Tag Archives: VXLAN

NSX Troubleshooting – VMs out of Network on VNI 5XXX

Currently i am working for customer running Network Virtualization (NSX) in their SDDC environment. Few weeks ago faced issues that multiple VMs out of Network in one of the compute cluster. So wanted to share and hope this will be useful for so many folks working on NSX. Customer is running NSX 6.1.1 with multiple VNIs managing networks for multiple environments. (e.g. Prod, DR, DEV,QA, Test etc.)

Here are the steps:-

  1. After receiving the issue we tried to ping random VMs from the list and VMs were not reachable.
  2. Next step was to find out the VNI number for those VMs and see if all are part of same VNI. And yes those VMs were part of same VNI (e.g. 5XXX)
  3. Once we knew the VNI number next step was to find out if all VMs connected to the VNI 5XXX are impacted or few.
  4. From the step 3 we came to know that only few VMs were impacted not all. After drilling down we found that VMs impacted are running on one of the ESXi hosts in the cluster and VNI working fine with other hosts in the cluster.
  5. To bring the VMs online we moved VMs to another host  and after migrating VMs were reachable and User were able to connect to the applications.
  6. Next was to find out the  Root Cause Analysis (RCA) why VMs connected to VNI 5XXX on ESXi host XXXXXXXXXX  lost network.
  7. Putty to ESXi Host and run the following command to check the VNI status on the host :- net-vdl2 -l. You can see below output screen that VXLAN Network 5002 is DOWN and all impacted VMs were part of this.

VNI19. To fix the issue we need to re-start the NETCPA daemon on the host. Here are list of commands to STOP / START  and CHECK STATUS of NETCPA daemon.

1)  Stopped the netcpa daemon by running –>  /etc/init.d/netcpad stop.

2)  Started the netcpa daemon by running –> /etc/init.d/netcpad start.

3) checked the status of service by running –> /etc/init.d/netcpad status.

10. After starting the NETCPA daemon check the VNI status by running command :- net-vdl2 -l. And now you can see that VXLAN 5002 is UP

VNI211. Next step was to move few VMs on this host from VNI 5002 and check the connectivity status of VMs and Application. All were perfectly fine after moving now on this host.

Note:- This issue has been addressed in NSX version 6.1.4e. If you are running NSX 6.1.4e then may be you will not get this issue. As Controller will be monitoring netcpad daemon and start if it failed on any of the hosts.

That’s it ….SHARE & SPREAD THE KNOWLEDGE!!

Network Virtualization with VMware NSX – Part 5

In Network Virtualization with VMware NSX – Part 4 we discussed Configuring and Deploying an NSX Distributed Router. Here in Network Virtualization with VMware NSX – Part 5 will discuss about VXLAN to VLAN Layer 2 Bridging, Configure and Deploy an NSX Edge Gateway, Configure Routes (Static Routing) on the NSX Edge Gateway and on the Distributed Router.

VXLAN to VLAN Layer 2 Bridging

A VXLAN to VLAN bridge enables direct Ethernet connectivity between virtual machines in a logical switch, and virtual machines in a distributed port group, This connectivity is called layer 2 bridging.

We can create a layer 2 bridge between a logical switch and a VLAN, which enables to migrate virtual workloads to physical devices with no effect on IP addresses. A logical network can leverage a physical gateway and access existing physical network and security resources by bridging the logical switch broadcast domain to the VLAN broadcast domain. Bridging can also be used in a migration strategy where you might be using P2V and you do not want to change subnets.

Note:- VXLAN to VXLAN bridging or VLAN to VLAN bridging is not supported. Bridging between different data centers is also not supported. All participants of the VLAN and VXLAN bridge must be in the same data center.

NSX Edge Services Gateway

The services gateway gives you access to all NSX Edge services such as firewall, NAT, DHCP, VPN, load balancing, and high availability. You can install multiple NSX Edge services gateway virtual appliances in a datacenter. Each NSX Edge virtual appliance can have a total of ten uplink and internal network interfaces.

ESG-1

NSX Edge logical router provides East-West and NSX Edge Services Gateway provide North-South Routing.

NSX Edge Services Gateway Sizing:-

NSX Edge can be deployed in four different configurations.ESG-2When we deploy NSX Edge gateway we need to choose right size as per load/requirements. We can also covert size of ESG later from Compact to Large, X-large or Quad Large. as you can in picture.

ESG20Note :- A service interruption might occur when the old NSX Edge gateway instance is removed and the new NSX Edge gateway instance is redeployed with new size or when we convert size of ESG.

NSX Edge Services Gateway features:-

ESG-3For resiliency and high-availability NSX Edge Services Gateway can be deployed as a pair of Active/Standby units (HA Mode).

When we deploy ESG/DLR in HA mode NSX Manager deploy the pair of NSX Edges/DLR on different hosts (anti-affinity rule). Heartbeat keepalives are exchanged every second between the active and standby edge instances to monitor each other’s health status.

If the ESXi server hosting the active NSX Edge fails, at the expiration of a “Declare Dead Time” timer, the standby node takes over the active duties. The default value for this timer is 15 seconds, but it can be tuned down (via UI or API calls) to 6 seconds.

The NSX Manager also monitors the state of health of the deployed NSX Edges, so it ensures to restart the failed unit on another ESXi host.

The NSX Edge appliance supports static and dynamic routing (OSPF, IS-IS, BGP, and Route redistribution).

Deploy NSX Edge gateway and Configure the static routing:

1. Connect to vCenter Server through vSphere Web Client —> Click Home tab –> Inventories –> Networking & Security and  select NSX Edges.ESG12. Click the green plus sign (+) to open the New NSX Edge dialog box. On the Name and description page, select Edge Services Gateway. (If you want to Enable HA for ESG select the Enable High Availability check box or leave it unchecked). Enter the Name of ESG as per your company standard and click Next.ESG23. On the CLI credentials page, enter the password for ESG in the password text box. Check Enable SSH Access box to enable SSH access for ESG appliance.             Note:- Password length must be at-least 12 characters. ESG1-P

ESG34. Select the Datacenter where you want to deploy this appliance. Select Appliance Size depending on your requirement we can also convert to any Size later as well. Check Enable auto rule generation to automatically generate service rules to allow flow of control traffic.

Under NSX Edge Appliances, click the green plus sign (+) to open the Add NSX Edge Appliance dialog box.ESG45. In Add NSX Edge Appliance dialog box select the Cluster and Datastore to deploy NSX Edge Appliance in the required location and designated datastore. And Click OK.

ESG56. verify all the settings on Configure deployment page and Click Next.

ESG67. On the Configure Interfaces page,click the green plus sign (+) to open the Add NSX Edge Interface dialog box

ESG78. Enter the Interface Name in the Name text box, choose Type, Click the Connected To –> Select link and choosed the required Distributed Port group. Click the green plus sign (+) under Configure Subnets to add subnet for the Interface.

ESG89. In the Add Subnet dialog box, click the green plus sign (+) to add an IP address field. Enter required IP address (192.168.100.3) in the IP Address text box and click OK to confirm the entry. Enter the subnet prefix length (24) in the Subnet prefix length text box and click OK.

ESG910. verify all the settings on Add NSX Edge Interface dialog box and Click OK.

ESG1011. Repeat steps 7-10 to add all required interfaces for ESG and Click Next.

ESG12

ESG11

ESG13

ESG1412. Once all Interfaces has been added verify settings on Configure Interfaces dialog box and Click Next.

ESG1513. On the Default gateway settings page, selec the Configure Default Gateway check box. Verify that the vNIC selection is Uplink-Interface. and  Enter the DG address (192.168.100.2) in the Gateway IP text box and Click Next.

ESG1614. On the Firewall and HA page, Select the Configure Firewall default policy check box. and Default Traffic Policy Accept. You can see that Configure HA parameters are gray out because we have not checked the Enable High Availability check box in step 2. And Click Next.

ESG1715. On the Ready to Complete dialog box verify all the settings (if you want to change any settings go back and change that)  and click Finish to complete the deployment for NSX Edge.

ESG1816. It will take few minutes to complete the deployment. Now under NSX Edges you can see that it is showing Deployed.

ESG1917. Double Click on the NSX Edge and can see the configuration settings as we choosed while deploying this.

esg1-ppNow Will Configure Static Routes on the NSX Edge Gateway:-

1. Double Click on the NSX Edge to browse NSX Edge –> Click on the Manage tab –> click Routing and select Static Routes. And Click the green plus sign (+) to open the Add Static Route dialog box.ESG-SR12. Select the interface connected to DLR which is (Transit-Interface), Enter the network ID with Subnet Mask (172.16.0.0/24) for which you want to add Routing and Next Hop Address for configured Network (in my case 192.168.10.2) and click OK.

ESG-SR23. After every settings or Modification need to Publish Changes. Click on Publish Changes.

ESG-SR34. Once Publishing finished you can see entry under Static Routes.

ESG-SR4

Configure Static Routes on the Distributed Router:-

1.Under Networking & Security –> NSX Edges –> double-click the Distributed Router entry to manage that object.ESG19

DLR-SR12. After browsing DLR  Click on the Manage and Routing tab. In the routing category panel select Static Routes and Click the Green Plus Sign (+) to add static Routes on DLR.

DLR-SR2

3. Select the interface connected to ESG which is (Transit-Interface), Enter the network ID with Subnet Mask (192.168.110.0/24) for which you want to add Routing and Next Hop Address for configured Network (in my case 192.168.10.1) and click OK.

DLR-SR34. After every settings or Modification need to Publish Changes. Click on Publish Changes. Once done you can see Static routes in the Static Routes lists.

DLR-SR4

Once Static Routing has been done will be able to ping the Logical switch network with External network. e.g external Network 192.168.110.10 to 3 logical switch network created in part 2 172.16.0.0/24.

esg1-2

That’s it. We are done with Deploying NSX Distributed Router and NSX Edge Services Gateway and also how to Configure Static Routing on DLR and ESG. 

In the next part (Network Virtualization with VMware NSX – Part 6) will discuss how to Configure Dynamic Routing on NSX Edge Appliances and NSX Distributed Router.

Thank you and stay tuned for next part. Keep sharing the knowledge 🙂

Other NSX Parts:-

Network Virtualization with VMware NSX – Part 1

Network Virtualization with VMware NSX – Part 2

Network Virtualization with VMware NSX – Part 3

Network Virtualization with VMware NSX – Part 4

Network Virtualization with VMware NSX – Part 5

Network Virtualization with VMware NSX – Part 3

In the Network Virtualization with VMware NSX – Part 2 we have discussed about NSX Controller Cluster, How to Deploy the NSX Controller Instances, Create IP Pool, and Install Network Virtualization Components ( Prepare Hosts) on vSphere Hosts.

In this part will discuss about Logical Switch Networks and VXLAN Overlays.

Before Discussing VXLAN let’s discuss bit about Virtual LAN (VLAN):-

A VLAN is a group of devices on one or more LANs that are configured to communicate as if they were attached to the same wire, when in fact they are located on a number of different LAN segments. Because VLANs are based on logical instead of physical connections, they are extremely flexible.

VLANs address scalability, security, and network management by enabling a switch to serve multiple virtual subnets from its LAN ports.

VLAN Split switches into separate virtual switches (Broadcast Domains). Only members of a virtual LAN (VLAN) can see that VLAN’s traffic. Traffic between VLANs must go through a router.

By default, all ports on a switch are in a single broadcast domain. VLANs enable a single switch to serve multiple switching domains. The forwarding table on the switch is partitioned between all ports belonging to a common VLAN. All ports on a Switch by default part of single and default VLAN 0 and this default VLAN is called the Native VLAN.

Virtual Extensible LAN (VXLAN) enables you to create a logical network for your virtual machines across different networks. You can create a layer 2 network on top of your layer 3 networks.

VXLAN is an Ethernet in IP overlay technology, where the original layer 2 frame is encapsulated in a User Datagram Protocol (UDP) packet and delivered over a transport network. This technology provides the ability to extend layer 2 networks across layer 3 boundaries and consume capacity across clusters. The VXLAN adds 50 to 54 bytes of information to the frame, depending on whether VLAN tagging is used. VMware recommends increasing the MTU to at least 1,600 bytes to support NSX.

A VXLAN Number Identifier (VNI) is a 24-bit number that gets added to the VXLAN frame. The 24-bit address space theoretically enables up to 16 million VXLAN networks. Each VXLAN network is an isolated logical network.  VMware NSX™ starts with VNI 5000.

A Virtual Tunnel End Point (VTEP) is an entity that encapsulates an Ethernet frame in a VXLAN frame or de-encapsulates a VXLAN frame and forwards the inner Ethernet frame.

VXLAN Frame :-

VXLAN1The top frame is the original frame from the virtual machines, minus the Frame Check Sequence (FCS), encapsulated in a VXLAN frame. A new FCS is created by the VTEP to include the entire VXLAN frame. The VLAN tag in the layer 2 Ethernet frame exists if the port group that your VXLAN VMkernel port is connected to has an associated VLAN number. When the port group is associated with a VLAN number, the port group tags the VXLAN frame with that VLAN number.

VXLAN Replication Modes:-

Three modes of traffic replication exist: two modes are based on VMware NSX Controller™ based and one mode is based on data plane.

vxlan1Unicast has no physical network requirements apart from the MTU. All traffic is replicated by the VTEPs. In NSX, the default mode of traffic replication is unicast.  Unicast has Higher overhead on the source VTEP and UTEP.

Multicast mode uses the VTEP as a proxy. In multicast, the VTEP never goes to the NSX Controller instance. As soon as the VTEP receives the broadcast traffic, the VTEP multicasts the traffic to all devices. Multicast has lowest overhead on the source VTEP.

Hybrid mode is not the default mode of operation in NSX for vSphere, but is important for larger scale operations. Also the configuration overhead or complexity of L2 IGMP is significantly lower than multicast routing.

In the Network Virtualization with VMware NSX – Part 2 we have configured/Prepared Hosts so now let’s Configure VXLAN on the ESXi Hosts.

1. Connect to vCenter using web client.

2. Click Networking & Security and then click Installation.

3. Click the Host Preparation tab and under VXLAN column Click Configure to start Configuring VXLAN on the ESXi Hosts.

vxlan24. In the Configure VXLAN networking dialog box, Select Switch, VLAN, Set MTU to 1600, for VMKNic IP Addressing if you have created IP Pool choose existing IP from from list or Click IP Pool to create New Pool And Click OK.

vxlan3

vxlan45. It will take few minutes to configure depending upon number of Hosts into Cluster. If an error is indicated, it is a transitory condition that occurs early in the process of applying the VXLAN configuration to the cluster. The vSphere Web Client interface has not updated to display the actual status. Click Refresh to update the console.

vxlan56. Repeat the steps to configure all the clusters. Once Configuration done on all clusters.Verify that the VXLAN status is Enabled with a green check mark.

vxlan67.  Once VXLAN Configuration done for all the clusters and VXLAN status is Enabled with a green check mark. Click the Logical Network Preparation tab and verify that VXLAN Transport is selected. In the Clusters and Hosts list,expand each of the clusters and confirm the host has a vmk# interface created with IP Address from the IP Pool we have created for each.

vxlan7Once We have finished Configuring VXLAN and Verified VXLAN configuration for all the clusters. Next need to Configure the VXLAN ID Pool to identify VXLAN networks:-

1.  On the Logical Network Preparation tab, click the Segment ID button and Click Edit to open the Segment ID pool dialog box to configure ID Pool.

2. Enter the Segment ID Pool and Click Ok to complete. VMware NSX™ starts with VNI ID from 5000.

vxlan8Next we need to Configure a Global Transport Zone:-

A transport zone specifies the hosts and clusters that are associated with logical switches created in the zone. Hosts in a transport zone are automatically added to the logical switches that you create. This process is very similar to manually adding hosts to VMware vSphere Distributed Switch.

1. On the Logical Network Preparation tab, click Transport Zones and Click the green plus sign to open the New Transport Zone dialog box.

vxlan92.  Enter the Name for Transport Zone and Select Control Plane Mode. select Clusters to Add to the Transport Zone and Click OK to complete the creation.

vxlan10

vxlan11

———————————————————————————————————-

NSX Logical Switching

The Logical Switching capability in the NSX platform provides customers the ability to spin up isolated logical L2 networks with the same flexibility and agility, as it is to spin up virtual machines. Endpoints, both virtual and physical, can then connect to those logical segments and establish connectivity independently from the specific location where
they are deployed in the data center network. This is possible because of the decoupling between network infrastructure and logical networks provided by NSX network virtualization. Each logical switch gets its own unique VNI.

The deployment of the NSX Virtualization components can help to the agile and flexible creation of applications with their required network connectivity and services. A typical example is the creation of a multi-tier application.

LS11Configure Logical Switch Networks

We need to create logical switches for the all required networks (e.g. Transit, Web-Tier, App-Tier, and DB-Tier networks as per above picture.)
1. Connect to vCenter Server using web Client and Click Networking and Security and Select Logical Switches,  In the left navigation pane.

LS12. Click the Green plus sign to open the New Logical Switch dialog box. Enter the Logical Switch Name and  Select the Global Transport Zone we had created earlier, Choose the Control Plane Mode and Click OK to complete the Switch creation.

ls23. Wait for the update to complete and confirm Transit-Network appears with a status of Normal. Repeat steps to create all required Logical Switches and all are Normal.

LS3Once Logical Switches has been created we need to Migrate Virtual Machines to Logical Switches:-

1. In the left pane under Networking & Security and select Logical Switches. In the center pane, select the logical Switch e.g. Web-Tier –> Right Click the Choose Add VM..

LS42. Select Virtual Machines you want to add to the Logical Switch and Click Next.

LS53.  Select the VNIC you want to add to the Network and Click Next.

LS64. In the Ready to complete box verify the settings and  Click Finish to Complete adding VMs to desired Network.

LS75. To verify that VMs have been added to Logical Switch, Double Click the Logical Switch.

LS36. Click Related Objects and Virtual Machines tab and you can the list of VMs added to this specific Logical Switch.

LS87. Repeat the same steps for all the Logical Switches to Add VMs. Once done try to ping VMs in same switch and between Switch.

Now you can only ping VMs connected in the same Switch. To communicate with VMs in another Switch we need to configure Routing. Which will discuss in next Part.

======================================================

Other NSX Parts:-

Network Virtualization with VMware NSX – Part 1

Network Virtualization with VMware NSX – Part 2

Network Virtualization with VMware NSX – Part 3

Network Virtualization with VMware NSX – Part 4

Network Virtualization with VMware NSX – Part 5

– See more at: http://virtualcloudsolutions.info/?p=829#sthash.YMq7IeEE.dpuf

Please share if useful …..Thank You 🙂