(For more resources related to this topic, see here.)
Device and I/O virtualization involves managing the routing of I/O requests between virtual devices and the shared physical hardware. Software-based I/O virtualization and management, in contrast to a direct pass through to the hardware, enables a rich set of features and simplified management. With networking, virtual NICs and virtual switches create virtual networks between virtual machines which are running on the same host without the network traffic consuming bandwidth on the physical network
NIC teaming consists of multiple, physical NICs and provides failover and load balancing for virtual machines. Virtual machines can be seamlessly relocated to different systems by using VMware vMotion, while keeping their existing MAC addresses and the running state of the VM. The key to effective I/O virtualization is to preserve these virtualization benefits while keeping the added CPU overhead to a minimum.
The hypervisor virtualizes the physical hardware and presents each virtual machine with a standardized set of virtual devices. These virtual devices effectively emulate well-known hardware and translate the virtual machine requests to the system hardware. This standardization on consistent device drivers also helps with virtual machine standardization and portability across platforms, because all virtual machines are configured to run on the same virtual hardware, regardless of the physical hardware in the system. In this article we will discuss the following:
- Describe various network performance problems
- Discuss the causes of network performance problems
- Propose solutions to correct network performance problems
Designing a network for load balancing and failover for vSphere Standard Switch
The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming we can group several physical network adapters attached to a vSwitch. This grouping enables load balancing between the different physical NICs and provides fault tolerance if a card or link failure occurs.
Network adapter teaming offers a number of available load balancing and load distribution options. Load balancing is load distribution based on the number of connections, not on network traffic. In most cases, load is managed only for the outgoing traffic and balancing is based on three different policies:
- Route based on the originating virtual switch port ID (default)
- Route based on the source MAC hash
- Route based on IP hash
Also, we have two network failure detection options and those are:
- Link status only
- Beacon probing
Getting ready
To step through this recipe, you will need one or more running ESXi hosts, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required.
How to do it…
To change the load balancing policy and to select the right one for your environment, and also select the appropriate failover policy, you need to follow the proceeding steps:
- Open up your VMware vSphere Client.
- Log in to the vCenter Server.
- On the left hand side, choose any ESXi Server and choose configuration from the right hand pane.
- Click on the Networking section and select the vSwitch for which you want to change the load balancing and failover settings.
- Click on Properties.
- Select the vSwitch and click on Edit.
- Go to the NIC Teaming tab.
- Select one of the available policies from the Load Balancing drop-down menu.
- Select one of the available policies on the Network Failover Detection drop-down menu.
- Click on OK to make it effective.
You may wish to override this per port group level as well.
How it works…
Route based on the originating virtual switch port ID (default)
In this configuration, load balancing is based on the number of physical network cards and the number of virtual ports used. With this configuration policy, a virtual network card connected to a vSwitch port will always use the same physical network card. If a physical network card fails, the virtual network card is redirected to another physical network card.
You typically do not see the individual ports on a vSwitch. However, each vNIC that gets connected to a vSwitch is implicitly using a particular port on the vSwitch. (It’s just that there’s no reason to ever configure which port, because that is always done automatically.)
It does a reasonable job of balancing your egress uplinks for the traffic leaving an ESXi host as long as all the virtual machines using these uplinks have similar usage patterns.
It is important to note that port allocation occurs only when a VM is started or when a failover occurs. Balancing is done based on a port’s occupation rate at the time the VM starts up. This means that which pNIC is selected for use by this VM is determined at the time the VM powers on based on which ports in the vSwitch are occupied at the time. For example, if you started 20 VMs in a row on a vSwitch with two pNICs, the odd-numbered VMs would use the left pNIC and the even-numbered VMs would use the right pNIC and that would persist even if you shut down all the even-numbered VMs; the left pNIC, would have all the VMs and the right pNIC would have none. It might happen that two heavily-loaded VMs are connected to the same pNIC, thus load is not balanced.
This policy is the easiest one and we always call for the simplest one to map it to a best operational simplification.
Now when speaking of this policy, it is important to understand that if, for example, teaming is created with two 1 GB cards, and if one VM consumes more than one card’s capacity, a performance problem will arise because traffic greater than 1 Gbps will not go through the other card, and there will be an impact on the VMs sharing the same port as the VM consuming all resources. Likewise, if two VMs each wish to use 600 Mbps and they happen to go to the first pNIC, the first pNIC cannot meet the 1.2 Gbps demand no matter how idle the second pNIC is.
Route based on source MAC hash
This principle is the same as the default policy but is based on the number of MAC addresses. This policy may put those VM vNICs on the same physical uplink depending on how the MAC hash is resolved.
For MAC hash, VMware has a different way of assigning ports. It’s not based on the dynamically changing port (after a power off and power on the VM usually gets a different vSwitch port assigned), but is instead based on fixed MAC address. As a result one VM is always assigned to the same physical NIC unless the configuration is not changed. With the port ID, the VM could get different pNICs after a reboot or VMotion.
If you have two ESXi Servers with the same configuration, the VM will stay on the same pNIC number even after a vMotion. But again, one pNIC may be congested while others are bored. So there is no real load balancing.
Route based on IP hash
The limitation of the two previously-discussed policies is that a given virtual NIC will always use the same physical network card for all its traffic. IP hash-based load balancing uses the source and destination of the IP address to determine which physical network card to use. Using this algorithm, a VM can communicate through several different physical network cards based on its destination. This option requires configuration of the physical switch’s ports to EtherChannel. Because the physical switch is configured similarly, this option is the only one that also provides inbound load distribution, where the distribution is not necessarily balanced.
There are some limitations and reasons why this policy is not commonly used. These reasons are described as follows:
- The route based on IP hash load balancing option involves added complexity and configuration support from upstream switches. Link Aggregation Control Protocol (LACP) or EtherChannel is required for this algorithm to be used. However, this does not apply for a vSphere Standard Switch.
- For IP hash to be an effective algorithm for load balancing there must be many IP sources and destinations. This is not a common practice for IP storage networks, where a single VMkernel port is used to access a single IP address on a storage device.
The same NIC will always send all its traffic to the same destination (for example, Google.com) through the same pNIC, though another destination (for example, bing.com) might go through another pNIC.
So, in a nutshell, due to the added complexity, the upstream dependency on the advanced switch configuration and the management overhead, this configuration is rarely used in production environments. The main reason is that if you use IP hash, the pSwitch must be configured with LACP or EtherChannel. Also, if you use LACP or EtherChannel, the load balancing algorithm must be IP hash. This is because with LACP, inbound traffic to the VM could come through either of the pNICs, and the vSwitch must be ready to deliver that to the VM and only IP Hash will do that (the other policies will drop the inbound traffic to this VM that comes in on a pNIC that the VM doesn’t use).
We have only two failover detection options and those are:
Link status only
The link status option enables the detection of failures related to the physical network’s cables and switch. However, be aware that configuration issues are not detected. This option also cannot detect the link state problems with upstream switches; it works only with the first hop switch from the host.
Beacon probing
The beacon probing option allows the detection of failures unseen by the link status option, by sending the Ethernet broadcast frames through all the network cards. These network frames authorize the vSwitch to detect faulty configurations or upstream switch failures and force the failover if the ports are blocked. When using an inverted U physical network topology in conjunction with a dual-NIC server, it is recommended to enable link state tracking or a similar network feature in order to avoid traffic black holes. According to VMware’s best practices, it is recommended to have at least three cards before activating this functionality. However, if IP hash is going to be used, beacon probing should not be used as a network failure detection, in order to avoid an ambiguous state due to the limitation that a packet cannot hairpin on the port it is received. Beacon probing works by sending out and listening to beacon probes from the NICs in a team. If there are two NICs, then each NIC will send out a probe and the other NICs will receive that probe. Because EtherChannel is considered one link, this will not function properly as the NIC uplinks are not logically separate uplinks. If beacon probing is used, this can result in MAC address flapping errors, and the network connectivity may be interrupted.
Designing a network for load balancing and failover for vSphere Distributed Switch
The load balancing and failover policies that are chosen for the infrastructure can have an impact on the overall design. Using NIC teaming, we can group several physical network switches attached to a vSwitch. This grouping enables load balancing between the different Physical NICs, and provides fault tolerance if a card failure occurs.
The vSphere distributed vSwitch offers a load balancing option that actually takes the network workload into account when choosing the physical uplink. This is route based on a physical NIC load. This is also called Load Based Teaming (LBT). We recommend this load balancing option over the others when using a distributed vSwitch. Benefits of using this load balancing policy are as follows:
- It is the only load balancing option that actually considers NIC load when choosing uplinks.
- It does not require upstream switch configuration dependencies like the route based on IP hash algorithm does.
- When the route based on physical NIC load is combined with the network I/O control, a truly dynamic traffic distribution is achieved.
Getting ready
To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required.
How to do it…
To change the load balancing policy and select the right one for your environment, and also select the appropriate failover policy you need to follow the proceeding steps:
- Open up your VMware vSphere Client.
- Log in to the vCenter Server.
- Navigate to Networking on the home screen.
- Navigate to a Distributed Port group and right click and select Edit Settings.
- Click on the Teaming and Failover section.
- From the Load Balancing drop-down menu, select Route Based on physical NIC load as the load balancing policy.
- Choose the appropriate network failover detection policy from the drop-down menu.
- Click on OK and your settings will be effective.
How it works…
Load based teaming, also known as route based on physical NIC load, maps vNICs to pNICs and remaps the vNIC to pNIC affiliation if the load exceeds specific thresholds on a pNIC. LBT uses the originating port ID load balancing algorithm for the initial port assignment, which results in the first vNIC being affiliated to the first pNIC, the second vNIC to the second pNIC, and so on. Once the initial placement is over after the VM being powered on, LBT will examine both the inbound and outbound traffic on each of the pNICs and then distribute the load across if there is congestion.
LBT will send a congestion alert when the average utilization of a pNIC is 75 percent over a period of 30 seconds. 30 seconds of interval period is being used for avoiding the MAC flapping issues. However, you should enable port fast on the upstream switches if you plan to use STP. VMware recommends LBT over IP hash when you use vSphere Distributed Switch, as it does not require any special or additional settings in the upstream switch layer. In this way you can reduce unnecessary operational complexity. LBT maps vNIC to pNIC and then distributes the load across all the available uplinks, unlike IP hash which just maps the vNIC to pNIC but does not do load distribution. So it may happen that when a high network I/O VM is sending traffic through pNIC0, your other VM will also get to map to the same pNIC and send the traffic.
What to know when offloading checksum
VMware takes advantage of many of the performance features from modern network adaptors.
In this section we are going to talk about two of them and those are:
- TCP checksum offload
- TCP segmentation offload
Getting ready
To step through this recipe, you will need a running ESXi Server and a SSH Client (Putty). No other prerequisites are required.
How to do it…
The list of network adapter features that are enabled on your NIC can be found in the file /etc/vmware/esx.conf on your ESXi Server. Look for the lines that start with /net/vswitch.
However, do not change the default NIC’s driver settings unless you have a valid reason to do so. A good practice is to follow any configuration recommendations that are specified by the hardware vendor. Carry out the following steps in order to check the settings:
- Open up your SSH Client and connect to your ESXi host.
- Open the file etc/vmware/esx.conf
- Look for the line that starts with /net/vswitch
- Your output should look like the following screenshot:
How it works…
A TCP message must be broken down into Ethernet frames. The size of each frame is the maximum transmission unit (MUT). The default maximum transmission unit is 1500 bytes. The process of breaking messages into frames is called segmentation.
Modern NIC adapters have the ability to perform checksum calculations natively. TCP checksums are used to determine the validity of transmitted or received network packets based on error correcting code. These calculations are traditionally performed by the host’s CPU. By offloading these calculations to the network adapters, the CPU is freed up to perform other tasks. As a result, the system as a whole runs better. TCP segmentation offload (TSO) allows a TCP/IP stack from the guest OS inside the VM to emit large frames (up to 64KB) even though the MTU of the interface is smaller.
Earlier operating system used the CPU to perform segmentation. Modern NICs try to optimize this TCP segmentation by using a larger segment size as well as offloading work from the CPU to the NIC hardware. ESXi utilizes this concept to provide a virtual NIC with TSO support, without requiring specialized network hardware.
- With TSO, instead of processing many small MTU frames during transmission, the system can send fewer, larger virtual MTU frames.
- TSO improves performance for the TCP network traffic coming from a virtual machine and for network traffic sent out of the server.
- TSO is supported at the virtual machine level and in the VMkernel TCP/IP stack.
- TSO is enabled on the VMkernel interface by default. If TSO becomes disabled for a particular VMkernel interface, the only way to enable TSO is to delete that VMkernel interface and recreate it with TSO enabled.
- TSO is used in the guest when the VMXNET 2 (or later) network adapter is installed. To enable TSO at the virtual machine level, you must replace the existing VMXNET or flexible virtual network adapter with a VMXNET 2 (or later) adapter. This replacement might result in a change in the MAC address of the virtual network adapter.
Selecting the correct virtual network adapter
When you configure a virtual machine, you can add NICs and specify the adapter type. The types of network adapters that are available depend on the following factors:
- The version of the virtual machine, which depends on which host created it or most recently updated it.
- Whether or not the virtual machine has been updated to the latest version for the current host.
- The guest operating system.
The following virtual NIC types are supported:
- Vlance
- VMXNET
- Flexible
- E 1000
- Enhanced VMXNET (VMXNET 2)
- VMXNET 3
If you want to know more about these network adapter types then refer to the following KB article:
http://kb.vmware.com/kb/1001805
Getting ready
To step through this recipe, you will need one or more running ESXi Servers, a vCenter Server, and a working installation of vSphere Client. No other prerequisites are required.
How to do it…
To choose a particular virtual network adapter you have two ways, one is while you create a new VM and the other one is while adding a new network adaptor to an existing VM.
To choose a network adaptor while creating a new VM is as follows:
- Open vSphere Client.
- Log in to the vCenter Server.
- Click on the File menu, and navigate to New| Virtual Machine.
- Go through the steps and hold on to the step where you need to create network connections. Here you need to choose how many network adaptors you need, which port group you want them to connect to, and an adaptor type.
To choose an adaptor type while adding a new network interface in an existing VM you should follow these steps:
- Open vSphere Client.
- Log in to the vCenter Server.
- Navigate to VMs and Templates on your home screen.
- Select an existing VM where you want to add a new network adaptor, right click and select Edit Settings.
- Click on the Add button.
- Select Ethernet Adaptor.
- Select the Adaptor type and select the network where you want this adaptor to connect.
- Click on Next and then click on Finish
How it works…
Among the entire supported virtual network adaptor types, VMXNETis the paravirtualized device driver for virtual networking. The VMXNET driver implements an idealized network interface that passes through the network traffic from the virtual machine to the physical cards with minimal overhead. The three versions of VMXNET are VMXNET, VMXNET 2 (Enhanced VMXNET), and VMXNET 3.
The VMXNET driver improves the performance through a number of optimizations as follows:
- Shares a ring buffer between the virtual machine and the VMkernel, and uses zero copy, which in turn saves CPU cycles. Zero copy improves performance by having the virtual machines and the VMkernel share a buffer, reducing the internal copy operations between buffers to free up CPU cycles.
- Takes advantage of transmission packet coalescing to reduce address space switching.
- Batches packets and issues a single interrupt, rather than issuing multiple interrupts. This improves efficiency, but in some cases with slow packet-sending rates, it could hurt throughput while waiting to get enough packets to actually send.
- Offloads TCP checksum calculation to the network hardware rather than use the CPU resources of the virtual machine monitor. Use vmxnet3 if you can, or the most recent model you can. Use VMware Tools where possible. For certain unusual types of network traffic, sometimes the generally-best model isn’t optimal; if you have poor network performance, experiment with other types of vNICs to see which performs best.