In this article written by Christian Stankowic, author of the book vSphere High Performance Essentials In cluster setups, Distributed Resource Scheduler (DRS) can assist you with automatic balancing CPU and storage load (Storage DRS). DRS monitors the ESXi hosts in a cluster and migrates the running VMs using vMotion, primarily, to ensure that all the VMs get the resources they need. Secondarily, it tries to balance the cluster. In addition to this, Storage DRS monitors the shared storage for information about latency and capacity consumption. In this case, Storage DRS recognizes the potential to optimize storage resources; it will make use of Storage vMotion to balance the load. We will cover Storage DRS in detail later.
(For more resources related to this topic, see here.)
Working of DRS
DRS primarily uses two metrics to determine the cluster balance:
- Active host CPU: it includes the usage (CPU task time in ms) and ready (wait times in ms per VMs to get scheduled on physical cores) metrics.
- Active host Memory: It describes the amount of memory pages that are predicted to have changed in the last 20 seconds. A math-sampling algorithm calculates this amount; however, it is quite inaccurate.
Active host memory is often used for resource capacity purposes. Be careful with using this value as an indicator as it only describes how aggressively a workload changes the memory. Depending on your application architecture, it may not measure how much memory a particular VM really needs. Think about applications that allocate a lot of memory for the purpose of caching. Using the active host memory metric for the purpose of capacity might lead to inappropriate settings.
The migration threshold controls DRS’s aggressiveness and defines how much a cluster can be imbalanced. Refer to the following table for detailed explanation:
Only affinity/anti-affinity constraints are applied
This will also apply recommendations addressing significant improvements
Recommendations that, at least, promise good improvements are applied
DRS applies recommendations that only promise a moderate improvement
Recommendations only addressing smaller improvements are applied
Apart from the migration threshold, two other metrics—Target Host Load Standard Deviation (THLSD) and Current host load standard deviation (CHLSD)—are calculated.
THLSD defines how much a cluster node’s load can differ from others in order to be still balanced. The migration threshold and the particular ESXi host’s active CPU and memory values heavily influence this metric. CHLSD calculates whether the cluster is currently balanced. If this value differs from the THLSD, the cluster is imbalanced and DRS will calculate the recommendations in order to balance it.
In addition to this, DRS also calculates the vMotion overhead that is needed for the migration. If a migration’s overhead is deemed higher than the benefit, vMotion will not be executed. DRS also evaluates the migration recommendations multiple times in order to avoid ping pong migrations.
By default, once enabled, DRS is polled every five minutes (300 seconds). Depending on your landscape, it might be required to change this behavior. To do so, you need to alter the vpxd.cfg configuration file on the vCenter Server machine. Search for the following lines and alter the period (in seconds):
<config> <drm> <pollPeriodSec> 300[SR1] </pollPeriodSec> </drm> </config>
Refer to the following table for configuration file location, depending on your vCenter implementation:
vCenter Server type
vCenter Server Appliance
Check-list – performance tuning
There are a couple of things to be considered when optimizing DRS for high-performance setups, as shown in the following:
- Make sure to use the hosts with homogenous CPU and memory configuration. Having different nodes will make DRS less effective.
- Use at least 1 Gbps network connection for vMotion. For better performance, it is recommended to use 10 Gbps instead.
- For virtual machines, it is a common procedure not to oversize them. Only configure as much as the CPU and memory resources need. Migrating workloads with unneeded resources takes more time.
- Make sure not to exceed the ESXi host and cluster limits that are mentioned in the VMware vSphere Configuration Maximums document.
For vSphere 5.5, refer to https://www.vmware.com/pdf/vsphere5/r55/vsphere-55-configuration-maximums.pdf.
For vSphere 6.0, refer to https://www.vmware.com/pdf/vsphere6/r60/vsphere-60-configuration-maximums.pdf.
To configure DRS for your cluster, proceed with the following steps:
- Select your cluster from the inventory tab and click Manage and Settings.
- Under Services, select vSphere DRS. Click Edit.
- Select whether DRS should act in the Partially Automated or Fully Automated mode. In partially automated mode, DRS will place VMs in appropriate hosts, once powered on; however, it wil not migrate the running workloads. In fully automated mode, DRS will also migrate the running workloads in order to balance the cluster load. The Manual mode only gives you recommendations and the administrator can select the recommendations to apply. To create resource pools at cluster level, you will need to have at least the manual mode enabled.
- Select the DRS aggressiveness. Refer to the preceding table for a short explanation.
Using more aggressive DRS levels is only recommended when having homogenous CPU and memory setups!
When creating VMware support calls regarding DRS issues, a DRS dump file called drmdump is important. This file contains various metrics that DRS uses to calculate the possible migration benefits. On the vCenter Server Appliance, this file is located in /var/log/vmware/vpx/drmdump/clusterName. On the Windows variant, the file is located in %ALLUSERSPROFILE%VMwareVMware VirtualCenterLogsdrmdumpclusterName.
VMware also offers an online tool called VM Resource and Availability Service (http://hasimulator.vmware.com), telling you which VMs can be restarted during the ESXi host failures. It requires you to upload this metric file in order to give you the results. This can be helpful when simulating the failure scenarios.
Enhanced vMotion capability
Enhanced vMotion Compatibility (EVC) enables your cluster to migrate the workloads between ESXi hosts with different processor generations. Unfortunately, it is not possible to migrate workloads between Intel-based and AMD-based servers; EVC only enables migrations in different Intel or AMD CPU generations. Once enabled, all the ESXi hosts are configured to provide the same set of CPU functions. In other words, the functions of newer CPU generations are disabled to match those of the older ESXi hosts in the cluster in order to create a common baseline.
To enable EVC, perform the following steps:
- Select the affected cluster from the inventory tab.
- Click on Manage, Settings, VMware EVC, and Edit.
- Choose Enable EVC for AMD Hosts or Enable EVC for Intel Hosts. Select the appropriate CPU generation for the cluster (the oldest).
- Make sure that Compatibility acknowledges your configuration. Save the changes, as follows:
As mixing older hosts in high-performance clusters is not recommended, you should also avoid using EVC.
To sum it up, keep the following steps in mind when planning the use of DRS:
- Enable DRS if you plan to have automatic load balancing; this is highly recommended for high-performance setups. Adjust the DRS aggressiveness level to match your requirements. Too aggressive migration thresholds may result in too many migrations, therefore, play with this setting to find the best for you.
- Make sure to have a separated vMotion network. Using the same logical network components as for the VM traffic is not recommended and might result in poor workload performance.
- Don’t overload ESXi hosts to spare some CPU resources for vMotion processes in order to avoid performance bottlenecks during migrations.
- In high-performance setups, mixing various CPU and memory configurations is not recommended to achieve better performance. Try not to use EVC.
- Also, keep license constraints in mind when configuring DRS. Some software products might require additional licenses if it runs on multiple servers. We will focus on this later.
Affinity and anti-affinity rules
Sometimes, it is necessary to separate workloads or stick them together. To name some examples, think about the classical multi-tier applications such as the following:
- Frontend layer
- Database layer
- Backend layer
One possibility would be to separate the particular VMs on multiple ESXi hosts to increase resilience. If a single ESXi host that is serving all the workloads crashes, all application components are affected by this fault. Moving all the participating application VMs to one single ESXi can result in higher performance as network traffic does not need to leave the ESXi host.
However, there are more use cases to create affinity and anti-affinity rules, as shown in the following:
- Diving into production, development, and test workloads. For example, it would possible to separate production from the development and test workloads. This is a common procedure that many application vendors require.
- Licensing reasons (for example, license bound to the USB dongle, per core licensing, software assurance denying vMotion, and so on.)
- Application interoperability incompatibility (for example, applications need to run on separated hosts).
As VMware vSphere has no knowledge about the license conditions of the workloads running virtualized, it is very important to check your software vendor’s license agreements. You, as a virtual infrastructure administrator, are responsible to ensure that your software is fully licensed. Some software vendors require special licenses when running virtualized/on multiple hosts.
There are two kinds of affinity/anti-affinity rules: VM-Host (relationship between VMs and ESXi hosts) and VM-VM (intra-relationship between particular VMs). Each rule consists of at least one VM and host DRS group. These groups also contain at least one entry. Every rule has a designation, where the administrator can choose between must or should. Implementing a rule with the should designation results in a preference on hosts satisfying all the configured rules. If no applicable host is found, the VM is put on another host in order to ensure at least the workload is running. If the must designation is selected, a VM is only running on hosts that are satisfying the configured rules. If no applicable host is found, the VM cannot be moved or started. This configuration approach is strict and requires excessive testing in order to avoid unplanned effects. DRS rules are rather combined than ranked. Therefore, if multiple rules are defined for a particular VM/host or VM/VM combination, the power-on process is only granted if all the rules apply to the requested action. If two rules are conflicting for a particular VM/host or VM/VM combination, the first rule is chosen and the other rule is automatically disabled. Especially, the use of the must rules should be evaluated very carefully as HA might not restart some workloads if these rules cannot be followed in case of a host crash.
Configuring affinity/anti-affinity rules
In this example, we will have a look at two use cases that affinity/anti-affinity rules can apply to.
Example 1: VM-VM relationship
This example consists of two VMs serving a two-tier application: db001 (database VM) and web001 (frontend VM). It is advisable to have both VMs running on the same physical host in order to reduce networking hops to connect the frontend server to its database.
To configure the VM-VM affinity rule, proceed with the following steps:
- Select your cluster from the inventory tab and click Manage and VM/Host Rule underneath Configuration.
- Click Add. Enter a readable rule name (for example, db001-web001-bundle) and select Enable rule.
- Select the Keep Virtual Machines Together type and select the affected VMs.
- Click OK to save the rule, as shown in the following:
When migrating one of the virtual machines using vMotion, the other VM will also migrate.
Example 2: VM-Host relationship
In this example, a VM (vcsa) is pinned to a particular ESXi host of a two-node cluster designated for production workloads.
To configure the VM-Host affinity rule, proceed with the following steps:
- Select your cluster from the inventory tab and click Manage and VM/Host Groups underneath Configuration.
- Click Add. Enter a group name for the VM; make sure to select the VM Group type. Also, click Add to add the affected VM.
- Click Add once again. Enter a group name for the ESXi host; make sure to select the Host Group type. Later, click Add to add the ESXi host.
- Select VM/Host Rule underneath Configuration and click Add. Enter a readable rule name (for example, vcsa-to-esxi02) and select Enable rule.
- Select the Virtual Machines to Hosts type and select the previously created VM and host groups.
- Make sure to select Must run on hosts in group or Should run on hosts in group before clicking OK, as follows:
- Migrating the virtual machine to another host will fail with the following error message if Must run on hosts in group was selected earlier:
Keep the following in mind when designing affinity and anti-affinity rules:
- Enable DRS.
- Double-check your software vendor’s licensing agreements.
- Make sure to test your affinity/anti-affinity rules by simulating vMotion processes. Also, simulate host failures by using maintenance mode to ensure that your rules are working as expected. Note that the created rules also apply to HA and DPM.
- KISS – Keep it simple, stupid. Try to avoid utilizing too many or multiple rules for one VM/host combination.
Distributed power management
High performance setups are often the opposite of efficient, green infrastructures; however, high-performing virtual infrastructure setups can be efficient as well. Distributed Power Management (DPM) can help you with reducing the power costs and consumption of your virtual infrastructure. It is part of DRS and monitors the CPU and memory usage of all workloads running in the cluster. If it is possible to run all VMs on fewer hosts, DPM will put one or more ESXi hosts in standby mode (they will be powered off) after migrating the VMs using vMotion.
DPM tries to keep the CPU and memory usage between 45% and 81% for all the cluster nodes by default. If this range is exceeded, the hosts will be powered on/off. Setting two advanced parameters can change this behaviour, as follows:
DemandCapacityRatioTarget: Utilization target for the ESXi hosts (default: 63%)
DemandCapacityRatioToleranceHost: Utilization range around target utilization (default 18%)
The range is calculated as follows: (DemandCapacityRatioTarget – DemandCapacityRatioToleranceHost) to (DemandCapacityRatioTarget + DemandCapacityRatioToleranceHost)
in this example we can calculate range: (63% – 18%) to (63% + 18%)
To control a server’s power state, DPM makes use of these three protocols in the following order:
- Intelligent Platform Management Interface (IPMI)
- Hewlett Packard Integrated Lights-Out (HP iLO)
- Wake-on-LAN (WoL)
To enable IPMI/HP iLO management, you will need to configure the Baseboard Management Controller (BMC) IP address and other access information. To configure them, follow the given steps:
- Log in to vSphere Web Client and select the host that you want to configure for power management.
- Click on Configuration and select the Power Management tab.
- Select Properties and enter an IP address, MAC address, username, and password for the server’s BMC. Note that entering hostnames will not work, as shown in the following:
To enable DPM for a cluster, perform the following steps:
- Select the cluster from the inventory tab and select Manage.
- From the Services tab, select vSphere DRS and click Edit.
- Expand the Power Management tab and select Manual or Automatic. Also, select the threshold, DPM will choose to make power decisions. The higher the value, the faster DPM will put the ESXi hosts in standby mode, as follows:
It is also possible to disable DPM for a particular host (for example, the strongest in your cluster). To do so; select the cluster and select Manage and Host Options. Check the host and click Edit. Make sure to select Disabled for the Power Management option.
Consider giving a thought to the following when planning to utilize DPM:
- Make sure your server’s have a supported BMC, such as HP iLO or IPMI.
- Evaluate the right DPM threshold. Also, keep your server’s boot time (including firmware initialization) in mind and test your configuration before running in production.
- Keep in mind that the DPM also uses Active Memory and CPU usage for its decisions. Booting VMs might claim all memory; however, not use many active memory resources. If hosts are powered down while plenty VMs are booting, this might result in extensive swapping.
In this article, you learned how to implement the affinity and anti-affinity rules. You have also learned how to save power, while still achieving our workload requirements.
Resources for Article:
Further resources on this subject:
- Monitoring and Troubleshooting Networking [article]
- Storage Scalability [article]
- Upgrading VMware Virtual Infrastructure Setups [article]