Home Cloud & Networking Virtualization High Availability Scenarios

High Availability Scenarios

November 26, 2014 - 12:00 am

1524

14 min read

“Live Migration between hosts in a Hyper-V cluster is very straightforward and requires no specific configuration, apart from type and amount of simultaneous Live Migrations. If you add multiple clusters and standalone Hyper-V hosts into the mix, I strongly advise you to configure Kerberos Constrained Delegation for all hosts and clusters involved.”

Hans Vredevoort – MVP Hyper-V

This article written by Benedict Berger, the author of Hyper-V Best Practices, will guide you through the installation of Hyper-V clusters and their best practice configuration. After installing the first Hyper-V host, it may be necessary to add another layer of availability to your virtualization services. With Failover Clusters, you get independence from hardware failures and are protected from planned or unplanned service outages.

This article includes prerequirements and implementation of Failover Clusters.

(For more resources related to this topic, see here.)

Preparing for High Availability

Like every project, a High Availability (HA) scenario starts with a planning phase. Virtualization projects are often turning up the question for additional availability for the first time in an environment. In traditional data centers with physical server systems and local storage systems, an outage of a hardware component will only affect one server hosting one service. The source of the outage can be localized very fast and the affected parts can be replaced in a short amount of time. Server virtualization comes with great benefits, such as improved operating efficiency and reduced hardware dependencies. However, a single component failure can impact a lot of virtualized systems at once. By adding redundant systems, these single points of failure can be avoided.

Planning a HA environment

The most important factor in the decision whether you need a HA environment is your business requirements. You need to find out how often and how long an IT-related production service can be interrupted unplanned, or planned, without causing a serious problem to your business. Those requirements are defined in a central IT strategy of a business as well as in process definitions that are IT-driven. They include Service Level Agreements of critical business services run in the various departments of your company. If those definitions do not exist or are unavailable, talk to the process owners to find out the level of availability needed. High Availability is structured in different classes, measured by the total uptime in a defined timespan, that is 99.999 percent in a year. Every nine in this figure adds a huge amount of complexity and money needed to ensure this availability, so take time to find out the real availability needed by your services and resist the temptation to plan running every service on multi-redundant, geo-spread cluster systems, as it may not fit in the budget.

Be sure to plan for additional capacity in a HA environment, so you can lose hardware components without the need to sacrifice application performance.

Overview of the Failover Cluster

A Hyper-V Failover Cluster consists of two or more Hyper-V Server compute nodes. Technically, it’s possible to use a Failover Cluster with just one computing node; however, it will not provide any availability advantages over a standalone host and is typically only used for migration scenarios. A Failover Cluster is hosting roles such as Hyper-V virtual machines on its computing nodes. If one node fails due to a hardware problem, it will not answer any more to cluster heartbeat communication, even though the service interruption is almost instantly detected. The virtual machines running on the particular node are powered off immediately due to the hardware failure on their computing node. The remaining cluster nodes then immediately take over these VMs in an unplanned failover process and start them on their respective own hardware. The virtual machines will be the backup running after a successful boot of their operating systems and applications in just a few minutes. Hyper-V Failover Clusters work under the condition that all compute nodes have access to a shared storage instance, holding the virtual machine configuration data and its virtual hard disks. In case of a planned failover, that is, for patching compute nodes, it’s possible to move running virtual machines from one cluster node to another without interrupting the VM. All cluster nodes can run virtual machines at the same time, as long as there is enough failover capacity running all services when a node goes down. Even though a Hyper-V cluster is still called a Failover Cluster—utilizing the Windows Server Failover-Clustering feature—it is indeed capable of running an Active/Active Cluster.

To ensure that all these capabilities of a Failover Cluster are indeed working, it demands an accurate planning and implementation process.

Failover Cluster prerequirements

To successfully implement a Hyper-V Failover Cluster, we need suitable hardware, software, permissions, and network and storage infrastructure as outlined in the following sections.

Hardware

The hardware used in a Failover Cluster environment needs to be validated against the Windows Server Catalogue. Microsoft will only support Hyper-V clusters when all components are certified for Windows Server 2012 R2.

The servers used to run our HA virtual machines should ideally consist of identical hardware models with identical components. It is possible, and supported, to run servers in the same cluster with different hardware components, that is, different size of RAM; however, due to a higher level of complexity, this is not best practice.

Special planning considerations are needed to address the CPU requirements of a cluster. To ensure maximum compatibility, all CPUs in a cluster should be exactly the same model. While it’s possible from a technical point of view to mix even CPUs from Intel and AMD in the same cluster through to different architecture, you will lose core cluster capabilities such as Live Migration.

Choosing a single vendor for your CPUs is not enough, even when using different CPU models your cluster nodes may be using a different set of CPU instruction set extensions. With different instructions sets, Live Migrations won’t work either. There is a compatibility mode that disables most of the instruction set on all CPUs on all cluster nodes; however, this leaves you with a negative impact on performance and should be avoided. A better approach to this problem would be creating another cluster from the legacy CPUs running smaller or non-production workloads without affecting your high-performance production workloads.

If you want to extend your cluster after some time, you will find yourself with the problem of not having the exact same hardware available to purchase. Choose the current revision of the model or product line you are already using in your cluster and manually compare the CPU instruction sets at http://ark.intel.com/ and http://products.amd.com/, respectively. Choose the current CPU model that best fits the original CPU features of your cluster and have this design validated by your hardware partner.

Ensure that your servers are equipped with compatible CPUs, the same amount of RAM, and the same network cards and storage controllers.

The network design

Mixing different vendors of network cards in a single server is fine and best practice for availability, but make sure all your Hyper-V hosts are using an identical hardware setup. A network adapter should only be used exclusively for LAN traffic or storage traffic. Do not mix these two types of communication in any basic scenario. There are some more advanced scenarios involving converged networking that can enable mixed traffic, but in most cases, this is not a good idea. A Hyper-V Failover Cluster requires multiple layers of communication between its nodes and storage systems. Hyper-V networking and storage options have changed dramatically through the different releases of Hyper-V. With Windows Server 2012 R2, the network design options are endless. In this article, we will work with a typically seen basic set of network designs. We have at least six Network Interface Cards (NICs) available in our servers with a bandwidth of 1 Gb/s. If you have more than five interface cards available per server, use NIC Teaming to ensure the availability of the network or even use converged networking. Converged networking will also be your choice if you have less than five network adapters available.

The First NIC will be exclusively used for Host Communication to our Hyper-V host and will not be involved in the VM network traffic or cluster communication at any time. It will ensure Active Directory and management traffic to our Management OS.
The second NIC will ensure Live Migration of virtual machines between our cluster nodes.
The third NIC will be used for VM traffic. Our virtual machines will be connected to the various production and lab networks through this NIC.
The fourth NIC will be used for internal cluster communication. The first four NICs can either be teamed through Windows Server NIC Teaming or can be abstracted from the physical hardware through to Windows Server network virtualization and converged fabric design.
The fifth NIC will be reserved for storage communication. As advised, we will be isolating storage and production LAN communication from each other. If you do not use iSCSI or SMB3 storage communication, this NIC will not be necessary. If you use Fibre Channel SAN technology, use a FC-HBA instead. If you leverage Direct Attached Storage (DAS), use the appropriate connector for storage communication.
The sixth NIC will also be used for storage communication as a redundancy. The redundancy will be established via MPIO and not via NIC Teaming.

There is no need for a dedicated heartbeat network as in older revisions of Windows Server with Hyper-V. All cluster networks will automatically be used for sending heartbeat signals throughout the other cluster members.

If you don’t have 1 Gb/s interfaces available, or if you use 10 GbE adapters, it’s best practice to implement a converged networking solution.

Storage design

All cluster nodes must have access to the virtual machines residing on a centrally shared storage medium. This could be a classic setup with a SAN, a NAS, or a more modern concept with Windows Scale Out File Servers hosting Virtual Machine Files SMB3 Fileshares. In this article, we will use a NetApp SAN system that’s capable of providing a classic SAN approach with LUNs mapped to our Hosts as well as utilizing SMB3 Fileshares, but any other Windows Server 2012 R2 validated SAN will fulfill the requirements.

In our first setup, we will utilize Cluster Shared Volumes (CSVs) to store several virtual machines on the same storage volume. It’s not good these days to create a single volume per virtual machine due to a massive management overhead. It’s a good rule of thumb to create one CSV per cluster node; in larger environments with more than eight hosts, a CSV per two to four cluster nodes. To utilize CSVs, follow these steps:

Ensure that all components (SAN, Firmware, HBAs, and so on) are validated for Windows Server 2012 R2 and are up to date.
Connect your SAN physically to all your Hyper-V hosts via iSCSI or Fibre Channel connections.
Create two LUNs on your SAN for hosting virtual machines. Activate Hyper-V performance options for these LUNs if possible (that is, on a NetApp, by setting the LUN type to Hyper-V). Size the LUNs for enough capacity to host all your virtual hard disks.
Label the LUNs CSV01 and CSV02 with appropriate LUN IDs.
Create another small LUN with 1 GB in size and label it Quorum.
Make the LUNs available to all Hyper-V hosts in this specified cluster by mapping it on the storage device.
Do not make these LUNs available to any other hosts or cluster.
Prepare storage DSMs and drivers (that is, MPIO) for Hyper-V host installation.
Refresh disk configuration on hosts, install drivers and DSMs, and format volumes as NTFS (quick).

Install Microsoft Multipath IO when using redundant storage paths:

Install-WindowsFeature -Name Multipath-IO
–Computername ElanityHV01, ElanityHV02

In this example, I added the MPIO feature to two Hyper-V hosts with the computer names ElanityHV01 and ElanityHV02.

SANs typically are equipped with two storage controllers for redundancy reasons. Make sure to disperse your workloads over both controllers for optimal availability and performance.

If you leverage file servers providing SMB3 shares, the preceding steps do not apply to you. Perform the following steps instead:

Create a storage space with the desired disk-types, use storage tiering if possible.
Create a new SMB3 Fileshare for applications.
Customize the Permissions to include all Hyper-V servers from the planned clusters as well as the Hyper-V cluster object itself with full control.

Server and software requirements

To create a Failover Cluster, you need to install a second Hyper-V host. Use the same unattended file but change the IP address and the hostname. Join both Hyper-V hosts to your Active Directory domain if you have not done this until yet. Hyper-V can be clustered without leveraging Active Directory but it’s lacking several key components, such as Live Migration, and shouldn’t be done on purpose. The availability to successfully boot up a domain-joined Hyper-V cluster without the need to have any Active Directory domain controller present during boot time is the major benefit from the Active Directory independency of Failover Clusters.

Ensure that you create a Hyper-V virtual switch as shown earlier with the same name on both hosts, to ensure cluster compatibility and that both nodes are installed with all updates.

If you have System Center 2012 R2 in place, use the System Center Virtual Machine Manager to create a Hyper-V cluster.

Implementing Failover Clusters

After preparing our Hyper-V hosts, we will now create a Failover Cluster using PowerShell. I’m assuming your hosts are installed, storage and network connections are prepared, and the Hyper-V role is already active utilizing up-to-date drivers and firmware on your hardware.

First, we need to ensure that Servername, Date, and Time of our Hosts are correct. Time and Timezone configurations should occur via Group Policy.
For automatic network configuration later on, it’s important to rename the network connections from default to their designated roles using PowerShell, as seen in the following commands:
```
Rename-NetAdapter -Name "Ethernet" -NewName "Host"
Rename-NetAdapter -Name "Ethernet 2" -NewName "LiveMig"
Rename-NetAdapter -Name "Ethernet 3" -NewName "VMs"
Rename-NetAdapter -Name "Ethernet 4" -NewName "Cluster"
Rename-NetAdapter -Name "Ethernet 5" -NewName "Storage"
```
The Network Connections window should look like the following screenshot:

Hyper-V host Network Connections
Next, IP configuration of the network adapters. If you are not using DHCP for your servers, manually set the IP configuration (different subnets) of the specified network cards. Here is a great blog post on how to automate this step: http://bit.ly/Upa5bJ

Next, we need to activate the necessary Failover Clustering features on both of our Hyper-V hosts:

Install-WindowsFeature -Name Failover-Clustering
-IncludeManagementTools –Computername ElanityHV01, ElanityHV02

Before actually creating the cluster, we are launching a cluster validation cmdlet via PowerShell:
```
Test-Cluster ElanityHV01, ElanityHV02
```
Test-Cluster cmdlet

Open the generated .mht file for more details, as shown in the following screenshot:

Cluster validation

As you can see, there are some warnings that should be investigated. However, as long as there are no errors, the configuration is ready for clustering and fully supported by Microsoft. However, check out Warnings to be sure you won’t run into problems in the long run. After you have fixed potential errors and warnings listed in the Cluster Validation Report, you can finally create the cluster as follows:

New-Cluster
-Name CN=ElanityClu1,OU=Servers,DC=cloud,DC=local
-Node ElanityHV01, ElanityHV02
-StaticAddress 192.168.1.49

This will create a new cluster named ElanityClu1 consisting of the nodes ElanityHV01 and ElanityHV02 and using the cluster IP address 192.168.1.49.

This cmdlet will create the cluster and the corresponding Active Directory Object in the specified OU. Moving the cluster object to a different OU later on is no problem at all; even renaming is possible when done the right way.

After creating the cluster, when you open the Failover Cluster Management console, you should be able to connect to your cluster:

Hyper-V Best Practices

Failover Cluster Manager

You will see that all your cluster nodes and Cluster Core Resources are online. Rerun the Validation Report and copy the generated .mht files to a secure location if you need them for support queries. Keep in mind that you have to rerun this wizard if any hardware or configuration changes occurring to the cluster components, including any of its nodes.

The initial cluster setup is now complete and we can continue with post creation tasks.

Summary

With the knowledge from this article, you are now able to design and implement Hyper-V Failover Clusters as well as guest clusters. You are aware of the basic concepts of High Availability and the storage and networking options necessary to achieve this. You have seen real-world proven configurations to ensure a stable operating environment.

Resources for Article:

Further resources on this subject:

Planning Desktop Virtualization [Article]
Backups in the VMware View Infrastructure [Article]
Virtual Machine Design/a> [Article]

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Your Quick Introduction to Extended Events in Analysis Services from Blog…

Logging the history of my past SQL Saturday presentations from Blog…

Storage savings with Table Compression from Blog Posts – SQLServerCentral

Daily Coping 31 Dec 2020 from Blog Posts – SQLServerCentral

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring the Strategy Behavioral Design Pattern in Node.js

How to integrate a Medium editor in Angular 8

Implementing memory management with Golang’s garbage collector

How to create sales analysis app in Qlik Sense using DAR…