This article by Simon M.C. Cheng, author of the book Proxmox High Availability, will show you some basic concepts of Proxmox VE before actually using it, including the technology used, basic administration, and some options available during set up.
The following topics are going to be covered in this chapter:
(For more resources related to this topic, see here.)
Have you ever heard about cloud computing? It is a hot topic in the IT industry and claims that you can allocate nearly unlimited computer resources in a pay-as-you-go basis. Are you not curious to know on how they are able to provide such a service? The underlying technology that allows them to be able to provide such a service is hardware virtualization. Depending on the kind of processor used, there are three different types of virtualizations available: full virtualization, para-virtualization, and hardware assisted virtualization.
We have discussed why we need to learn server virtualization and how virtualization works, so are you curious on how many major virtualization software are in the market? What are the differences between them? Let’s take a deep look at it:
There are two types of virtualization available in Proxmox: OpenVZ and KVM.
OpenVZ is an operating-system-level virtualization based on the GNU/Linux kernel and the host operation system. Theoretically, OpenVZ is not a type of virtualization but more like the jail concept in Linux. Since a patched Linux kernel is needed, only Linux guests can be created. All guests are called containers that share the same kernel and architecture as long as the host OS, while each container reserves a separate user space.
Kernel-based Virtual Machine (KVM) is basically a hardware-assisted virtualization with the modified Linux kernel built with KVM module. KVM itself does not perform any emulation or virtualization. Instead, it simply exposes the /dev/kvm interface. QEMU is chosen as a software-based emulator to simulate hardware for the virtualized environment.
During the virtual machine creation, the following virtual disk options are available:
What means availability? Let’s take a look on the formula of calculating the availability; we need to divide the subtraction of Downtime duration (DD) from Expected uptime (EU) with Expected uptime (EU), then multiply it with 100. Availability is expressed as a percentage of uptime with a year. Here is the formula:
What are the problems brought from downtime? Let’s have a look:
DRBD is a short form for Distributed Replicated Block Device, it is intended to use under high availability environment. DRBD provides high availability by mirroring existing system to another machine including the disk storage, network card status and services running under existing system. So if the existing system is out of service, we can instantly switch to the backup system to avoid service interruption.
Besides high availability, there are a few more functions provided by Proxmox cluster mode but the most important one is live migration. Unlike normal migration, in Proxmox cluster, migration can be processed without shutting down the virtual machine. Such approach is called live migration which greatly reduces the downtime of each virtual machine.
Are you curious on how to manage multiple configuration files in Proxmox cluster mode? Proxmox Cluster file system (pmxcfs) is a built-in function which Proxmox cluster provided to synchronize configuration files between cluster member nodes. It is an essential component for Proxmox cluster as a version control on configuration files including cluster configuration, the virtual machine configuration, etc. It is basically a database-driven file system to store configuration files for all host servers and replicate in real time on all host nodes using corosync. The underlying file system is created by FUSE, with maximum size of 30 MB now. Here are the concepts for this file system:
The following diagram shows the structure for the Proxmox Cluster file system:
Unlike building a local RAID1 device by using mdadm command, we need to form LVM volume with dedicated local disk in multiple servers. LVM is used to simplify the disk management process of large hard disks. By adding an abstruction layer, users are able to add/replace their hard disks without downtime in combining with hot swapping. Besides, users are able to add/remove/resize their LVM volumes or even create a RAID volume easily. The structure of LVM is shown as follows:
GlusterFS is a distributed filesystem running in server-client architecture. It makes used of native Gluster protocol but also be seen as a NFS share or even work as an object-storage (Amazon S3-like networked key-value storage) with GlusterFS UFO.
Gluster over LVM with iSCSI provides auto healing function. With auto healing, Gluster client would still be able to read/write files even if one Gluster server has failed which is similar to what RAID 1 offered. Let’s check out how Gluster file system handles server failure:
Initially, we need to haveat least two storage servers installed with Gluster server package in order to enjoy the functionality of auto healing. On the client side, we have configured to use Replicate mode and mount the file system to /glusterfs.
The file content will be stored in both storages in this mode as follows:
If Storage 1 is failed, Gluster client will redirect the request to Storage 2.
When Storage 1 becomes available, the updated content will be synchronized from Storage 2. Therefore, client will not notice there is a server failure. This is shown in the following diagram:
Thus, Gluster file system can provide high availability if we are using replication mode. For performance, we can distribute the file to more servers as follow:
Ceph is also a distributed filesystem providing petabyte-level storage but is more focused on eliminating a single point of failure. To ensure high availability, replicas are created on other storage nodes. Ceph is developed based on the concepts of RADOS (reliable autonomic distributed object store) with different accessing methods provided:
Accessing method | Support Platforms | Usage |
Library packages | C, C++, JAVA, Python, Ruby, PHP | Programming |
RADOS gateway | Amazon S3, Swift | Cloud platform |
RBD | KVM | Virtualization |
CEPH file system | Linux kernel, FUSE | File system |
Here is a simple diagram demonstrating the structure of CEPH file system:
Fencing device, as name stated, is a virtual fence to prevent the communications between two nodes. It is used to separate the failed node from accessing shared resources. If there are two nodes accessing the shared resources at the same time, collision occurs which might corrupt the shared data which is the data inside virtual machines.
It is very important to protect our data without any corruption, what types of fencing devices are available and how can they build their fences during node failure? There are two approaches as listed below:
Since the voting system is a democratic system, which means there is one vote for each node. So if we only have two nodes, no one could win the race which causes the racing problem. As a result, we need to add a 3rd node joining this system (i.e. the quorum in our case). Here is a sample on why the racing problem appears and how we can fix it:
Assumes we have a cluster system with only two nodes, the above diagram show the initial state for the cluster. We have marked Node 1 as the Primary node.
Here, Node 1 is disconnected and therefore Node 2 would like to take over its position to become primary node. But it cannot be successful because two votes are needed for role switching operation. Therefore the cluster will become non-operational until Node 1 is recovered as follows:
When Node 1 is recovered from failure, it tries to join back the cluster but failed because the cluster has stopped working. To solve the problem, an extra node is recommended to join the cluster in order to create a high availability environment. Here is an example when a node failed and Node 2 would like to be the primary node:
For the network interface, a bonding device (Bond0 and Bond1) will be created in Proxmox. Bonding device is also called NIC teaming which is a native Linux kernel feature allowing user to double network speed performance or perform network redundancy. There are two options for network redundancy, 802.1ad and Active-backup. They have different response patterns when handling multiple sessions:
These network redundancy options are explained in the following points:
The following points explain the concepts of cluster manager:
NOTE: The operations of the existing virtual machines without high availability are not affected.
After we have made a copy of the container configurations, we are going to back the actual data inside the virtual machine. There are two different methods—manual backup with vzdump command for both KVM and openVZ guests and backup via GUI management console.
There are three different backup approaches in the vzdump command:
Except from manually backup the container under command-line interface, we can also make it under web management interface too. Here are the steps to perform backup with GUI:
Log in to the web management console with root account information.
Browse the left panel to locate the virtual machine to be backed up.
Choose the Backup tab in the right panel and you will only see the latest backup files you have created in the previous steps:
Then we can simply click on the Backup button to initialize the backup dialog:
Notice that Proxmox uses TAR package as the compression method and make used of Snapshot mode by default. Therefore, make sure you have enough free space in your Volume Group which stores the data of virtual machines before using default values. By default, the volume group used is pve which is mounted at /var/lib/vz and you cannot place your dump file the same volume group.
From the dialog, we can choose if the backup output file is compressed or not. To conserve disk space, here we choose GZIP as the compression method and choose snapshot to enjoy the zero downtime backup process as follow:
There are two types of templates: one is OpenVZ template and another one is VM template.
Here are the steps to download an OpenVZ template:
Basically, it should not be difficult for you to install a Proxmox server from scratch. But after I have performed a few installations on different platforms, I noticed there are few scenarios which might cause you into trouble. Here are the problems I have found.
Symptom: In some motherboards, you would receive the Undefined video mode number warning after you have pressed Enter to begin installation. It simply tells you that you cannot run the fancy installation wizard as below:
Root cause: The main problem is the display chipset. When your motherboard is using display chipset which is not VESA2.0 compatible, this error message appears. To learn more about the VESA2.0, please find the following links:
Solution: Then, you will be asked to press either <ENTER>, <SPACE> or wait for 30 seconds to continue. If you have pressed <ENTER>, the possible video modes available on your system will be shown:
You can pick up a display mode number based on the list mentioned above. Normally, you can choose display mode 314 with 800 x 600 resolutions and 16-bit color depth or you can choose display mode 311 which provides you with 640 x 480 resolutions and 16-bit color depth. Then you should be able to continue the installation process.
Prevention: I found that this problem normally happened in Nvidia display cards. If it is possible, you can try to replace it with Intel or ATI display cards during your installation.
In this article, we explained the concept of virtualization and compared Proxmox with other virtualization software.
Further resources on this subject:
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…