Apache CloudStack Architecture

17 min read

(For more resources related to this topic, see here.)

Introducing cloud

Before embarking on a journey to understand and appreciate CloudStack, let’s revisit the basic concepts of cloud computing and how CloudStack can help us in achieving our private, public, or hybrid cloud objectives.

Let’s start this article with a plain and simple definition of cloud. Cloud is a shared multi-tenant environment built on a highly efficient, highly automated, and preferably virtualized IT infrastructure where IT resources can be provisioned on demand from anywhere over a broad network, and can be metered. Virtualization is the technology that has made the enablement of these features simpler and convenient. A cloud can be deployed in various models; including private, public, community or hybrid clouds. These deployment models can be explained as follows:

  • Private cloud: In this deployment model, the cloud infrastructure is operated solely for an organization and may exist on premise or off premise. It can be managed by the organization or a third-party cloud provider.
  • Public cloud: In this deployment model, the cloud service is provided to the general public or a large industry group, and is owned and managed by the organization providing cloud services.
  • Community cloud: In this deployment model, the cloud is shared by multiple organizations and is supported by a specific community that has shared concerns. It can be managed by the organization or a third party provider, and can exist on premise or off premise.
  • Hybrid cloud: This deployment model comprises two or more types of cloud (public, private, or community) and enables data and application portability between the clouds.

A cloud—be it private, public, or hybrid—has the following essential characteristics:

  • On-demand self service
  • Broad network access
  • Resource pooling
  • Rapid elasticity or expansion
  • Measured service
  • Shared by multiple tenants

Cloud has three possible service models, which means there are three types of cloud services that can be provided. They are:

  • Infrastructure as a service (IaaS): This type of cloud service model provides IT infrastructure resources as a service to the end users. This model provides the end users with the capability to provision processing, storage, networks, and other fundamental computing resources that the customer can use to run arbitrary software including operating systems and applications. The provider manages and controls the underlying cloud infrastructure and the user has control over the operating systems, storage and deployed applications. The user may also have some control over the networking services.
  • Platform as a service (PaaS): In this service model, the end user is provided with a platform that is provisioned over the cloud infrastructure. The provider manages the network, operating system, or storage and the end user has control over the applications and may have control over the hosting environment of the applications.
  • Software as a service (SaaS): This layer provides software as a service to the end users, such as providing an online calculation engine for their end users. The end users can access these software using a thin client interface such as a web browser. The end users do not manage the underlying cloud infrastructure such as network, servers, OS, storage, or even individual application capabilities but may have some control over the application configurations settings.

As depicted in the preceding diagram, the top layers of cloud computing are built upon the layer below it. In this book, we will be mainly dealing with the bottom layer—Infrastructure as a service.

Thus providing Infrastructure as a Service essentially means that the cloud provider assembles the building blocks for providing these services, including the computing resources hardware, networking hardware and storage hardware. These resources are exposed to the consumers through a request management system which in turn is integrated with an automated provisioning layer. The cloud system also needs to meter and bill the customer on various chargeback models. The concept of virtualization enables the provider to leverage and pool resources in a multi-tenant model. Thus, the features provided by virtualization resource pooling, combined with modern clustering infrastructure, enable efficient use IT resources to provide high availability and scalability, increase agility, optimize utilization, and provide a multi-tenancy model.

One can easily get confused about the differences between the cloud and a virtualized Datacenter; well, there are many differences, such as:

  • The cloud is the next stage after the virtualization of datacenters. It is characterized by a service layer over the virtualization layer. Instead of bare computing resources, services are built over the virtualization platforms and provided to the users. Cloud computing provides the request management layer, provisioning layer, metering and billing layers along with security controls and multi-tenancy.
  • Cloud resources are available to consumers on an on demand model wherein the resources can be provisioned and de-provisioned on an as needed basis. Cloud providers typically have huge capacities to serve variable workloads and manage variable demand from customers. Customers can leverage the scaling capabilities provided by cloud providers to scale up or scale down the IT infrastructure needed by the application and the workload. This rapid scaling helps the customer save money by using the capacity only when it is needed.
  • The resource provisioning in the cloud is governed by policies and rules, and the process of provisioning is automated.
  • Metering, Chargeback, and Billing are essential governance characteristics of any cloud environment as they govern and control the usage of precious IT resources.

Thus setting up a cloud is basically building capabilities to provide IT resources as a service in a well-defined manner. Services can be provided to end users in various offerings, depending upon the amount of resources each service offering provides. The amount of resources can be broken down to multiple resources such as the computing capacity, memory, storage, network bandwidth, storage IOPS, and so on. A cloud provider can provide and meter multiple service offerings for the end users to choose from.

Though the cloud provider makes upfront investments in creating the cloud capacity, however from a consumer’s point of view the resources are available on demand on a pay per use model. Thus the customer gets billed for consumption just like in case of electricity or telecom services that individuals use. The billing may be based on hours of compute usage, the amount of storage used, bandwidth consumed, and so on.

Having understood the cloud computing model, let’s look at the architecture of a typical Infrastructure as a Service cloud environment.

Infrastructure layer

The Infrastructure layer is the base layer and comprises of all the hardware resources upon which IT is built upon. These include computing resources, storage resources, network resources, and so on.

Computing resources

Virtualization is provided using a hypervisor that has various functions such as enabling the virtual machines of the hosts to interact with the hardware. The physical servers host the hypervisor layer. The physical server resources are accessed through the hypervisor. The hypervisor layer also enables access to the network and storage. There are various hypervisors on the market such as VMware, Hyper-V, XenServer, and so on. These hypervisors are responsible for making it possible for one physical server to host multiple machines, and for enabling resource pooling and multi tenancy.


Like the Compute capacity, we need storage which is accessible to the Compute layer.

The Storage in cloud environments is pooled just like the Compute and accessed through the virtualization layer. Certain types of services just offer storage as a service where the storage can be programmatically accessed to store and retrieve objects.

Pooled, virtualized storage is enabled through technologies such as Network Attached Storage (NAS) and Storage Area Network (SAN) which helps in allowing the infrastructure to allocate storage on demand that can be based on policy, that is, automated.

The storage provisioning using such technologies helps in providing storage capacity on demand to users and also enables the addition or removal of capacity as per the demand. The cost of storage can be differentiated according to the different levels of performance and classes of storage.

Typically, SAN is used for storage capacity in the cloud where statefulness is required. Direct-attached Storage (DAS) can be used for stateless workloads that can drive down the cost of service. The storage involved in cloud architecture can be redundant and prevent the single point of failure. There can be multiple paths for the access of disk arrays to provide redundancy in case connectivity failures.

The storage arrays can also be configured in a way that there is incremental backup of the allocated storage. The storage should be configured such that health information of the storage units is updated in the system monitoring service, which ensures that the outage and its impact are quickly identified and appropriate action can be taken in order to restore it to its normal state.

Networks and security

Network configuration includes defining the subnets, on-demand allocation of IP addresses, and defining the network routing tables to enable the flow of data in the network. It also includes enabling high availability services such as load balancing. Whereas the security configuration aims to secure the data flowing in the network that includes isolation of data of different tenants among each other and with the management data of cloud using techniques such as network isolation and security groups.

Networking in the cloud is supposed to deal with the isolation of resources between multiple tenants as well as provide tenants with the ability to create isolated components. Network isolation in the cloud can be done using various techniques of network isolation such as VLAN, VXLAN, VCDNI, STT, or other such techniques.

Applications are deployed in a multi-tenant environment and consist of components that are to be kept private, such as a database server which is to be accessed only from selected web servers and any other traffic from any other source is not permitted to access it. This is enabled using network isolation, port filtering, and security groups. These services help with segmenting and protecting various layers of application deployment architecture and also allow isolation of tenants from each other.

The provider can use security domains, layer 3 isolation techniques to group various virtual machines. The access to these domains can be controlled using providers’ port filtering capabilities or by the usage of more stateful packet filtering by implementing context switches or firewall appliances. Using network isolation techniques such as VLAN tagging and security groups allows such configuration. Various levels of virtual switches can be configured in the cloud for providing isolation to the different networks in the cloud environment.

Networking services such as NAT, gateway, VPN, Port forwarding, IPAM systems, and access control management are used in the cloud to provide various networking services and accessibility. Some of these services are explained as follows:

  • NAT: Network address translation can be configured in the environment to allow communication of a virtual machine in private network with some other machine on some other network or on the public Internet. A NAT device allows the modification of IP address information in the headers of IP packets while they are transformed from a routing device. A machine in a private network cannot have direct access to the public network so in order for it to communicate to the Internet, the packets are sent to a routing device or a virtual machine with NAT configured which has direct access to the Internet. NAT modifies the IP packet header so that the private IP address of the machine is not visible to the external networks.
  • IPAM System/DHCP: An IP address management system or DHCP server helps with the automatic configuration of IP addresses to the virtual machines according to the configuration of the network and the IP range allocated to it. A virtual machine provisioned in a network can be assigned an IP address as per the user or is assigned an IP address from the IPAM. IPAM stores all the available IP addresses in the network and when a new IP address is to be allocated to a device, it is taken from the available IP pool, and when a device is terminated or releases the IP address, the address is given back to the IPAM system.
  • Identity and access management: A access control list describes the permissions of various users on different resources in the cloud. It is important to define an access control list for users in a multi-tenant environment. It helps in restricting actions that a user can perform on any resource in the cloud. A role-based access mechanism is used to assign roles to users’ profile which describes the roles and permissions of users on different resources.

Use of switches in cloud

A switch is a LAN device that works at the data link layer (layer 2) of the OSI model and provides multiport bridge. Switches store a table of MAC addresses and ports. Let us see the various types of switches and their usage in the cloud environment:

  • Layer 3 switches: A layer-3 switch is a special type of switch which operates at layer 3—the Network layer of the OSI model. It is a high performance device that is used for network routing. A layer-3 switch has a IP routing table for lookups and it also forms a broadcast domain. Basically, a layer-3 switch is a switch which has a router’s IP routing functionality built in.

    A layer-3 switch is used for routing and is used for better performance over routers. The layer-3 switches are used in large networks like corporate networks instead of routers. The performance of the layer-3 switch is better than that of a router because of some hardware-level differences. It supports the same routing protocols as network routers do. The layer-3 switch is used above the layer-2 switches and can be used to configure the routing configuration and the communication between two different VLANs or different subnets.

  • Layer 4-7 switches: These switches use the packet information up to OSI layer 7 and are also known as content switches, web-switches, or application switches. These types of switches are typically used for load balancing among a group of servers which can be performed on HTTP, HTTPS, VPN, or any TCP/IP traffic using a specific port. These switches are used in the cloud for allowing policy-based switching—to limit the different amount of traffic on specific end-user switch ports. It can also be used for prioritizing the traffic of specific applications. These switches also provide forwarding decision making like NAT services and also manages the state of individual sessions from beginning to end thus acting like firewalls. In addition, these switches are used for balancing traffic across a cluster of servers as per the configuration of the individual session information and status. Hence these types of switches are used above layer-3 switches or above a cluster of servers in the environment. They can be used to forward packets as per the configuration such as transferring the packets to a server that is supposed to handle the requests and this packet forwarding configuration is generally based on the current server loads or sticky bits that binds the session to a particular server.
  • Layer-3 traffic isolation provides traffic isolation across layer-3 devices. It’s referred to as Virtual Routing and Forwarding (VRF). It virtualizes the routing table in a layer-3 switch and has set of virtualized tables for routing. Each table has a unique set of forwarding entries. Whenever traffic enters, it is forwarded using the routing table associated with the same VRF. It enables logical isolation of traffic as it crosses a common physical network infrastructure. VRFs provide access control, path isolation, and shared services. Security groups are also an example of layer-3 isolation capabilities which restricts the traffic to the guests based on the rules defined. The rules are defined based on the port, protocol, and source/destination of the traffic.
  • Virtual switches: The virtual switches are software program that allows one guest VM to communicate with another and is similar to the Ethernet switch explained earlier. Virtual switches provide a bridge between the virtual NICs of the guest VMs and the physical NIC of the host. Virtual switches have port groups on one side which may or may not be connected to the different subnets. There are various types of virtual switches used with various virtualization technologies such as VMware Vswitch, Xen, or Open Vswitch. VMware also provides a distributed virtual switch which spans multiple hosts. The virtual switches consists of port groups at one end and an uplink at the other. The port groups are connected to the virtual machines and the uplink is mapped to the physical NIC of the host. The virtual switches function as a virtual switch over the hypervisor layer on the host.
  • Management layer

    The Management layer in a cloud computing space provides management capabilities to manage the cloud setup.

    It provides features and functions such as reporting, configuration for the automation of tasks, configuration of parameters for the cloud setup, patching, and monitoring of the cloud components.


    The cloud is a highly automated environment and all tasks such as provisioning the virtual machine, allocation of resources, networking, and security are done in a self-service mode through automated systems.

    The automation layer in cloud management software is typically exposed through APIs. The APIs allow the creation of SDKs, scripts, and user interfaces.


    The Orchestration layer is the most critical interface between the IT organization and its infrastructure, and helps in the integration of the various pieces of software in the cloud computing platform.

    Orchestration is used to join together various individual tasks which are executed in a specified sequence with exception handling features. Thus a provisioning task for a virtual machine may involve various commands or scripts to be executed. The orchestration engine binds these individual tasks together and creates a provisioning workflow which may involve provisioning a virtual machine, adding it to your DNS, assigning IP Addresses, adding entries in your firewall and load balancer, and so on.

    The orchestration engine acts as an integration engine and also provides the capabilities to run an automated workflow through various subsystems. As an example, the service request to provision cloud resources may be sent to an orchestration engine which then talks to the cloud capacity layer to determine the best host or cluster where the workload can be provisioned. As a next step, the orchestration engine chooses the component to call to provision the resources.

    The orchestration platform helps in easy creation of complex workflows and also provides ease of management since all integrations are handled by a specialized orchestration engine and provide loose coupling.

    The orchestration engine is executed in the cloud system as an asynchronous job scheduler which orchestrates the service APIs to fulfill and execute a process.

    Task Execution

    The Task execution layer is at the lower level of the management operations that are performed using the command line or any other interface. The implementation of this layer can vary as per the platform on which the execution takes place. The activity of this layer is activated by the layers above in the management layer.

    Service Management

    The Service Management layer helps in compliance and provides means to implement automation and adapts IT service management best practices as per the policies of the organization, such as the IT Infrastructure Library (ITIL). This is used to build processes to implement different types of incident resolutions and also provide change management.

    The self service capability in cloud environment helps in providing users with a self-service catalog which consists of various service options that the user can request and provision resources from the cloud. The service layer can be comprised of various levels of services such as basic provisioning of virtual machines with some predefined templates/configuration, or can be of an advanced level with various options for provisioning servers with configuration options as well.

Subscribe to the weekly Packt Hub newsletter

* indicates required


Please enter your comment!
Please enter your name here