(For more resources related to this topic, see here.)
Understanding the SAP HANA architecture
Architecture is the key for SAP HANA to be a game changing innovative technology. SAP HANA has been designed so well architecture-wise such that it makes a lot of difference when compared to other traditional databases available today. This section explains us the various components of SAP HANA and its functionalities.
Enterprise application requirements have become more demanding—complex reports with high computation on huge volumes of transaction data and also business data of other formats (both structured and semi-structured). Data is being written or updated, and also read from the database in parallel. Thus, integration of both transactional and analytical data into single database is required, where SAP HANA has evolved. Columnar storage exploits modern hardware and technology (multiple CPU cores, large main memory, and caches) in achieving the requirements of enterprise applications. Apart from this, it should also support procedural logic where certain tasks cannot be completed with simple SQL.
How it works…
The SAP HANA database consists of several services (servers). Index server is the most important component of all the servers. Other servers are name server, preprocessor server, statistics server, and XS Engine:
- Index server: This server holds the actual data and the engines for processing the data. When SQL or MDX is fired against the SAP HANA system in the case of authenticated sessions and transactions, an index server takes care of these commands and processes them.
- Name server: This server holds complete information about the system landscape. Name server is responsible for the topology of the SAP HANA system. In a distributed system, SAP HANA instances will be running on multiple hosts. In this kind of setup, the name server knows where the components are running and how data is spread on different servers.
- Preprocessor server: This server comes into the picture during text data analysis. Index server utilizes the capabilities of preprocessor server in text data analysis and searching. This helps to extract the information on which text search capabilities are based.
- Statistics server: This server helps in collecting the data for the system monitor and helps you know the health of the SAP HANA system. The statistics server is responsible for collecting the data related to status, resource allocation/consumption and performance of the SAP HANA system. Monitoring the clients and getting the status of various alert monitors use the data collected by Statistics server. This server also provides a history of measurement data for further analysis.
- XS Engine: The XS Engine allows external applications and application developers to access the SAP HANA system via the XS Engine clients, for example, a web browser accesses SAP HANA apps built by application developers via HTTP. Application developers build applications by using the XS Engine, and the users access the app via HTTP by using a web browser. The persistent model in the SAP HANA database is converted into a consumption model for clients to access it via HTTP. This allows an organization to host system services that are a part of the SAP HANA database (for example, Search service, a built-in web server that provides access to static content in the repository).
The following diagram shows the architecture of SAP HANA:
There’s is more…
Let us continue learning about the different components:
- SAP Host Agent: According to the new approach from SAP, the SAP Host Agent should be installed on all machines that are related to the SAP landscape. It is used by Adaptive Computing Controller (ACC) to manage the system and Software Update Manager (SUM) for automatic updates.
- LM-structure: LM-structure for SAP HANA contains the information about current installation details. This information will be used by SUM during automatic updates.
- SAP Solution Manager diagnostic agent: This agent provides all the data to SAP Solution Manager (SAP SOLMAN) to monitor the SAP HANA system. After the SAP SOLMAN is integrated with the SAP HANA system, this agent provides information about the database at a glance, which includes the database state and general information about the system, such as alerts, CPU, or memory and disk usage.
- SAP HANA Studio repository: This helps the end users to update the SAP HANA studio to higher versions. The SAP HANA Studio repository is the code that does this process.
- Software Update Manager for SAP HANA: This helps in automatic updates of SAP HANA from the SAP Marketplace and patching the SAP host agent. It also allows distribution of the Studio repository to the end users.
Explaining IMCE and its components
We have seen the architecture of SAP HANA and its components. In this section, we will learn about IMCE (in-memory computing engine) and how its components and its functionalities.
The SAP in-memory computing engine (formerly Business Analytic Engine (BAE)) is the core engine for SAP’s next generation high-performance, in-memory solutions as it leverages technologies such as in-memory computing, columnar databases, massively parallel processing (MPP), and data compression, to allow organizations to instantly explore and analyze large volumes of transactional and analytical data from across the enterprise in real time.
How it works…
In-memory computing allows the processing of massive quantities of real-time data in the main memory of the server, providing immediate results from analyses and transactions. The SAP in-memory computing database delivers the following capabilities:
- In-memory computing functionality with native support for row and columnar datastores providing full ACID (atomicity, consistency, isolation, and durability) transactional capabilities
- Integrated lifecycle management capabilities and data integration capabilities to access SAP and non-SAP data sources
- SAP IMCE Studio, which includes tools for data modeling, data and life cycle management, and data security
The SAP IMCE that resides at the heart of SAP HANA is an integrated database and calculation layer that allows the processing of massive quantities of real-time data in the main memory to provide immediate results from analysis and transactions. Like any standard database, the SAP IMCE not only supports industry standards such as SQL and MDX, but also incorporates a high-performance calculation engine that embeds procedural language support directly into the database kernel. This approach is designed to remove the need to read data from the database, process it, and then write data back to the database, that is, process the data near the database and return the results.
The IMCE is an in-memory, column-oriented database technology. It is a powerful calculation engine at the heart of SAP HANA. As data resides in the Random Access Memory (RAM), highly accelerated performance can be achieved compared to systems that read data from disks. The heart lies within the IMCE, which allows us to create and perform calculations on data. SAP IMCE Studio includes tools for data modeling activities, data and life cycle management, and also tools that are related to data security.
The following diagram shows the components of IMCE alone:
SAP HANA database has the following two engines:
- Column-based store: This engine stores the huge amounts of relational data in column-optimized tables, which are aggregated and used in analytical operations.
- Row-based store: This engine stores the relational data in rows, similar to the storage mechanism of traditional database systems. The row store is more optimized for write operations and has a lower compression rate. Also, the query performance is lower when compared to the column-based store.
The engine that is used to store data can be selected on a per-table basis at the time of creating a table. Tables in the row-based store are loaded at start up time. In the case of column-based stores, tables can be either loaded at start up or on demand, that is, during normal operation of the SAP HANA database.
Both engines share a common persistence layer, which provides data persistency that is consistent across both engines. Like a traditional database, we have page management and logging in SAP HANA. The changes made to the in-memory database pages are persisted through savepoints. These savepoints are written to those data volumes on the persistent storage for which the storage medium is hard drives. All transactions committed in the SAP HANA database are stored/saved/referenced by the logger of the persistency layer in a log entry written to the log volumes on the persistent storage. To get high I/O performance and low latency, log volumes use the flash technology storage.
The relational engines can be accessed through a variety of interfaces. The SAP HANA database supports SQL (JDBC/ODBC), MDX (ODBO), and BICS (SQLDBC). The calculation engine performs all the calculations in the database. No data moves into the application layer until calculations are completed. It also contains the business functions library that is called by applications to perform calculations based on the business rules and logic. The SAP HANA-specific SQL script language is an extension of SQL that can be used to push down data-intensive application logic into the SAP HANA database for specific requirements.
This component creates and manages sessions and connections for the database clients. When a session is created, a set of parameters are maintained. These parameters are like auto-commit settings or the current transaction isolation level. After establishing a session, database clients communicate with the SAP HANA database using SQL statements. SAP HANA database treats all the statements as transactions while processing them. Each new session created will be assigned to a new transaction.
The transaction manager is the component that coordinates database transactions, takes care of controlling transaction isolation, and keeps track of running and closed transactions. The transaction manager informs the involved storage engines about the running or closed transactions, so that they can execute necessary actions, when a transaction is committed or rolled back. The transaction manager cooperates with the persistence layer to achieve atomic and durable transactions.
The client requests are analyzed and executed by a set of components summarized as request processing and execution control. The client requests are analyzed by a request parser, and then it is dispatched to the responsible component. The transaction control statements are forwarded to the transaction manager. The data definition statements are sent to the metadata manager. The object invocations are dispatched to the object store. The data manipulation statements are sent to the optimizer, which creates an optimized execution plan that is given to the execution layer.
The SAP HANA database also has built-in support for domain-specific models (such as for financial planning domain) and it offers scripting capabilities that allow application-specific calculations to run inside the database. It has its own scripting language named SQLScript that is designed to enable optimizations and parallelization. This SQLScript is based on free functions that operate on tables by using SQL queries for set processing.
The SAP HANA database also contains a component called the planning engine that allows financial planning applications to execute basic planning operations in the database layer. For example, while applying filters/transformations, a new version of a dataset will be created as a copy of an existing one. An example of planning operation is disaggregation operation in which based on a distribution function; target values from higher to lower aggregation levels are distributed.
Metadata manager helps to access metadata. SAP HANA database’s metadata consists of a variety of objects, such as definitions of tables, views and indexes, SQLScript function definitions, and object store metadata. All these types of metadata are stored in one common catalog for all the SAP HANA database stores. Metadata is stored in tables in the row store. The SAP HANA features such as transaction support and multi-version concurrency control (MVCC) are also used for metadata management. Central metadata is shared across the servers in the case of a distributed database systems. The background mechanism of metadata storage and sharing is hidden from the components that use the metadata manager.
As row-based tables and columnar tables can be combined in one SQL statement, both the row and column engines must be able to consume the intermediate results. The main difference between the two engines is the way they process data: the row store operators process data in a row-at-a-time fashion, whereas column store operations (such as scan and aggregate) require the entire column to be available in contiguous memory locations. To exchange intermediate results created by each other, the row store provides results to the column store. The result materializes as complete rows in the memory, while the column store can expose results using the iterators interface needed by the row store.
The persistence layer is responsible for durability and atomicity of transactions. The persistent layer ensures that the database is restored to the most recent committed state after a restart, and makes sure that transactions are either completely executed or completely rolled back. To achieve this in an efficient way, the persistence layer uses a combination of write-ahead logs, shadow paging, and savepoints. Moreover, the persistence layer also offers interfaces for writing and reading data. It also contains SAP HANA’s logger that manages the transaction log.
The authorization manager is invoked by other SAP HANA database components to check the required privileges for users to execute the requested operations. Privileges to other users or roles can be granted. A privilege grants the right to perform a specified operation (such as create, update, select, and execute data manipulation languages) on a specified object such as a table, view, and SQLScript function. Analytic privileges represent filters or hierarchy, and they drill down limitations for analytic queries. Analytic privileges such as granting access to values with a certain combination of dimension attributes are supported in SAP HANA. Users are authenticated either by the SAP HANA database itself (log in with username and password), or authentication can be delegated to external authentication providers third-party such as an LDAP directory.
- SAP HANA in-memory analytics and in-memory computing available at http://scn.sap.com/people/vitaliy.rudnytskiy/blog/2011/03/22/time-to-update-your-sap-hana-vocabulary
This article explains the SAP architecture and the IMCE feature in brief.
Resources for Article:
- SAP HANA integration with Microsoft Excel [Article]
- Data Migration Scenarios in SAP Business ONE Application- part 2 [Article]
- Data Migration Scenarios in SAP Business ONE Application- part 1 [Article]