12 min read

In this article by Dmitry Anoshin, author of the book SAP Lumira Essentials, Dmitry talks about living in a century of information technology. There are a lot of electronic devices around us which generate lots of data. For example, you can surf the Internet, visit a couple of news portals, order new Nike Air Max shoes from a web store, write a couple of messages to your friend, and chat on Facebook. Your every action produces data. We can multiply that action by the amount of people who have access to the internet or just use a cell phone, and we get really BIG DATA. Of course, you have a question: how big is it? Now, it starts from terabytes or even petabytes. The volume is not the only issue; moreover, we struggle with the variety of data. As a result, it is not enough to analyze only the structured data. We should dive deep in to unstructured data, such as machine data which are generated by various machines.

(For more resources related to this topic, see here.)

Nowadays, we should have a new core competence—dealing with Big Data—, because these vast data volumes won’t be just stored, they need to be analysed and mined for information that management can use in order to make right business decisions. This helps to make the business more competitive and efficient.

Unfortunately, in modern organizations there are still many manual steps needed in order to get data and try to answer your business questions. You need the help of your IT guys, or need to wait until new data is available in your enterprise data warehouse. In addition, you are often working with an inflexible BI tool, which can only refresh a report or export it in to Excel. You definitely need a new approach, which gives you a competitive advantage, dramatically reduces errors, and accelerates business decisions.

So, we can highlight some of the key points for this kind of analytics:

  • Integrating data from heterogeneous systems
  • Giving more access to data
  • Using sophisticated analytics
  • Reducing manual coding
  • Simplifying processes
  • Reducing time to prepare data
  • Focusing on self-service
  • Leveraging powerful computing resources

We could continue this list with many other bullet points.

If you are a fan of traditional BI tools, you may think that it is almost impossible. Yes, you are right, it is impossible. That’s why we need to change the rules of the game. As the business world changes, you must change as well.

Maybe you have guessed what this means, but if not, I can help you. I will focus on a new approach of doing data analytics, which is more flexible and powerful. It is called data discovery. Of course, we need the right way in order to overcome all the challenges of the modern world. That’s why we have chosen SAP Lumira—one of the most powerful data discovery tools in the modern market. But before diving deep into this amazing tool, let’s consider some of the challenges of data discovery that are in our path, as well as data discovery advantages.

Data discovery challenges

Let’s imagine that you have several terabytes of data. Unfortunately, it is raw unstructured data. In order to get business insight from this data you have to spend a lot of time in order to prepare and clean the data. In addition, you are restricted by the capabilities of your machine. That’s why a good data discovery tool usually is combined of software and hardware. As a result, this gives you more power for exploratory data analysis.

Let’s imagine that this entire Big Data store is in Hadoop or any NoSQL data store. You have to at least be at good programmer in order to do analytics on this data. Here we can find other benefit of a good data discovery tool: it gives a powerful tool to business users, who are not as technical and maybe don’t even know SQL.

Apache Hadoop is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Rather than relying on high-end hardware, the resilience of these clusters comes from the software’s ability to detect and handle failures at the application layer.

A NoSQL data store is a next generation database, mostly addressing some of the following points: non-relational, distributed, open-source, and horizontally scalable.

Data discovery versus business intelligence

You may be confused about data discovery and business intelligence technologies; it seems they are very close to each other or even BI tools can do all what data discovery can do. And why do we need a separate data discovery tool, such as, SAP Lumira?

In order to better understand the difference between the two technologies, you can look at the table below:


Enterprise BI

Data discovery

Key users

All users

Advanced analysts


Vertically-oriented (top to bottom), semantic layers, requests to existing repositories

Vertically-oriented (bottom-up), mushup, putting data in the selected repository


Reports, dashboards






By IT consultants

By business users

Let’s consider the pros and cons of data discovery:


  • Rapidly analyze data with a short shelf life
  • Ideal for small teams
  • Best for tactical analysis
  • Great for answering on-off questions quickly


  • Difficult to handle for enterprise organizations
  • Difficult for junior users
  • Lack of scalability

As a result, it is clear that BI and data discovery handles their own tasks and complement each other.

The role of data discovery

Most organizations have a data warehouse. It was planned to supporting daily operations and to help make business decisions. But sometimes organizations need to meet new challenges. For example, Retail Company wants to improve their customer experience and decide to work closely with the customer database. Analysts try to segment customers into cohorts and try to analyse customer’s behavior. They need to handle all customer data, which is quite big. In addition, they can use external data in order to learn more about their customers. If they start to use a corporate BI tool, every interaction, such as adding new a field or filter, can take 10-30 minutes. Another issue is adding a new field to an existing report. Usually, it is impossible without the help of IT staff, due to security or the complexities of the BI Enterprise solution. This is unacceptable in a modern business. Analysts want get an answer to their business questions immediately, and they prefer to visualize data because, as you know, human perception of visualization is much higher than text. In addition, these analysts may be independent from IT. They have their data discovery tool and they can connect to any data sources in the organization and check their crazy hypotheses.

There are hundreds of examples where BI and DWH is weak, and data discovery is strong.

Introducing SAP Lumira

Starting from this point, we will focus on learning SAP Lumira. First of all, we need to understand what SAP Lumira is exactly.

SAP Lumira is a family of data discovery tools which give us an opportunity to create amazing visualizations or even tell fantastic stories based on our big or small data. We can connect most of the popular data sources, such as Relational Database Management Systems (RDBMSs), flat files, excel spreadsheets or SAP applications. We are able to create datasets with measures, dimensions, hierarchies, or variables. In addition, Lumira allows us to prepare, edit, and clean our data before it is processed.

SAP Lumira offers us a huge arsenal of graphical charts and tables to visualize our data. In addition, we can create data stories or even infographics based on our data by grouping charts, single cells, or tables together on boards to create presentation- style dashboards. Moreover, we can add images or text in order to add details.

The following are the three main products in the Lumira family offered by SAP:

  • SAP Lumira Desktop
  • SAP Lumira Server
  • SAP Lumira Cloud

Lumira Desktop can be either a personal edition or a standard edition. Both of them give you the opportunity to analyse data on your local machine. You can even share your visualizations or insights via PDF or XLS.

Lumira Server is also in two variations—Edge and Server. As you know, SAP BusinessObjects also has two types of license for the same software, Edge and Enterprise, and they differ only in terms of the number of users and the type of license. The Edge version is smaller; for example, it can cover the needs of a team or even the whole department.

Lumira Cloud is Software as a Service (SaaS). It helps to quickly visualize large volumes of data without having to sacrifice performance or security. It is especially designed to speed time to insight. In addition, it saves time and money with flexible licensing options.

Data connectors

We met SAP Lumira for the first time and we played with the interface, and the reader could adjust the general settings of SAP Lumira. In addition, we can find this interesting menu in the middle of the window:

There are several steps which help us to discover our data and gain business insights. In this article we start from first step by exploring data in SAP Lumira to create a document and acquire a dataset, which can include part or all of the original values from a data source. This is through Acquire Data. Let’s click on Acquire Data. This new window will come up:

There are four areas on this window. They are:

  • A list of possible data sources (1): Here, the user can connect to his data source.
  • Recently used objects (2): The user can open his previous connections or files.
  • Ordinary buttons (3), such as Previous, Next, Create, and Cancel.
  • This small chat box (4) we can find at almost every page. SAP Lumira cares about the quality of the product and gives the opportunity to the user to make a screen print and send feedback to SAP.

Let’s go deeper and consider more closely every connection in the table below:

Data Source


Microsoft Excel

Excel data sheets

Flat file



There are two possible ways: Offline (downloading data) and Online (connected to SAP HANA)

SAP BusinessObjects universe


SQL Databases

Query data via SQL from relational databases

SAP Business warehouse

Downloaded data from a BEx Query or an InfoProvider

Let’s try to connect some data sources and extract some data from them.

Microsoft spreadsheets

Let’s start with the easiest exercise. For example, our manager of inventory asked us to analyse flop products, which are not popular, and he sent us two excel spreadsheets, Unicorn_flop_products.xls and Unicorn_flop_price.xls. There are two different worksheets because prices and product attributes are in different systems. Both files have a unique field—SKU. As a result, it is possible to merge them by this field and analyse them as one data set.

SKU or stock keeping unit is a distinct item for sale, such as a product or service, and them attributes associated with the item distinguish it from other items. For a product, these attributes include, but are not limited to, manufacturer, product description, material, size, color, packaging, and warranty terms. When a business takes inventory, it counts the quantity of each SKU.

Connecting to the SAP BO universe

Universe is a core thing in the SAP BusinessObjects BI platform. It is the semantic layer that isolates business users from the technical complexities of the databases where their corporate information is stored. For the ease of the end user, universes are made up of objects and classes that map to data in the database, using everyday terms that describe their business environment.

Introducing Unicorn Fashion universe

The Unicorn Fashion company uses the SAP BusinessObjects BI platform (BIP) as its primary BI tool. There is another Unicorn Fashion universe, which was built based on the unicorn datamart. It has a similar structure and joins as datamart. The following image shows the Unicorn Fashion universe:

It unites two business processes: Sales (orange) and Stock (green) and has the following structure in business layer:

  • Product: This specifies the attributes of an SKU, such as brand, category, ant, and so on
    • Price: This denotes the different pricing of the SKU
  • Sales: This specifies the sales business process
    • Order: This denotes the order number, the shipping information, and orders measures
    • Sales Date: This specifies the attributes of order date, such as month, year, and so on
    • Sales Measures: This denotes various aggregated measures, such as shipped items, revenue waterfall, and so on
  • Stock: This specifies the information about the quantity on stock
    • Stock Date: This denotes the attributes of stock date, such as month, year, and so on


A step-by-step guide of learning SAP Lumira essentials starting from overview of SAP Lumira family products. We will demonstrate various data discovery techniques using real world scenarios of online ecommerce retailer. Moreover, we have detail recipes of installations, administration and customization of SAP Lumira. In addition, we will show how to work with data starting from acquiring data from various data sources, then preparing it and visualize through rich functionality of SAP Lumira. Finally, it teaches how to present data via data story or infographic and publish it across your organization or world wide web.

Learn data discovery techniques, build amazing visualizations, create fantastic stories and share these visualizations through electronic medium with one of the most powerful tool – SAP Lumira. Moreover, we will focus on extracting data from different sources such as plain text, Microsoft Excel spreadsheets, SAP BusinessObjects BI Platform, SAP HANA and SQL databases. Finally, it will teach how to publish result of your painstaking work on various mediums, such as SAP BI Clients, SAP Lumira Cloud and so on.

Resources for Article:

Further resources on this subject:

Subscribe to the weekly Packt Hub newsletter

* indicates required


Please enter your comment!
Please enter your name here