Choosing your database type
There are three broad categories for setting up a database when you install Oracle database using the installation wizard:
- Transaction Processing: Online transactio processing is typically for applications with multiple users inserting and updating low volumes per transaction
- General Purpose: A batch application that loads large volumes of data during specific windows
- Data Warehouse: An application that can have multiple users (transactions) requesting large volumes of data for read-only
Identify the key stakeholders, business process owners, and system owners within the identified process flows.
How to do it…
To determine the initial configuration of the database, it is important to get an understanding of the number of users and the expected volumes you will be processing:
- Schedule meetings with the stakeholders.
- Ask key questions to understand the nature of the solution:
- How many anticipated users do you expect for the solution?
- What time of the month/day do you expect the highest number of users utilizing the information?
- Does the information need to be updated real time, and why?
- Schedule meetings with the business process owners.
- Ask key questions to understand the nature of the business process:
- Is the business process updated real time by users, or is information entered and then processed within a batch?
- How many business users are involved with the business process?
- Schedule meetings with the system owners.
- Ask key questions to understand the source system:
- What is the database upon which the system resides?
- How large is the current database?
- Where is the system located?
- How is the system connected to the network?
- Does the system do a lot of batch processing?
- Can we connect to the system directly?
- What is the preferred method to connect to the system to extract information?
- What is the process to request access to extract information?
How it works…
There are many other questions that can and should be asked. The aforementioned questions are focused on gathering information to determine how many databases should be created, and the general configuration for each database.
Generally, a business intelligence solution is logically structured in the following way:
Information comes from multiple sources, and is loaded into a staging area. From the staging area, the information is transformed and loaded into the data warehouse. In the data warehouse, this information can be summarized or enhanced (additional calculations), ready to be consumed from the presentation area.
For the information sessions, if you have determined that you will be continuously extracting large volumes of data from multiple sources, which are geographically dispersed, then you will probably require a separate staging database to collect and consolidate the information. It is advantageous to select a separate database, so that as your information grows over a period of time, you can tune the stage environment to suit the requirements. When you perform the installation for Oracle, you can select a General Purpose template as the initial configuration for the stage database.
For the data warehouse, if you have a large number of users who access information 24 hours per day, 7 days per week, then it is recommended that the data warehouse again be a separate database. This will allow you to prepare the information within the staging database and load the results into the data warehouse. You can minimize the impact of the data transformation and loading on the users, and only require a smaller window to load information. The Data Warehouse template should be used when creating this database.
For the presentation area, if you are using a relational tool to publish information, then this should be located within the same database as the data warehouse. The main reason for this is the ease of administration and the ability to utilize the Oracle technology (summary management, query optimization, and so on), and minimize the amount of data movement.
For most projects, if you have a handful of sources with data loads happening during a preset window and reasonable user counts, a single General Purpose database will be the correct place to start. Data Marts would also be classified under this definition.
Oracle has published a white paper called Best Practices for a Data Warehouse on Oracle Database 11g, An Oracle White Paper, November 2010. This has a lot of very valuable information with regards to Balanced Configuration and Disk Layout for your database.