8 min read

What is GoldenGate?

Oracle GoldenGate is Oracle’s strategic solution for real-time data integration. GoldenGate software enables mission-critical systems to have continuous availability and access to real-time data. It offers a fast and robust solution for replicating transactional data between operational and analytical systems.

Oracle GoldenGate captures, filters, routes, verifies, transforms, and delivers transactional data in real-time, across Oracle and heterogeneous environments with very low impact and preserved transaction integrity. The transaction data management provides read consistency, maintaining referential integrity between source and target systems.

As a competitor to Oracle GoldenGate, data replication products, and solutions exist from other software companies and vendors. These are mainly storage replication solutions that provide a fast point in time data restoration. The following is a list of the most common solutions available today:

  • EMC SRDF and EMC RecoverPoint
  • IBM PPRC and Global Mirror (known together as IBM Copy Services)
  • Hitachi TrueCopy
  • Hewlett-Packard Continuous Access (HP CA)
  • Symantec Veritas Volume Replicator (VVR)
  • DataCore SANsymphony and SANmelody
  • FalconStor Replication and Mirroring
  • Compellent Remote Instant Replay

Data replication techniques have improved enormously over the past 10 years and have always been a requirement in nearly every IT project in every industry. Whether for Disaster Recovery (DR), High Availability (HA), Business Intelligence (BI), or even regulatory reasons, the requirements and expected performance have also increased, making the implementation of efficient and scalable data replication solutions a welcome challenge.

Oracle GoldenGate evolution

GoldenGate Software Inc was founded in 1995. Originating in San Francisco, the company was named after the famous Golden Gate Bridge by its founders, Eric Fish and Todd Davidson. The tried and tested product that emerged quickly became very popular within the financial industry. Originally designed for the fault tolerant Tandem computers, the resilient and fast data replication solution was in demand. The banks initially used GoldenGate software in their ATM networks for sending transactional data from high street machines to mainframe central computers. The data integrity and guaranteed zero data loss is obviously paramount and plays a key factor. The key architectural properties of the product are as follows:

  • Data is sent in “real time” with sub-second speed.
  • Supports heterogeneous environments across different database and hardware types. “Transaction aware” —maintaining its read-consistent and referential integrity between source and target systems.
  • High performance with low impact; able to move large volumes of data very efficiently while maintaining very low lag times and latency.
  • Flexible modular architecture.
  • Reliable and extremely resilient to failure and data loss. No single point of failure or dependencies, and easy to recover.

Oracle Corporation acquired GoldenGate Software in September 2009. Today there are more than 500 customers around the world using GoldenGate technology for over 4000 solutions, realizing over $100 million in revenue for Oracle.

Oracle GoldenGate solutions

Oracle GoldenGate provides five data replication solutions:

  1. High Availability
    • Live Standby for an immediate fail-over solution that can later re-synchronize with your primary source.
    • Active-Active solutions for continuous availability and transaction load distribution between two or more active systems.
  2. Zero-Downtime Upgrades and Migrations
    • Eliminates downtime for upgrades and migrations.
  3. Live Reporting
    • Feeding a reporting database so as not to burden the source production systems with BI users or tools.
  4. Operational Business Intelligence (BI)
    • Real-time data feeds to operational data stores or data warehouses, directly or via Extract Transform and Load (ETL) tools.
  5. Transactional Data Integration
    • Real-time data feeds to messaging systems for business activity monitoring, business process monitoring, and complex event processing.
    • Uses event-driven architecture and service-oriented architecture (SOA).

The following diagram shows the basic architecture for the various solutions available from GoldenGate software:

Oracle GoldenGate Implementer's Guide

We have discovered there are many solutions where GoldenGate can be applied. Now we can dive into how GoldenGate works, the individual processes, and the data flow that is adopted for all.

Oracle GoldenGate technology overview

Let’s take a look at GoldenGate’s fundamental building blocks; the Capture process, Trail files, Data pump, Server collector, and Apply processes. In fact, the order in which the processes are listed depicts the sequence of events for GoldenGate data replication across distributed systems. A Manager process runs on both the source and the target systems that “oversee” the processing and transmission of data.

All the individual processes are modular and can be easily decoupled or combined to provide the best solution to meet the business requirements. It is normal practice to configure multiple Capture and Apply processes to balance the load and enhance performance.

Filtering and transformation of the data can be done at either the source by the Capture or at the target by the Apply processes. This is achieved through parameter files.

The capture process (Extract)

Oracle GoldenGate’s capture process, known as Extract, obtains the necessary data from the databases’ transaction logs. For Oracle, these are the online redo logs that contain all the data changes made in the database. GoldenGate does not require access to the source database and only extracts the committed transactions from the online redo logs. It can, however, read archived redo logs to extract the data from long-running transactions.

The Extract process will regularly checkpoint its read and write position, typically to a file. The checkpoint data insures GoldenGate can recover its processes without data loss in the case of failure.

The Extract process can have one of the following statuses:

  • STOPPED
  • STARTING
  • RUNNING
  • ABENDED

The ABENDED status stems back to the Tandem computer, where processes either stop (end normally) or abend (end abnormally). Abend is short for an abnormal end.

Trail files

To replicate transactional data efficiently from one database to another, Oracle GoldenGate converts the captured data into a Canonical Format which is written to trail files, both on the source and the target system. The provision of the source and target trail files in the GoldenGates architecture eliminates any single point of failure and ensures data integrity is maintained. A dedicated checkpoint process keeps track of the data being written to the trails on both the source and target for fault tolerance.

It is possible to configure GoldenGate not to use trail files on the source system and write data directly from the database’s redo logs to the target server data collector. In this case, the Extract process sends data in large blocks across a TCP/IP network to the target system. However, this configuration is not recommended due to the possibility of data loss occurring during unplanned system or network outages. Best practice states, the use of local trail files would provide a history of transactions and support the recovery of data for retransmission via a Data Pump.

Data pump

When using trail files on the source system, known as a local trail, GoldenGate requires an additional Extract process called Data pump that sends data in large blocks across a TCP/IP network to the target system. As previously stated, this is best practice and should be adopted for all Extract configurations.

Server collector

The server collector process runs on the target system and accepts data from the source (Extract/Data Pump). Its job is to reassemble the data and write it to a GoldenGate trail file, known as a remote trail.

The Apply process (Replicat)

The Apply process, known in GoldenGate as Replicat, is the final step in the data delivery. It reads the trail file and applies it to the target database in the form of DML (deletes, updates, and inserts) or DDL*. (database structural changes). This can be concurrent with the data capture or performed later.

The Replicat process will regularly checkpoint its read and write position, typically to a file. The checkpoint data ensures that GoldenGate can recover its processes without data loss in the case of failure.

The Replicat process can have one of the following statuses:

  • STOPPED
  • STARTING
  • RUNNING
  • ABENDED

* DDL is only supported in unidirectional configurations and non-heterogeneous (Oracle to Oracle) environments.

The Manager process

The Manager process runs on both source and target systems. Its job is to control activities such as starting, monitoring, and restarting processes; allocating data storage; and reporting errors and events. The Manager process must exist in any GoldenGate implementation. However, there can be only one Manager process per Changed Data Capture configuration on the source and target.

The Manager process can have either of the following statuses:

  • STOPPED
  • RUNNING

GGSCI

In addition to the processes previously described, Oracle GoldenGate 10.4 ships with its own command line interface known as GoldenGate Software Command Interface (GGSCI). This tool provides the administrator with a comprehensive set of commands to create, configure, and monitor all GoldenGate processes.

Oracle GoldenGate 10.4 is command-line driven. However, there is a product called Oracle GoldenGate Director that provides a GUI for configuration and management of your GoldenGate environment.

Process data flow

The following diagram illustrates the GoldenGate processes and their dependencies. The arrows largely depict replicated data flow (committed transactions), apart from checkpoint data and configuration data. The Extract and Replicat processes periodically checkpoint to a file for persistence. The parameter file provides the configuration data. As described in the previous paragraphs, two options exist for sending data from source to target; these are shown as broken arrows:

Oracle GoldenGate Implementer's Guide

Having discovered all the processes required for GoldenGate to replicate data, let’s now dive a little deeper into the architecture and configurations.

LEAVE A REPLY

Please enter your comment!
Please enter your name here