3 min read

You use Designer, one of the DataStage clients, for ETL jobs development. When you start Designer, in addition to entering your host system and credential, you must also specify the project you’d like to work on.

Visual ETL Development With IBM DataStage

A DataStage project stores jobs and define their environment, such as their security and execution resources. Your project, as well as your user account, is typically created by your DataStage administrator.

To develop a new job, on the icon toolbar, click the drop-down arrow to the right of the New icon (I circle in red).

Visual ETL Development With IBM DataStage

We’ll develop a parallel job, so click the Parallel Job from the drop-down list.

Visual ETL Development With IBM DataStage

In DataStage, a parallel job can execute in parallel mode (in simple terms, this means the job does not process its data records sequentially, one record after another).

We can now start to design our job in the job canvas. For now, our parallel job name is Untitled 1, as shown on the canvas title.

Visual ETL Development With IBM DataStage

Our job extracts its input data from a customer sequential file, transforms the data, and then loads it into an Oracle table. The customer file contains new customers and name changes of the existing customers. The customer file has more data (more columns) than we need; we’ll extract only those we need.

In DataStage, you build a job by selecting and linking stages on the job canvas. So, on the Palette, click File, look for Sequential File stage by clicking the up or down arrow.

Visual ETL Development With IBM DataStage

Drag and drop our first stage, the Sequential File stage, from the Palette to the left side of the canvas.

We’d like to layout our stages left to right to visualize them processing data in that direction, so we start placing the input stage on the left of the canvas (we can move stages around the canvas, though, by dragging them).

Visual ETL Development With IBM DataStage

Rename the Sequential File stage to Customer by selecting (clicking) and clicking it again, and typing over the default name (Sequential_File_1).

Visual ETL Development With IBM DataStage

Next, add Transformer stage (under Processing palette) and Oracle Enterprise stage (under Database palette) that we need for our job. Rename them as shown.

We use the Transformer stage to change (transform) customer names to uppercase.

Visual ETL Development With IBM DataStage

Now that we have the stages we need for our job, we’ll link them together. Right click the Customer stage, hold and drag & drop on the Uppercase stage, and then, from Uppercase stage to the Customer_Dim stage.

Visual ETL Development With IBM DataStage

You can rename the links just like the way you rename stages.

Visual ETL Development With IBM DataStage

Just in case something happens to our system, we’d like to save our works so far.

Visual ETL Development With IBM DataStage

Name the job Load_Customer_Dim and you can choose the folder (called Category in DataStage) of your choice where you want to put the job.

Visual ETL Development With IBM DataStage

Notice the canvas title now reflects the name of our parallel job (changes from its default name given when we’re creating it).

Visual ETL Development With IBM DataStage

The Customer stage has a warning sign (the yellow! sign), which indicates that it has an error. This is because we haven’t specified its parameters. For example, the file it refers to. So, let’s fix it.

Double-click the Customer stage. On the Properties tab select the File property, and type in the file name including its full path. The sequential file must be accessible by the DataStage server.

Visual ETL Development With IBM DataStage

Next, we need to specify the format of our file. For example, set the Record level’s Final delimiter and Field default’s Delimiter as shown.

Visual ETL Development With IBM DataStage

We must also specify the record layout, i.e. its fields (columns).

Visual ETL Development With IBM DataStage

To test that the stage will be able to read the data as specified, click the View Data button (upper right hand corner).

You can specify the number of rows (records) you’d like to view.

Visual ETL Development With IBM DataStage

When you click OK and you have set the correct specification for the file, you’d see the data.

Visual ETL Development With IBM DataStage

When you finish viewing the data, close the view window as well as the property window. We’re done with the Customer stage.


Subscribe to the weekly Packt Hub newsletter

* indicates required

LEAVE A REPLY

Please enter your comment!
Please enter your name here