03 November 2013

Summary of DataStage Design Patterns

Here is a list with the DataStage patterns I have published on this blog:

Will try to update it when something new is posted.

DataStage Patterns–Compare Transformer Record with the Previous Record

A common requirement in a DataStage job is to compare within a Transformer stage the current record with the previous record. For example consider the following rule:

if record(T).TotalPrice – record(T-1).TotalPrice > 10 then 0 else 1

There are several ways to implement this in DataStage:

Use Stage Variables in the Transformer

The straight forward approach will be to define stage variables in the Transformer stage and keep the value(s) from the previous record there. The following screenshot illustrates this (please note that the order of the stage variables is important):

image

Use Join Over a Dummy Column

The disadvantage of the previous method is the use of the stage variables – if we need to access more columns from the previous record the number of the stage variables will increase and the transformer will be too complicated.

Another idea is to have a transformer that will have two output links. Each output link will introduce a new column called “dummy”. The dummy column in the first link will have values from 1..n (where n is the number of input records) and the dummy column from the second link will have values from 0..(n-1). Later if you join both links over this dummy column you will get the values of the current and the next record in one row. The above is illustrated in the following screenshots:

image

image

image

02 November 2013

DataStage Patterns–Generate Sequence of Numbers in a Parallel Transformer

A common requirement in the DataStage jobs is to create a sequence of unique numbers (row id in other words) in a parallel transformer stage. The obvious solution bellow does not work if the job is running on multiple partitions:

image

To overcome this the @PARTITIONNUM, @NUMPARTITIONS and @INROWNUM system variables should be used in a transformer derivation like this:

@PARTITIONNUM + (@NUMPARTITIONS * (@INROWNUM -1))

image