(For more resources related to this topic, see here.)
Open the job jo_cook_ch03_0010_validationSubjob. As you can see, the reject flow has been attached and the output is being sent to a temporary store (tHashMap).
((Integer)globalMap.get("tFileOutputDelimited_1_NB_LINE")) > 0
((Integer)globalMap.get("tFileOutputDelimited_1_NB_LINE")) == 0
The job should now look like the following:
What we have done in this exercise is created a validation stage prior to processing the data.
Valid rows are held in temporary storage (tHashOutput) and invalid rows are written to a reject file until all input rows are processed.
The job then checks to see how many records are rejected (using the RunIf link). In this instance, there are invalid rows, so the RunIf link is triggered, and the job is killed using tDie.
By ensuring that the data is correct before we start to process it into a target, we know that the data will be fit for writing to the target, and thus avoiding the need for rollback procedures.
The records captured can then be sent to the support team, who will then have a record of all incorrect rows. These rows can be fixed in situ within the source file and the job simply re-run from the beginning.
This article is particularly important when rollback/correction of a job may be particularly complex, or where there may be a higher than expected number of errors in an input.
An example would be when there are multiple executions of a job that appends to a target file. If the job fails midway through, then rolling back involves identifying which records were appended to the file by the job before failure, removing them from the file, fixing the offending record, and then re-running. This runs the risk of a second error causing the same thing to happen again.
On the other hand, if the job does not die, but a subsection of the data is rejected, then the rejects must be manipulated into the target file via a second manual execution of the job.
So, this method enables us to be certain that our records will not fail to write due to incorrect data, and therefore saves our target from becoming corrupted.
This article has shown how the rejects are collected before killing a job. This article also shows how incorrect rejects be manipulated into the target file.
Further resources on this subject:
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…