You are not logged in.
i do have a file (positional) with a header and a footer row providing information about the real data (count rows, sum amount...)
How can i validate file data with header/ footer information before processing continues.
To enable validation I have filtered the rows by type (header, footer and data). Moreover i do t_AggregateRow on data rows to determine "count rows" etc.
But i do not have any idea how to deal with. May you help me?
first I think you need to decide if you would like to parse the file and check it and than run the processing (parsing the file a second time) or do all in one step.
The design of you job will mainly depend of your data and which values you would like to calculate. For example if it is only the number of rows you could count them in a context var and check them against your footer row at the end.
Here is the way of getting the header/footer line:
tFileInputPositional--->tSampleRow(let you choose a list of line numbers and/or a list of ranges. Set range as "1" to get the first line)-->tJavaRow(do your filter/validate/processing to get expected data, then set them to context var)
tFileRowCount(to get the line number of total line, assuming there are 8 lines in your file)
tJava(set the line number to a context variable, eg: context.lineNumber(String)))-->tJavaRow(do your filter/validate/processing to get expected data, then set them to context var)
tFileInputPositional--->tSampleRow(Set the range as context.lineNumber to the last line data)
If you still have problem, please show us an example of your file and what are your expect result.
Can you show us an example of your file and what are your expected result?
Explanation of rows
H -> Header Line with Timestamp 2009-03-17 12:30:00
D -> Data Record with description and amount , Desc: abcd, Amount: 0000120.50 -> 120.50 Ä
F -> Footer Line with count rows of data records and overall amount, 0000002 -> 2 Rows (D), 00000000000220.49 -> 220.49 Amount
I want to check the data records based on information in the footer line and do the real processing (database etc) if data quality is ok.
I would propose the following job:
tFileInputRegex("(.)(.*)") --(row)--> tMap --(type = D)--> tAggregateRow(fix key, sum amount) --(row)--> tJavaRow_1
--(type = F)--> tJavaRow:2
Code in tJava Row:
// code in tJavaRow_1 context.sumOfAmount= input_row.sumOfAmount; context.fileValid= false; //code in tJavaRow_2 context.fileValid= context.sumOfAmount == input_row.sumOfFooter;
Now you could use a second job and trigger it based on context.fileValid. In the second job you have to do the parsing again, unless you are able to process the file in step one and do a rollback in case of errors.
@Shong: I'm not sure: Does a "run if ..." work in this case (instead of tInSubJobOk and/or a thrown exception if checks in tJavaRow_2 fails)?