You are not logged in.
I am using Excel/Flat files (no databases at this point) and I need to create additional columns that have either substrings of original columns etc. I am using TOS 2.3.1 with PERL project.
Class, Subsection, Description
1000, 1-1, Saturn
1001, 1-2, Nissan
1002, 2-3, Honda
And I want output like this:
Class, Subsection, Description, Substring_Section, Upper_Case_Description, length_Description
1000, 1-1, Saturn, 1,SATURN, 6
1001, 3-2, Nissan,3,NISSAN, 6
1002, 2-3, Honda,2,HONDA, 5
So I am keeping my current columns and want to add more columns with functions like substring, uc, split etc. There are no joins or anything. I am not replacing original columns or filtering on any particular value (that will be later).
Q1. Will a component like tPERL or tMap handle this? A line by line conversion and creation of additional columns as the rows are processed?
Q2. If I can use any PERL function on the flow and create new column, then I've all PERL string functions available. Is there an example for this? This might be simpler for me.
Q3. Can I split the subsection in the example above based on the presence of "-"? So if I have input 23-1, I get 23. If I have 1-23, I get 1 etc. (I think I can figure this out in PERL using rindex or something).
Thanks in advance for your help.
It is very easy to fit your requirement in Java version. In Perl project, Plegall(Perl project developer) will give you an answer. if need, I will show you the Java example.
Talend Perl version also "makes the simple things easy and the hard things possible" :
use tPerlRow component to process your input columns and fill your additional columns. In tPerlRow, you can have an output schema different from input schema and define specific code that is processed for each row.
# fill output columns with input columns @output_row = @input_row; # split returns a list, we pick the first element $output_row[substring_section] = ( split('-',$input_row[subsection] )); $output_row[uc_description] = uc $input_row[description]; $output_row[length_description] = length $input_row[description];
Hope it helps.
For shong : could you please give us your java version ?
This will work perfectly for me and is easy to implement.
I refer to users guide 2.3_a quite often and it has no reference to tPerlRow, tJoin etc. Is there a newer in-progress doc anywhere? I know it is always going to be a moving target with such a rapidly evolving code base.
Thanks for your answer. You guys are great!
I refer to users guide 2.3_a quite often and it has no reference to tPerlRow, tJoin etc. Is there a newer in-progress doc anywhere?
Yes, 23b is available at download, but there is no tPerlRow and tJoin inside. I know esabot (in charge of documentation) as worked with rbillerey (Perl team developer) about tJoin so it should come as soon as tJoin will be a priority :-) About tPerlRow, I have no info. This component is available since TOS 1.0.0 but is not documented yet.
I know it is always going to be a moving target with such a rapidly evolving code base.
You're so right... documentation is a never ending piece of work. In addition to general features, the number of components as increased faster and faster since TOS 2.0.0