You are not logged in.
Announcement
Unanswered posts
|
Pages: 1
Hi,
I'm struggling to figure out how to make a joblet reusable. Here is the basic scenario I'm facing:
I have a source with four columns: "First Name", "Last Name", "Age", "Gender". It would go to a target with three columns: "Full Name", "Age", "Gender".
I have another source with four columns: "First Name", "Last Name", "Marital Status", "Number of Children". It would go to a target with three columns: "Full Name", "Marital Status", "Number of Children".
I'd like to use a joblet to concatenate the first and last names together to form "Full Name". I still want to keep the other attributes and put them into their respective targets.
So I thought I'd create a joblet with an input schema of "First Name" and "Last Name" and an output schema of "Full Name". Then I could plug it into either of the two jobs and create my full name. My issues is that I can't find a way to pass the rest of my attributes to the target. I tried creating a tMap that spilt out the names and the rest of the attributes, but once I have two data streams I can't find a way to rejoin them. I tried a tMap, tJoin, and tUnite but none would allow both streams to connect.
Can anyone show me how to do this properly? I know I could write a routine in java to do it and call it within my tMap, but I'd really like to utilize the joblet since this is what I thought it was intended for. I also have thought about just adding a bunch of generic input and output columns to the joblet schema to pass the attributes, but that would be a pretty dirty way to do it and could create datatype nightmares.
Thanks for any help!
Mike
PS. My example is not real, we are actually dealing with web information around URLs and paths, but I didn't want to complicate my question.
Offline
Hello
So I thought I'd create a joblet with an input schema of "First Name" and "Last Name" and an output schema of "Full Name". Then I could plug it into either of the two jobs and create my full name. My issues is that I can't find a way to pass the rest of my attributes to the target.
Here is a way to join the full name and the rest attributes:
tFileInputDelimited_1-row1--tJoblet_1---tMap_1---main---tMap_3----tLogRow_1
|
lookup
|
tFileInputDelimited_2-row2---tMap_2
tFileInputDelimited_1: read the first name and last name column.
tFileInputDdelimited_2: read the rest columns "Age", "Gender"
tMap_1: add a new column id, generate a sequence number for id column, set its expression as: Numeric.sequence("s1",1,1)
tMap_2: add a new column id, generate a sequence number for id column, set its expression as: Numeric.sequence("s2",1,1) //note that the sequence name is "s2"
tMap_3: do a inner join base on id column, get the matched rows.
Best regards
Shong
Online
Pages: 1