You are not logged in.
I'm fairly new to Talend, but have done a significant amount of work with it over the last 2 months or so. I reached a point where I had to create a custom component for an internal file format, and it went fairly well. My second effort though is not going so well.
Basically, I need to have a new file output component that creates a specific internal file format we use within the business. This file format is extremely specific and needs exact field inputs, they can't be variable. Also, it needs to have two sets of looping data merged together. I've struggled to figure out how to even configure the component's xml to have the input schemas and connectors working correctly. I don't know what types of connectors to put and how to configure basically two distinct schemas through two inputs.
The overall setup will go from an xml file, where I have two different sets of metadata to loop through the xml at two different tags. The first one goes into a tMap and then out to the first input of the new file output component. The second also goes through a tMap with different fields mapped and I want that to go into the input of the new file output component, and then I want both of those merged at a particular id.
So, I need my file output component to have exactly two inputs that are required, each of those inputs need to have specific schemas. Is this possible?
Thanks ahead of time for your help.
Hello, sorry, would like to help, but after reading 4 times your description I did not understand..well.. basically anything.
First thing :which kind of component are you trying to design?
It is a component that outputs data to a talend row connection, right?
2) How do you feed data to this component? Is it going to read it from a file itself or is it going to get this data from an incoming row connection?
3) Is the eventual input schema fixed or can it vary?
4) Is the out put schema fixed or can it vary?
totally lost that part about xml and tmap, but is it really relevant to accomplish your needs or is it something that works completely separately?
Sorry, I'll try to explain a bit more by answering your questions.
The component creates a file, so no, it doesn't output a row, it outputs a custom internal company file format.
The data is fed to the component from an XML file through a tMap. So, I guess that means it gets the data from an incoming row connection. Well, hopefully it gets it from 2 incoming row connections as that's what I would like.
The input schema is fixed. I'm hoping to have two inputs, both with fixed schemas, but both schemas different.
The output file is fixed, it has fixed data and fixed fields that it needs to write out.
Basically, the whole point is to create a custom file format that does not match any of the generic formats supplied by Talend. This file requires certain input, but it requires 2 different looping elements, and 2 different fixed schemas. I guess it might be more of a merge.
Hopefully that helps, if you need any more info, I'm happy to supply.
Thanks a bunch for your help.
Last edited by ssherriff (2012-08-07 01:58:12)
it's a bit more clear now, thanks.
So, you basically have two inputs and probably will use one to create a sort of loop.
This is already a key information, but, unless you are trying to output a row connection, it should not be a big issue.
However, since all the loop process will happen within the component you will need to manage to buffer (in memory) your data.
If you deal with huge data sets, that might be a problem too.
The fact that your input schema are fixed will represent only a constraint for the output row of the preceding component (the tMap in your case, most likely).
While it is possible to pre-define an input schema (you can check the tRssOutput component xml for that), I am not sure it is possible to define multiple input ones.
One practical issue would be, when you connect the two inputs, to define which is which.
There are possibly other ways to achieve it, i.e. you could leave the input formats open and then assign the columns dynamically using a mapping table in your pars, something like this one :
<PARAMETER NAME="mappings" FIELD="TABLE" REQUIRED="true" NB_LINES="3" NUM_ROW="6">
<ITEM NAME="output_column_1" FIELD="COLUMN_LIST" />
<ITEM NAME="whatever_fixed" FIELD="CLOSED_LIST">
<ITEM NAME="outfield1" VALUE="outfield1"/>
<ITEM NAME="bbb" VALUE="2"/>
<ITEM NAME="ccc" VALUE="3"/>
<ITEM NAME="input_column_1" FIELD="PREV_COLUMN_LIST"/>
You can also use single pars, using the type FIELD="PREV_COLUMN_LIST".
That would retrieve the list of columns from the incoming connections.
As a general comment, you are probably better off by managing everything with standard components, then output the row result and eventually write the actual file using a tjavaFlex if you have a format that cannot be managed by standard components.
Else, going for the custom component solution, I would skip even the tMap part and read your XML etc directly from the component which will basically be a normal java software that does all the job, without using any specific Talend functionality.
Last edited by saburo (2012-08-07 11:58:38)
Thanks for your help. I've put this on temporary hold, but will be back to it soon and will go over your pointers closer than.
I figured it would probably be easier to do it in Java, but I had hoped to create a re-usable component for the business as a whole to use. With that, the format coming in might not be the same for everyone, but I'll evaluate that and see what is worth the time.