You are not logged in.
I am new to talend but managed to create first scenarios. I like the tool.
Now I am at a point I do not know how to manage this with talend. I have to transfer mass data from a progress database (not postgress) to SAP. The systems are in different parts of the world so I have a runtime and network bandwidth and performance issue. I have a working scenario doing this:
tJDBCInput1 -> Call Subjob
tJDBCInput2-> tMap -> tSAPOutput
tJDBCInput1 just selects all keys from the database, every key is passed to the subjob, there tJDBCInput2 selects the complete line (around 80 fields) and sends them to SAP (SAP can be exchanged by a WebService, same problem)
This works but as you can think is not very efficient.
What I want to do now is group the result from tJDBCInput1 by for example 100 rows and give this to the SubJob. With tJDBCInput2 I can then select these 100 records at once, collect them and send with tSAPOutput.
The key of the database is material number. With the first tJDBCInput I get around 20.000 material numbers. They are sorted but a mixture of numeric and alphanumeric numbers. My idea is now to finded the first number and the number from row 100, pass these 2 values to the SubJob, there in tJDBCInput2 use "where matnr between 4711 and abc23" to get the complete data from these 100 records. Next use number from row 101 to 200 etc.
Collecting the data for the use in tSAPOutput can be done like described here:
I hope I could explain my problem a way you are able to understand.
How can I archive the grouping of 100 records and work with them in a subjob?
Any idea or maybe better solutions are welcome.
Many thanks for help.
I think using two tJDBCInput won't make the job efficient.
Especially there isn't a way to increase the keys by adding a number which are sorted but a mixture of numeric and alphanumeric numbers.
In mysql, there is 'LIMIT' as below to get specified range of rows.
SELECT * FROM table LIMIT [offset,] rows | rows OFFSET offset
So there may be the same syntax under your DB. It must be more efficient because only one tJDBCInput is enough.
Thanks for this answer. Progress SQL in the version I have to connct does not support LIMIT or TOP etc. SQL Statement.
So I thought where <field> between <xx> and <yy> would be best to use.
Any further idea?
As far as I am concerned, there isn't any specified component for progress DB now.
The normal JDBC connection won't support better performance. Even though you want to get a range between <xx> and <yy>, it doesn't help a lot.
Usually what you can do is using tMap with 'Store on disk', inner join Table1 and Table2.
Besides, the material numbers are not sequence. Is there a rule to add some value to get the next material number?
the problem of the performance has 2 sites.
The way it is now:
1. I just select all material numbers select matnr from mat_table - This is ok
2. I iterate over every material number and select the complete line and send this to SAP by tSAPOutput - This is slow because of the RFC overhead to SAP is big. So I have to send a number of lines that is fast in select and fast in RFC (e.g. 100 or 500). For this i need the grouping of the lines from 1. to get the number of complete lines at once in 2. I am sure this will cut the tSAPOutput time a lot a s I can send many lines at once.