You are not logged in.
Announcement
Unanswered posts
|
hi can some one tell me how to increase the speed of the rows(no.of.rows) passing in between source to destination...for eg...suppose if a csv file consists more than 1lakhs rows means while running the job..the rows passing in a speed of only 200 rows/s to 300rows/s...how to increase it to more than 1000rows/s...pls anyone reply me...
Hi,
Make sure that the metadata schema in yout tMap and in your tAccessOutput match exactly (including datatype) with the schema in your final database
Have you tried to load a temporary table as output in the first job ?
Then you could map it with the output table in the 2nd job.
Ok I define a key on my second TMap. I will send the results after.
Thx.
Be careful, don't confuse between the key column into a schema and Expression key column into tMap.
Check the Key column from a schema will not improve performance into tMap, yet set an Expression key into one of your lookup will do.
I don't have a key define on my schema. Do you think if I define a key I improve the performance ?
Hi suzchr,
Try to test with only one tMap,it's accept many input and output ?
you may put a better Expression key to improve performence.
Regards
For help I add two screen shot which shows the two jobs.
Yes, it's done but the best time get is 22 minutes. I try to use 2 jobs to improve this time.
After my test I can say that when you write on Access, the best commit value is 125 000 rows, but if I write on Access in the same job that I read in my data warehouse, with a commit of 125 000 rows the time is 6h 40minutes...
Hi suzchr,
There is a special reason using a tBufferOutput ? Can't you load directly in Access instead of loading in the buffer first ?
So I am realising my benchmark and the result are not good...
In fact I realize two types of benchmark. The first is the job complete and the best time is get with a commit value on the tAccessOutput at 10. The best time is 22 minutes versus 9 minutes with my loader in Access.
Then, I create the same job without write on Access (I delete the tComponentOutput). The time is 4min 40 seconds. This is very good.
Then, I create a job where I just write on Access. I write 500 000 lines generated by the tRowGenerator. The best time is get by a commit value at 125 000. This best time is 5 minutes 39 secondes. This is also efficient.
All in all, I create two job one which extract only the data and finish by the tBufferOutput component and an other which get the data of the job and write on Access. But the performance are bad. After two hours I have just write 125 000 rows.
How can I do to improve my performance ? Someone has a good idea ?
Thank you for all your answers !
I realise benchmark in my job and after I will give my results.
My first impression is with Access the most efficiant is to commit every 1 values. It's rare but in my case it's like this.
suzchr,
I have a question for you ! I see that your status is Talend Team. Do it significate that you work for Talend company ?
Yes I'm working for Talend ! :-)
However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example).
In general, it's better with a big "commit every" value, but it not as simple as that.
You may have better performance with a commit every of 40000 than with a comit every of 50000.
You have to make tests to find the better value.
Somebody know how the commit is done if I write 0 like value in commit every ?
With 0 or empty, there won't be any commit at all.
HTH,
Somebody know how the commit is done if I write 0 like value in commit every ?
I have an other question according to my job. To improve performance, I need to modify the commit on tAccessOutput. However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example).
I use an computer with 1Go of ram memories and my process write 422188 rows in my database Access.