• Index
  •  » Talend Enterprise Data Integration » General questions
  •  » Performance - 12 millions rows in tFileInputdelimited

#1 2012-04-25 01:21:15

sree2
Member
Registered: 2012-04-02
Posts: 24

Performance - 12 millions rows in tFileInputdelimited

Every  hour a new log file is created and has about 12 million entries .The ETL Job runs every hour and has to read this 12 million rows from the file and load to a staging table.
I'm facing huge performance issue.


I'm not able to upload the image of the talend job, so here is the flow:
1) tFilelist
2) tFileinputDelimited
3) tconvert type
4) tmap
5) insert into the table.

It is taking about 3 hours to process the 12 million rows in a  single log file.  As it is crossing the 1 hour limit. The log files are getting accumulated and there is no way for this ETL Job to process all this backlogs.


Does any one has any suggestions on how to improve this process ?

Thanks

Offline

#2 2012-04-25 03:47:26

pedro
Member
Registered: 2011-11-17
Posts: 3682

Re: Performance - 12 millions rows in tFileInputdelimited

Hi

Usually tmap and tconverttype are the bottleneck of one job.
First, use 'store on disk' of tMap.
Second, delete tConverttype and try to convert type in expressions of tMap.
Besides, using bulk component(e.g. tOracleOutputBulkExec) will optimize the performance.

Regards,
Pedro


Only Paranoid Survive.

Offline

  • Index
  •  » Talend Enterprise Data Integration » General questions
  •  » Performance - 12 millions rows in tFileInputdelimited

Board footer

Powered by FluxBB