• Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » merging multiple files with the same schema

#1 2009-08-21 02:02:19

ejb11235
New member
Company: One Tall Tree Consulting
Registered: 2009-04-09
Posts: 6

merging multiple files with the same schema

Tags: [error, memory, merge]

I have multiple files with the same schema. They are already sorted. I want to merge them together into a single file that is also sorted by the same field.

I can use tUnite --> tSortRow but this is highly inefficient, plus I get an out-of-memory error

Is there a "version" of tUnite that reads multiple files and outputs the records in sorted order?

Suggestions?

Thanks,

--eric

Last edited by ejb11235 (2009-08-21 02:31:12)

Offline

#2 2009-08-21 04:12:35

shong
Talend team
Registered: 2007-08-29
Posts: 10350
Website

Re: merging multiple files with the same schema

Hello
tUnite is the most suitable component for merging records.
You can try the following way:
1)Go to Windows/Preferences/Talend/RunDebug, and modify the vm argument in "Job Run VM arguments" table.
2)Split your job into two subjob, one is merge all the records into a temporary file, another is extract records from temporary file and sort them, output them to target file.

Best regards

          shong


Email:shong@talend.com
Choose Talend, Enjoy Talend!
New & Event: Talend Help Center
Talend-->the leader of open source data management and application integration solutions!

Offline

  • Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » merging multiple files with the same schema

Board footer

Powered by FluxBB