You are not logged in.
Are your data sorted?
i'm trying to compare two XML files that have the same structure. My question is: How can i do to show the difference between the two XML files. Any solutions?
I'm not aware of how does Talend get the metadata info but it looks like querying a select * from table in order to get the metadata afterwards when you just need one record.
That's the only reason I see for the hellish wait time as this happens also on databases. For delimited files I suggest that a sample file is used instead since I don't think is possible to get one row for this purposes.
The file contains 99,800 records not fields, one of which is a header record. The individual lines contain around 100 fields.
It is a Java project.
So is it typical to use a subset of the file in order to get thru the metadata setup ?
I've sometimes problem with the execution time of a "guess schema" in the Metadata section to (primary with xml). As a workaround i reduce the file to a limit number of lines myself and process them. In job execution I use the full file (without problems).
In fairness I've to say that opening my (large) files in Notepad or UltraEdit let them in special cases crash too.
Can you confirm that you have 99,800 fields on each line of your tab delimited file? It really sounds a lot and I don't remember any test with such a huge schema.
Reading your previous topics, I think you're using a Java project, it's a good thing to remind it when you ask a question. On this kind of operation, Perl and Java won't behave the same.
I went through the process of creating a Metadata -> File Delimited schema.
The file I selected is a tab delimited 52.9 MB file and contains just under 99,8000 records/lines. When I click next on Step 3 of 4, Talend just sits there and I eventually have to kill the process. If however, I limit the number of lines to 100 Talend I do get to step 4 of 4.
Is there a limit on the size of the file I can select to create a schema from ?