I need some help devising a strategy to parse JSON docs within a talend job (Java job, not Perl). I am using Talend Version: 5.0.2 and developing on a Mac, planning to run on a Linux box.
Unfortunately, I cannot use the tFileInputJSON component because of the format of my files -- each file contains several hundred JSON docs, with a complete JSON doc taking up one line in the file. I think the right solution is to read the file line by line then pass it into a JSON parser and from there send the results to the rest of the job.
As I see it my options are:
a) send the line input to some sort of Java JSON parser. If that's the strategy I need to take, I'd like some advice on how to deal with the output and passing that output into my tmap/other parts of the job.
b) find a Talend component that parses JSON docs, but doesn't require an input of a file with a single valid JSON format.
I've searched around for this component but can't seem to find it. From my search, it seems even the tFileInputJSON component is relatively new.
Anyone have some advice on where I should turn next?
Thanks in advance.
This post closely mirrors a previous, unanswered post: http://www.talendforge.org/forum/viewtopic.php?id=18291
TOS doesn't support a JSON component like tExtractXMLField.
For any new feature or new component, please report it on BugTracker.
So the workaround is to create a job as follows.
No.1: Read line by line from this file.
No.2: Save current line into a new delimited file called temp.txt.
No.3: Use tFileInputJson to extract temp.txt and do your job logic.
No.4: Use tFileDelete to delete temp.txt and start a new loop for next line.
Only Paranoid Survive.
Thanks pedro -- I think that's going to have to work for now. I've also opened a question at stack overflow for anyone who'd like to follow what's going on there. Will keep you posted if we end up making a component/routine. Also filed a request!
Alright pedro -- I am about to share with you how bad I am with talend...
How do I accomplish No. 1 and No. 2 on your list? I know tFileInputFullRow reads line by line, but I am having trouble getting it to write a single one of those rows. It seems to read each line -- then write each line. So if I have a two line file, I cannot figure out how to split one line off to write.
Care to give me another push?
Can any one help me with this!
Iam new to talend, I want to load data from mongodb source. In basic settings I have option of edit schema, how to specify schema for nested document in mongodb??
Thanks in advance
A schema is a row description, i.e. it defines the number of fields that will be processed and passed on to the next component.
I have replied your related forum Forum 29488.
What we can do is to make sure that Talend will be your best choice!
Hello pedro - I'm not seeing any images in your post dtd: 2012-04-11 04:15:09. I know it's been over 3 years ago, but any chance you still have those images and would be kind enough to post them? I need to do something similar, but have limited Talend experience.