You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2012-04-04 01:09:11

gbadge
Member
4 posts

gbadge said:

Parse file containing multiple JSON documents

Tags: [java, JSON]

I need some help devising a strategy to parse JSON docs within a talend job (Java job, not Perl). I am using Talend Version: 5.0.2 and developing on a Mac, planning to run on a Linux box.

Unfortunately, I cannot use the tFileInputJSON component because of the format of my files -- each file contains several hundred JSON docs, with a complete JSON doc taking up one line in the file. I think the right solution is to read the file line by line then pass it into a JSON parser and from there send the results to the rest of the job.

As I see it my options are:

a) send the line input to some sort of Java JSON parser. If that's the strategy I need to take, I'd like some advice on how to deal with the output and passing that output into my tmap/other parts of the job.

b) find a Talend component that parses JSON docs, but doesn't require an input of a  file with a single valid JSON format.

I've searched around for this component but can't seem to find it. From my search, it seems even the tFileInputJSON component is relatively new.

Anyone have some advice on where I should turn next?

Thanks in advance.



This post closely mirrors a previous, unanswered post: http://www.talendforge.org/forum/viewtopic.php?id=18291

Offline

#2 2012-04-05 08:40:41

pedro
Member
3682 posts

pedro said:

Re: Parse file containing multiple JSON documents

Hi

TOS doesn't support a JSON component like tExtractXMLField.
For any new feature or new component, please report it on BugTracker.
So the workaround is to create a job as follows.
No.1: Read line by line from this file.
No.2: Save current line into a new delimited file called temp.txt.
No.3: Use tFileInputJson to extract temp.txt and do your job logic.
No.4: Use tFileDelete to delete temp.txt and start a new loop for next line.

Regards,
Pedro


Only Paranoid Survive.

Offline

#3 2012-04-11 01:24:53

gbadge
Member
4 posts

gbadge said:

Re: Parse file containing multiple JSON documents

Thanks pedro -- I think that's going to have to work for now. I've also opened a question at stack overflow for anyone who'd like to follow what's going on there. Will keep you posted if we end up making a component/routine. Also filed a request!

http://stackoverflow.com/questions/1000 … for-talend

Offline

#4 2012-04-11 03:47:43

gbadge
Member
4 posts

gbadge said:

Re: Parse file containing multiple JSON documents

Alright pedro -- I am about to share with you how bad I am with talend...

How do I accomplish No. 1 and No. 2 on your list? I know tFileInputFullRow reads line by line, but I am having trouble getting it to write a single one of those rows. It seems to read each line -- then write each line. So if I have a two line file, I cannot figure out how to split one line off to write.

Care to give me another push?

Thanks!

Offline

#5 2012-04-11 04:15:09

pedro
Member
3682 posts

pedro said:

Re: Parse file containing multiple JSON documents

Hi

You can create a job as the following images.

Regards,
Pedro


Only Paranoid Survive.

Offline

#6 2012-04-11 18:49:44

gbadge
Member
4 posts

gbadge said:

Re: Parse file containing multiple JSON documents

Pedro! Thanks so much. Learned a ton from your example and got it to work. Really really appreciate it.

Offline

#7 2013-06-11 09:19:31

snigdha224
Member
3 posts

snigdha224 said:

Re: Parse file containing multiple JSON documents

Hi,
Can any one help me with this!
Iam new to talend, I want to load data from mongodb source. In basic settings I have option of edit schema, how to specify schema for nested document in mongodb??

Thanks in advance

Offline

#8 2013-06-17 05:21:25

xdshi
Talend Team


xdshi said:

Re: Parse file containing multiple JSON documents

Hi snigdha224,

A schema is a row description, i.e. it defines the number of fields that will be processed and passed on to the next component.
I have replied your related forum Forum 29488.

Best regards
Sabrina


What we can do is to make sure that Talend will be your best choice!

Offline

#9 2015-09-17 22:27:27

bdavis
Guest

bdavis said:

Re: Parse file containing multiple JSON documents

Hello pedro - I'm not seeing any images in your post dtd:  2012-04-11 04:15:09.  I know it's been over 3 years ago, but any chance you still have those images and would be kind enough to post them?  I need to do something similar, but have limited Talend experience. 

Many thanks!

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy