• Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » tFileOutputDelimited with text enclosure only for type "String"

#1 2010-11-03 09:58:05

ODN
New member
Registered: 2008-12-11
Posts: 4

tFileOutputDelimited with text enclosure only for type "String"

Hello

I want to make a CSV file with a tFileOutputDelimited with text enclosure only for type "String" (not for number).

exemple:

     12;"text2";"text3";15
     45;"text4";"text25";12

how to do this?

Thanks

Eric

Offline

#2 2010-11-03 14:14:01

JohnGarrettMartin
Member
Registered: 2009-01-07
Posts: 762

Re: tFileOutputDelimited with text enclosure only for type "String"

in a tJavaRow you can use java.lang.reflect to retrieve the column types, enclose strings in quotes and then write them out to a file delimited with no text enclosure.

Offline

#3 2010-11-03 15:51:24

tchd
Member
Company: edexe
Registered: 2010-07-05
Posts: 79
Website

Re: tFileOutputDelimited with text enclosure only for type "String"

Hi ODN,

I hope you don't mind me hijacking the thread a little, but I really liked John's idea of using java.lang.reflect, so gave it a go myself (see code below).  I'm fairly new to java, so I'm not sure if this would be considered robust enough for say production use.

John, if you would do it differently or have any tips ideas regarding robustness/error checking etc then I'd love to capture your thoughts. 

Regards,
Rick

Class cls = Class.forName(input_row.getClass().getName());   
Class clsOut = Class.forName(output_row.getClass().getName()); 

Field fieldlist[] = cls.getDeclaredFields();
Field fieldlistOut[] = clsOut.getDeclaredFields();

for (int i = 0; i < fieldlist.length; i++) {
    Field fld = fieldlist[i];
    Field fldOut = fieldlistOut[i];
    int mod = fld.getModifiers();     
    if ( Modifier.toString(mod).equals("public")) {
        if ( fld.getType().getName().equals("java.lang.String") ) {
            fldOut.set(output_row, "\"" + fld.get(input_row) + "\"");
        } else {
            fldOut.set(output_row, fld.get(input_row));
        }
    }
}

Offline

#4 2010-11-04 00:19:25

CaptainRoo
Member
Registered: 1970-01-01
Posts: 66

Re: tFileOutputDelimited with text enclosure only for type "String"

This is all good, but the actual question is that the component should not even place "s around non string columns...
I would consider the behavior not correct, since an integer is non-text.


HTH,

Gábor

Offline

#5 2010-11-04 12:22:20

janhess
Member
Company: Newcastle University
Registered: 2009-05-19
Posts: 1123

Re: tFileOutputDelimited with text enclosure only for type "String"

You could do it manually in a map rule by adding " to the start and end of string fields and not setting the csv options in the output file.
OK if there are only a few fields.

Offline

#6 2010-11-04 12:55:33

tchd
Member
Company: edexe
Registered: 2010-07-05
Posts: 79
Website

Re: tFileOutputDelimited with text enclosure only for type "String"

CaptainRoo wrote:

This is all good, but the actual question is that the component should not even place "s around non string columns...
I would consider the behavior not correct, since an integer is non-text.

Hi CaptainRoo,

The code does only enclose strings, however I did neglect to state that:

- It should be in a tJavaRow
- You will need to include java.lang.reflection.*
- You will need to ensure that the tOutputFileDelimited has csv options turned off, so that no automatic enclosure is performed.


janhess: you are absolutely correct.  This method would add an overhead in processing due to the need to loop through the fields for every row, whereas adding a rule for each field wouldn't have that overhead.

Another option would be to create a code routine to add the "s and then apply it to each string field.


Regards,
Rick

Offline

#7 2010-11-14 23:36:11

archenroot
Member
Company: CoeTech Unconnected
Registered: 2010-02-23
Posts: 166
Website

Re: tFileOutputDelimited with text enclosure only for type "String"

Well I edited this topic, because for the first time I was little bit wrong :-)) hehe

I checked tFileInputDelimited and you are able to use "Text enclosure" feature if you check the CSV option on component. So your functionality is already there.

This is also available by creating metadata for delimited file, select type CSV and there already is that option related to "Text-enclousure" as """, then the schema will be based on reading that file in a way:
"alfa" will be String
245 will be Integer

So Talend is already ready for your scenario.

Best regards,

archenroot

Last edited by archenroot (2010-11-15 07:49:01)


Emperor wants to control outer space Yoda wants to explore inner space that's the fundamental difference between good and bad sides of the Force

Offline

#8 2010-11-17 11:32:07

ODN
New member
Registered: 2008-12-11
Posts: 4

Re: tFileOutputDelimited with text enclosure only for type "String"

Hello

I tried with the component "tFileOutputDeleimited "with the option "text enclosure"

with Integer as I get "123".

Offline

#9 2010-11-19 03:56:27

CaptainRoo
Member
Registered: 1970-01-01
Posts: 66

Re: tFileOutputDelimited with text enclosure only for type "String"

Told you so. ;-)

Rick - aka tchd - why are you saying that it has to be tJavaRow? ODN did not specify that...


HTH,

Gábor

Offline

#10 2010-11-19 09:25:48

tchd
Member
Company: edexe
Registered: 2010-07-05
Posts: 79
Website

Re: tFileOutputDelimited with text enclosure only for type "String"

Hi Gabor,

It was a comment on the java code that I wrote that uses java.lang.reflection to test the field types and then enclose the strings. i.e. it has to be in the tJavaRow component to work.

Regards,
Rick

Offline

  • Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » tFileOutputDelimited with text enclosure only for type "String"

Board footer

Powered by FluxBB