• Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » Import one XML file into multiple tables and keep the relations

#1 2009-11-17 14:15:16

gavelin
Member
Registered: 2009-11-12
Posts: 12

Import one XML file into multiple tables and keep the relations

Tags: [import, mysql, schema, xml]

Hi,

I have a XML file like this :

<?xml version="1.0" encoding="UTF-8"?>
<PrintedProduct xmlns="editorialin" Name="AIXPR" NewsPaperName="BLP" PublicationDate="2009-10-08T00:00:00:000" Priority="50" Version="1">
    <PrintedProductDescription Structure="B34T20">
                  <Section Format="Broadsheet" Layout="GABBLP" Name="A" PhysicalSectionNumber="1">
            <Page FileName="B01AIX_20091008.ps" Name="B01AIX-Q" NumberInSection="1">
                <Color Name="Cyan" NumberOfPlates="2" id="80001" sepId="BLP_AIXPR_0_20091008_001_C"/>
                <Color Name="Magenta" NumberOfPlates="2" id="80002" sepId="BLP_AIXPR_0_20091008_001_M"/>
                <Color Name="Yellow" NumberOfPlates="2" id="80003" sepId="BLP_AIXPR_0_20091008_001_Y"/>
                <Color Name="Black" NumberOfPlates="2" id="80004" sepId="BLP_AIXPR_0_20091008_001_K"/>
            </Page>
            <Page FileName="P2BCH_20091008.ps" Name="P2BCH-Q" NumberInSection="2">
                <Color Name="Cyan" NumberOfPlates="0" id="80005" sepId="BLP_AIXPR_0_20091008_002_C"/>
                <Color Name="Magenta" NumberOfPlates="0" id="80006" sepId="BLP_AIXPR_0_20091008_002_M"/>
                <Color Name="Yellow" NumberOfPlates="0" id="80007" sepId="BLP_AIXPR_0_20091008_002_Y"/>
                <Color Name="Black" NumberOfPlates="0" id="80008" sepId="BLP_AIXPR_0_20091008_002_K"/>
            </Page>
                  </Section>
        </PrintedProductDescription>
</PrintedProduct>

I want to import this in a database like this :

Table "PrintProduct" :
------------------------
Fields :
idPrintProduct (auto increment), primary key
name
date
structure

Table "Section " :
------------------------
Fields :
idSection (auto increment), primary key
idPrintProduct (foreign key)
format
layout
name

Table "Page"
-----------------------
Fields :
idPage (auto increment), primarykey
idSection (foreign key)
fileName
name


So i set up a metadata xml to extract the data, i do a loop on the "page" element (see the capture)

http://www.guillaumekuhmel.com/files/capture1.JPG

So now each row extracted from the xml contains page info, section info and printedproduct info.

Now i want to feed my three tables :
- first the PrintedProduct : a new ID is generated by the mysql database,
- then the section : i need to lookup my PrintedProduct table to retrieve the newly created ID and fill the foreign key of Section table with it, a new idSection is created by the mysql database,
- then the page : i need to lookup my Section table to retrive the newly created idSection and fill the foreign key of the Page table with it, a new idPage is created.

I have absolutely no idea of how to deal with that !

Have i to create one tFileInputXML for each element (ie one for PrintedProduct, one for Section, one for Page), and deal with the feed of my tables separately ?

Any help would be very very appreciated !!

Last edited by gavelin (2009-11-17 14:16:55)

Offline

#2 2010-12-10 12:43:08

fclement
New member
Registered: 2010-12-09
Posts: 1

Re: Import one XML file into multiple tables and keep the relations

I have the same question.

Did you find an answer or solution ?

Offline

#3 2010-12-10 18:04:54

gavelin
Member
Registered: 2009-11-12
Posts: 12

Re: Import one XML file into multiple tables and keep the relations

I can tell you how i've work with that :

solution one :

- read the XML file for each reference table, feed the destination table for the "reference tables" and create id's

- read the XML file then for the child rows, make a map between the previously feeded tables and feed each child tables. You have to make a join between the values of the xml and the values of the reference table (not the id of course because u dont know them in the xml), and put the id of the reference row matched in the child table.

solution two :
you can read only once the XML , then replicate the flox with tReplicate , then do the same than above with just a difference : for each flow, make a aggregate if needed.
Can be faster for huge XML files.

Offline

  • Index
  •  » Talend Open Studio for Data Integration » Usage, Operation
  •  » Import one XML file into multiple tables and keep the relations

Board footer

Powered by FluxBB