5mn Demo - Set up a Join Familiarize yourself with Talend Open Studio for Data Integration.

The 5' demo presents Talend Open Studio for Data Integration's main features. This demo is available in video format on Talend's homepage http://www.talend.com.

In this tutorial, you will learn the tMap features by joining an input file and a database table, and mapping and transforming the data to create a database table.

For ease of use, this tutorial is divided into four segments:
- [5mn Demo - Talend Open Studio for Data Integration Overview]
- [5mn Demo - tMap Essential Features]
- [5mn Demo - Set up a Join]
- [5mn Demo - Export Job Script]


Prerequisites:

To follow this tutorial, you need to download and extract the exampleFile.zip file available at the bottom of this page, in the Download it! section of this tutorial.

Once unzipped, you will get three new files:
- customers_demo5mn.csv, the input file used for the join.
- states_demo5mn.txt, a file containing the data to be loaded in the database table used for the join.
- demo5mn_prerequisite.zip, a zip file containing Jobs to import in the Studio creating the tables to be used for the join from the states_demo5mn.txt and customers_demo5mn.csv files. (To import the two prerequisite Jobs, click the Import items button from the Studio, select the demo5mn_prerequisite.zip in the Select archive file field and select the two Jobs: Customers2DbTable_0.1 and States2DbTable_0.1.)

Once the prerequisite Jobs imported, change the settings of their components to match your configuration and execute them.

Now the two tables: customers and states needed for this tutorial, are created.

Download it!

You want to practice?

Download exampleFile.zip to get the files used for this tutorial.

You can also download tutorialProject.zip containing all the jobs needed to carry out this tutorial.

You can also:
Send it!

Share it!
Next Step: [5mn Demo - Export Job Script]

 


Create and use Db Connection metadata


In the Repository on the left:

Right-click on Db Connections.

In the menu, click Create connection to open the Database connection wizard.

Next

In the Database connection wizard:

In the Name field, fill in the name of the connection: demo5mn.

Click Next to continue.

Next

In the Database Settings area:

In the DB Type list, select the type of database you want to connect to.

In the Login and Password fields, fill in the login and password used to connect to the database.

In the Server field, fill in the name or IP address of the database server.

In the Port field, fill in the listening port number of the database server.

In the Database field, fill in the name of the database you want to connect to.

Click Check to verify the database connection.

Next

The Check Connection message box will indicate if the connection is working.

Click OK.

Next

If the connection has succeeded, click Finish to close the wizard and create the new connection.

If the connection has failed, change the database settings and click Check again.

Next

In the Repository on the left:

The new connection demo5mn will appear under the Metadata > Db Connections node.

Right-click on it.

In the menu, click Retrieve Schema to open the Schema wizard.

Next

In the Schema wizard:

In the Select Filter Conditions area, select Use the Name Filter or Use the Sql Filter depending on the type of filter you want to use.

In the Select Types area, select the type of item you are looking for:
table, view and/or synonym.

In the Set the Name Filter or Set the Sql Filter areas, set the filter you want to use.

For this tutorial, leave the default settings.

Click Next.

Next

In the Select Schema to create area, enter the name of the table you are looking for in the Name Filter field and check the corresponding box in the Name list, or simply check the table in the list.

Here, we select the customers and states tables.

Click Next.

Next

In the Schema area on the left of the wizard, select the table.

Once selected, its name displays in the Name field at the top of the wizard. You can modify it.

Its structure also displays in the Schema area at the bottom of the wizard. You can modify it via the toolbar below the table.

For this tutorial, we don't have to modify the structure.

In the Schema area on the left, click the other table.

Next

The Name field displays the name of the table.

The Schema area on the bottom displays its structure.

Click Finish.

Next

In the Repository on the left:

The new schemas are displayed in the Metadata > Db Connections > demo5mn > Table schemas node.

To add the input component, press Ctrl+Shift while you drag and drop the states metadata into the Job Designer above the tMap.

The tMysqlInput component is automatically created in the Job Designer.

Next

In the Job Designer:

Double-click the tMysqlInput to show the corresponding Component view to set its Basic settings.

In the Component view:

Click [...] next to the Query field to open the SQL Builder wizard.

Next

In the SQL Builder wizard:

Click the Designer tab at the bottom of the demo5mn(demo5mn).sql view.

Uncheck the ID, Capital and MostPopulousCity boxes.

Click the running man icon.

The result of the query displays in the Result: 1 view at the bottom of the wizard.

Click OK.

Next

In the Job Designer:

To link the components, right-click on tMysqlInput, hold, and drag to the tMap.

Next

In the Job Designer:

Right-click on the tLogRow. In the menu, click Delete.

In the Repository on the left:

To add the output component, press Ctrl while you drag and drop the customers metadata in the Job Designer, to where the tLogRow was.

Next

In the Job Designer:

Double-click on the tMysqlOutput to show the corresponding Component view and define its Basic settings.

In the Component view:

In the Action on table list, click Drop table if exists and create to replace the old version of the table with the new one.

Next

In the Job Designer:

To link the components together, right-click on the tMap, hold, and drag it to the tMysqlOutput.

In the tMap_1 Output wizard, name the link between the tMap and the tMysqlOutput: out1.

Double-click on the tMap to open the tMap editor.

Next
Set the processing component parameters

In the tMap editor:

Click the row1 table and click [x] to delete it.

In the row2 table, select all the columns except states and drag them to the out1 table.
Now select the states column and drag it to the Postal column of the row3 table to create the join.

In the row3 table, select the State column and drag it at the end of the out1 table.

Next

In the out1 table, select the row3.State column. Go to the out1 table in the Schema editor area, and set it length to 14 characters.

Click OK
Next
Run the Job

In the Job Designer:

Press Ctrl+S to save the Job.

Click the Run view.

Check the Statistics and Traces boxes.

Click Run to execute the Job in Statistics and Traces mode.

In the Job Designer:

The Traces table displays the data being processed.

In the Run view:

Click Kill to stop the execution of the Job.

Next

In the Run view:

Uncheck the Traces box.

Click Run again.

Next
  

The Job is working!

It creates a join between two inputs and populates a database with the aggregated data.

Now we want to export the Job.

  

 

    Download it!     Send it!     Share it!

You want to practice?

Download exampleFile.zip to get the files used for this tutorial.

You can also download tutorialProject.zip containing all the jobs needed to carry out this tutorial.

Friends / colleagues may be interested in this tutorial? Send it to them!

You liked this tutorial ? Support it!

[ top ]