  |
 | Create a Job Design
|
In the Repository on the left of Talend Open Studio main screen: Right-click on Job Designs. In the menu, click on Create Job to open the New Job wizard.
Next
|
  |
In the New Job wizard: In the Name field, fill in the name of the Job: howToSortFile. Click Finish to close the wizard and create the Job. The Job Designer opens an empty Job.
Next
 | In the Name field, accents, special characters and spaces are invalid. Also do not use numbers to start the field.
|
|
  |
 | Set the connector reading the delimited file parameters
|
In the Palette on the right: To add an input component, click the File family and the Input sub-family. Click the tFileInputDelimited component and drop it on the Job Designer.
Next
|
  |
In the Job Designer: Double-click the tFileInputDelimited to show the corresponding Component view to define its Basic settings. In the Component view: To specify the path to the customer.csv file, click [...] next to the File Name field and select the file from the wizard. To describe the structure of the file, click [...] next to the Edit schema field to open the "Schema of tFileInputDelimited_1" wizard.
Next
|
  |
 | Set the structure of the data flow schema
|
In the Schema of tFileInputDelimited_1 wizard: To describe the columns of the customer file, click (+) nine times. Nine lines are added to the schema, you can set them according to your file as shown in the next step.
Next
 | For schema with multiple columns, you should use Metadata.
|
|
  |
In the Schema of tFileInputDelimited_1 wizard: In the Column column, rename each field according to the file columns. In the Type column, set the type field for each column. In the Length column, fill in the length of each field of your schema. Click Ok to close the wizard.
Next
|
  |
 | Set the connector writing in the delimited file parameters
|
In the Palette on the right: To add the output component, click on the Output sub-family. Click on the tFileOutputDelimited component and drop it on the Job Designer.
Next
|
  |
In the Job Designer: Double-click tFileInputDelimited to show the corresponding Component view to define its Basic settings. In the Component view: To specify the path of the file you are creating, click [...] next to the File Name field. In the wizard, define the same path as for the customer.csv file but name it temp.csv. Check the Include Header box to retrieve the column names of the file.
Next
|
  |
 | Define the processing component and link the components
|
In the Palette on the right: To add the component sorting the data, click on the Processing family. Click the tSortRow component and drop it on the Job Designer.
Next
|
  |
In the Job Designer: To link the components, right-click on tFileInputDelimited, hold and drag to the tSortRow. Do the same to link the tSortRow to the tFileOutputDelimited.
Next
 | You can also right-click on the component and select Row > Main on the right-click menu to link the components.
|
|
  |
In the Job Designer: Double-click on the tSortRow to show the corresponding Component view to define its Basic settings. In the Component view: Define the sorting criteria by clicking (+) to add a line to the Criteria table. Select the column you want to sort as shown in the screenshot.
Next
|
| | |
|
At this point, the Job will create a new file named temp.csv containing the sorted data. As the purpose of the Job was to sort the source file and not to create a new one, we have to replace the source file by the new one.
Next
|
 |
| | |
  |
 | Define the file managing component and link it to the first subjob
|
In the Palette on the right: To replace the source file with the new one, click the File family and the Management sub-family. Click the tFileCopy component and drop it on the Job Designer, under the tFileInputDelimited component.
Next
|
  |
In the Job Designer: To link the first Subjob to the tFileCopy component right-click on tFileInputDelimited and select Trigger > OnSubjobOk from the menu. Click on tFileCopy to draw the OnSubjobOk link. In the Job Designer: Double-click on tFileCopy to show the corresponding Component view to define the Basic settings.
Next
|
  |
In the Component view: To copy the temp.csv file containing the sorted data, click [...] next to the File Name field and specify the file path. To specify the folder in which you want to copy the file, click [...] next to the Destination directory field and select the file path of the customer.csv source file. To replace the source file with the sorted file, check the Rename box and enter customer.csv between quotes. To delete the temporary file, check the Remove source file box.
Next
|
  |
 | Run the Job
|
In the Job Designer: Press Ctrl+S to save the Job. Press F6 to run it. The Run view displays at the bottom of Talend Open Studio and the console follows the Job execution.
Next
 | Check the Statistic box in the Run view and run this Job again: this option will show you how the Subjobs are orchestrated.
|
|
| | |
|
The howToSortFile Job is working! It comprises two Subjobs: - sorting data in a temporary file, - replacing the source file by that temporary file. Now you have to document it!
Next
|
 |
| | |
  |
 | Document the Job
|
In the Job Designer: To document your Job, add a title to each Subjob. To do so, click in the blue area around the first Subjob. Click the Component view. Check the Show subjob title box and, in the Title field, fill in the corresponding title: Sorting data in a new file. Title the second Subjob Replacing the source file. Save the Job again.
Next
|
| | |
|
This tutorial is finished. The Job is working and it's documented. It's your turn now!
|
 |
| | |