Talend Exchange is the place where Talend community can share items related to Talend opensource products, such as Data Integration, Data Quality and Data Master Management. Contribution is open to any user, no specific validation is needed. As soon as you have your forum account, you automatically get a Talend Exchange account.


Show

Category
Search
Version
Author
 

Statistics

  • 500 extensions
  • 820 revisions
  • 223 contributors
  • 109969 downloads
 

Top Contributors

Version Author Released on Rating Downloads
Java

tHTTPTableInput

4 dthonon 2009-05-29
253

This Component extracts HTML-Tables from a given URL.

The \'Syntax for Table\' means:
T=3 / Third(fourth) Table inside the page
C=0,0 / The Column at the position 0,0 in the third(fourth) table
T=1 / The first(second) table inside Cell 0,0 from the third(fourth) [optional]

Exemple 1: "T=3;C=0,0;T=1"
Exemple 2: "T=2;C=0,0;" with the change of revision 3
Exemple 3: "T=2;" with the change of revision 3

Routine

Securities Validation

1 hugo 2009-05-25
204

Validate Fixed Income and Equity Securiites.
This routines calculates and compares the check digits for the follow securities

Isin
Sedol
Cusip


Java

tmOracleOutput

1.2 madamovic 2009-05-14
117

This component adds "Delete obsolete" functionality to the existing tOracleOutput component.

"Delete Obsolete" option is enabled for all variants of "Insert and Update" data actions. If Delete Obsolete is turned on, all records from the target table that were not inserted or updated from the input row set, will be deleted. In other words, records from the target table that do not exist in the source set (matched on predefined IDs) will be deleted. This is like full synchronization between the source and the target record set.

Parameters (in addition to the existing tOracleOutput parameters):
- Delete obsolete records – enable deleting the obsolete records.
- Where clause condition for delete obsolete - optional where clause for deleted records. Only records that satisfy the where clause and are not in the input set will be deleted.

The component uses HashSet to store the IDs of the processed records. The limitation is that this is not efficient for large data sets. The component can be further enhanced to use disk implementation of HashSet, or to store the IDs of processed records in a database table.

Java

tmMysqlOutput

1.2 madamovic 2009-05-13
170

This component adds "Delete obsolete" functionality to the existing tMysqlOutput component.

"Delete Obsolete" option is enabled for all variants of "Insert and Update" data actions. If Delete Obsolete is turned on, all records from the target table that were not inserted or updated from the input row set, will be deleted. In other words, records from the target table that do not exist in the source set (matched on predefined IDs) will be deleted. This is like full synchronization between the source and the target record set.

Parameters (in addition to the existing tMysqlOutput parameters):
- Delete obsolete records – enable deleting the obsolete records.
- Where clause condition for delete obsolete - optional where clause for deleted records. Only records that satisfy the where clause and are not in the input set will be deleted.

The component uses HashSet to store the IDs of the processed records. The limitation is that this is not efficient for large data sets. The component can be further enhanced to use disk implementation of HashSet, or to store the IDs of processed records in a database table.

Java

tSharepointFile

1.0 jjolley 2009-05-12
319

This component allows you to grab any file from a Sharepoint server through http.
It performs the necessary NTLM authentication. The component takes the sharepoint file and creates a temporary copy of the file. The temporary file name is stored in tSharepointFile.FILE and can be used with the rest of Talend's components. The temporary file is deleted once the job has completed. (knowledgerelay.com)

Perl

tFileInputXbase

0.3 plegall 2009-05-12
352

Read DBase and FoxPro files with the XBase Perl module.

Perl

tOneToMany

3 plegall 2009-05-12
392

This component is a Proof Of Concept : a row component (taking a data flow as input) and creating several distinct data flow as output. Each output has a distinct schema, that you can set dynamically, at design time.

This component needs at least trunk r20522 (it will be available in 3.1.0M1).

Job

CofigurableJobUsingSingleton

0.1 pravu 2009-05-11
891

Problem Definition
The problem definition of the ETL job is as mentioned below.
1. The Configuration values like database credentials, log file location and name needs to be kept in a XML file
a. Name and Location of the Log File can be changed without modifying the ETL job
b. By editing the configuration file, the user can change the database credentials for the source and target database.
2. The ETL job must support both Windows and Unix family operation system
3. Validation of configuration file needs to be done
a. Whether the mentioned database credentials in the configuration file is correct or not needs to be informed to the user in the log file. Even the database credentials is correct and still it is not possible to connect to a database because might be the database is down then also the ETL job needs to log about this in the log file.
b. The log file path mentioned in the configuration file is correct or not needs to be informed to the user in the console
4. The configuration file needs to be passed from command line because there are more than one instance of the job are expected to be executed at the same time. It means multiple instances of target database are having the same structure. So multiple instances of the same job having different configuration file can migrate the data from the source database in case we need to make the target database values same at the same time. The values in the configuration files like target database name, Ip address must be different in all the configuration files.
5. The command line configuration file name and location needs to be checked by the ETL job and should inform the user, in case it is wrong it must exit from the job. The ETL job can use the console to inform about the wrong command line configuration file name
6. The configuration file should not be loaded each and every time from the disk whenever the values in the configuration file needs to be used by any ETL sub job. It means the configuration file should not be loaded for each sub jobs those use the content of the configuration file. The configuration file must be loaded only once and the values must be kept in the memory and to be used by all sub jobs.
7. There should be a log file and that should tell about the execution of the main job and sub jobs.
a. Information about start and end of each sub job and main job with status and time information should be kept in the log file.
b. In case any record is rejected while inserting the data, it should be kept in the log file with date, time and with an error message
c. Number of records fetched from source database and number of records processed and inserted into the target database must be kept in the log file
d. The log files for each instance of the job must be different and the user needs to be advised to do so. The user should not use the same log file for all the instances get executed at the same time. Other wise the log file will contain garbage
e. The ETL should manage to create log file according to date. It means the ETL will append the date value with the log file name mentioned in the configuration file.

Java

tFileOutputPDF

1.2 cahsohtoa 2009-05-11
3359

This is the first version of the component that allow you to export your data in a PDF file.
Please have a look to the advanced settings because you would find a lot of parameters to customize your result.
I hope it will be helpfull

Java

tRunJobRow

0.1 bcourtine 2009-05-08
426

This component was created to run another job, sending to the subjob data rows, and getting back result rows :

- input and output schemas of the subjob can be different (technically, the tRunJobRow component has only an output schema)
- input and output row line numbers can be different

To work fine, this component NEEDS the tBufferCopyInput component.

User manual and explanations :
1) In the main job, data rows are sent to a tBufferOutput
2) In the subjob, data rows are read with a tBufferCopyInput. This component also cleans the global buffer for the next tBufferOutput
3) In the subjob, output data rows are sent to a tBufferOutput

See the screenshot for a real example.

Show

Category
Search
Version
Author
 

Statistics

  • 139 extensions
  • 172 revisions
  • 23 contributors
  • 12667 downloads
 

Top Contributors

Version Author Released on Rating Downloads
Regex

US Zipcode Validation

3.1.2 mhallam 2009-07-07
83

matches only if 5 numerics are presentnot matches if less than 5 numerics
Zipcode should be of 5 numerics

Regex

Swedish personnummer with accepted foreigners

3.1.2 mhallam 2009-07-07
67

with a "P", "T", or "F" instead of the first of the four last numbers.

This code fixes the problem, but does not check the validity of the date, or the last number.

Regex

Swedish Personal Nr (Personnummer)

3.1.2 mhallam 2009-07-07
65

Simple regex for the Swedish personal number. It's in the form: YYMMDD-xxxx where xxxx is an arbitrary number from 0000-9999.

791231-1234

Regex

UK Vehicle Registration Plate Number Plate

3.1.2 mhallam 2009-07-07
52

AB12 RCY|||CD07 TES|||S33 GTT|||Y999 FVBab12 rcy|||CD07 TIS|||S34 GTT|||Z999 FVB

UK Vehicle Registration Plate / Number Plate format as specified by the DVLA. Accepts both "Prefix" and "New" style. Allows only valid DVLA number combinations as not all are supported. Registration number must be exactly as is displayed on car, hence all letters must be in uppercase and a space seperating the two sets of characters.

Regex

Phone Brazil

3.1.2 mhallam 2009-07-07
56

011 5555-1234|||(011) 5555 1234|||(11) 5555.1234|||1155551234(011 5555-1234|||(01) 5555 1234|||(11) 0555.1234|||(11) 5555 abcd


Matches Brazilian phone numbers, includes DDD (long-distance call) with or without 0. Accepts characters -, . and [space] as separators.

Regex

No special chart

3.1.2 mhallam 2009-07-07
32

QDE|||QDE#RF

Allow only 3 charcters or number without any special characters

Regex

Mobile number of India

3.1.2 mhallam 2009-07-07
38

This expression will be useful to check mobile
number of India. This expression will check
various types of number like number in local
format or international number separated by
hyphen(-) or not.

Regex

International phone number

3.1.2 mhallam 2009-07-07
61

Matches most international formated phonenumber.

Regex

International Passport

3.1.2 mhallam 2009-07-07
34

? 9 characters made up of a combination of numbers and/or letters. Where less than 9 characters it will be padded out to the right with chevrons (

Regex

ISBN Checker

3.1.2 mhallam 2009-07-07
53

Expression to check for a valid ISBN number

Show

Category
Search
Version
Author
 

Statistics

  • 5 extensions
  • 7 revisions
  • 4 contributors
  • 3382 downloads
 

Top Contributors

Version Author Released on Rating Downloads
Export

DStar

1.0 ctoum 2010-08-04
1096

D* Industries Demo Model


56 ms