You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2006-11-17 18:04:32

Richard
Guest

Richard said:

tMap with Mysql question, 2 scenarios

Hi,

Scenario 1:
=================================================================
I have a csv-File for input with the following content:

Item;Size;Description;Brand;Quantity;Price

Then I have to do a lookup in a mysql-table:

select item-id from table foo
where
(brand-from-csv-file) = field_brandnames and
country = 'XYZ' and
(Item-from-csv-file) = field_itemid

Then I have to create a csv-file with the following format:

Item;Size;Description;Brand;Quantity;Price;(field_itemid-from-mysql-lookup)

That means if the lookup was successful, the field (field_itemid-from-mysql-lookup) contains the same value as Item, but if the lookup was NOT successful the field has to remain empty.

The csv file normally has 200-400 lines, the mysql table for lookup has 79 columns and about 700000 records. The fields I am looking for are not indexed and will not in the future.

How do I have to set up tMap?
=================================================================

Scenario 2:
=================================================================
I have a csv-File for input with the following content:

Date;Size;Description;Brand;Price

The field 'Description' has a format like 'xxx xx-xx/xxx xxxxxx' where x represents a ascii char. I need to extract the second 'block' of chars (xx-xx/xxx). Then I have to delete the - and the / char. At last I have to add a leading 0 (zero). The result (0xxxxxxx) is now used for my query.

Then I have to do a lookup in a mysql-table:

select item-id from table foo
where
(brand-from-csv-file) = field_brandnames and
country = 'XYZ' and
(0xxxxxxx-result-from-csv-file) = field_itemid

Then I have to create a csv-file with the following format:

Size;Description;Brand;Quantity;Price;(field_itemid-from-mysql-lookup)

The column 'Quantity' must be set to 0 (zero) for all lines.

If the lookup was successful, the field (field_itemid-from-mysql-lookup) contains the same value as Item, but if the lookup was NOT successful the field has to remain empty.

The csv file normally has 200-400 lines, the mysql table for lookup has 79 columns and about 700000 records. The fields I am looking for are not indexed and will not in the future.

How do I have to set up tMap?
=================================================================

I hpe you can help.

#2 2006-11-17 23:54:38

plegall
Talend Team


plegall said:

Re: tMap with Mysql question, 2 scenarios

Richard wrote:

Scenario 1: [...]

You need a tMap and 2 input components: a tFileInputDelimited (or tFileInputCSV if you really have CSV file format and not simple delimited) as main row and a tDBInput as lookup row. You also one output component: tFileOutputDelimited.

In your tDBInput, the query must be:

'
SELECT field_itemid, field_brandnames, item-id
  FROM foo
  WHERE country = \'XYZ\'
'

The schema of your tDBInput must match the 3 selected columns, call them "field_itemid", "field_brandnames" and "item_id". The two first columns must be set as schema keys.

The join will be made in the tMap, as you can see on screenshots.

The number of columns in table "foo" is not important as long as you don't put all of them in the lookup. The number of line in table "foo" can be a problem if it's really too large. You said you had 700,000 records, but as you filter on the country, I suppose it's less (twice at least).

Offline

#3 2006-11-18 00:23:21

plegall
Talend Team


plegall said:

Re: tMap with Mysql question, 2 scenarios

Richard wrote:

Scenario 2 [...]

Same principle as Scenario1, the trick is to transform the description from "abc de-fg/hij klmnop" to "defghij". I propose 2 solutions, the first one is to write this piece of funny complicated Perl code directly in the tMap, in the join expression for the "Description":

(map {s{[-/]}{}g; $_} ((split /\s+/, $row1[description])[1]))[0]

or you can (you should, to tell the truth) create a dedicated function in the project routine Perl module:

# "abc de-fg/hij klmnop" as input
# "defghij" as output
sub extractKeyFromDesc {
    my ($desc) = @_;

    # extract the second token of the string splitted by spaces
    # "de-fg/hij" is now in $middle
    my $middle = (split /\s+/, $s)[1];

    # replace all "-" and "/" in the string
    $middle =~ s{[-/]}{}g;

    return $middle;
}

Don't forget to add the new function extractKeyFromDesc in the @EXPORT array at the beginning of the file. Then in the expression for the join on description field, write:

extractKeyFromDesc($row1[description])

Less fun, but far more readable and maintainable, isn't it?

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy