You are not logged in.
I've got an issue when inserting the records. My scenario is that I'd like to monitor a table call A1 which contains the company id, company name, contact and other information. And then check if the that company id is in B1 or not. (just use that column to check) If it is not in B1, in other words B1 doesn't contain this company info, insert that company name and other information getting from A1 to B1. If it is already in B1, then just update some columns in B1.
Here's my job design:
tOracleCDC - A1(Main) - tMap - update B1(Main Order:2) - tOracleOutput
- insert B1(Main Order:1) - tOracleOutput
(in tMap I'm joinning A1 and B1 by using the company id and didn't check inner join since B1 sometimes will contain no record;also, there are 2 outputs in tMap. if id==null, the primary key for B1, then insert B1; if id!=null then update B1)
But the problem is that if I have 2 records with same company name executed in one job session, it seems that the first insertion is not captured so that it will insert twice instead of updating the second so that I got 2 duplicate records. In other words, in tMap it will still use the old information to compare the expression(id is null or not), not the one(new id) just inserted.
It will update if I stop and restart the job again, which is not I want.
How can I make the insertion works well?
One of the reason that I can think of is that the B1 Lookup table will executed first and it won't get updated after the first insertion. Then it will insert twice. How should I change the job and make it work in the right way?
You need to change the lookup model in tMap to "Reload at each row", using the company id as the globalMap key to include in the tOracleInput component. Read Scenario 6 in the tMap help for a detailed example.
BTW, you can use an inner join. Just change the first output table's settings (for the insert) to "Catch inner join rejects". Your way is fine, though.
The same principles should apply. I should have also said you need to ensure that each row being inserted is committed so in tOracleOutput uncheck "Use Batch Size" and (if you're not using an existing connection) change "Commit every" to 1. If you are using an existing connection, set that to auto-commit.
Thanks for the quick reply. So in my case I'd like to check the new changes, then in tMap I don't have to set globalMap key and don't have to include that in the query?
One more question I didn't see "Commit every" in tOracleOutput component. Do you mean to add one tOracleCommit component separately?
No, you should not change what you had before i.e. you should still use "Reload at each row" and include the company id in the lookup query's where clause. If you remove the where clause then the job will run much more slowly since the entire lookup table will be read into Talend for each main row, rather than just for that company id.
"Commit every" is only on tOracleOutput if you are not using an existing connection; otherwise committing is controlled at the connection level. In that case, just set the tOracleConnection to auto-commit. If that's not appropriate for other parts of your job that use the same connection, then you need to have tOracleCommit connected with a row from tOracleOutput.
But as long as the CDC rows are in the same session, it won't reload the latest data in the lookup table though it reloaded when processing each row. I still didn't see one more row when reloading even it's slower(I didn't use the GlobalMap key).
And I'm not using existing connection but I'm still not able to see commit every. Does the software edition matter?