You are not logged in.
Announcement
Unanswered posts
|
lijolawrance wrote:
Hi
use tHashInput/output components
I Agree.
Sorry I've also implemented thiswith hash ... that's the reason I've put the post here (I'm familiar with datastage hash).
I was confusing with my another post http://www.talendforge.org/forum/viewtopic.php?id=22948
where I'm asking about the possibility to update the KEY of an "in memory" lookup.
A sort of
update lookup set KEY= X where KEY= Y
Please if you want write there.
Many many thanks!
Andrea
Hi
use tHashInput/output components
janhess wrote:
use a database table?
OYes, I didi in it in this way.
I'd like to know if exists a way to do this "in memory" ....
Arrays could help ?
use a database table?
Hi.
Datastage HASH is used not only for lookups, but also to implement other techniques.
For example, wrting and reading from the same hash enable you to know wich records you inserted in the hash up to the previous record. This technique helps you in many cases.
It's a sort of a db insert commiting every record.
I need to undestand how to implement this technique with Talend.
Many thanks,
Andrea
amaumont wrote:
- you can't reuse already stored data in memory for the moment, this is a planned feature.
- you can't reuse already stored on disk data for the moment like DataStage, this is a planned feature.
Have those features been implemented or are they still planned ?
Thanks,
Nicolas
thanks.. i will try it once i have ported to 2.3.3 from 2.1.4 successfully.. running into some problems with my custom components ![]()
writing custom hashing and mapping processing is a lot of trouble and tough to maintain...
tlittle,
Yes you can try 2.4 !!!
You can have both 2.3.3 and 2.4 on the same computer.
Use import project, and install TOS 2.4 in antoher folder.
Regards,
Hi,
I am currently using TOS 2.3.3. I am wondering if TOS 2.4 can help me solve my issue:
My hash lookup file are a few GBs in size (more than physically available memory), would it be better to move over to TOS 2.4 and use the "store on disk" option?
Previously, I have tried to use a DB lookup table but the performance was horribly slow (I can't remember how slow exactly but it was slower by a few magnitudes as compared to using in-memory hashes). I resorted to splitting the hash by some predetermined categories and it helped to reduce memory consumption.
Please advise if I should move over to TOS 2.4
Thank you ![]()
Thank you both for answering. It was very helpful ![]()
Hi dsg78,
I can resume the current behavior of Talend about the lookup subject:
- Talend store by default lookups data in memory once (before start the current subjob). Hash is implicit, so you needn't to use a specific component before tMap or tJoin lookups.
- since TOS 2.4, tMap component allows to store on disk lookup data (and temporary join data) by checking the 'Store on disk' option on the lookup as you want.
- you can't reuse already stored data in memory for the moment, this is a planned feature.
- you can't reuse already stored on disk data for the moment like DataStage, this is a planned feature.
I hope these informations will help you.
amaumont
Of course, Talend and Datastage are different products with a different philosophy.
Talend doesn't need you to manually create HashFiles for lookups operations.
You can still do it if you want to, but this is simply not required.
HTH,
Hi all,
I previously worked with DataStage and am just starting with Talend. I`m not quite sure how to do certain things I did with Datastage. For example, I created hash-files for lookups. All these hashfiles were created in seperate jobs and could be run parallel in the job sequence. In the main job I could use them again as lookups.
In Talend, I used the data-source (Informix database) directly for the lookup, since there are no real hash-files (I think). Is this the best way to do it?
I hope I explained my problem well enough for someone to be able to help me.
Thanks,
Heidi