I'm a new talend user and i'm trying to do an update on a hive table. My problem is that a have some data that changed and i want to create a new row on the hive table for the new data and update the old one timestamp. It's something like:
ID CITY TIMESTAMP TIMESTAMP_END
2 NY 2009-10-02 16:52:30 2009-10-05 17:52:30
2 MIAMI 2009-10-05 17:52:30 NULL
In the above case, I had a change on the field CITY, so i updated the TIMESTAMP_END, to know when this data was changed, to keep track of the data state on a period of time, and the knew data was inseted without a TIMESTAMP_END, so i know its the most recent state of the data for the ID 2.
I want to check on Talend if the new data I receive is already there, so I do this update and insert the data, otherwise I just insert the data.
Can anyone please help me? I just cant find a way to check this and update the hive table.
Have you already centralized the connection information to the Hive database in your studio? Could you please try to retrieve the table schema of interest from the connected Hive database to see if the new data you receive is already there?
What we can do is to make sure that Talend will be your best choice!