You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2014-12-24 12:24:36

hemant056
Member
1 post

hemant056 said:

tHBaseInput - Incremental data read from Hbase

Tags: [bug, context]

Problem Statement1: We are working on a project where we are supposed to read only the incremental data from Hbase table (Cloudera 4.6). A background process is running 24*7 and it is loading the data in Hbase table.

Approach: we are using a tHBaseInput component to read the data from the Hbase table, but we are not able to find any filter where we can provide the timestamp value so that it can read only the data which is loaded after the last run. 

I am not sure if i am missing something on the component or is it the limitation in talend tHBaseInput. I am using 5.4.2 Talend Big data.  

Problem Statement2: By default tHBaseInput uses scan to fetch the data form Hbase table and the cache size of this object is set to 1, which means the map-task will make call back to region-server for every record processed. Due to this the tHbaseInput is taking a lot of time to read from Hbase table (30 Mins for 1 Lakh records). We tried to do it in java by creating a new scan object and setting the cache size as 1000 and we were able to read 1 Lakh records in just 2 Minutes.

Do we have any properties in tHBaseInput where we can increase the default cache for scan. 

Offline

#2 2014-12-30 03:43:14

shong
Talend Team


shong said:

Re: tHBaseInput - Incremental data read from Hbase

Hi

1.) Is there a timestamp field in your table?
2.) Have a try to add the related property in the advanced settings panel of tHbaseInput component.

Best regards
Shong


Email:shong@talend.com
Choose Talend, Enjoy Talend!
New & Event: Talend Help Center
Talend-->the global leader of open source data management and application integration solutions!

Offline

#3 2016-03-07 07:33:17

ptarwadi
Member
3 posts

ptarwadi said:

Re: tHBaseInput - Incremental data read from Hbase

Hi
hemant056, have you got any solution for this?
I am facing same problem. I have input data in the form of epoch time. Urgent help needed.

Offline

#4 2016-03-14 06:49:49

ptarwadi
Member
3 posts

ptarwadi said:

Re: tHBaseInput - Incremental data read from Hbase

Hi Shong,
Can you please help me with this? I have timestamp field in my input database, which is epoch time in byte[].

Offline

#5 2017-04-07 18:14:16

srkalakonda
Member
8 posts

srkalakonda said:

Re: tHBaseInput - Incremental data read from Hbase

Hi
I have a similar requirement same as the above. Can I read the data from HBase table based on regions??

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy