You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2014-08-28 06:23:16

sk_19
Member
21 posts

sk_19 said:

Data sets to be tested using Talend Big Data Sandbox

Hi! Is there any way to get new data sets (other than those already provided as examples) to be tested using Talend Big Data sandbox?? Where can I get?? Can I only upload .TSV extension data sets to the sandbox or other extension files can also be used??
Please help.
Regards,
Sumbul Khan

Last edited by sk_19 (2014-08-28 06:29:29)

Offline

#2 2014-08-28 19:50:42

mbalkenende
Talend Team


mbalkenende said:

Re: Data sets to be tested using Talend Big Data Sandbox

You can FTP any dataset you would like to the Sandbox.  There are no limits.  To FTP file you can use the users/passwords which are found in the cookbook.  Please keep in mind the Sandbox is a single node Hadoop Virtual Machine.  This will not be representative of any performance you can gain using Hadoop.  You would need to create a much larger cluster consisting of 3 or more servers in the cluster.  Hortonworks, Cloudera and MapR would all recommend the same if not even more nodes.  

Also, you can use any file format you would like.  For example the Twitter example uses JSON, the Data Warehouse example works on compressed files. There is an example that process Apache Web Log format as well.

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy