You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2008-12-23 16:08:25

LinuxChap
Member
1 post

LinuxChap said:

Parallel Processing - fixed or dynamic node clustered environment

I have been following this product for the past year but never had a chance to do a thorough evaluation of this product.  One of the reason why I admire a product such Talen is the use of Perl as a de-facto language, Perl to me is a battle tested front line tools that can manipulates zillions of rows of fact tables in a warehouse environment.  In an enterprise world zillion is the name of the game not mega anymore, especially if warehouse data has accumulated for the past 10 years.

I wanna see a Talen version that can do parallel processing on node clustered environment, wherein any file or component that can distribute node load equally either dynamic or fixed setup just like Teradata and Netezza databases processing.  Right now there are only two official ETL or EAI vendors using Parallel Processing for clustered environment, I don't want to mention those two vendors and the rest of the other vendors are purely database dependent architect not stand alone such as Talen.

Offline

#2 2009-01-05 19:00:04

plegall
Talend Team


plegall said:

Re: Parallel Processing - fixed or dynamic node clustered environment

Hello LinuxChap :-)

Thanks for your post. Maybe it's a good opportunity to summarize what currently exists with Talend in term of parallelization and clustering.

Talend Open Studio, the free of charge version, provides parallelization inside a job with:

- (new in 2.1) Job view > Extra > Multi Thread execution (threads with Java, child process with Perl), 2 subjobs with no trigger links between them (2 components are green on your job designer editor)

- (new in 2.4) iterate with parallelize option (the number of parallel executions can be dynamic with a context variable for example)

Talend Integration Suite, the charged version, adds a higher level of parallelization inside a job:

- (new in 2.4) tParallelize component to orchestrate parallel execution so that you can run A and B in parallel and C once A and B are finished. It's parallelization at subjob level.

- (new in 3.0) parallel execution on database output components (concurrency managed by database server). It's parallelization at data flow level

Talend Integration Suite also provides clustering features:

- ability to start a job on a cluster of servers (we call them "virtual servers"), and the most available server at scheduled time is selected.


As I'm responsible of parallel processing related developments, I've studied the Map/Reduce algorithm (Google Labs works on this). This is the ultimate parallelization/clustering level we would obviously like to reach. It's not planned for the near future (not for 3.1 I think), but it would be "nice to have".

Offline

#3 2012-05-20 08:04:21

amitkhatua
Member
51 posts

amitkhatua said:

Re: Parallel Processing - fixed or dynamic node clustered environment

hi plegall,
could you please tell me about those features in tos mdm 5.0.2 community version? or please provide some useful links to find those functionality in tos 5.0.2. this is some kind of urgent as i need to provide solution to my client about talend features.
thanks,
amit

Offline

#4 2012-05-20 14:19:13

ccarbone
Member
1223 posts

ccarbone said:

Re: Parallel Processing - fixed or dynamic node clustered environment

amitkhatua wrote:

could you please tell me about those features in tos mdm 5.0.2 community version? or please provide some useful links to find those functionality in tos 5.0.2.

Hi Amit,

Thanks for your post. You answered to a post wrote more than 3 years ago about parallelization. Since 3+ years, we enhance our products a lot on various topics like clustering.

If you are looking for parallelization on multiple nodes, I suggest you to look into TOS for BigData (Apache license). We leverage Hadoop Map/Reduce since the 4.0 (2 years ago).

amitkhatua wrote:

this is some kind of urgent as i need to provide solution to my client about talend features.
thanks,
amit

If you need urgent answer for your customer, you can also use Talend consulting.

______________________
Thank you for your support,
Cedric Carbone
Follow me on twitter : @carbone


Regards,
Cedric Carbone
Talend CTO
https://twitter.com/carbone

Offline

#5 2014-05-21 07:40:14

shanshan
Member
6 posts

shanshan said:

Re: Parallel Processing - fixed or dynamic node clustered environment

hi im looking for a feature in the talend integration suite that enables cluster of servers , and that the most available server at scheduled time will be  selected.

can you please send relevent links about it? i was asked in my new job to add one more server and to make it work in parallel to the second one and i'm looking for a existing solution of talend that does this.

thanks a lot ,

shani moyal

Offline

#6 2014-05-21 08:13:41

xdshi
Talend Team


xdshi said:

Re: Parallel Processing - fixed or dynamic node clustered environment

Hi shanshan,

Are you looking for Cluster of Talend Administration Center in document TalendHelpCenter:Working+in+cluster+mode?

Best regards
Sabrina


What we can do is to make sure that Talend will be your best choice!

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy