You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2014-07-31 10:03:44

kleinmat
Member
130 posts

kleinmat said:

Aggregation manager

Hi everyone,

for the past few years I have been working with mass data applications in the context of business intelligence, customer scoring and such. All these applications have one thing in common: they must aggregate lots of data but fast.

Typical scenario:
A customer database consists of 200 database tables that hold 10 million - 10 billion rows each, so we are talking several TB of data.
Every day, 500-1000 KPI's must be calculated for each customer, such as "number of active contracts" or "average billing amount" etc., but also things like "product number with the highest rank" (which requires consulting a lookup ranking table). But all in all, it's a collection of aggregation functions.

For lack of alternatives, these aggregates are calculated by means of several SQL queries that easily span 20-40 pages (if printed). So you can imagine, maintenance is a nightmare. The problem is: due to fast changing business demands, these SQL scripts are always under construction: new aggregates, alterations in the calculating procedure of existing KPI's, deletion of obsolete aggregations... you name it.

So I would love to see a component which allows for me to drag into it all tables that need to be used in a given aggregation (often 40-50 tables are used in a giant join), graphically describe their relationship (which results in a join) and then have a tabular way of describing all needed aggregations with name, datatype, calculation formula (which source fields are used and how are they used), description etc.

By means of a mouseclick I would then also like that component to create an output table which can hold exactly the results of such aggregation run.

The component should then generate SQL code that is very fast, so optimizer hints and other things should be part of the deal out of the box.

Thanks
Matt

Offline

#2 2014-07-31 12:41:54

sanvaibhav
Member
1719 posts

sanvaibhav said:

Re: Aggregation manager

Hi Matt,

Based on what you expect in below para, 
>>
So I would love to see a component which allows for me to drag into it all tables that need to be used in a given aggregation (often 40-50 tables are used in a giant join), graphically describe their relationship (which results in a join) and then have a tabular way of describing all needed aggregations with name, datatype, calculation formula (which source fields are used and how are they used), description etc.

I think it is achievable without going for complex and difficult cycle of component development... using single tMap and tAggregate component ahead of it.

Let me know us know if you find any difficulties in tMap... or realizing your business logic.

Further more, the aggregates are better calculated on Views than using conventional sql queries... There are less overheads on using aggregate views as compared to Talend component or packages for aggregation...
or even you can think of federated tables which gives you direct result after aggregation from different tables, databases and servers...

Vaibhav

Last edited by sanvaibhav (2014-07-31 12:43:39)


Talend Certified Consultant

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy