You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2008-10-06 11:24:06

scorreia
Talend Team


scorreia said:

Need a tutorial on Open Profiler?

Tags: [doc, top]

First, check the documentation section: http://www.talend.com/resources/documentation.php
For the 1.0.0 version, we only had a "Getting Started" guide, now with the 1.1.0 version, we have a complete user guide.

Official tutorials are still missing, but Dylan Jones at Data Quality Pro has achieved a great work at presenting how to use Talend Open Profiler on a real data sample. Check it out at http://www.dataqualitypro.com/data-qual … al-in.html

Dylan's tutorial comes with a data sample and explains how to profile them. It also gives useful tips on how to interpret the results. Some questions like the following are easily answered with TOP: Do you need to know whether a column could serve as a primary key? Do you need to identify your duplicate data? Do you want to see unexpected data ("outliers")? Do you need to plan your data quality tasks?...

All features of Talend Open Profiler are not described in Dylan's tutorial, but it's a very good start.

Other valuable information and tools on data quality in general are available on the Data Quality Pro website. Give it a look.


Among the features not covered by the tutorial, here are a few that are worth to mention if you need to go further:
- you can set thresholds on indicators and TOP will highlight the results that do not respect your expected ranges.
- You can study slices of numeric data with the frequency indicator (for example, What is the repartition of the age of your customers in given slices: 10-20, 20-40, 40-65 year?)
- You can evaluate how many of your email (or any other kind of data) are well formed and see which ones are invalid by using the "Pattern" indicators.
- You can add your own patterns (regular expressions) to the list of the existing ones.
...


Thank you for your support,
Sebastiao Correia.

Offline

#2 2008-10-21 14:33:50

Dylan Jones
Guest

Dylan Jones said:

Re: Need a tutorial on Open Profiler?

Hi Sebastiao!

Thanks for sharing the tutorial information on talend forge, I hope others here find it useful too.

We've got more Talend Open Profiler special tutorials coming in the next few weeks and it's really positive to see both communities interacting like this.

I think you've done a great job with the product so far and it has some really neat features, not just for analysing and detecting DQ problems but for generating a DQ workflow in the organisation which is what it's all about so well done and I wish you continued success.

We're open to suggestions for the focus of future DQ tutorials so I would ask others in this community to send me their ideas and we'll put them into the pipeline.

Just drop me a line here: http://www.dataqualitypro.com/data-quality-dylan-jones/ if you want to see anything specific on DQ in a tutorial.

Thank you again,
Dylan Jones

Founder/Editor
Data Quality Pro (http://www.dataqualitypro.com)
Data Migration Pro (http://www.datamigrationpro.com)

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy