Connection Analysis (Exercise 1) Learning how to connect to data sources for profiling, performing an overview analysis and taking a quick look at data with the Data Explorer.

In this exercise, you will perform a Connection Analysis which will enable you to sort the schemas that you need to profile. The exercise comprises three steps: creation of a database connection, creation of a “database structure overview” analysis and tables structure and content discovery inside the data explorer.

Prerequisites:
To follow this tutorial, you need to extract and install the cif and crm databases zipped in the exampleFile.zip file available for download in the Download it! section of this tutorial.

Download it!

You want to practice?

Download exampleFile.zip to get the files used for this tutorial.

You can also download tutorialProject.zip containing all the jobs needed to carry out this tutorial.

You can also:
Send it!

Share it!
Next Step: Performing more precise [Schema/Catalog Analysis (Exercise 2)]

 


Create a database connection


Create a new connection to the training staging database that you locally installed on your machine beforehand.

In the DQ Repository view:

Expand the Metadata node.
Right-click DB Connection.
In the menu, click New Connection to open the Database Connection wizard.

Next
In the Database Connection wizard:

In the Name field, name your connection Staging_DB.

Click Next.

Next
Configure the DB server connection where the staging databases reside.

Click Next.

Next


By default, on a MySQL server, the user does not need to specify a DB-name to connect to.

Once the connection has been created, Talend Open Studio for Data Quality or Talend Enterprise Data Quality gathers information coming from all the databases the user has access to.

Next
Create a Database Structure Overview analysis


In the DQ Repository view:

Expand the Data Profiling node.
Right-click Analyses.
In the menu, click New analysis to open the Create New Analysis wizard.

Next
In the Create New Analysis wizard:

Expand the Connection Analysis node.
Select the Database Structure Overview analysis.
Click Next.

Next


For each analysis, the right panel provides an online help describing the purpose of the selected analysis.

In the Name field, name your analysis structure_analysis.

Click Next.

Next
Expand the DB Connections node.

Select the connection to Staging_DB.

Click Next.

Do not select any filter on tables or view and click on Finish. A tab will open and display the analysis.

Next
1) Save it by clicking on the Floppy Disk button and 2) execute the analysis by clicking on the Run button.

Analysis Parameters: in this zone, you can add filters on tables and views, which is useful to restrict the analysis scope.


Analysis Summary: gives basic information regarding the last execution of the analysis.


Statistical Information: provides the analysis results. Elements in red indicate that there is no record in the selected item.

Next
During this training, we will use the cif and crm databases.
Click on cif or crm databases to display additional information.

You can see the tables (left table) and views (right table) contained in each database.



Next
Explore the databases and tables structure and content using the Data Explorer


Right click on the contract table #keys column and select View keys to display the defined primary key and open the Data Explorer perspective.


Next
In this perspective, you can select the tables in the right panel and display the columns (Columns tab) and primary key (Primary Keys tab) and also preview the contents of the tables (Preview tab).

  Next Step: Performing more precise [Schema/Catalog Analysis (Exercise 2)]

 

    Download it!     Send it!     Share it!

You want to practice?

Download exampleFile.zip to get the files used for this tutorial.

You can also download tutorialProject.zip containing all the jobs needed to carry out this tutorial.

Friends / colleagues may be interested in this tutorial? Send it to them!

You liked this tutorial ? Support it!

[ top ]