  |
 | Simple date validation on Contract table
|
Our first data quality rule analysis consists in creating a simple Data Quality (DQ) rule to compare the contract begin and end dates. Rule - contract end date should be after the begin date. To do so, we will use the Begin_dt and End_dt date columns of the contract table in the crm catalog.
Next
|
  |
To create the DQ rule:Expand the Libraries and Rules nodes. Right-click the SQL folder and select New DQ Rule in the menu. Name it contract_dates. Click Next. Set the Where clause with the following value: Begin_dt < End_dtClick Finish.
Next
|
  |
Tips:You can directly drag and drop the columns in the Where clause field.
Next
|
  |
Now we have to create a new Table Analysis. In the DQ Repository view:Expand the Data Profiling node. Right-clik Analyses. In the menu, click New Analysis to open the Create New Analysis wizard.
Next
|
  |
In the Create New Analysis wizard:Expand Table Analysis and select DQ Rule Analysis. Click Next. Name it contract_dates and click Next. Browse to the crm database and select the contract table, on which you want to perform the analysis on. Click Next.
Next
|
  |
Expand the Rules and SQL nodes. Select the contract_dates DQ rule that we just created. Click Finish.
Next
|
  |
Save and run the analysis. Display the Analysis Results tab.
Next
|
  |
You can now right-click on the contract_dates line or on the chart to see invalid rows by selecting View invalid rows in the menu.
Next
|
  |
The invalid rows are displayed in the SQL Editor. You can export them into a file to correct the data. To do so: Right-click the data, select Export... in the menu.
Next
|
  |
 | Verify that Customers are 18 years old or more
|
Rule: Legal constraints do not allow a person under 18 to have a contract, which means no customer can be less than 18 years old (but a minor person can be a contract beneficiary). To do so, we will work with the Cust_type and DOB columns of the cust table.
Next
|
  |
Expand the columns of the cust table of the crm database. Right-click the Cust_type column and select Preview in the menu. A preview of the Cust_type data displays on the SQL Editor. We can see that Cust_type can have 3 different values: beneficiary, customer and prospect.
Next
|
  |
Create a new DQ Rule named cust_age with the following Where clause: 18<=((TO_DAYS(NOW())-TO_DAYS(DOB))/365)
Create a new DQ Rule Analysis named cust_table. Apply on the cust table and select the cust_age DQ Rule. Click Finish.
Next
|
  |
In this analysis, we want to check that customers are not under 18. We need to exclude other person types: beneficiary and prospect. To do this, we will add a constraint on the cust_table analysis. Add a data filter by typing: Cust_type="customer"Save and run this analysis.
Next
|
  |
Click the Analysis Result tab to display the results of the analysis. We can see that 16.56% of the customers are under aged.
Next
|
  |
 | Phone number should not be empty for Customers
|
Rule: If a person is a customer, his phone number must be known. 1) We need to define the Where clause on Cust_type and Phone_num. 2) Create a new DQ Rule named cust_phone with the following Where Clause: Phone_num!=""
Next
|
  |
1) Drag and drop the cust_phone DQ Rule on the cust table in the cust_table analysis. 2) Save and 3) run the analysis.
Next
|
  |
Click the Analysis Results tab to display the results of the analysis. We can see that we don't have the phone number of 7.50% of our customers.
|