Data quality rules can be run during different phases of the pipeline. This makes it possible to:
Divide the load on throughout the system.
Apply different scenarios for data quality running.
The following phases are available:
ORIGIN. Runs first. If at least one validation quality rule produces a critical error, the process of inserting / updating records stops. If this phase contains enrichment quality rules, the records will also be updated until the first validation error occurs.
ETALON. Any validation errors in this phase are only indexed. The quality rules update the records in enrichment mode. The phase is started after the validity periods have been saved in the system.
How Execution Phases Work¶
Phases are added to the RECORD_UPSERT_START pipeline.
The description of the phases is created in the “Data Quality” section → Execution Phase tab.
After the phase description is created, columns with the names of the execution phases appear in the “Data Quality” section → Assignment tab.
While the quality rules assigning, the entity (lookup entity) should be selected. One or more sets of quality rules are specified in the corresponding columns of the phases for the the entity (lookup entity).
When data is loaded into the selected entity (lookup entity), the RECORD_UPSERT_START pipeline is started.
ORIGIN phase is started first. During the data loading process, all quality rule sets that were specified for this phase and for this entity (lookup entity) are triggered. If the phase ends without critical errors, the insertion of records will continue.
ETALON phase is started after all the data of the record validity periods has been loaded. The ETALON phase uses validity periods, so this sequence is mandatory. This phase does not produce errors, so it will end by storing all error data in the indexes.
As a result, the data must be loaded into the system (including all quality errors, transformations, etc.).
Before you start:
Make sure that the phases are added to the pipelines.
Go back to the “Quality Rules” section, if this has not been done before.
Select the “Switch to advanced mode” option.
Make sure that draft mode is enabled and draft is selected.
Go to the “Execution Phase” tab.
To add a phase:
Click “Add Execution Phase” in the upper right corner of the table.
This will open a drawer with the settings for the new phase.
Specify the logical name for the phase. Currently available: origin, etalon.
Specify the display name of the phase (any suitable to recognize the phase).
Specify the description, if necessary.
Save the changes. Click “Save” in the upper right corner of the drawer.
Figure 1. “Execution phase” tab