DQ Execution Phase Setup

General Info

Execution phases can only be configured for the org.unidata.mdm.data[RECORD_UPSERT_START] pipeline. The use of phases allows you to create collections of quality rules to be run at different stages of the pipeline. This allows you to divide the load throughout the system. Each phase processes quality rules in a special way.

The following phases are available:

  • ORIGIN. Triggers first. Outputs the systemValidationException (validation error) if at least one of the quality rules (in validation mode) returns an error with RED indicator (high critical level). In this case, the insertion of records stops. Also, while the phase is running, the quality rules in enrichment mode update the records. Updating records stops when the first validation error occurs.

  • ETALON. Triggers last. Does not create its own errors. Any validation errors in this phase are only indexed. Quality rules in enrichment mode update records.

The internal API of the DQ model allows to build collections of rule sets in two ways:

  • Use entity/lookup entity name and phase name. Applied by ORIGIN and ETALON connectors. Connectors build a collection of quality rules according to how the rules are marked in the “Data Quality” section → “Assignments” tab. For example, if several sets are assigned to the ORIGIN phase, and the ORIGIN phase is included in the pipeline, as a result all quality rules in the specified sets will match the ORIGIN phase.

  • Use only entity/lookup entity name. Used by the DQ connector that is contained in the default pipeline. The default connector selects all rule sets, regardless of how they are marked in the “Data Quality” section → “Assignments” tab.

Pipeline Configuration

To configure quality rule phases:

  1. Open the org.unidata.mdm.data[RECORD_UPSERT_START] pipeline.

  2. Remove the default segment of the Connector type from the pipeline: [RECORD_QUALITY_CONNECTOR]. This connector is located in the DRAFT selector of the REGULAR branch. This is necessary because otherwise the standard DQ connector will conflict with the phase connectors.

  3. After the segment of Point type: [RECORD_UPSERT_MODBOX], add another one: [RECORD_UPSERT_QUALITY_ORIGIN].

  4. After the segment of Point type: [RECORD_UPSERT_TIMELINE], add another one: [RECORD_UPSERT_QUALITY_ETALON].

  5. Click “Save” in the upper right corner of the screen.

To use the added phases they must be configured in the “Data Quality” section –> “Execution phase” tab.

Phase Configuration

Figure 1. Phase Configuration.