Matching Rules

Matching Rules section. Data administrator interface

General Info

Matching rules are used to search for duplicate data by certain attributes and to form clusters with duplicates. Matching rules have flexible configuration of items to be matched, the ability to select the source of information and algorithms for data comparison.

The matching functionality of entity/lookup entity records is based on the first level attributes (simple, code). Matching records containing relations is possible only through the creation of a custom Pipeline.

The section contains several tabs:

  • Matching Tables - contain matching columns with types of entity/lookup entity attributes, which will be used for matching data model objects.
  • Matching Rules - contain a list of algorithms, according to which the matching will be performed on a certain table.
  • Matching Rule Sets - contain a list of rules, according to which the matching will be performed for the columns of tables.
  • Rules Assignment - allows to bind previously configured matching tables and rulesets to certain attributes of entities/lookup entities.

Clusters of duplicate data model objects are formed according to the rule settings. The list of clusters and their contents can be viewed in the Duplicates section of data steward interface. The contents of clusters is updated when you save changes/delete a record in real time (depending on which pipelines are configured) or when you start the operation of data reindexing.

Also see the Duplicate Search Concept.

Launch of Matching Rules

Launch with Pipelines:

  • In the Platform parameters section of the system administrator interface, enable the Real-time matching checkbox in the Data matching settings module.
  • In the “Pipelines” section of the system administrator interface, configure the data matching pipeline.
  • Configure the matching mechanisms in the following sequence: Create matching tables → Create matching rules → Create rule sets → Assign rules to the entity/lookup entity.
  • See the results in the Duplicates section.

Launch with Data re-indexing operation:

  • Configure the matching mechanisms in the following sequence: Create matching tables → Create matching rules → Create rule sets → Assign rules to the entity/lookup entity.
  • In the “Operations” section of the system administrator interface create and run the reindexDataJob operation with the Update matching tables data parameter enabled.
  • See the results in the Duplicates section.