About Record Consolidation Rules for RealTime Data Products
Limited ReleaseRealTime data products are in limited release. Some functionality is still in development and is subject to change. If you are interested in using these data products, contact Tamr Support ([email protected]).
Record consolidation rules determine the values in the single golden record that best represents a cluster of similar source records.
You can configure a rule for each attribute to ensure the selected value is appropriate for your downstream users and systems.
Each record consolidation rule is a set of filters, or conditions, that determine the set of records from which Tamr can select the golden record value. You can specify any number of conditions, which Tamr applies in order to filter the records.
You can filter by simple conditions, such as most common, smallest, or non-null values. You can also prioritize specific sources to select values from, so that a value is chosen from a record in the prioritized source provided it meets the other conditions.
For example, if you want to select the most common non-null Company value from a preferred source for the Company attribute you would define the source priority, with that source as the highest priority, and then apply a condition to filter to only records with the most common Company value. If the cluster includes records from the preferred source with non-null values for Company, the most common Company from those records is chosen. If the cluster does not include records from that source, or if records from that source have null values for Company, the most common non-null value across all clustered records is selected as the golden record value.
Tamr creates default record consolidation rules for each attribute in your data product. See the topic for your data product for the default rules:
- RealTime B2B Customers Data Product
- RealTime B2C Customers Data Product
- RealTime Contacts Data Product
- RealTime Healthcare Providers Data Product
Simple Conditions
| Condition Type | Description |
|---|---|
| Most Common Value | Select the record with the most frequently occurring non-null value for the specified attribute. |
| Smallest Value | Select the record with the minimum (smallest) non-null lexicographic or numeric value. |
| Largest Value | Select the maximum(largest) non-null lexicographic or numeric value for this attribute . |
| Longest Value | Select record with the longest non-null string value (by character count). |
| Shortest Value | Select record with the shortest non-null string value (by character count). |
| Exists | Select records where the specified attribute value is, or is not, null.
Choose Has Value to consider only records with non-null values for the attribute.
Choose IS EMPTY to select records where the attribute is null. |
| Match Value | Select records where the specified attribute equals or does not equal a supplied value.
Choose Equals to only consider records with matching values.
Choose Does not equal to exclude records with matching records. |
| Match Attribute | Select records with matching or non-matching values for the two specified attributes. Null values are not considered.
Choose Equals to only consider records with matching attribute values.
Choose Does not equal to exclude records with matching attribute values. |
Source Priority
Define priority tiers to use records from preferred sources when determining golden record values. The golden record value is chosen from a record in the prioritized source provided it meets the other conditions.
Final Value Condition
Each attribute has a default Final Value Condition, which cannot be deleted.
When rules resolve to multiple values, this condition determines the final value in the golden record.
Generally, this condition selects the Smallest Value from the clustered source records (numerically or lexicographically) by default. If needed, you can change this to the Largest Value instead.
For custom attributes, you can also choose Collect Distinct, which creates a list of all unique values for this attribute from the source records that meet the rule conditions.
Updated about 3 hours ago