Steps Completed by Contacts
When you create a data product using the contacts template, Tamr Cloud creates a mastering flow with steps specific to contacts mastering.
The following table describes each step in the contacts flow, and explains which steps usually need to be edited for your data.
If you need to make changes to the mastering flow beyond those described in this documentation, contact Support at [email protected] for assistance.
Usually Requires Changes | Step | Description |
---|---|---|
Add Data | You verify that source data meets both general requirements and template-specific requirements. Then, you add source data. | |
Align to Contacts Data Model | You map input columns
to attributes in the supplied schema. If you add attributes to the unified schema, you will need to update the following steps, as described in the rows below, to ensure that these attributes appear in your final mastered entities:
|
|
Create tamr_record_id
|
This transformation step ensures that each source record has a unique primary key across all
source datasets by adding a new primary key field: tamr_record_id . The tamr_record_id
is a 128-bit hash value of the source dataset name and the source primary
key. See the
Tamr Core
documentation for a description of the function used to generate the hash value.
Important: If records within the same source dataset have duplicate primary key values, the tamr_record_id value for those records will also be duplicates.
If you mapped an empty placeholder column to the trusted_id attribute, add the following transformation: SELECT *, '' as trusted_id; .
|
|
Prepare Data for Org Address Enrichment | This step transforms the data in the unified dataset to match the expected inputs for the address enrichment service. | |
Enrich Org Address | This step standardizes and validates address information for the contact’s organization, stored in the
org_address_line_1 , org_address_line_2 , org_city , org_region ,
org_postal_code , and org_country fields in the unified dataset.
Additionally, this step enriches organization addresses with latitude, longitude, and detailed address information. See Address Standardization, Validation, and Geocoding. |
|
Rename Organization Address Fields | This step renames the address enrichment fields returned for the contact’s organization
address to prepend each of these fields with org_ . For example,
enriched_address_city becomes org_enriched_address_city . These
changes allow you to easily identify enrichment fields related to the organization address.
|
|
Prepare Organization Address Fields | This step transforms the data in the unified dataset to create the fields used by the trained clustering model to identify similar and matching records. | |
Prepare Data for Contact Address Enrichment | This step transforms the data in the unified dataset to match the expected inputs to address enrichment service. | |
Enrich Contact Address | This step standardizes and validates address information for the contact’s organization,
tored in the contact_address_line_1 , contact_address_line_2 ,
contact_city , contact_region , contact_postal_code , and
contact_country fields in the unified dataset.
Additionally, this step enriches organization addresses with latitude, longitude, and detailed address information. See Address Standardization, Validation, and Geocoding. |
|
Rename Contact Address Fields | This step renames the address enrichment fields returned for the contact’s organization
address to prepend each of these fields with contact_ . For example,
enriched_address_city becomes contact_enriched_address_city . These
changes allow you to easily identify enrichment fields related to the organization address.
|
|
Prepare Contact Address Fields | This step transforms the data in the unified dataset to create the fields used by the trained clustering model to identify similar and matching records. | |
Prepare Data for Primary Phone Enrichment | This step transforms the data in the unified dataset to match the expected inputs to the phone number data quality service included in the mastering flow. This step replaces empty country values in source records with the country returned by the Enrich Contact Address step. | |
Enrich Primary Phone | This step validates, standardizes, and enriches phone number data for each contact’s primary
phone number, stored in the phone_number field in the unified dataset.
See
Phone Number Enrichment.
|
|
Prepare Data for Alt Phone Enrichment | This step prepares the unified dataset ahead of enriching phone number data for each
contact’s alternate phone number, stored in the phone_number_alt field in the unified dataset.
This step renames the phone number enrichment fields returned for the primary phone number to prepend each of these fields with primary_ . For example, enriched_phone_carrier
becomes primary_enriched_phone_carrier . These
changes allow you to easily identify enrichment fields related to the primary phone number.
|
|
Enrich Alt Phone | This step validates, standardizes, and enriches phone number data for each contact’s
alternate phone number, stored in the phone_number_alt field in the unified dataset.
See
Phone Number Enrichment.Important: By default, this enriched alternate phone number data is used in the clustering model, but is not included in the final mastering flow output. You can modify the Consolidate Records step and the Configure Attributes step to include additional fields. |
|
Prepare for Clustering | This step transforms the data in the unified dataset to create the fields used by the
trained clustering model to identify similar and matching records.
The fields created as input to the model are prefixed with ml_ . Many of these ml_ fields are created as arrays of unified source
fields and fields added by the enrichment services. The model identifies
the most similar values across the arrays and assigns weights based on these similarities.
This step also renames the phone number enrichment fields returned for the alt phone number to prepend each of these fields with alt_ . For example, enriched_phone_carrier
becomes alt_enriched_phone_carrier . These
changes allow you to easily identify enrichment fields related to the alt phone number.
|
|
Apply Clustering Model | This step groups records that refer to the same entity into a cluster, using the trained model. See Features of Contacts. | |
Consolidate Records | This step applies rules to produce a single record, called the mastered entity record, that
best represents a cluster. For most fields, these rules select the most common value from the
clustered records.
Additionally, this step adds a Tamr ID ( tamr_id ) to each mastered entity record. The Tamr ID
is a unique, persistent id.
If you added new attributes in the Schema Mapping step, or would like additional phone enrichment fields to be included in the mastering output, add lines in the transformations to tell Tamr Cloud what value to set for each attribute when creating the mastered entity. See Modifying Record Consolidation Transformations. |
|
Configure Attributes | You configure how mastered entity attributes appear in Tamr Cloud and published datasets. If you added new attributes in the Schema Mapping step, add and map those attributes in this step to include them in your final mastered entity output. See Configuring Data Display. |
Updated 10 months ago