Contacts Mastering
Use the Contacts Mastering template to master marketing data, such as B2C contacts, using an industry-standard schema and trained machine learning model. The template also provides enrichment for phone numbers and for contact and organization addresses, helping to ensure that you have the most complete and up-to-date information.
Input Dataset Requirements
As part of the mastering flow, Tamr Cloud aligns your input datasets to a unified schema with predefined output fields. Tamr Cloud uses these predefined output fields to enrich your data and consolidate similar records into entities.
In addition to the general Requirements for Input Datasets, certain data is required for Contacts mastering.
You must map one or more input fields to each of the predefined output fields:
contact_address_line_1
contact_address_line_2
contact_city
contact_country
contact_postal_code
contact_region
email
fax_number
first_name
last_name
middle_name
name_prefix
name_suffix
org_address_line_1
org_address_line_2
org_city
org_country
org_name
org_name_alt
org_postal_code
org_region
phone_number
phone_number_alt
professional_title
trusted_id
This is a unique identifier such as a customer or contact ID. The clustering model considers records with the same value for this field to be a match. Map a field to this output field only if records that have the same value for this field should be considered a match. If the values in the input do not represent a definite match, map an empty placeholder field totrusted_id
, then add the following transformation to create tamr_record_id step:select *, null as trusted_id
.unique_key
This is the primary key for the dataset. See About Primary Keys for more information.
Clustering Model
For Contacts mastering, the clustering model considers the similarity of values for the following fields, and then uses decision-tree logic to accurately identify records that refer to the same entity:
- Name
- Street address
- Phone number
- User email
Except in rare edge cases, records with the same trusted ID are clustered together, while records with different trusted IDs are not clustered together.
Tamr Cloud looks for similarities in these fields, not exact matches. For example, two addresses on the same street can correspond to the same person.
Modifying the Mastering Flow
Designer Flow Step Overview
When you create an entity type using the Contacts mastering template, Tamr Cloud creates a mastering flow in Designer with steps specific to Contacts mastering:
- Add Data
- Align to Contacts Data Model
- Create tamr_record_id
- Prepare Data for Primary Phone Enrichment
- Enrich Primary Phone
- Prepare Data for Alt Phone Enrichment
- Enrich Alt Phone
- Prepare Data for Org Address Enrichment
- Enrich Org Address
- Prepare Data for Contact Address Enrichment
- Enrich Contact Address
- Prepare for Clustering
- Apply Clustering Model
- Consolidate Records
- Deliver Data to Studio
Add Data
Ensure your input data meets both the general requirements for input datasets and the specific requirements for this template.
Then, see Adding a Data Source.
Align to Contacts Data Model
See Mapping Input Fields to a Unified Schema.
Create tamr_record_id
This transformation step ensures that each source record has a unique primary key across all source datasets, by concatenating the source dataset name and the unique key, separated by an underscore, from the source record into a new primary key field: tamr_record_id
.
Important: If records within the same source dataset have duplicate primary key values, the tamr_record_id
value for those records will also be duplicates.
You do not need to modify this step.
Prepare Data for Primary Phone Enrichment
This step transforms the data in the unified dataset to match the expected inputs for the phone number enrichment service.
You do not need to make changes to this step.
Enrich Primary Phone
This step validates, standardizes, and enriches phone number data for each contact’s primary phone number, stored in the phone_number field in the unified dataset. You do not need to make changes to this step.
See Phone Number Enrichment for information about this enricher, including the output fields it adds to your data.
Prepare Data for Alt Phone Enrichment
This step prepares the unified dataset ahead of enriching phone number data for each contact’s alternate phone number, stored in the phone_number_alt field in the unified dataset.
This step renames the phone number enrichment fields returned for the primary phone number to prepend each of these fields with primary_
. For example, enriched_phone_carrier
becomes primary_enriched_phone_carrier
. These changes allow you to easily identify enrichment fields related to the primary phone number.
You do not need to make changes to this step.
Enrich Alt Phone
This step validates, standardizes, and enriches phone number data for each contact’s alternate phone number, stored in the phone_number_alt
field in the unified dataset.
You do not need to make changes to this step.
See Phone Number Enrichment for information about this enricher, including the output fields it adds to your data.
Prepare Data for Org Address Enrichment
This step transforms the data in the unified dataset to match the expected inputs for the address enrichment service.
Additionally, the step renames the phone number enrichment fields returned for the alternate phone number to prepend each of these fields with alt_
. For example, enriched_phone_carrier
becomes alt_enriched_phone_carrier
. These changes allow you to easily identify enrichment fields related to the alternate phone number.
By default, this enriched alternate phone number data is used in the clustering model, but is not included in the final mastering flow output. You can modify the Consolidate Records step and the Deliver to Studio step to include additional fields.
Enrich Org Address
This step standardizes and enriches address data for the contact’s organization, stored in the org_address_line_1
, org_address_line_2
, org_city
, org_region
, org_postal_code
, and org_country
fields in the unified dataset. You do not need to make changes to this step.
See Address Enrichment for information about this enricher, including the output fields it adds to your data.
Prepare Data for Contact Address Enrichment
This step transforms the data in the unified dataset to match the expected inputs to address enrichment service.
Additionally, the step renames the address enrichment fields returned for the contact’s organization address to prepend each of these fields with org_
. For example, enriched_address_city
becomes org_enriched_address_city
. These changes allow you to easily identify enrichment fields related to the organization address.
Enrich Contact Address
This step standardizes and enriches the contact’s address data, stored in the contact_address_line_1
, contact_address_line_2
, contact_city
, contact_region
, contact_postal_code
, and contact_country
fields in the unified dataset. You do not need to make changes to this step.
See Address Enrichment for information about this enricher, including the output fields it adds to your data.
Prepare for Clustering
This step transforms the data in the unified dataset to create the fields used by the trained clustering model to identify similar and matching records. The fields created as input to the model are prefixed with ml_
. Many of these ml_
fields are created as arrays of unified source fields and fields added by the enrichment services. The model identifies the most similar values across the arrays and assigns weights based on these similarities.
Additionally, the step renames the address enrichment fields returned for the contact’s address to prepend each of these fields with contact_
. For example, enriched_address_city
becomes contact_enriched_address_city
. These changes allow you to easily identify enrichment fields related to the contact’s address.
Apply Clustering Model
This read-only step groups records that refer to the same entity into a cluster, using the trained model.
Note: You can publish the output of this step by publishing the "Source Records by Entity" dataset. See Available Published Datasets.
Consolidate Records
This step applies rules to produce a single entity record that best represents a cluster. For most fields, these rules select the most common value from the clustered records.
Additionally, this step adds a Tamr ID (tamr_id) to each entity. The Tamr ID is a unique, persistent id.
You do not need to modify this step unless:
- You added output fields in the Align to Customer Data Model step and want those fields to be included in the final output for the entity type.
- You want to add fields returned by the phone enrichment service for alternate phone numbers.
See Modifying Transformations for Your Data for instructions on adding fields in this step.
Deliver to Studio
This step allows you to configure how entity data appears in Studio, Curator, and published datasets. See Configuring Data Display in Studio.
The Deliver to Studio step is configured by group as follows by default:
Contact:
Unified Field | Mapped Display Name |
---|---|
tamr_persistent_id | Tamr ID |
full_name | Full Name |
professional_title | Title |
company_name | Company |
most_common_origin_email | |
phone_number | Phone |
contact_address_line_1 | Address Line 1 |
contact_city | City |
contact_region | Region |
contact_postal_code | Postal Code |
contact_country | Country |
entityID | Entity ID |
Organization:
Unified Field | Mapped Display Name |
---|---|
org_name | Organization Name |
org_address_line_1 | Organization Address |
org_city | Organization City |
org_region | Organization Region |
org_postal_code | Organization Postal Code |
org_country | Organization Country |
Enriched Phone Information:
Unified Field | Mapped Display Name |
---|---|
primary_enriched_phone_national_format | Enriched Phone |
primary_enriched_phone_type | Enriched Phone Type |
Enriched Contact Address Information:
Unified Field | Mapped Display Name |
---|---|
contact_enriched_full_address | Enriched Contact Address |
contact_enriched_address_city | Enriched Contact City |
contact_enriched_address_region | Enriched Contact Region |
Enriched Organization Information:
Unified Field | Mapped Display Name |
---|---|
org_enriched_full_address | Enriched Organization Address |
org_enriched_address_city | Enriched Organization City |
org_enriched_address_region | Enriched Organization Region |
org_enriched_address_postal_code_primary | Enriched Organization Postal Code |
org_enriched_address_country_name | Enriched Organization Country |
Updated 10 days ago