Steps Completed by B2B Customers with D&B

When you create a data product using the B2B customers with D&B template, Tamr Cloud creates a mastering flow with steps specific mastering and enriching company data.

This topic describes each step in the B2B customers with D&B flow, and explains which steps usually need to be edited for your data.

If you need to make changes to the mastering flow beyond those described in this documentation, contact Support at [email protected] for assistance.

Add Data

Ensure your input data meets both the general requirements for source datasetsand the specific requirements for this template.

Then, add your source data to the mastering flow. See Adding Data to Your Data Product.

Align to Customer Model

Map input columns to attributes in the supplied schema. See Mapping Input Fields to a Unified Schema.

If you add attributes to the unified schema, you will need to update the following steps, as described in the sections below, to ensure that these attributes appear in your final mastered entities:

  • Consolidate Records
  • Consolidate Record Fields
  • Configure Attributes

Create input_record_id

This transformation step ensures that each source record has a unique primary key across all source datasets by adding a new primary key field: input_record_id. The input_record_idis a 128-bit hash value of the source dataset name and the source primary key. See the Tamr Core documentation for a description of the function used to generate the hash value.

If you mapped an empty placeholder column to the trusted_id attribute, add the following transformation: SELECT *, '' as trusted_id;. Otherwise, you do not need to modify this step.

Important: If records within the same source dataset have duplicate primary key values, the input_record_id value for those records will also be duplicates.

Enrich Country Code

This step enriches records with the standardized ISO 3166-1 alpha-2 two character country code values (enriched_country_code_2_character). You do not need to make changes to this step.

The enriched_country_code_2_character field is used as the country code input in Company Name Enrichment and the D-U-N-S Match steps later in the mastering flow. When this field is available, it is also used in place of the input country field in clustering and in the mastered output.

See Country Code Enrichment for information about this enricher, including the output fields it adds to your data.

Enrich Company Name

This step cleans and enriches company name data. You do not need to make changes to this step.

See Company Name Enrichment for information about this enricher, including the output fields it adds to your data.

D-U-N-S Match

D-U-N-S Match enrichment provides the D-U-N-S number and additional information from Dun & Bradstreet (D&B) for each record for which a D-U-N-S match is identified.

This step supports the D&B Asian Language Matching feature, set to auto-detect the language for name and address lookup. See the D&B description of Asian Language Matching for more information.

All input fields for this enrichment step are optional. If you need to remove an input field from this step, remove the mapping for that field in the D-U-N-S Match step and save the change. This field will no longer be included in the input match parameters when you run the flow. You do not need to modify this step unless you want to remove an input field from D-U-N-S Match enrichment.

The clustering model uses the below list of output fields to identify high-confidence, accepted matches. If records do not meet the match acceptance criteria, these fields are not used by the clustering model. You specify the match acceptance criteria to determine a high-confidence, accepted match in the next step: Set Confidence Code and Match Grade Patterns.

  • dnb_match_candidates_match_quality_information_confidence_code
  • dnb_match_candidates_organization_duns
  • dnb_match_candidates_organization_primary_name
  • dnb_match_candidates_organization_telephone_telephone_number_0
  • dnb_match_candidates_organization_website_address_domain_name_0
  • dnb_match_candidates_organization_primary_address_street_address_line1
  • dnb_match_candidates_organization_primary_address_street_address_line2
  • dnb_match_candidates_organization_primary_address_address_region_name
  • dnb_match_candidates_organization_primary_address_postal_code
  • dnb_match_candidates_organization_primary_address_address_country_code_iso_alpha2_code
  • dnb_match_candidates_organization_primary_address_locality_name
  • dnb_match_candidates_organization_mailing_address__street_address_line1
  • dnb_match_candidates_organization_mailing_address__street_address_line2
  • dnb_match_candidates_organization_mailing_address_address_region_name
  • dnb_match_candidates_organization_mailing_address__postal_code
  • dnb_match_candidates_organization_mailing_address_address_country_iso_alpha2_code
  • dnb_match_candidates_organization_mailing_address_address_locality_name
  • dnb_match_candidates_organization_corporate_linkage_familytree_roles_played_description_0

See D-U-N-S Match for information about all of the output fields provided by this enricher.

Set Confidence Code Thresholds and Match Grade Patterns

In the Set Confidence Code Thresholds and Match Grade Patterns transformation step, you can adjust the criteria used to determine the D-U-N-S match quality and whether the match is accepted. These criteria includes:

By default, the D&B Match Grade Patterns are disabled in the transformation logic. If you enable Match Grade Patterns:

  • If the Confidence Code is not met, Tamr Cloud uses the Match Grade Pattern to determine whether to apply the D-U-N-S match result.
  • If either the Confidence Code or Match Grade Pattern is not met, Tamr Cloud does not accept or apply the D-U-N-S match results.

Follow the instructions in the step to change the Confidence Code, enable Match Grade Patterns, and add or remove Match Grade Patterns.

See the D&B Entity Matching Guide for detailed information about setting these criteria.

Prepare for Clustering

This step transforms the data in the unified dataset to create the fields used by the trained clustering model to identify similar and matching records. The fields created as input to the model are prefixed with ml_. Many of these ml_ fields are created as arrays of unified source fields and fields added by the enrichment services. The model identifies the most similar values across the arrays and assigns weights based on these similarities.

You do not need to modify this step.

Apply Clustering Model

This read-only step groups records that refer to the same entity into a cluster, using the trained model.

Note: You can publish the output of this step by publishing the "Source Records by Entity" dataset. See Available Published Datasets.

Consolidate Records

This step applies rules to produce a single record, called the mastered entity record, that best represents a cluster. For most fields, these rules select the most common value from the clustered records.

Additionally, this step adds a Tamr ID (tamr_id) to each entity. The Tamr ID is a unique, persistent id.

If you added a new field in the Align to Customer Data Model step, add a line in the transformations to tell Tamr Cloud what value to set for that field when creating the mastered entity record. See Modifying Record Consolidation Transformations.

D&B Enrich

This step returns enriched data for records for which a D-U-N-S number was identified in the D-U-N-S Match step. D&B Enrich enrichment provides data from the following D&B Data Blocks:

  • Company Information L2
  • Hierarchies and Connections L1

This step is pre-configured based on the D-U-N-S Match step; you do not need to modify this step.

See D&B Enrich for information about this enricher, including the complete list of and descriptions of output fields it adds to your data.

The final mastering flow output includes a subset of the D&B Enrich output fields. You can modify the Consolidate Record Fields step (next in the flow) to include additional fields. The fields included by default are:

  • dnb_duns_control_status_operating_status_description
  • dnb_industry_codes_0_code
  • dnb_industry_codes_0_description
  • dnb_primary_industry_code_us_sic_v4
  • dnb_primary_industry_code_us_sic_v4_description
  • dnb_unspsc_codes_0_code
  • dnb_unspsc_codes_0_description
  • dnb_corporate_linkage_familytree_roles_played_0_description
  • dnb_corporate_linkage_head_quarter_duns
  • dnb_corporate_linkage_head_quarter_primary_name
  • dnb_corporate_linkage_head_quarter_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_parent_duns
  • dnb_corporate_linkage_parent_primary_name
  • dnb_corporate_linkage_parent_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_domestic_ultimate_duns
  • dnb_corporate_linkage_domestic_ultimate_primary_name
  • dnb_corporate_linkage_domestic_ultimate_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_global_ultimate_duns
  • dnb_corporate_linkage_global_ultimate_primary_name
  • dnb_corporate_linkage_global_ultimate_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_global_ultimate_family_tree_members_count

Consolidate Record Fields

The transformation script in this step determines the output fields that are included in the final mastered entity records for this data product.

For records that match to a D-U-N-S number, this transformation sets the values for specific fields to the values supplied by the D&B Enrich step. For these records, the following field values are replaced with values supplied by D&B Enrich:

Unified Dataset FieldReplaced with D&B Field Value

You do not need to modify this step unless:

  • You added output fields in the Align to Customer Data Model step and want those fields to be included in the final output for the data product.
  • You want to add additional fields returned by the D&B Enrich step.

For each new output field that you want to include in the output, add an entry for the field name in the comma-separated field list before this line:

dnb_match_candidates_match_quality_information_confidence_code as confidence_code;

Configure Attributes

This step allows you to configure how data appears in Tamr Cloud and published datasets. If you added new attributes in the Schema Mapping step, add and map those attributes in this step to include them in your final mastered entity output. See Configuring Data Display and Mastered Entity Attributes for B2B Customers with D&B.