Supplier Mastering with D&B

You use the Supplier Mastering with D&B template to master site-level supplier data.

The Supplier Mastering with D&B template uses an industry-standard schema and trained machine learning model. During the mastering flow, Tamr Cloud identifies source records that refer to the same entity. Each entity is a mastered record that represents a group of related source records across your source datasets.

For each field in the entity, Tamr Cloud identifies the most appropriate value from these source records, ensuring that the entity has the most accurate and up-to-date information.

This template also enriches your data with the following:

  • D-U-N-S number and additional information provided by Dun & Bradstreet (D&B).
  • Data from the following D&B Data Blocks:
    • Company Information L2
    • Hierarchies and Connections L1
    • Diversity Insights L1
    • Third Party Risk Insights L1
  • Country code and company name enrichment data.

Input Dataset Requirements

As part of the mastering flow, Tamr Cloud aligns your source datasets to a unified schema with predefined output fields. Tamr Cloud uses these predefined output fields to enrich your data and consolidate similar records into entities.

In addition to the general Requirements for Input Datasets, certain data is required for Supplier Mastering with D&B.

You must map one or more source fields to each of the predefined output fields:

  • uniqueKey
    This is the primary key used in the source dataset to uniquely identify each record. See About Primary Keys for more information.
  • company_name
  • address_line_1
  • address_line_2
  • city
  • region
  • postal_code
  • country
  • phone
  • registration_number
  • registration_number_type
  • url
  • email
  • source_duns_number
    If you trust that the D-U-N-S number available in your source data is correct, map the field with that number to the source_duns_number output field. Otherwise, map an empty placeholder field to source_duns_number. Records that have the source_duns_number populated will not be rematched to a D-U-N-S number in the D-U-N-S Match step; instead, the source_duns_number will be used.

Clustering Model

For Supplier Mastering with D&B, the clustering model identifies records that represent the same entity by first considering the D-U-N-S number. If two records have been enriched with the same D-U-N-S number, they are clustered together, provided that the D-U-N-S match is above the Confidence Code threshold or matches any specified Match Grade Patterns specified in the Set Confidence Code Thresholds and Match Grade Patterns step.

Then, the clustering model considers the similarity of values for the following fields, and uses decision-tree logic to accurately identify records that refer to the same entity:

  • Company name variations
  • All address fields
  • Phone number
  • Website

For low-confidence (unaccepted) D-U-N-S matches, the above fields only contain information from the source fields. For high-confidence (accepted) D-U-N-S matches, the above fields contain information from both the source fields and the returned fields from the D-U-N-S Match step.

Additionally, the model will not cluster together different high-confidence D-U-N-S numbers, except in rare edge-cases.

Clustering Examples

The clustering model used by the Supplier Mastering with D&B template is the same model used by the Company Mastering with D&B template. See Company Mastering with D&B for examples for source records that the model clusters and does not cluster.

Modifying the Mastering Flow

Designer Flow Step Overview

When you create a data product using the Supplier Mastering with D&B template, Tamr Cloud creates a mastering flow in Designer with steps specific to supplier mastering:

Add Data

Ensure your source data meets both the general requirements for input datasets and the specific requirements for this template.

Then, see Adding a Dataset.

Align to Customer Model

See Mapping Input Fields to a Unified Schema.

Create tamr_record_id

This transformation step ensures that each source record has a unique primary key across all source datasets by adding a new primary key field: tamr_record_id.

For data products created before June 1, 2023, this step produces a tamr_record_id for each source record by concatenating the source dataset name and the source primary key, separated by an underscore.

For data products created on June 1, 2023 or later, this step produces a tamr_record_id for each source record by creating a 128-bit hash value of the source dataset name and the source primary key. See the Tamr Core documentation for a description of the function used to generate the hash value.

Important: If records within the same source dataset have duplicate primary key values, the tamr_record_id value for those records will also be duplicates.

You do not need to modify this step.

Enrich Country Code

This step enriches records with the standardized ISO 3166-1 alpha-2 two character country code values (enriched_country_code_2_character). You do not need to make changes to this step.

The enriched_country_code_2_character field is used as the country code input in Company Name Enrichment and the D-U-N-S Match steps later in the mastering flow. When this field is available, it is also used in place of the input country field in clustering and in the mastered output.

See Country Code Enrichment for information about this enricher, including the output fields it adds to your data.

Enrich Company Name

This step cleans and enriches company name data. You do not need to make changes to this step.

See Company Name Enrichment for information about this enricher, including the output fields it adds to your data.

D-U-N-S Match

D-U-N-S Match enrichment provides the D-U-N-S number and additional information from Dun & Bradstreet (D&B) for each record for which a D-U-N-S match is identified.

This step supports the D&B Asian Language Matching feature, set to auto-detect the language for name and address lookup. See the D&B description of Asian Language Matching for more information.

All input fields for this enrichment step are optional. If you need to remove an input field from this step, remove the mapping for that field in the D-U-N-S Match step and save the change. This field will no longer be included in the input match parameters when you run the flow. You do not need to modify this step unless you want to remove an input field from DUNS Match enrichment.

The list of output fields below is used by the clustering matching model to identify high-confidence, accepted matches. If records do not meet the match acceptance criteria, these fields are not used by the clustering model. You specify the match acceptance criteria to determine a high-confidence, accepted match in the next step: Set Confidence Code and Match Grade Patterns.

  • dnb_match_candidates_match_quality_information_confidence_code
  • dnb_match_candidates_organization_duns
  • dnb_match_candidates_organization_primary_name
  • dnb_match_candidates_organization_telephone_telephone_number_0
  • dnb_match_candidates_organization_website_address_domain_name_0
  • dnb_match_candidates_organization_primary_address_street_address_line1
  • dnb_match_candidates_organization_primary_address_street_address_line2
  • dnb_match_candidates_organization_primary_address_address_region_name
  • dnb_match_candidates_organization_primary_address_postal_code
  • dnb_match_candidates_organization_primary_address_address_country_code_iso_alpha2_code
  • dnb_match_candidates_organization_primary_address_locality_name
  • dnb_match_candidates_organization_mailing_address__street_address_line1
  • dnb_match_candidates_organization_mailing_address__street_address_line2
  • dnb_match_candidates_organization_mailing_address_address_region_name
  • dnb_match_candidates_organization_mailing_address__postal_code
  • dnb_match_candidates_organization_mailing_address_address_country_iso_alpha2_code
  • dnb_match_candidates_organization_mailing_address_address_locality_name
  • dnb_match_candidates_organization_corporate_linkage_familytree_roles_played_description_0

See D-U-N-S Match for information about this enricher, including the complete list of and descriptions of output fields it adds to your data.

Set Confidence Code Thresholds and Match Grade Patterns

In the Set Confidence Code Thresholds and Match Grade Patterns transformation step, you can adjust the criteria used to determine the D-U-N-S match quality and whether the match is accepted. These criteria includes:

By default, the D&B Match Grade Patterns are disabled in the transformation logic. If you enable Match Grade Patterns:

  • If the Confidence Code is not met, Tamr Cloud uses the Match Grade Pattern to determine whether to apply the D-U-N-S match result.
  • If either the Confidence Code or Match Grade Pattern is not met, Tamr Cloud does not accept or apply the D-U-N-S match results.

Follow the instructions in the step to change the Confidence Code, enable Match Grade Patterns, and add or remove Match Grade Patterns.

See the D&B Entity Matching Guide for detailed information about setting these criteria.

Prepare for Clustering

This step transforms the data in the unified dataset to create the fields used by the trained clustering model to identify similar and matching records. The fields created as input to the model are prefixed with ml_. Many of these ml_ fields are created as arrays of unified source fields and fields added by the enrichment services. The model identifies the most similar values across the arrays and assigns weights based on these similarities.

You do not need to modify this step.

Apply Clustering Model

This read-only step groups records that refer to the same entity into a cluster, using the trained model.

Note: You can publish the output of this step by publishing the "Source Records by Entity" dataset. See Available Published Datasets.

Consolidate Records

This step applies rules to produce a single record, called the mastered entity record, that best represents a cluster. For most fields, these rules select the most common value from the clustered records.

Additionally, this step adds a Tamr ID (tamr_id) to each entity. The Tamr ID is a unique, persistent id.

If you added a new field in the Align to Customer Data Model step, add a line in the transformations to tell Tamr Cloud what value to set for that field when creating the mastered entity record. See Modifying Record Consolidation Transformations.

D&B Enrich

This step returns enriched data for records for which a D-U-N-S number was identified in the D-U-N-S Match step. D&B Enrich enrichment provides data from the following D&B Data Blocks:

  • Company Information L2
  • Hierarchies and Connections L1
  • Diversity Insights L1
  • Third Party Risk Insights L1

This step is pre-configured based on the D-U-N-S Match step; you do not need to modify this step.

See D&B Enrich for information about this enricher, including the complete list of and descriptions of output fields it adds to your data.

The final mastering flow output includes a subset of the D&B Enrich output fields. You can modify the Consolidate Record Fields step (next in the flow) to include additional fields. The fields included by default are:

  • dnb_duns_control_status_operating_status_description
  • dnb_industry_codes_0_code
  • dnb_industry_codes_0_description
  • dnb_primary_industry_code_us_sic_v4
  • dnb_primary_industry_code_us_sic_v4_description
  • dnb_unspsc_codes_0_code
  • dnb_unspsc_codes_0_description
  • dnb_corporate_linkage_familytree_roles_played_0_description
  • dnb_corporate_linkage_head_quarter_duns
  • dnb_corporate_linkage_head_quarter_primary_name
  • dnb_corporate_linkage_head_quarter_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_parent_duns
  • dnb_corporate_linkage_parent_primary_name
  • dnb_corporate_linkage_parent_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_domestic_ultimate_duns
  • dnb_corporate_linkage_domestic_ultimate_primary_name
  • dnb_corporate_linkage_domestic_ultimate_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_global_ultimate_duns
  • dnb_corporate_linkage_global_ultimate_primary_name
  • dnb_corporate_linkage_global_ultimate_primary_address_address_country_iso_alpha2_code
  • dnb_corporate_linkage_global_ultimate_family_tree_members_count
  • dnb_dnb_assessment_supplier_evaluation_risk_score_raw_score
  • dnb_dnb_assessment_supplier_stability_index_score_class_score
  • dnb_socio_economic_information_is_minority_owned
  • dnb_socio_economic_information_is_veteran_owned
  • dnb_socio_economic_information_is_woman_owned
  • dnb_socio_economic_information_ownership_primary_ethnicity_type_description
  • dnb_is_small_business
  • dnb_organization_size_category_description

Consolidate Record Fields

The transformation script in this step determines the output fields that are included in the final entities for this data product.

This transformation also sets the values for specific fields to the values supplied by the D&B Enrich step, for records that match to a D-U-N-S number. For these records, the following field values are replaced with values supplied by D&B Enrich:

Unified Dataset FieldReplaced with D&B Field Value
company_namednb_primary_name
address_line_1dnb_primary_address_street_address_line_1
address_line_2dnb_primary_address_street_address_line_2
citydnb_primary_address_address_locality_name
regiondnb_primary_address_region_name
postal_codednb_primary_address_postal_code
country_code
(enriched country code value when available, otherwise input country value)
dnb_primary_address_address_country_iso_alpha2_code
phonednb_telephone_0_telephone_number
emaildnb_email_0_address
urldnb_website_address_0_url
registration_numberdnb_registration_numbers_0_registration_number
registration_number_typednb_registration_numbers_0_type_description

You do not need to modify this step unless:

  • You added output fields in the Align to Customer Data Model step and want those fields to be included in the final output for the data product.
  • You want to add additional fields returned by the D&B Enrich step.

For each new output field that you want to include in the output, follow the instructions in the step to add an entry for the field name in the comma-separated field list before this line:

dnb_match_candidates_match_quality_information_confidence_code as confidence_code;

Deliver Data to Studio

This step allows you to configure how data appears in Studio, Curator, and published datasets. See Configuring Data Display in Studio.

The Deliver to Studio step is configured as follows by default:

Organization Identifiers

Unified FieldMapped Display Name
tamr_idTamr ID
company_nameCompany Name
address_line_1Address Line 1
address_line_2Address Line 2
cityCity
regionRegion
postal_codePostal Code
phoneTelephone
country_codeCountry Code
emailEmail Address
urlURL
duns_numberDUNS number
duns_ control_operating_status_descriptionOperating Status
registration_numberRegistration Number
registration_number_typeRegistration Number Description
entity_idEntity ID

Note: this is the same value as the tamr_id

D&B Enriched Industry Classification

Unified FieldMapped Display Name
dnb_industry_codes_0_codeIndustry Code
dnb_industry_codes_0_descriptionIndustry Code Description
dnb_primary_industry_code_us_sic_v4SIC Code
dnb_primary_industry_code_us_sic_v4_descriptionSIC Code Description
dnb_unspsc_codes_0_codeUNSPSC Code
dnb_unspsc_codes_0_descriptionUNSPSC Code Description

D&B Enriched Corporate Linkage

Unified FieldMapped Display Name
dnb_corporate_linkage_familytree_roles_played_0_descriptionFamily Tree Role
dnb_corporate_linkage_head_quarters_dunsHeadquarter DUNS Number
dnb_corporate_linkage_head_quarter_primary_nameHeadquarter Business Name
dnb_corporate_linkage_head_quarter_primary_address_address_country_iso_alpha2_codeHeadquarter Country Code
dnb_corporate_linkage_pagent_dunsParent DUNS Number
dnb_corporate_linkage_parent_primary_nameParent Business Name
dnb_corporate_linkage_parent_primary_address_address_country_iso_alpha2_codeParent Country Code
dnb_corporate_linkage_domestic_ultimate_dunsDomestic Ultimate DUNS Number
dnb_corporate_linkage_domestic_ultimate_primary_nameDomestic Ultimate Business Name
dnb_corporate_linkage_domestic_ultimate_primary_address_address_country_iso_alpha2_codeDomestic Ultimate Country Code
dnb_corporate_linkage_global_ultimate_dunsGlobal Ultimate DUNS Number
dnb_corporate_linkage_global_ultimate_primary_nameGlobal Ultimate Business Name
dnb_corporate_linkage_global_ultimate_primary_address_address_country_iso_alpha2_codeDomestic Ultimate Country Code
dnb_corporate_linkage_global_ultimate_family_tree_members_countCorporate Linkage Global Ultimate Family Tree Member Count

D&B Enriched Supplier Risk

Unified TitleMapped Display Name
dnb_dnb_assessment_supplier_evaluation_risk_score_raw_scoreSupplier Evaluation Risk Score
dnb_dnb_assessment_supplier_stability_index_score_class_scoreSupplier Stability Indicator Class Score

D&B Enriched Diversity Insights

Unified TitleMapped Display Name
dnb_socio_economic_information_is_minority_ownedMinority Owned Business
dnb_socio_economic_information_is_veteran_ownedVeteran Owned Business
dnb_socio_economic_information_is_woman_ownedWoman Owned Business
dnb_socio_economic_information_ownership_primary_ethnicity_type_descriptionOwnership Primary Ethnicity
dnb_is_small_businessSmall Business Indicator
dnb_organization_size_category_descriptionOrganization Size

D&B Match Quality

Unified TitleMapped Display Name
confidence_codeConfidence Code