Source Dataset Requirements for B2B Site Mastering

You align your source data with the industry-standard schema for company data that is supplied by this template.

The B2B site mastering template includes a predefined, standardized schema for company data. The mastering flow for data products produced by this template includes a schema mapping step in which you identify how columns in your source datasets correspond to the attributes in the supplied schema.

To prepare, review the general Requirements for Source Datasets. Then, identify the column or columns in each of your source datasets that you will map to the B2B site schema:

Unified Attribute Description
Address_Line_1 Line 1 of the company’s address.
Address_Line_2 Line 2 in the company's address.
Alternative_Names Alternate names for the company.
Associated_Persons Officers, Directors, and other people associated with the company.
City City of the company’s address.
Company_Name Company's name.
Company_Registration_Number Legal entity ID with local government.
Company_Type Legal status of the company (for example, LLC, LTD, and so on).
Country Country of the company’s address.
Founding_Year Year of company formation.
Phone Company’s phone number.
Postal_Code Postal (zip) code of the company’s address
Previous_Names Previous legal names for the company.
primaryKey The primary key used in the source dataset to uniquely identify each record. See About Primary Keys for more information.
Region Region/state of the company’s address.
Stock_Exchange Stock exchange where the company is listed (if public).
Tax_IDs Identification numbers, such as EINs.
Ticker_Symbol Company’s ticker symbol on the stock exchange.
trusted_id A non-unique key, such as a customer identification number used by your internal systems. The clustering model always clusters together records that have the same trusted_id. If the values in this field do not represent a definite match, map an empty placeholder field to trusted_id, and then add the following transformation in the Create tamr_record_id step in the mastering flow: SELECT *, '' as trusted_id;.
Type_Of_Address Whether the company address represents the headquarters or a branch location.
URL Primary website domain for the company.

After you map your source data fields to these attributes, Tamr Cloud can enrich your data and consolidate similar records into entities.

Tip: You can also add attributes to the unified schema and map columns that you want to include in the mastered data product to them. The template does not use these additional attributes as part of the mastering process.