Mastered Entity Attributes for B2B Customers
By default, the B2B customers data product includes organization Identifier attributes and provider-specific enrichment attributes.
By default, data products created with the B2B customers data product include mastered entity attributes and provider-specific enrichment attributes. You can also add custom attributes.
You can configure which attributes are included in the the data product's entities table and their order from the entities page. See Configuring Entity Attribute Display.
When configuring publish destinations and datasets, you can also select which attributes are included in the published output, as well as attribute names and order. See Publishing Data Products.
Customer (Golden Record) Attributes
The customers (golden records) dataset includes the following types of attributes:
- Primary attributes, which are the unified schema attributes to which source columns are mapped.
- Curation attributes, which are calculated values for attributes that can help with data curation, such as the number of similar clusters and cluster size.
- Uniformity attributes, which are the calculated uniformity scores for clusters and selected attributes.
- Enrich attributes, which are the firmographic attributes added by selected external data providers.
Primary Customer (Golden Records) Dataset Attributes
The following attributes are provided by default in the unified schema:
Attribute Name |
---|
tamr_id |
company_name |
alternative_names |
previous_names |
revenue |
company_type |
industry_type |
founding_year |
address_type |
address |
address_line_2 |
city |
state |
postal_code |
country |
latitude |
longitude |
phone |
registration_number |
tax_IDs |
ticker_symbol |
stock_exchange |
email_domain |
website |
associated_persons |
Curation Attributes
Curation attributes are calculated metrics that can aid in data review and curation. These include:
Attribute | Description |
---|---|
number_of_source_records | For each golden record, the number of source records in the cluster. |
number_of_source_datasets | For each golden record, the number of source datasets from which source records in a cluster originated. |
number_of_similar_entities | For each golden record, the number of similar golden records (entities). |
maximum_similarity | For each golden record, the maximum similarity score for similar golden records. |
Uniformity Attributes
Uniformity attributes are calculated values that measure the similarity (uniformity) of records within a cluster, and for values of selected attributes within a cluster. By default, an overall cluster_uniformity_score
is calculated for each golden record. In the Configure Data Product page, you also can choose to calculate a similarity score for specific attributes.
Uniformity scores range from 0 to 1. For example, uniformity score of 1 for an attribute means that all records in the cluster have the same value for this attribute, while a uniformity score of 0 indicates that all records in this cluster have different values for this attribute.
Enrichment Attributes
The Customers dataset contains attributes added by the selected data quality services and firmographic enrichment providers, as well as Tamr Enrich ID and Firmographic Match Status. See Tamr Enrich ID.
See Phone Number and Address Standardization, Validation, and Geocoding for the attributes added by these data quality services.
See Tamr Firmographic Enrichment for the attributes added by each provider.
If you are using CMS enrichment, the enrichment attributes are available only in the CMS enrichment datasets, and are not included in the Customers dataset. You can join with the CMS datasets with the Customers dataset after publishing, using the tamr_id
. See CMS Enrichment for the attributes provided for each healthcare organization type.
Source Records Dataset Attributes
In addition to the attributes in the unified schema for your source datasets, this dataset includes clustering metadata attributes to help you understand how each record was clustered. See Persistent Identifier Attributes in Data Products for more information about the clustering metadata attributes.
Attribute Name |
---|
tamr_id |
company_name |
alternative_names |
previous_names |
revenue |
company_type |
industry_type |
founding_year |
address_type |
address |
address_line_2 |
city |
state |
country |
postal_code |
latitude |
longitude |
phone |
registration_number |
tax_ids |
ticker_symbol |
stock_exchange |
email_domain |
website |
associated_persons |
clustering_metadata.ml_cluster_id |
clustering_metadata.rule_cluster_id |
clustering_metadata.verified_cluster_id |
clustering_metadata.applied_clustering_rules |
Enhanced Source Records Dataset Attributes
In this dataset, source record values have been standardized and enhanced using the data quality services provided in the data product. Additionally, this dataset includes attributes added by the Phone Number and Address Standardization, Validation, and Geocoding(doc:address-enrichment) data quality services and by the Tamr Enrich ID enrichment service.
Attribute Name |
---|
tamr_id |
company_name |
alternative_names |
previous_names |
revenue |
company_type |
industry_type |
founding_year |
address_type |
address |
address_line_2 |
city |
state |
country |
postal_code |
latitude |
longitude |
phone |
registration_number |
tax_ids |
ticker_symbol |
stock_exchange |
email_domain |
website |
associated_persons |
enriched_phone.primary.national_format |
enriched_phone.primary.carrier |
enriched_phone.primary.success |
enriched_phone.primary.valid |
enriched_phone.primary.country_code |
enriched_phone.primary.international_format |
enriched_phone.primary.region |
enriched_phone.primary.type |
enriched_address.primary.premise |
enriched_address.primary.thoroughfare |
enriched_address.primary.city |
enriched_address.primary.region |
enriched_address.primary.postal_code_primary |
enriched_address.primary.postal_code_secondary |
enriched_address.primary.country_name |
enriched_address.primary.country_code_2_character |
enriched_address.primary.full_address |
enriched_address.primary.latitude |
enriched_address.primary.longitude |
enriched_address.primary.success |
enriched_address.primary.google_place_id |
enriched_address.primary.location_type |
enriched_address.primary.match_type |
enriched_address.primary.data_provider |
firmographic_match.tamr_enrich_id |
firmographic_match.tamr_firmographic_match_status |
Updated 11 days ago