Mastered Entity Attributes for Healthcare Providers

By default, Healthcare Providers data products include mastered entity attributes and provider-specific enrichment attributes. You can also add custom attributes.

You can configure which attributes are included in the the data product's entities table and their order from the entities page. See Configuring Entity Attribute Display.

When configuring publish destinations and datasets, you can also select which attributes are included in the published output, as well as attribute names and order. See Publishing Data Products.

Providers (Golden Records) Dataset Attributes

The providers (golden records) dataset includes the following types of attributes:

  • Primary attributes, which are the unified schema attributes to which source columns are mapped.
  • Curation attributes, which are calculated values for attributes that can help with data curation, such as the number of similar clusters and cluster size.
  • Uniformity attributes, which are the calculated uniformity scores for clusters and selected attributes.
  • NPPES enrich attributes, which are the provider attributes added by NPPES enrichment.

Providers (Golden Records) Attributes

The following attributes are provided by default in the unified schema:

Attribute Name
tamr_id
first_name
middle_name
family_name
full_name
name_prefix
name_suffix
gender
organization_name
address
address_line_2
address_type
city
state
postal_code
country
latitude
longitude
npi
credentials
degree
graduation_year
license_state
license_number
license_status
provider_specialty
phone
email
date_of_birth

Curation Attributes

Curation attributes are calculated metrics that can aid in data review and curation. These include:

AttributeDescription
curation.number_of_source_recordsFor each golden record, the number of source records in the cluster.
curation.number_of_source_datasetsFor each golden record, the number of source datasets from which source records in a cluster originated.
curation.number_of_similar_entitiesFor each golden record, the number of similar golden records (entities).
curation.maximum_similarityFor each golden record, the maximum similarity score for similar golden records.
curation.number_of_verified_source_recordsFor each golden record, the number of source records verified in the cluster.

Uniformity Attributes

Uniformity attributes are calculated values that measure the similarity (uniformity) of records within a cluster, and for values of selected attributes within a cluster. By default, an overall cluster_uniformity_score is calculated for each golden record. In the Configure Data Product page, you also can choose to calculate a similarity score for specific attributes.

Uniformity scores range from 0 to 1. For example, uniformity score of 1 for an attribute means that all records in the cluster have the same value for this attribute, while a uniformity score of 0 indicates that all records in this cluster have different values for this attribute.

Enrichment Attributes

This dataset contains attributes added by the data quality services and NPPES enrichment.

See NPPES Enrichment for the additional attributes added by this enrichment service.

See Phone Number and Address Standardization, Validation, and Geocoding for the attributes added by these data quality services.

Source Records Dataset Attributes

In addition to the attributes in the unified schema for your source datasets, this dataset includes clustering metadata fields to help you understand how each record was clustered. See Persistent Identifier Attributes in Data Products for more information about the clustering metadata attributes.

Attribute Name
tamr_id
tamr_record_id
source_name
primary_key
first_name
middle_name
family_name
full_name
name_prefix
name_suffix
gender
organization_name
address
address_line_2
address_type
city
state
postal_code
country
latitude
longitude
npi
credentials
degree
graduation_year
license_state
license_number
license_status
provider_specialty
phone
email
date_of_birth
clustering_metadata.ml_cluster_id
clustering_metadata.rule_cluster_id
clustering_metadata.verified_cluster_id
clustering_metadata.applied_clustering_rules

Enhanced Source Records Dataset Attributes

In this dataset, source record values have been standardized and enhanced using the data quality services provided in the data product. Additionally, this dataset includes attributes added by the Phone Number and Address Standardization, Validation, and Geocoding(doc:address-enrichment) data quality services.

Attribute Name
tamr_id
tamr_record_id
source_name
primary_key
first_name
middle_name
family_name
full_name
name_prefix
name_suffix
gender
organization_name
address
address_line_2
address_type
city
state
postal_code
country
latitude
longitude
npi
credentials
degree
graduation_year
license_state
license_number
license_status
provider_specialty
phone
email
date_of_birth
enriched_phone.primary.national_format
enriched_phone.primary.carrier
enriched_phone.primary.success
enriched_phone.primary.valid
enriched_phone.primary.country_code
enriched_phone.primary.international_format
enriched_phone.primary.region
enriched_phone.primary.type
enriched_address.primary.premise
enriched_address.primary.thoroughfare
enriched_address.primary.city
enriched_address.primary.region
enriched_address.primary.postal_code_primary
enriched_address.primary.postal_code_secondary
enriched_address.primary.country_name
enriched_address.primary.country_code_2_character
enriched_address.primary.full_address
enriched_address.primary.latitude
enriched_address.primary.longitude
enriched_address.primary.success
enriched_address.primary.google_place_id
enriched_address.primary.location_type
enriched_address.primary.match_type
enriched_address.primary.data_provider