Healthcare Providers Data Product

You use the Healthcare Providers template to master healthcare provider and practitioner data.

This data product provides a consolidated view of Type 1 (individual) healthcare providers, and includes National Plan & Provider Enumeration System (NPPES) enrichment - ensuring that you have the most complete and up-to-date information for each provider. You can use this information, for example, to help streamline operations and provide optimal patient care.

This data product is not designed to master Type 2 (organization) healthcare data. To master healthcare organization data, use the B2B Customers Data Product.

This data product provides:

  • An industry-standard schema for healthcare provider data.
  • A machine learning model that deduplicates entities within, and across, your data sources.
  • Data quality services for address, phone number, and first name data.
  • Provider identification, license, specialty, and credential details.
  • Insight into organizations with which a provider is affiliated.

Healthcare Providers Data Processing

The following diagram explains how your source records are prepared for clustering, and how Tamr creates and enriches the golden record for each healthcare provider.

Source record preparation includes:

  • Aligning source columns to data product attributes.
  • Cleaning, validating, and enriching source record values.
  • Assigning each record a unique primary key, the tamr_record_id. This ID is a 128-bit hash value of the source dataset name and the source primary key.
  • Matching each source record to an NPI number, as described in the section below on NPPES Enrichment.

Source record clustering includes:

  • Applying the clustering model and any clustering rules to group source records that refer to the same company. (This model is described in the section below.) Each record in a given cluster is assigned the same Tamr ID. This is ID is a unique, persistent identifier that links records in a cluster with each other and with the generated golden record for that cluster.
  • Applying any previous cluster overrides and verifications.

Golden record creation includes:

  • Applying logic to select the best value for each golden record attribute, and associating a Tamr ID with the golden record.
  • Enriching golden records with provider data from the NPPES enrichment service.

Healthcare Providers Clustering Model

By default, the healthcare providers model groups records based on similarities between values for these attributes:

  • Full address
  • Street address
  • City
  • Region
  • Country code
  • Postal code
  • First three digits of the postal code
  • Phone Number
  • First name, including common first name variations and nicknames
  • Last name
  • Middle name
  • Name suffix
  • Organization name
  • Speciality
  • Credentials

Additionally, Tamr applies any custom clustering rules you have added to for the data product when clustering source records.

NPPES Enrichment

The Healthcare Providers data product matches each provider in your source data to an NPI number in the NPPES database based on the following attributes, and prioritizes matches in the following order:

PriorityAttributesnppes_match_status
1Existing NPIMATCH_NPI
2Provider name and full addressMATCH_NAME_ADDRESS
2License numberMATCH_LICENSE_NUMBER
3Provider name and cityMATCH_NAME_CITY
3Provider name and stateMATCH_NAME_STATE
3Phone numberMATCH_PHONE
3Email addressMATCH_EMAIL

When there is a tie in priority order, the ranking of records is determined based the following order:

  1. By the average similarity in descending (DESC) order.
  2. By the sum of average similarities in descending order.
  3. By the NPI in descending order. This is the final tie breaker.

The NPPES enrichment service then adds detailed practice, credential, specialty, name, and other information from the National Plan & Provider Enumeration System (NPPES) to each golden record, based on the NPI.

Data Quality Services

This data product includes these data quality services:

  • Address Standardization, Validation, and Geocoding. This service examines values for the template's address attributes, and adds any resulting validated, standardized values to each record in new enrichment-specific attributes.
  • Phone Enrichment. This service validates and standardizes phone numbers, and enriches phone numbers with type, carrier, and region.
  • First name. This service examines first name values, and, for clustering purposes only, identifies common first name variations and nicknames. For example, common variations for Robert include Rob, Robbie, and Bob. The clustering model uses the original first name value and the enriched values when evaluating first name similarity.
  • Name Prefix Standardization. This service improves match results by standardizing approximately 300 prefixes, such as Miss, Mister, Doctor, Engineer, and so on, to common abbreviations, such as Ms., Mr., Dr., and Eng.

See the linked topics above for processing details and added attributes.

Using this Template

To learn more about this data product's requirements, processing, and resulting golden records, see: