Features of Healthcare Providers

This template enhances your data with NPPES enrichment and standardized address values, and consolidates similar records into grouped entities.

About the Data Quality and Enrichment Services

The healthcare providers template includes these data quality and enrichment services:

  • Address Standardization, Validation, and Geocoding. This service examines values for the template's address attributes, and adds any resulting validated, standardized values to each record in new enrichment-specific attributes. The original values mapped from your source datasets remain present and unchanged.
  • Phone Enrichment. This service validates and standardizes phone numbers, and enriches phone numbers with type, carrier, and region.
  • NPPES Enrichment. This service matches each mastered healthcare provider to a National Provider Identifier (NPI), and adds detailed practice, credential, specialty, name, and other information from the National Plan & Provider Enumeration System (NPPES).

See the linked topics above for processing details and added attributes.

About the Clustering Model

The healthcare providers model groups records as follows:

First, by trusted_id. Records with the same trusted_id are clustered together. Records with different trusted_ids are not clustered together.

Records with null/empty trusted_id are clustered based on the middle name value or similarity, meaning that they may be clustered with records that have a trusted _id.

Then, by middle name. If records have not been clustered together based on trusted_id values, then records with different values for middle name are not clustered together.

For example, if two records have the same trusted_id, they are clustered together regardless of middle name value. However, if two records do not have trusted_id values and have different values for middle name, these records will not be clustered together, regardless of the similarity in other attributes.

Then, by similarity.
Records with empty trusted_ids and with matching or empty middle names are clustered based on similarities between these attribute fields:

  • Full address
  • Street address
  • City
  • Region
  • Country code
  • Postal code
  • First three digits of the postal code
  • Phone Number
  • First name
  • Last name
  • Middle name
    Note: Records that are highly similar but have different middle name values will not be grouped together, unless these records have the same trusted_id.
  • Name suffix
  • Organization name
  • Speciality
  • Credentials

Note: Generic descriptions, rather than specific attribute names, are listed to represent both the standard schema and the attributes added by the enrichers and other data transformations.