Source Dataset Requirements for Healthcare Providers Mastering
You align your source data with the industry-standard schema for healthcare provider data that is supplied by this template.
The healthcare providers mastering template includes a predefined, standardized schema for healthcare provider data. The mastering flow for data products produced by this template includes a schema mapping step in which you identify how columns in your source datasets correspond to the attributes in the supplied schema.
To prepare, review the general Requirements for Source Datasets. Then, identify the column or columns in each of your source datasets that you will map to the attributes in the healthcare providers schema. After you map your source data fields to these attributes, Tamr Cloud can enrich your data and consolidate similar records into entities.
The table below describes these attributes and explains which are:
- Required: The Schema Mapping step will be marked as incomplete and you cannot run the flow until you map source columns to these attributes.
- Recommended: For optimal data quality, enrichment, and clustering results, map source columns to these attributes.
- Optional: These attributes have minimal impact on your clustering and enrichment results. If your source data includes columns that match these attributes, map them to include that source data in your completed data product.
Unified Attribute | Description | Type |
---|---|---|
primaryKey | The primary key used in the source dataset to uniquely identify each record. See About Primary Keys for more information. | Required |
trusted_id | A non-unique key, such as a provider identification number used by your internal systems.
The clustering model always clusters together records that have the same trusted_id . If the values
in this field do not represent a definite match, map an empty placeholder field to trusted_id ,
and then add the following transformation in the Create tamr_record_id step in the mastering
flow: SELECT *, '' as trusted_id; . |
Optional |
address_line_1 | Line 1 of the provider’s address. | Recommended |
address_line_2 | Line 2 of the provider's address. | Optional |
city | City of the provider’s address. | Recommended |
country | Country of the providers’s address. | Optional |
credentials | The provider's credentials. | Optional |
first_name | The provider's first name. | Required |
last_name | The provider's last name. | Required |
middle_name | The provider's middle name. | Optional |
name_suffix | The provider's suffix, such as Jr. or Senior. | Optional |
gender | The provider's gender. | Optional |
organization_name | The name of the provider's organization. | Optional |
provider_specialty | The provider's specialty. | Recommended |
phone_number | The provider's phone number. | Optional |
postal_code | Postal (zip) code of the provider’s address | Optional |
region | The region of the provider’s address, such as the state or territory. | Recommended |
Tip: You can also add attributes to the unified schema and map columns that you want to include in the mastered data product to them. The template does not use these additional attributes as part of the mastering process.
Updated 4 days ago