Source Dataset Requirements for Patient Mastering
You align your source data with the industry-standard schema for patient data that is supplied by this template.
The patient mastering template includes a predefined, standardized schema for patient data. The mastering flow for data products produced by this template includes a schema mapping step in which you identify how columns in your source datasets correspond to the attributes in the supplied schema.
To prepare, review the general Requirements for Source Datasets. Then, identify the column or columns in each of your source datasets that you will map to the patient schema:
primaryKey
: This is the primary key used in the source dataset to uniquely identify each record. See About Primary Keys for more information.address_line_1
address_line_2
city
country
dob
: Date of birth.email
fax_number
first_name
last_name
middle_name
name_prefix
name_suffix
gender
patient_national_identification_number
: For example, Social Security Number.phone_number
phone_number_alt
: Alternate phone number.postal_code
region
trusted_id
: This is a non-unique key, such as a patient identification number used by your internal systems. The clustering model always clusters together records that have the sametrusted_id
. If the values in this field do not represent a definite match, map an empty placeholder field totrusted_id
, and then add the following transformation in the Create tamr_record_id step:SELECT *, '' as trusted_id;
.
After you map your source data fields to these attributes, Tamr Cloud can enrich your data and consolidate similar records into entities.
Tip: You can also add attributes to the unified schema and map columns that you want to include in the mastered data product to them. The template does not use these additional attributes as part of the mastering process.
Updated 3 months ago