Persistent Identifier Attributes in Data Products
Tamr Cloud assigns a unique identifier to each entity. The same identifier is added to each source record in a cluster and to the golden record.
Identifier for Golden Records
Each golden record, or mastered entity, is assigned a Tamr ID (tamr_id
). This same ID is assigned to each source record in the golden record's cluster.
Identifiers for Source Records
The following attributes store unique identifiers for source records. These attributes can help you understand how a source record was clustered.
Attribute | Display Name | Description |
---|---|---|
entityId tamr_record_id | Entity ID Tamr Record ID | Both of these attributes store the unique identifier for the source record, generated by Tamr. This is a 128-bit hash value of the source dataset name and the source primary key. Note: This is not the same as the Tamr ID. |
tamr_id | Tamr ID | The final Tamr ID for this source record; all records in a cluster, along with the golden record for that cluster, are assigned the same Tamr ID. |
clustering_metadata.ml_cluster_id | Suggested Cluster ID | This is the Tamr ID assigned to the record by the clustering model. This suggested Tamr ID can be overridden by clustering rules and manual source record curation. |
clustering_metadata.rule_cluster_id | Rule Clustered ID | The Tamr ID for a record after clustering rules are applied. If clustering rules have been applied, the Applied Clustering Rules attribute ( clustering_metadata.applied_clustering_rules ) includes the numbers of the rules applied, available on the Configure Data Product page. |
clustering_metadata.verified_cluster_id | Verified Cluster ID | The Tamr ID for a record if it is manually moved to a new cluster through curation. If a source record has been moved to (verified in) a new cluster, the verification type is always suggest , meaning that an override was applied. |
Updated 1 day ago