Persistent Identifier Attributes in Data Products

Tamr Cloud assigns a unique, persistent identifier to each entity. The same identifier is added to each source record in a cluster and to the mastered entity record.

Identifier for Golden Records

Each golden record, or mastered entity, is assigned a Tamr ID (tamr_id). This same ID is assigned to each source record in the golden record's cluster.

Identifiers for Source Records

The following attributes store unique identifiers for source records. These attributes can help you understand how a source record was clustered.

AttributeDisplay NameDescription
entityId

tamr_record_id
Entity ID

Tamr Record ID
Both of these attributes store the unique identifier for the source record, generated by Tamr. This is a 128-bit hash value of the source dataset name and the source primary key.

Note: This is not the same as the Tamr ID.
tamr_idTamr IDThe final Tamr ID for this source record; all records in a cluster, along with the golden record for that cluster, are assigned the same Tamr ID.
clustering_metadata.ml_cluster_idSuggested Cluster IDThis is the Tamr ID assigned to the record by the clustering model.

This suggested Tamr ID can be overridden by clustering rules and manual source record curation.
clustering_metadata.rule_cluster_idRule Clustered IDThe Tamr ID for a record after clustering rules are applied.

If clustering rules have been applied, the Applied Clustering Rules attribute (clustering_metadata.applied_clustering_rules) includes the numbers of the rules applied, available on the Configure Data Product page.
clustering_metadata.verified_cluster_idVerified Cluster IDThe Tamr ID for a record if it is manually moved to a new cluster through curation.

If a source record has been moved to (verified in) a new cluster, the verification type is always suggest, meaning that an override was applied.