Managing Clusters
You can override Tamr Cloud-computed clusters.
When you run a mastering flow, Tamr Cloud groups records that refer to the same entity into a cluster, using a trained model. Each entity can represent a cluster of one to thousands of records.
In Curator, you can review the entities and their clustered source records for data products to which you have access. If you determine that several entities should be combined into a single entity, you can easily merge them. You can also perform more advanced entity source record management. For each entity, you can review similar entities and move individual source records between them. You can also create entirely new entities from the source records of the primary entity which you are viewing.
When you merge entities, move records between entities, or create new entities, your changes are applied by the Clustering step in Designer the next time the flow is run. Changes are shown as pending until the flow is run. These changes then persist on future mastering flow runs.
When you change the source records for an entity, note that different source record values may be selected for entity fields when the flow is run.
Tip: From a record in Studio, you can open the entity directly in Curator by selecting Override Value.
Merging Entities
If you know that two entities refer to the same real-word entity, you can quickly merge them. When you merge two or more entities, you select the surviving entity (the entity into which the others will be merged). The Tamr ID for the surviving entity remains the same, and all source records from the merged entities move to the surviving entity. The entities merged into the surviving entity are removed, along with their Tamr IDs.
To merge entities:
- Navigate to Curator.
- Open the data product tile.
- In the Entities tab, select the checkboxes for two or more entities to merge.
You can sort and filter the table to find specific entities. - From the dropdown, choose Actions > Merge.
- In the confirmation dialog, review the list of entities being merged and then select Next.
- Select the entity into which the other entities will be merged. (This is the surviving entity).
- Select Merge & Close.
In the table, changes are shown as pending until the flow is run.
Watch the video below to learn how to quickly merge entities that represent the same real-world entity.
Moving Source Records between Entities
If you need to compare the source records of several entities to determine whether they have been clustered into the correct entity, you can use the Move option to review the source records and move them between entities if necessary. When you move source records between entities, you select a primary (target) entity to review and additional entities to compare to this entity. You can move records from the primary entity into any of these additional entities, as well as into any other similar entities. You can also move source records from the selected or similar entities into the primary entity. Your changes will be pending until the next time the flow is run.
Important: If you move all source records out of a selected or similar entity, that entity, along with its Tamr ID, will be removed the next time the flow is run.
To move source records between clusters:
- Navigate to Curator.
- Open the data product tile.
- In the Entities tab, select the rows for the entities whose source records you want to review.
You can sort and filter the table to find specific entities. - From the dropdown, choose Actions > Move.
- In the confirmation dialog, review the list of entities to review for cluster overrides. Select Next.
- In the Choose a Target Entity dialog, select the primary entity for review.
The Cluster Overrides page opens. The primary entity is shown at the top of the page, with the list of source records for the entity. The other selected entities are listed in the Selected & Similar Entities panel in the bottom left corner, along with any other similar entities. - In the Selected & Similar Entities panel, select one or more entity to view the related source records in the Selected & Similar Source Records panel.
You can sort, filter, and configure the table to find specific records or view specific fields. - To move source records from a similar entity into the primary entity, select one more records in the Selected & Similar Source Records panel, and either:
- Drag and drop the selected records into the Primary Entity Source Records list. Confirm the move.
- Choose Move from the Actions dropdown above the Primary Entity Source Records table. Confirm the move.
- If you need to undo your changes, select Reset.
- To move source records from the primary entity into a similar entity, select one or more records record or records to move from the Source Records pane and select Move. Select Submit to confirm.
- Drag and drop the selected records into the list. Confirm the move, and then select the target entity to which to move the records.
- Choose Move from the Actions dropdown above the Primary Entity Source Records table. Confirm the move, and then select the target entity to which to move the records.
- If you need to undo your changes, select Reset.
- When you have completed all of your changes, select Save to save your changes.
- Navigate back to the entities table. Changes are shown as pending
until the flow is run.
Watch the video below to learn how to compare source record clusters for selected and similar entities, and move source records between clusters.
Creating a New Entity
You can choose to create a new entity from one or more source records from the primary entity which you are viewing, moving those records from the current entity to a new entity. This feature is useful for situations in which clustered source records represent neither the entity to which they currently belong nor to any other existing entity, but rather represent a new entity. Your changes will be pending until the next time the flow is run.
To create a new entity:
- Navigate to Curator.
- Open the data product tile.
- In the Entities tab, select source records to review, then navigate to the Cluster Overrides tab.
You can sort and filter the table to find specific entities. - Select one or more source records from the primary entity to move to a new entity.
- Select Actions > Create New Entity.
- Confirm the source records you want to move into a new entity, then select Create.
These records are shown in pending newstate.
- When you have completed all of your changes, select Close to save your changes.
- Navigate back to the entities table. Changes are shown in pending state until the flow is run.
You must run the flow in Designer to update your changes. Until you run the flow, your changes are in a pending state.
Note: After confirming that you want to create the new entity, you cannot use the Reset option to undo this change. If you do need to revert this change, you can merge the new entity with its previous entity after running the flow.
Watch the video below to learn how to move source records to a newly created entity.
Viewing Cluster Override History
After you run the flow to apply cluster overrides, you can view both the cluster assigned to a source record by the clustering model and the cluster to which the record was moved by an override in the following places:
- Apply Clustering step output in Designer
- Source record tables in Curator and Studio
- Source Records by Entity published dataset
The cluster assigned by the clustering model is stored in the suggestedClusterId
field.
The cluster to which a record has been moved through overrides is stored in the verifiedClusterId
field.
See Persistent Identifier Fields in Tamr Cloud for more information on these fields.
Confirming Cluster Overrides
After running a flow, you can confirm that your cluster overrides were applied by checking several field values.
- Navigate to Studio.
- Open the entity containing the source records that you applied changes to.
- Open Source Records tab.
- Scroll across the source record table until you reach the
persistentId
,suggestedClusterId
,ruleClusterId
,verifiedClusterId
, andverificationType
columns. - Reference the cases below to confirm that your intended override was applied.
Case 1: No Overrides Applied
If no cluster overrides were applied, you see the following:
persistentId
,suggestedClusterId
,ruleClusterId
have the same valueverifiedClusterId
andverificationType
values are both null

Case 2: New Entity Created in Curator
If you created a new entity from clustered source records, you see the following in the source records for the new entity:
persistentId
matches theverifiedClusterId
This is the Tamr ID for the newly created entity.suggestedClusterId
matches theruleClusterId
The value for these columns is the original cluster for the source records.verificationType
isSUGGEST
SUGGEST
indicates that an override was made.

Case 3: Source Records Moved to Different Cluster
If you moved source records to a different cluster, you see the following in the source records that were moved:
persistentId
andverifiedClusterId
are the same
This is the Tamr ID of the entity to which the source records have been moved.persistentId
andsuggestedClusterId
are different
The value for these columns is the original cluster for the source records.verificationType
isSUGGEST
SUGGEST
indicates that an override was made.

Updated 6 days ago