Managing Entity Record Clusters

When you run a mastering flow, Tamr Cloud groups records that refer to the same entity into a cluster, using a trained model. Each entity can represent a cluster of one to thousands of records.

In Curator, you can review the entities and their clustered source records for entity types to which you have access. If you determine that several entities should be combined into a single entity, you can easily merge them. You can also perform more advanced entity source record management; for each entity, you can review similar entities and move individual source records between them.

When you merge entities or move records between similar entities, your changes are applied by the Clustering step in Designer the next time the flow is run. Changes are shown as pending until the flow is run. These changes then persist on future mastering flow runs.

When you change the source records for an entity, note that different source record values may be selected for entity fields when the flow is run.

Tip: From an entity record in Studio, you can open the entity directly in Curator by selecting Override Value.

Merging Entities

If you know that two entities refer to the same real-word entity, you can quickly merge them. When you merge two or more entities, you select the surviving entity (the entity into which the others will be merged). The Tamr ID for the surviving entity remains the same, and all source records from the merged entities move to the surviving entity. The entities merged into the surviving entity are removed, along with their Tamr IDs.

To merge entities:

  1. Navigate to Curator.
  2. Open the entity type tile.
  3. In the Entities tab, select the checkboxes for two or more entities to merge.
    You can sort and filter the table to find specific entities.
  4. Choose Merge from the Actions dropdown above the entities table.
  5. In the confirmation dialog, review the list of entities being merged and then select Next.
  6. Select the entity into which the other entities will be merged. (This is the surviving entity).
  7. Select Merge & Close.

In the entities table, changes are shown as pending until the flow is run.

Moving Source Records between Entities

If you need to compare the source records of several entities to determine whether they have been clustered into the correct entity, you can use the Move option to review the source records and move them between entities if necessary. When you move source records between entities, you select a primary (target) entity to review and additional entities to compare to this entity. You can move records from the primary entity into any of these additional entities, as well as into any other similar entities. You can also move source records from the selected or similar entities into the primary entity. Your changes will be pending until the next time the flow is run.

importantimportant Important: If you move all source records out of a selected or similar entity, that entity, along with its Tamr ID, will be removed the next time the flow is run.

To move source records between clusters:

  1. Navigate to Curator.
  2. Open the entity type tile.
  3. In the Entities tab, select the rows for the entities whose source records you want to review.
    You can sort and filter the table to find specific entities.
  4. Choose Move from the Actions dropdown above the entities table.
  5. In the confirmation dialog, review the list of entities to review for source record overrides. Select Next.
  6. In the Choose a Target Entity dialog, select the primary entity for review.
    The Source Records Overrides page opens. The primary entity is shown at the top of the page, with the list of source records for the entity. The other selected entities are listed in the Selected & Similar Entities panel in the bottom left corner, along with any other similar entities.
  7. In the Selected & Similar Entities panel, select one or more entity to view the related source records in the Selected & Similar Source Records panel.
    You can sort, filter, and configure the table to find specific records or view specific fields.
  8. To move source records from a similar entity into the primary entity, select one more records in the Selected & Similar Source Records panel, and either:
    • Drag and drop the selected records into the Primary Entity Source Records list. Confirm the move.
    • Choose Move from the Actions dropdown above the source records table. Confirm the move.
    • If you need to undo your changes, select Reset.
  9. To move source records from the primary entity into a similar entity, select one or more records record or records to move from the Source Records pane and select Move. Select Submit to confirm.
    • Drag and drop the selected records into the list. Confirm the move, and then select the target entity to which to move the records.
    • Choose Move from the Actions dropdown above the Primary Entity Source Records table. Confirm the move, and then select the target entity to which to move the records.
    • If you need to undo your changes, select Reset.
  10. When you have completed all of your changes, select Save to save your changes.
  11. Navigate back to the entities table. Changes are shown as pending until the flow is run.

Viewing Source Record Override History

After you run the flow to apply cluster overrides, you can view both the cluster assigned to a source record by the clustering model and the cluster to which the record was moved by an override in the following places:

  • Apply Clustering step output in Designer
  • Source record tables in Curator and Studio
  • Source Records by Entity published dataset

The cluster assigned by the clustering model is stored in the suggestedClusterId field.
The cluster to which a record has been moved through overrides is stored in the verifiedClusterId field.

See Persistent Identifier Fields in Tamr Cloud for more information on these fields.