We are pleased to share the latest improvements in Tamr Cloud, including significant improvements to source dataset management, a new connection to AWS S3 for source data, and the ability to create new entities from source records in Curator.

Centrally Add and Manage Source Datasets

In the Admin > Sources app, you can now centrally add and manage all source datasets. Use this new feature to add source datasets to use in one or more mastering flows. For each source dataset, you can:

View the connection type and other metadata, as shown in the image below.
Select the number in the Rows column to open a dataset preview.
Select the number in the Used By column to see which entity types are using the dataset.
Select More to edit, delete, refresh, or share the dataset with other users.

Sources app with several source datasets

Existing sources used in mastering flows will remain available and your flows will continue to run. Add any new source datasets in Admin > Sources.

Tamr will contact customers to assist with migrating existing source datasets used in mastering flows to take advantage of this new feature. Contact us at [email protected] with any questions.

See Managing Data Sources for more information on adding new source datasets and using them in your mastering flows.

Schedule Source Data Refresh

In Admin > Scheduler, you can now schedule recurring tasks to refresh source datasets. This task updates the dataset with the latest version of the data from its cloud storage location. In the example below, the Customer source dataset is scheduled to be refreshed every Monday at 6:00 am UTC. Once the data has been refreshed, the updated data is available for any mastering flows that use this dataset as a source.

Note: At this time, these tasks do not appear listed in the Jobs table.

1806 — Recurring task to refresh a source dataset

See Scheduling Recurring Tasks for more information.

Connect to AWS S3 for Source Data

You can now create connections to AWS S3 to add source datasets to Tamr Cloud. See Connecting to Cloud Storage Locations for more information.

Create New Entity in Curator

When reviewing the clustered source records for a mastered entity in Curator, you can now choose to create a new entity from one or more records, as shown in the image below. This feature is useful for situations in which clustered records do not represent the entity to which they currently belong or any other existing entity. The selected records will be shown in pending state until the mastering flow is run. When the flow is run, they will be moved out of their original entity cluster and into a new entity.

1307 — New Create option in the Source Record Override page in Curator

See Managing Entity Record Clusters for more information.

Patient Mastering Template

Use the Patient Mastering template to master patient data, delivering patient identity resolution across disparate sources and systems. The template provides enrichment for phone numbers and addresses, helping to ensure that you have the most complete and up-to-date information. This template is HIPAA-compliant, as it does not master PMI (Patient Master Index) data.

See Patient Mastering Template for more information.

Fixed Issues and Other Improvements

Our latest release of Tamr Cloud includes general usability improvements and bug fixes, as well as:

(Fix) Source dataset rows are dropped in the Designer mastering flow if the primary key is not unique. This issue has been resolved; users now see an error message and the flow fails if the primary key is not unique for steps that require a unique primary key.
(Improvement) Users can now add UTF-8 files with and without BOM markers as source datasets.

For the latest information on known issues, see Troubleshooting and Known Issues.