End-to-End Process

At a high level, here is what you need to do in order to configure, run, and manage a data product.

Note: For legacy data products, see Running an End-to-End Flow for Legacy Data Products.

Connect Data

The first step is connecting your data to Tamr Cloud. To do this, you configure connections to your data storage locations, and then add your source datasets.

Configure the Data Product

Next, you add and then configure your data product. You map columns from your source data to the industry-standard schema for your selected data product. Configure record consolidation rules, clustering rules, and data cleaning.

For more details on configuring your data product, see:

Run the Data Product

Running the data product deduplicates your data by grouping similar records together into clusters, and cleans and enriches it with third-party data. Tamr produces the single best record representing each entity (the golden record).

Review and Curate Results

When the data product run completes, review your golden records and source record clusters.

You can configure which attributes are included in the golden records table, and use review tools such like sorting, filtering, and table views to look over your data. Insight reports provide detail to help you understand your results.

You can verify records that you are sure are clustered correctly, meaning these records will always be clustered together, regardless of changes in the source data. If necessary, manually adjust record clusters and any incorrect or incomplete attribute values.

Publish Data

Export data to your storage systems to use in downstream applications. When you publish data, any data already published to the destination for the data product is overwritten..

Automate

Automate your process using the Jobs API .

Optional: Configuring Your Data Product for Tamr RealTime

If you are using the Tamr RealTime offering with this data product, see About Tamr RealTime for instructions on configuring data products for real-time use cases.