Managing Data Product Lifecycle
Best practices for moving a data product from development to testing, and then to production.
Typically, the data product lifecycle involves three stages:
- Development
- User acceptance testing (UAT)
- Production
The promotion of data products from development to UAT to production requires careful planning and execution, as well as a methodical approach to promoting changes. By following the steps described in this topic, teams can efficiently manage the lifecycle of a data product in Tamr Cloud. This structured approach ensures that changes are properly vetted and tested at each stage, thereby minimizing risks and ensuring the stability and reliability of the production data product.
Important: Tamr IDs are not retained when promoting a data product from development to UAT, or from UAT to production.
Development Stage
This stage is critical for testing and refining your product's functionality within a controlled environment.
-
Configure data connections.
Configure a development connection and add the corresponding source datasets for the data product. -
Configure the data product flow.
Add a new data product. As a best practice, include “DEV” in the name and description. Add and map the sources and apply any other necessary configuration. Run the data product. -
Review the results.
When the data product run completes, open the data product. Validate your results by reviewing the golden records and source record clusters within your development data product. -
Verify the end-to-end pipeline.
Confirm the functionality of the entire pipeline, from connection through to publishing, within the development environment. -
Implement and validate changes.
Make any necessary changes within the data product to align with the final expected outcomes for UAT and production. This might include adjusting the connections, modifying the source data, or editing the data product configuration. Then, rerun the data product to validate and save the changes.
User Acceptance Testing (UAT) Stage
This stage is critical for conducting user acceptance tests to ensure that the product meets all requirements and operates as expected in a setting that closely mimics the production data product.
-
Copy the DEV data product.
Use the Save As option to save a new version of the DEV data product for UAT. As a best practice, include “TEST” in the name and description of the new data product.
The Save As option creates a new data product; it does not replace the DEV data product. -
Configure and run the data product.
Review the data product configuration and make any necessary adjustments. For example, you might need to remove data sources used for development purposes only or add new sources. Then run the data product. -
Review the results.
When the data product run completes, open the data product. Validate your results by reviewing thegolden records and source record clusters within your UAT data product. -
Verify the end-to-end pipeline.
Confirm the functionality of the entire pipeline, from connection through to publishing, within the UAT environment. -
Perform user acceptance testing.
Ask team members to review the data product and gather their feedback. -
Implement and validate changes.
Make any necessary changes within the data product to align with the final expected outcomes for production. Then, rerun the data product to validate and save the changes.
Production Stage
Following successful validation in the UAT stage, the next step is to promote these changes to the production data product. This stage involves careful planning to ensure that the transition is smooth and does not impact existing users or systems negatively.
-
Copy the TEST data product.
Use the Save As option to save a new version of the TEST data product for production.
The Save As option creates a new data product; it does not replace the TEST data product. -
Configure and run the data product flow.
Review the data product configuration and make any necessary adjustments. For example, you might need to remove data sources used for testing purposes only or add new sources. Then run the data product. -
Validate the results.
When the flow completes, open the data product . Validate your results by reviewing the golden records and source record clusters within your production data product. -
Verify the end-to-end pipeline.
Confirm the functionality of the entire pipeline, from connection through to publishing, within the production environment. -
Perform ongoing monitoring and adjustment.
On an ongoing basis, monitor the results of your data product runs and curation changes. This will help you quickly identify and address any issues that may arise, and ensures the long-term reliability and performance of the data product.
For example, you can create views to quickly review:
- Key accounts.
- Clusters with low uniformity scores.
- Similar clusters.
See Utilizing Review Tools for more information on views and filtering.
You can also review Insight metrics to understand how your results have changed over time. See Gaining Insights with Data Product Metrics
Updated 3 months ago