Adding an Entity Type

When adding an entity type, you select the appropriate template for your mastering use case. See Tamr Cloud Templates for details on each template.

Tamr Cloud creates the new entity type in Studio, and creates the mastering flow for that entity type in Designer. This process may take several minutes to complete. The Designer flow includes two sample datasets that you can use to explore the flow, test running the flow, and view the flow output. You can then replace the sample datasets with your datasets and modify the flow for your input data.

Before You Begin:

  1. Ensure that your input datasets for the entity type meet the requirements.
  2. Provide your input datasets to Tamr. Your Tamr representative uploads these files to Google Cloud Storage (GCS) for you, and provides you with access to these files in Tamr Cloud through a shared connection.

Adding a New Entity Type

To add a new entity type:

  1. Navigate to Studio.
  2. Select Add Entity Type in the top right.
  3. Select the appropriate template for your use case, and then select Next.
  4. Enter a name and optional description for the entity type.
    The entity name must be unique.
    When publishing the mastered datasets for this entity type, spaces and hyphens are converted to underscores.
  5. Select Add Entity Type.

Tamr Cloud creates the new entity type in Studio and creates the mastering flow for that entity type in Designer. This process may take several minutes to complete.

Running the Mastering Flow with Sample Data

To run the mastering flow and explore Designer:

  1. Navigate to Designer.
  2. Select the entity type tile to open the flow.
  3. Explore the steps created by the template:
    • Select each step to open it and view its configuration. Note that the Add Data step includes two sample datasets.
    • Return to the flow by selecting the arrow backwardarrow backward back arrow next to the step description at the top of the screen.
  4. Select Run Flow to run the mastering flow with the sample datasets.
    Note: Running a flow takes approximately 20-30 minutes.
  5. Monitor the job status:
    • In Designer, the status for each step changes as the flow runs. The overall flow status is available at the bottom of the page.
    • In Admin > Jobs, the Jobs table contains details for each Tamr Cloud job.
  6. When the flow completes, view the mastered data in Studio:
    • Navigate to Studio. The metrics on this entity type tile update to reflect the mastered data. (See Viewing the Latest Entity Type Data for more details.)
    • Select the entity type to view the mastered entities and fields.

Modify the Flow for Your Input Data and Downstream Requirements

See Modifying Mastering Flows for Your Data.