Mapping Source Columns to a Unified Schema
You map columns in your source records to the attributes in the industry-standard schema for your selected data product.
When you create a data product, Tamr Cloud automatically creates predefined output attributes for the unified schema. After adding source datasets to a flow, map the columns from these datasets to the unified output attributes.
- You must map at least one input field to the required unified attributes for your data product. If these output fields do not have mappings, the flow will not run. See the Data Product Templates topics for the required attributes for each data product.
- You can add new output fields if your input dataset includes a field not represented by the predefined output fields.
- A single input field can be mapped to multiple output fields.
- Do not map multiple fields from the same dataset to the same output field.
- Multiple input fields from different datasets can be mapped to the same output field.
- If your input datasets contain fields that you do not want to include in the unified schema, you do not need to map those fields to an output field.
Important: Do not delete or rename the predefined output fields in the right pane.
Updating Schema Mapping
-
Open the data product from the home page.
-
Select the Configure Flow page.
-
Select the Align to Customer Data Model (Schema Mapping) step.
-
Change the input datasets for the step by selecting Settings to open the Edit Step dialog.
-
In the Edit Step dialog, add your input datasets:
- In the Input section, select Add.
- From the Dataset dropdown, select your input dataset.
- Repeat to add each of input datasets you added to your flow.
- Select Update.
-
On the Align source fields page, map input fields to output fields as follows:
-
To map a source column, drag it from the input fields panel (left) to the appropriate attribute in the output fields panel (right).
-
Automatically map column(s) by selecting source dataset columns then selecting Actions > AutoMap. See AutoMap Fields below for more detail.
-
To remove a mapping, select the input column in the input fields panel (left) and then choose Actions > Unmap.
-
To add a new output attribute, drag and drop a field from the input fields panel (left) to the Add New section at the top of the output fields panel (right).
You can also add a new output attribute by selecting Actions > Create in the output panel (right) and entering a name for the attribute. Then, drag and drop a column from the input fields panel (left) to the new attribute in the output fields panel.
Note: Attribute names can contain ONLY alphanumeric characters and underscores. -
To remove an output attribute, select it in the output fields panel (right) and then select Actions > Remove. The output attribute and its related mappings are removed.
Tip: You can sort columns and attributes in ascending or descending order to more easily find specific fields. To sort, select the up or down arrow next at the top of the field column.
-
-
When you have finished the schema mapping updates, navigate back to the flow by selecting the back arrow next to the step description.
AutoMapping Fields
The AutoMap feature can help you quickly map source columns to appropriate attributes in the unified schema, by:
- Identifying source columns that match previously mapped source columns. AutoMap applies the same mapping for the matching columns.
- Identifying source column that match unified schema attributes. AutoMap maps these columns to their matching unified schema attributes.
AutoMap considered columns and attributes to be a match when they contain the same words, not including delimiter characters, plural words, and partial matches. Delimiters recognized include camel case, but not lower case characters. AutoMap does not map two columns from a single source to the same output attribute. Additionally, unlike the Map option, AutoMap does not create an output attribute if no match is found for the selected column.
Two names are considered a match if the names are an exact match when split on:
- The following characters:
- \ _ ( ) / \
- The boundary between lowercase and uppercase letters.
- The boundary between letters and numbers.
- Whitespace characters.
Example | Resulting Action |
---|---|
Match cases | Match: addressLine1 , address_line_1 , and Address Line (1) These 3 columns would be mapped to the unified schema attribute address_line_1 . |
Delimiters accepted | Match: company_id and company id These 2 columns would be mapped to the unified schema attribute company_id . |
Delimiters not accepted | No match: address|Line|1 and addressLine1 No mapping. Pipe character delimiters are not accepted. |
Plurals | No match: region and regions No mapping. |
Partials | No match: Primary Street Address and Street Address No mapping. |
Updated 8 months ago