Automatically Cleaned Data Values

The data cleaning feature in data product configuration enables you to replace known bad values in your source data with null, when Tamr finds an exact, case insensitive match for an attribute value. This helps to ensure that these values are not used for matching or included in your golden records

In addition to values you specify for data cleaning, Tamr automatically replaces the following values with null wherever it identifies an exact, case insensitive match in the mapped source record fields:

  • ABSENT
  • BLANK
  • CC ONLY DO NOT INACTIVATE
  • CCCCCCCCCCC
  • CCCCCCCCCCCC
  • CCCCCCCCCCCCC
  • COPYFROM
  • COPY FROM
  • DO NOT INACTIVATE
  • DO NOT DEACTIVATE
  • EMPTY
  • INVALID
  • MISSING
  • N/A
  • NA
  • NAN
  • NIL
  • NO CO SPECIFIED
  • NO DATA
  • NO INFO
  • NO STREET
  • NONE
  • NONEXISTENT
  • NOT APPLICABLE
  • NOT AVAILABLE
  • NOT FOUND
  • NOT PROVIDED
  • NOT PROVIDED ACCOUNT - UNKNOWN
  • NOT SET
  • NOT VALID
  • NULL
  • [email protected]
  • STREET
  • TBD
  • UNAVAILABLE
  • UNDEFINED
  • UNKNOWN
  • VACANT
  • [[UNKNOWN]]
  • [UNKNOWN STREET]
  • [UNKNOWN STREET)
  • UNKNOWN VALUE
  • UNKNOWN ZIP
  • UNRESOLVED
  • UNSET
  • UNSPECIFIED
  • VOID

Example

Consider this example in which Tamr automatically replaces attribute values of STREET will null

The following attribute values are replaced with null:

  • street
  • Street
  • STREET

The following values are not cleaned; the full value remains in the attribute:

  • streets
  • 123 main street
  • ELM STREET
  • St.