Publishing the Tamr RealTime Datastore

You can use the Jobs API create operation to publish the records or relationships from the RealTime datastore to a configured S3, ADLS2, GCS, BigQuery or Snowflake publish destination.

In order to publish the dataset via the API:

  1. Work with Tamr to configure a publish destination. Contact Tamr Support ([email protected]) for assistance.
  2. Obtain the pub_ ID for the publish destination from Tamr.

Format for Published Records

Publishing the RealTime datastore exports the data to the configured destination as either tables, multi-part CSV files or multi-part NDJSON files. CSV and NDJSON files are UTF-8 encoded, and are named part-<ID>.csv and part-<ID>.ndjson respectively.

Each output include the following columns:

  • recordId
  • tableId
  • versionId
  • data
  • createdMs (milliseconds since unix epoch)
  • updatedMs (milliseconds since unix epoch)

Format for Published Relationships

Publishing the RealTime datastore relationships exports the relationships to the configured destination as either tables, multi-part CSV files or multi-part NDJSON files. CSV and NDJSON files are UTF-8 encoded, and are named part-<ID>.csv and part-<ID>.ndjson respectively.

Each output include the following columns:

  • relationshipId
  • fromTableId
  • fromRecordId
  • toTableId
  • toRecordId
  • relationshipTypeId
  • versionId
  • relationshipDetails
  • createdMs (milliseconds since unix epoch)
  • updatedMs (milliseconds since unix epoch)