Publishing the Tamr RealTime Datastore

You can use the Jobs API create operation to publish the dataset from the RealTime datastore to a configured AWS S3 publish destination.

In order to publish the dataset via the API:

  1. Work with Tamr to configure an AWS S3 publish destination. Contact Tamr Support ([email protected]) for assistance.
  2. Obtain the pub_ ID for the publish destination from Tamr.

File Format for Published Dataset

Publishing the RealTime datastore exports the data as multi-part new-line delimited JSON (NDJSON) files to a directory in the configured S3 destination.

The files are UTF-8 encoded, and are named part-<ID>.json. Each file include the following columns.

ColumnData Type
recordIdstring
tableIdstring
versionIdinteger
datastringified json
createdMsinteger (milliseconds since unix epoch)
updatedMsinteger (milliseconds since unix epoch)