You can schedule the following recurring jobs:
- Load source: This job refreshes the data in Tamr Cloud for a specified source dataset. For example, you might schedule a recurring job to refresh the data stored in a specific Snowflake table each Monday morning.
- Update data product or workflow: This job runs the specified data product or workflow. For example, you might schedule a recurring job to run the mastering flow each Monday afternoon. You can schedule both legacy and non-legacy data product runs.
- Publish: This job publishes data to a specified destination. For example, you might schedule a recurring job to publish data each Monday evening.
Specifying Job Time and Frequency
When configuring a scheduled job:
- You must leave at least 60 minutes between jobs for the same object.
- Specify the time zone. This must be a valid time zone from the the IANA Time Zone database, and the value is case-sensitive.
- Specify the recurring timetable, following Unix-style crontab format. Valid crontab values are:
- Minute: 0-59
- Hour: 0-23
- Day: 1-31
- Month: 1-12 or JAN-DEC
- Week day: 0-7 (0 and 7 are both Sunday) or SUN-SAT
- In addition, you can match multiple values, as follows:
- Match all values: *
- Match a range of values: 1-5
- Match a range of values with step: */2
- Match a list of values: 1,2,3 (comma separated, no spaces)
Pausing Scheduled Jobs
If needed, you can pause scheduled jobs. See update for more information.
Job Notifications
The user who scheduled the job receives notifications when the job runs and completes.
Error Handling
Like any event in a distributed system, scheduled events may fail to be submitted if an underlying service is unavailable. The system automatically retries failed scheduled events after a backoff period of one or more minutes.
The history of all scheduled events - including failed ones - can be queried using the ScheduleEvents API. You can query the history of all scheduled events by calling SchedulesEvents list.
Jobs launched by scheduled events also may fail. Job status is available from the Schedules Events API or directly from the Jobs API.
If the scheduled job cannot run because too many other jobs are already running, the scheduled job queues until capacity is available.
Schedule History
You can review the event history for a scheduled job, including the job time and status. See list events for more information.