Schedule cron-based pipelines
Schedules enable automated execution of jobs at specified intervals. These intervals can range from common frequencies like hourly, daily, or weekly, to more intricate patterns defined using cron expressions.
Prerequisites
To follow the steps in this guide, you'll need:
- Familiarity with Assets
- Familiarity with Ops and Jobs
Basic schedule
A basic schedule is defined by a JobDefinition
and a cron_schedule
using the ScheduleDefinition
class. A job can be thought of as a selection of assets or operations executed together.
Loading...
Run schedules in a different timezone
By default, schedules without a timezone will run in Coordinated Universal Time (UTC). To run a schedule in a different timezone, set the timezone
parameter:
daily_schedule = ScheduleDefinition(
job=daily_refresh_job,
cron_schedule="0 0 * * *",
timezone="America/Los_Angeles",
)
Create schedules from partitions
If using partitions and jobs, you can create a schedule using the partition with build_schedule_from_partitioned_job
. The schedule will execute at the same cadence specified by the partition definition.
- Assets
- Ops
If you have a partitioned asset and job:
Loading...
If you have a partitioned op job:
Loading...
Next steps
By understanding and effectively using these automation methods, you can build more efficient data pipelines that respond to your specific needs and constraints:
- Learn more about schedules in Understanding automation
- React to events with sensors
- Explore Declarative Automation as an alternative to schedules