Run executors
Executors are responsible for executing steps within a job run. Once a run has launched and the process for the run (the run worker) is allocated and started, the executor assumes responsibility for execution.
Executors can range from single-process serial executors to managing per-step computational resources with a sophisticated control plane.
Relevant APIs
Name | Description |
---|---|
@dg.executor | The decorator used to define executors. Defines an ExecutorDefinition . |
ExecutorDefinition | An executor definition. |
Specifying executors
Directly on jobs
Every job has an executor. The default executor is the multi_or_in_process_executor
, which by default executes each step in its own process. This executor can be configured to execute each step within the same process.
An executor can be specified directly on a job by supplying an ExecutorDefinition
to the executor_def
parameter of @dg.job
or GraphDefinition
:
from dagster import graph, job, multiprocess_executor
# Providing an executor using the job decorator
@job(executor_def=multiprocess_executor)
def the_job(): ...
@graph
def the_graph(): ...
# Providing an executor using graph_def.to_job(...)
other_job = the_graph.to_job(executor_def=multiprocess_executor)
For a code location
To specify a default executor for all jobs and assets provided to a code location, supply the executor
argument to the Definitions
object.
If a job explicitly specifies an executor, then that executor will be used. Otherwise, jobs that don't specify an executor will use the default provided to the code location:
from dagster import multiprocess_executor, define_asset_job, asset, Definitions
@asset
def the_asset():
pass
asset_job = define_asset_job("the_job", selection="*")
@job
def op_job(): ...
# op_job and asset_job will both use the multiprocess_executor,
# since neither define their own executor.
defs = Definitions(
assets=[the_asset], jobs=[asset_job, op_job], executor=multiprocess_executor
)
Executing a job via JobDefinition
overrides the job's executor and uses in_process_executor
instead.
Example executors
Name | Description |
---|---|
in_process_executor | Execution plan executes serially within the run worker itself. |
multiprocess_executor | Executes each step within its own spawned process. Has a configurable level of parallelism. |
dask_executor | Executes each step within a Dask task. |
celery_executor | Executes each step within a Celery task. |
docker_executor | Executes each step within an ephemeral Kubernetes pod. |
k8s_job_executor | Executes each step within an ephemeral Kubernetes pod. |
celery_k8s_job_executor | Executes each step within a ephemeral Kubernetes pod, using Celery as a control plane for prioritization and queuing. |
celery_docker_executor | Executes each step within a Docker container, using Celery as a control plane for prioritization and queueing. |
Custom executors
The executor system is pluggable, meaning it's possible to write your own executor to target a different execution substrate. Note that this is not currently well-documented and the internal APIs continue to be in flux.