Unit testing assets and ops
Unit testing is essential for ensuring that computations function as intended. In the context of data pipelines, this can be particularly challenging. However, Dagster streamlines the process by enabling direct invocation of computations with specified input values and mocked resources, making it easier to verify that data transformations behave as expected.
While unit tests can't fully replace integration tests or manual review, they can catch a variety of errors with a significantly faster feedback loop.
This guide covers how to write unit tests for assets and ops with a variety of different input requirements.
Prerequisites
To follow the steps in this guide, you'll need familiarity with:
Before you start
Before you begin implementing unit tests, note that:
- Testing individual assets or ops is generally recommended over unit testing entire jobs
- Unit testing isn't recommended in cases where most of the business logic is encoded in an external system, such as an asset which directly invokes an external Databricks job.
Assets and ops without arguments
The simplest assets and ops to test are those with no arguments. In these cases, you can directly invoke definitions.
- Assets
- Ops
Loading...
Loading...
Assets and ops with upstream dependencies
If an asset or op has an upstream dependency, you can directly pass a value for that dependency when invoking the definition.
- Assets
- Ops
Loading...
Loading...
Assets and ops with config
If an asset or op uses config, you can construct an instance of the required config object and pass it in directly.
- Assets
- Ops
Loading...
Loading...
Assets and ops with resources
If an asset or op uses a resource, it can be useful to create a mock instance of the resource to avoid interacting with external services.
- Assets
- Ops
Loading...
Loading...
Assets and ops with context
If an asset or op uses a context
argument, you can use build_asset_context()
or build_op_context()
to construct a context object.
- Assets
- Ops
Loading...
Loading...