Skip to main content

Serverless runtime environment

By default, Dagster+ Serverless will package your code as PEX files and deploys them on Docker images. Using PEX files significantly reduces the time to deploy since it does not require building a new Docker image and provisioning a new container for every code change. However you are able to customize the Serverless runtime environment in various ways:

Add dependencies

You can add dependencies by including the corresponding Python libraries in your Dagster project's setup.py file. These should follow PEP 508.

Example setup.py
from setuptools import find_packages, setup

setup(
name="quickstart_etl",
packages=find_packages(exclude=["quickstart_etl_tests"]),
install_requires=[
"dagster",
# when possible, add additional dependencies in setup.py
"boto3",
"pandas",
"matplotlib",
],
extras_require={"dev": ["dagster-webserver", "pytest"]},
)

You can also use a tarball to install a dependency, such as if pip is unable to resolve a package using dependency_links. For example, soda and soda-snowflake provide tarballs that you can include in the install_requires section:

from setuptools import find_packages, setup

setup(
name="quickstart_etl",
packages=find_packages(exclude=["quickstart_etl_tests"]),
install_requires=[
"dagster",
"boto3",
"pandas",
"matplotlib",
'soda @ https://pypi.cloud.soda.io/packages/soda-1.6.2.tar.gz',
'soda-snowflake @ https://pypi.cloud.soda.io/packages/soda_snowflake-1.6.2.tar.gz'
],
extras_require={"dev": ["dagster-webserver", "pytest"]},
)

To add a package from a private GitHub repository, see Use private Python packages

Use a different Python version

The default Python version for Dagster+ Serverless is Python 3.9. Python versions 3.10 through 3.12 are also supported. You can specify the Python version you want to use in your GitHub or GitLab workflow, or by using the dagster-cloud CLI.

In your .github/workflows/deploy.yml file, update the PYTHON_VERSION environment variable with your desired Python version:

Updating the Python version in deploy.yml
env:
DAGSTER_CLOUD_URL: "http://jamie-test-1.canary.dagster.cloud"
DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
ENABLE_FAST_DEPLOYS: 'true'
PYTHON_VERSION: '3.11'
DAGSTER_CLOUD_FILE: 'dagster_cloud.yaml'

Use a different base image

Dagster+ runs your code on a Docker image that we build as follows:

  • The standard Python "slim" Docker image, such as python:3.8-slim is used as the base
  • The dagster-cloud[serverless] module installed in the image

You can add dependencies in your setup.py file, but when that is not possible you can build and upload a custom base image that will be used to run your Python code:

note

Setting a custom base image isn't supported for GitLab CI/CD workflows out of the box, but you can write a custom GitLab CI/CD yaml file that implements the manual steps noted.

  1. Include dagster-cloud[serverless] as a dependency in your Docker image by adding the following line to your Dockerfile:

    RUN pip install "dagster-cloud[serverless]"
  2. Build your Docker image, using your usual Docker toolchain.

  3. Upload your Docker image to Dagster+ using the upload-base-image command. This command will print out the tag used in Dagster+ to identify your image:

    $ dagster-cloud serverless upload-base-image local-image:tag

    ...
    To use the uploaded image run: dagster-cloud serverless deploy-python-executable ... --base-image-tag=sha256_518ad2f92b078c63c60e89f0310f13f19d3a1c7ea9e1976d67d59fcb7040d0d6
  4. Specify this base image tag in you GitHub workflow, or using the dagster-cloud CLI:

    In your .github/workflows/deploy.yml file, add the SERVERLESS_BASE_IMAGE_TAG environment variable and set it to the tag printed out in the previous step:

    Setting a custom base image in deploy.yml
    env:
    DAGSTER_CLOUD_URL: "http://jamie-test-1.canary.dagster.cloud"
    DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
    SERVERLESS_BASE_IMAGE_TAG: "sha256_518ad2f92b078c63c60e89f0310f13f19d3a1c7ea9e1976d67d59fcb7040d0d6"

Include data files

To add data files to your deployment, use the Data Files Support built into Python's setup.py. This requires adding a package_data or include_package_data keyword in the call to setup() in setup.py. For example, given this directory structure:

- setup.py
- quickstart_etl/
- __init__.py
- definitions.py
- data/
- file1.txt
- file2.csv

If you want to include the data folder, modify your setup.py to add the package_data line:

Loading data files in setup.py
from setuptools import find_packages, setup

setup(
name="quickstart_etl",
packages=find_packages(exclude=["quickstart_etl_tests"]),
# Here "data/*" is relative to the quickstart_etl sub directory.
package_data={"quickstart_etl": ["data/*"]},
install_requires=["dagster"],
)

Disable PEX deploys

You have the option to disable PEX-based deploys and deploy using a Docker image instead of PEX. You can disable PEX in your GitHub or GitLab workflow, or by using the dagster-cloud CLI.

In your .github/workflows/deploy.yml file, update the ENABLE_FAST_DEPLOYS environment variable to false:

Disable PEX deploys in deploy.yml
env:
DAGSTER_CLOUD_URL: "http://jamie-test-1.canary.dagster.cloud"
DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
ENABLE_FAST_DEPLOYS: 'false'

You can customize the Docker image using lifecycle hooks or by customizing the base image:

This method is the easiest to set up, and doesn't require setting up any additional infrastructure.

In the root of your repo, you can provide two optional shell scripts: dagster_cloud_pre_install.sh and dagster_cloud_post_install.sh. These will run before and after Python dependencies are installed. They're useful for installing any non-Python dependencies or otherwise configuring your environment.

Use private Python packages

If you use PEX deploys in your workflow (ENABLE_FAST_DEPLOYS: 'true'), the following steps can install a package from a private GitHub repository, e.g. my-org/private-repo, as a dependency:

  1. In your deploy.yml file, add the following to the top of steps: section in the dagster-cloud-default-deploy job.

    - name: Checkout internal repository
    uses: actions/checkout@v3
    with:
    token: ${{ secrets.GH_PAT }}
    repository: my-org/private-repo
    path: deps/private-repo
    ref: some-branch # optional to check out a specific branch

    - name: Build a wheel
    # adjust the `cd` command to cd into the directory with setup.py
    run: >
    cd deps/private-repo &&
    python setup.py bdist_wheel &&
    mkdir -p $GITHUB_WORKSPACE/deps &&
    cp dist/*whl $GITHUB_WORKSPACE/deps

    # If you have multiple private packages, the above two steps should be repeated for each but the following step is only
    # needed once
    - name: Configure dependency resolution to use the wheel built above
    run: >
    echo "[global]" > $GITHUB_WORKSPACE/deps/pip.conf &&
    echo "find-links = " >> $GITHUB_WORKSPACE/deps/pip.conf &&
    echo " file://$GITHUB_WORKSPACE/deps/" >> $GITHUB_WORKSPACE/deps/pip.conf &&
    echo "PIP_CONFIG_FILE=$GITHUB_WORKSPACE/deps/pip.conf" > $GITHUB_ENV
  2. Create a GitHub personal access token and set it as the GH_PAT secret for your Actions.

  3. In your Dagster project's setup.py file, add your package name to the install_requires section:

        install_requires=[
    "dagster",
    "dagster-cloud",
    "private-package", # add this line - must match your private Python package name

Once the deploy.yml is updated and changes pushed to your repo, then any subsequent code deploy should checkout your private repository, build the package and install it as a dependency in your Dagster+ project. Repeat the above steps for your branch_deployments.yml if needed.