Libraries

You can integrate Dagster with external services using our libraries and libraries supported by the community.

Airbyte

2 items

Anthropic

The dagster-anthropic library allows you to easily interact with the Anthropic REST API using the Anthropic Python API to build AI steps into your Dagster pipelines. You can also log Anthropic API usage metadata in Dagster Insights, giving you detailed observability on API call credit consumption.

AWS

10 items

Azure Data Lake Storage Gen 2

Dagster helps you use Azure Storage Accounts as part of your data pipeline. Azure Data Lake Storage Gen 2 (ADLS2) is our primary focus but we also provide utilities for Azure Blob Storage.

Census

With the dagster-census integration you can execute a Census sync and poll until that sync completes, raising an error if it's unsuccessful.

Chroma

The dagster-chroma library allows you to easily interact with Chroma's vector database capabilities to build AI-powered data pipelines in Dagster. You can perform vector similarity searches, manage schemas, and handle data operations directly from your Dagster assets.

Cube

With the dagster_cube integration you can setup Cube and Dagster to work together so that Dagster can push changes from upstream data sources to Cube using its integration API.

Databricks

The dagster-databricks integration library provides the PipesDatabricksClient resource, enabling you to launch Databricks jobs directly from Dagster assets and ops. This integration allows you to pass parameters to Databricks code while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Datadog

While Dagster provides comprehensive monitoring and observability of the pipelines it orchestrates, many teams look to centralize all their monitoring across apps, processes and infrastructure using Datadog's 'Cloud Monitoring as a Service'. The dagster-datadog integration allows you to publish metrics to Datadog from within Dagster ops.

dbt

3 items

Delta Lake

Delta Lake is a great storage format for Dagster workflows. With this integration, you can use the Delta Lake I/O Manager to read and write your Dagster assets.

dlt

This integration allows you to use dlt to easily ingest and replicate data between systems through Dagster.

Docker

The dagster-docker integration library provides the PipesDockerClient resource, enabling you to launch Docker containers and execute external code directly from Dagster assets and ops. This integration allows you to pass parameters to Docker containers while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

DuckDB

This library provides an integration with the DuckDB database, and allows for an out-of-the-box I/O Manager so that you can make DuckDB your storage of choice.

Fivetran

This guide provides instructions for using Dagster with Fivetran using the dagster-fivetran library. Your Fivetran connector tables can be represented as assets in the Dagster asset graph, allowing you to track lineage and dependencies between Fivetran assets and data assets you are already modeling in Dagster. You can also use Dagster to orchestrate Fivetran connectors, allowing you to trigger syncs for these on a cadence or based on upstream data changes.

GCP

3 items

Gemini

The dagster-gemini library allows you to easily interact with the Gemini REST API using the Gemini Python API to build AI steps into your Dagster pipelines. You can also log Gemini API usage metadata in Dagster Insights, giving you detailed observability on API call credit consumption.

GitHub

This library provides an integration with GitHub Apps by providing a thin wrapper on the GitHub v4 GraphQL API. This allows for automating operations within your GitHub repositories and with the tighter permissions scopes that GitHub Apps allow for vs using a personal token.

HashiCorp Vault

Package for integrating HashiCorp Vault into Dagster so that you can securely manage tokens and passwords.

Hightouch

With this integration you can trigger Hightouch syncs and monitor them from within Dagster. Fine-tune when Hightouch syncs kick-off, visualize their dependencies, and monitor the steps in your data activation workflow.

Libraries

You can integrate Dagster with external services using our libraries and libraries supported by the community.

Jupyter Notebooks

About Jupyter

Kubernetes

The dagster-k8s integration library provides the PipesK8sClient resource, enabling you to launch Kubernetes pods and execute external code directly from Dagster assets and ops. This integration allows you to pass parameters to Kubernetes pods while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

LakeFS

By integrating with lakeFS, a big data scale version control system, you can leverage the versioning capabilities of lakeFS to track changes to your data. This integration allows you to have a complete lineage of your data, from the initial raw data to the transformed and processed data, making it easier to understand and reproduce data transformations.

Looker

Dagster allows you to represent your Looker project as assets, alongside other your other technologies like dbt and Sling. This allows you to see how your Looker assets are connected to your other data assets, and how changes to other data assets might impact your Looker project.

Meltano

The dagster-meltano library allows you to run Meltano using Dagster. Design and configure ingestion jobs using the popular Singer.io specification.

Microsoft Teams

By configuring this resource, you can post messages to MS Teams from any Dagster op or asset.

Open Metadata

With this integration you can create a Open Metadata service to ingest metadata produced by the Dagster application. View the Ingestion Pipeline running from the Open Metadata Service Page.

OpenAI

The dagster-openai library allows you to easily interact with the OpenAI REST API using the OpenAI Python API to build AI steps into your Dagster pipelines. You can also log OpenAI API usage metadata in Dagster Insights, giving you detailed observability on API call credit consumption.

PagerDuty

This library provides an integration between Dagster and PagerDuty to support creating alerts from your Dagster code.

Pandas

Perform data validation, emit summary statistics, and enable reliable DataFrame serialization/deserialization. The dagster_pandas library provides you with the utilities for implementing validation on Pandas DataFrames. The Dagster type system generates documentation of your DataFrame constraints and makes it accessible in the Dagster UI.

Pandera

The dagster-pandera integration library provides an API for generating Dagster Types from Pandera DataFrame schemas.

Prometheus

This integration allows you to push metrics to the Prometheus gateway from within a Dagster pipeline.

SDF

SDF can integrate seamlessly with your existing Dagster projects, providing the best-in-class transformation layer while enabling you to schedule, orchestrate, and monitor your dags in Dagster.

Secoda

Connect Dagster to Secoda and see metadata related to your Dagster assets, asset groups and jobs right in Secoda. Simplify your team's access, and remove the need to switch between tools.

Bash / Shell

Dagster comes with a native PipesSubprocessClient resource that enables you to launch shell commands directly from Dagster assets and ops. This integration allows you to pass parameters to external shell scripts while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated external execution. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Slack

This library provides an integration with Slack to support posting messages in your company's Slack workspace.

Sling

This integration allows you to use Sling to extract and load data from popular data sources to destinations with high performance and ease.

Snowflake

This library provides an integration with the Snowflake data warehouse. Connect to Snowflake as a resource, then use the integration-provided functions to construct an op to establish connections and execute Snowflake queries. Read and write natively to Snowflake from Dagster assets.

Spark

Spark jobs typically execute on infrastructure that's specialized for Spark. Spark applications are typically not containerized or executed on Kubernetes.

SSH/SFTP

This integration provides a resource for SSH remote execution using Paramiko. It allows you to establish secure connections to networked resources and execute commands remotely. The integration also provides an SFTP client for secure file transfers between the local and remote systems.

Twilio

Use your Twilio Account SID and Auth Token to build Twilio tasks right into your Dagster pipeline.

Weights & Biases

Use Dagster and Weights & Biases (W&B) to orchestrate your MLOps pipelines and maintain ML assets. The integration with W&B makes it easy within Dagster to:

Weaviate

The dagster-weaviate library allows you to easily interact with Weaviate's vector database capabilities to build AI-powered data pipelines in Dagster. You can perform vector similarity searches, manage schemas, and handle data operations directly from your Dagster assets.