Skip to main content

Managing multiple projects and teams with multi-tenancy

In this article, we'll cover some strategies for managing multiple projects/code bases and teams in a Dagster+ account.

Separating code bases

note

In this section, repository refers to a version control system, such as Git or Mercurial.

If you want to manage complexity or divide your work into areas of responsibility, consider isolating your code bases into multiple projects with:

  • Multiple directories in a single repository, or
  • Multiple repositories

Refer to the following table for more information, including the pros and cons of each approach.

ApproachHow it worksProsCons
Multiple directories in a single repositoryYou can use a single repository to manage multiple projects by placing each project in a separate directory. Depending on your VCS, you may be able to set code owners to restrict who can modify each project.
  • Easy to implement
  • Facilitates code sharing between projects
  • All projects share the same CI/CD pipeline and cannot be deployed independently
  • Shared dependencies between projects may cause conflicts and require coordination between teams
Multiple repositoriesFor stronger isolation, you can use multiple repositories to manage multiple projects.
  • Stronger isolation between projects and teams
  • Each project has its own CI/CD pipeline and be deployed independently
  • Dependencies between projects can be managed independently
Code sharing between projects requires additional coordination to publish and reuse packages between projects.

Deployment configuration

Whether you use a single repository or multiple, you can use a dagster_cloud.yaml file to define the code locations to deploy. For each repository, follow the steps appropriate to your CI/CD provider and include only the code locations that are relevant to the repository in your CI/CD workflow.

Example with GitHub CI/CD on Hybrid deployment

  1. For each repository, use the CI/CD workflow provided in Dagster+ Hybrid quickstart repository.
  1. For each project in the repository, configure a code location in the dagster_cloud.yaml file:

    # dagster_cloud.yml

    locations:
    - location_name: project_a
    code_source:
    package_name: project_a
    build:
    # ...
    - location_name: project_b
    code_source:
    package_name: project_b
    build:
    # ...
  2. In the repository's dagster-cloud-deploy.yml file, modify the CI/CD workflow to deploy all code locations for the repository:

    # .github/workflows/dagster-cloud-deploy.yml

    jobs:
    dagster-cloud-deploy:
    # ...
    steps:
    - name: Update build session with image tag for "project_a" code location
    id: ci-set-build-output-project-a
    if: steps.prerun.outputs.result != 'skip'
    uses: dagster-io/dagster-cloud-action/actions/utils/dagster-cloud-cli@v0.1
    with:
    command: "ci set-build-output --location-name=project_a --image-tag=$IMAGE_TAG"

    - name: Update build session with image tag for "project_b" code location
    id: ci-set-build-output-project-b
    if: steps.prerun.outputs.result != 'skip'
    uses: dagster-io/dagster-cloud-action/actions/utils/dagster-cloud-cli@v0.1
    with:
    command: "ci set-build-output --location-name=project_b --image-tag=$IMAGE_TAG"
    # ...

Isolating execution context between projects

Separating execution context between projects can have several motivations:

  • Facilitating separation of duty between teams to prevent access to sensitive data
  • Differing compute environments and requirements, such as different architecture, cloud provider, etc.
  • Reducing impact on other projects. For example, a project with a large number of runs can impact the performance of other projects

In order from least to most isolated, there are three levels of isolation:

Code location isolation

If you have no specific requirements for isolation beyond the ability to deploy and run multiple projects, you can use a single agent and deployment to manage all your projects as individual code locations.

Diagram of isolation at the code location level

ProsCons
  • Simplest and most cost-effective solution
  • User access control can be set at the code location level
  • Single glass pane to view all assets
No isolation between execution environments

Agent isolation

note

Agent queues are only available on Hybrid deployment.

Using the agent routing feature, you can effectively isolate execution environments between projects by using a separate agent for each project.

Motivations for utilizing this approach could include:

  • Different compute requirements, such as different cloud providers or architectures
  • Optimizing for locality or access, such as running the data processing closer or in environment with access to the storage locations

Diagram of isolation at the agent level

ProsCons
  • Isolation between execution environments
  • User access control can be set at the code location level
  • Single glass pane to view all assets
Extra work to set up additional agents and agent queues

Deployment isolation

note

Multiple deployments are only available in Dagster+ Pro.

Of the approaches outlined in this guide, multiple deployments are the most isolated solution. The typical motivation for this isolation level is to separate production and non-production environments.

Diagram of isolation at the Dagster+ deployment level

ProsCons
  • Isolation between assets and execution environments
  • User access control can be set at the code location and deployment level
No single glass pane to view all assets (requires switching between multiple deployments in the UI)