Managing multiple projects and teams with multi-tenancy

In this article, we'll cover some strategies for managing multiple projects/code bases and teams in a Dagster+ account.

Separating code bases

note

In this section, repository refers to a version control system, such as Git or Mercurial.

If you want to manage complexity or divide your work into areas of responsibility, consider isolating your code bases into multiple projects with:

Multiple directories in a single repository, or
Multiple repositories

Refer to the following table for more information, including the pros and cons of each approach.

Approach	How it works	Pros	Cons
Multiple directories in a single repository	You can use a single repository to manage multiple projects by placing each project in a separate directory. Depending on your VCS, you may be able to set code owners to restrict who can modify each project.	Easy to implement Facilitates code sharing between projects	All projects share the same CI/CD pipeline and cannot be deployed independently Shared dependencies between projects may cause conflicts and require coordination between teams
Multiple repositories	For stronger isolation, you can use multiple repositories to manage multiple projects.	Stronger isolation between projects and teams Each project has its own CI/CD pipeline and be deployed independently Dependencies between projects can be managed independently	Code sharing between projects requires additional coordination to publish and reuse packages between projects.

Deployment configuration

Whether you use a single repository or multiple, you can use a dagster_cloud.yaml file to define the code locations to deploy. For each repository, follow the steps appropriate to your CI/CD provider and include only the code locations that are relevant to the repository in your CI/CD workflow.

Example with GitHub CI/CD on Hybrid deployment

For each repository, use the CI/CD workflow provided in Dagster+ Hybrid quickstart repository.

For each project in the repository, configure a code location in the dagster_cloud.yaml file:

# dagster_cloud.yml

locations:
  - location_name: project_a
    code_source:
      package_name: project_a
    build:
      # ...
  - location_name: project_b
    code_source:
      package_name: project_b
    build:
      # ...

In the repository's dagster-cloud-deploy.yml file, modify the CI/CD workflow to deploy all code locations for the repository:

# .github/workflows/dagster-cloud-deploy.yml

jobs:
  dagster-cloud-deploy:
    # ...
    steps:
      - name: Update build session with image tag for "project_a" code location
        id: ci-set-build-output-project-a
        if: steps.prerun.outputs.result != 'skip'
        uses: dagster-io/dagster-cloud-action/actions/utils/dagster-cloud-cli@v0.1
        with:
          command: "ci set-build-output --location-name=project_a --image-tag=$IMAGE_TAG"

      - name: Update build session with image tag for "project_b" code location
        id: ci-set-build-output-project-b
        if: steps.prerun.outputs.result != 'skip'
        uses: dagster-io/dagster-cloud-action/actions/utils/dagster-cloud-cli@v0.1
        with:
          command: "ci set-build-output --location-name=project_b --image-tag=$IMAGE_TAG"
      # ...

Isolating execution context between projects

Separating execution context between projects can have several motivations:

Facilitating separation of duty between teams to prevent access to sensitive data
Differing compute environments and requirements, such as different architecture, cloud provider, etc.
Reducing impact on other projects. For example, a project with a large number of runs can impact the performance of other projects

In order from least to most isolated, there are three levels of isolation:

Code location
Agent
Deployment

Code location isolation

If you have no specific requirements for isolation beyond the ability to deploy and run multiple projects, you can use a single agent and deployment to manage all your projects as individual code locations.

Diagram of isolation at the code location level

Pros	Cons
Simplest and most cost-effective solution User access control can be set at the code location level Single glass pane to view all assets	No isolation between execution environments

Agent isolation

note

Agent queues are only available on Hybrid deployment.

Using the agent routing feature, you can effectively isolate execution environments between projects by using a separate agent for each project.

Motivations for utilizing this approach could include:

Different compute requirements, such as different cloud providers or architectures
Optimizing for locality or access, such as running the data processing closer or in environment with access to the storage locations

Diagram of isolation at the agent level

Pros	Cons
Isolation between execution environments User access control can be set at the code location level Single glass pane to view all assets	Extra work to set up additional agents and agent queues

Deployment isolation

note

Multiple deployments are only available in Dagster+ Pro.

Of the approaches outlined in this guide, multiple deployments are the most isolated solution. The typical motivation for this isolation level is to separate production and non-production environments.

Diagram of isolation at the Dagster+ deployment level

Pros	Cons
Isolation between assets and execution environments User access control can be set at the code location and deployment level	No single glass pane to view all assets (requires switching between multiple deployments in the UI)

Separating code bases​

Deployment configuration​

Example with GitHub CI/CD on Hybrid deployment​

Isolating execution context between projects​

Code location isolation​

Agent isolation​

Deployment isolation​

Separating code bases

Deployment configuration

Example with GitHub CI/CD on Hybrid deployment

Isolating execution context between projects

Code location isolation

Agent isolation

Deployment isolation