Creating alerts in Dagster+
You can create alerts in the Dagster+ UI or using the dagster-cloud
CLI.
Before you create alerts, you must configure an alert notification service.
Alerting when a run fails
You can set up alerts to notify you when a run fails.
By default, these alerts will target all runs in the deployment, but they can be scoped to runs with a specific tag.
- In the UI
- Using the CLI
-
In the Dagster UI, click Deployment.
-
Click the Alerts tab.
-
Click Add alert policy.
-
Select Run alert from the dropdown.
-
Select Job failure.
If desired, add tags in the format {key}:{value}
to filter the runs that will be considered.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
description: Sends an email when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
description: Sends a Microsoft Teams webhook when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
description: Sends a PagerDuty alert when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
description: Sends a Slack message when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
Alerting when a run is taking too long to complete
You can set up alerts to notify you whenever a run takes more than some threshold amount of time.
By default, these alerts will target all runs in the deployment, but they can be scoped to runs with a specific tag.
- In the UI
- Using the CLI
-
In the Dagster UI, click Deployment.
-
Click the Alerts tab.
-
Click Add alert policy.
-
Select Run alert from the dropdown.
-
Select Job running over and how many hours to alert after.
If desired, add tags in the format {key}:{value}
to filter the runs that will be considered.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
alert_targets:
- long_running_job_threshold_target:
threshold_seconds: 3600
description: Sends an email when a run is taking too long to complete.
event_types:
- JOB_LONG_RUNNING
name: job-running-over-one-hour-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
tags:
important: 'true'
# alert_policies.yaml
alert_policies:
alert_targets:
- long_running_job_threshold_target:
threshold_seconds: 3600
description: Sends a Microsoft Teams webhook when a run is taking too long to complete.
event_types:
- JOB_LONG_RUNNING
name: job-running-over-one-hour-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
tags:
important: 'true'
# alert_policies.yaml
alert_policies:
alert_targets:
- long_running_job_threshold_target:
threshold_seconds: 3600
description: Sends a PagerDuty alert when a run is taking too long to complete.
event_types:
- JOB_LONG_RUNNING
name: job-running-over-one-hour-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
tags:
important: 'true'
# alert_policies.yaml
alert_policies:
alert_targets:
- long_running_job_threshold_target:
threshold_seconds: 3600
description: Sends a Slack message when a run is taking too long to complete.
event_types:
- JOB_LONG_RUNNING
name: job-running-over-one-hour-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
tags:
important: 'true'
Alerting when an asset fails to materialize
You can set up alerts to notify you when an asset materialization attempt fails.
By default, these alerts will target all assets in the deployment, but they can be scoped to a specific asset or group of assets.
If using a RetryPolicy, an alert will only be sent after all retries complete.
- In the UI
- Using the CLI
-
In the Dagster UI, click Deployment.
-
Click the Alerts tab.
-
Click Add alert policy.
-
Select Asset alert from the dropdown.
-
Select Failure under the Materializations heading.
If desired, select a target from the dropdown menu to scope this alert to a specific asset or group.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
description: Sends an email when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
description: Sends a Microsoft Teams webhook when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
description: Sends a PagerDuty alert when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
description: Sends a Slack message when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
Alerting when an asset check fails
You can set up alerts to notify you when an asset check on an asset fails.
By default, these alerts will target all assets in the deployment, but they can be scoped to checks on a specific asset or group of assets.
- In the UI
- Using the CLI
-
In the Dagster UI, click Deployment.
-
Click the Alerts tab.
-
Click Add alert policy.
-
Select Asset alert from the dropdown.
-
Select Failed (ERROR) under the Asset Checks heading.
If desired, select a target from the dropdown menu to scope this alert to a specific asset or group.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
alert_targets:
- asset_key_target:
asset_key:
- s3
- report
- asset_group_target:
asset_group: transformed
location_name: prod
repo_name: __repository__
description: Sends an email when an asset check fails.
event_types:
- ASSET_CHECK_SEVERITY_ERROR
name: asset-check-failed-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
alert_targets:
- asset_key_target:
asset_key:
- s3
- report
- asset_group_target:
asset_group: transformed
location_name: prod
repo_name: __repository__
description: Sends a Microsoft Teams webhook when an asset check fails.
event_types:
- ASSET_CHECK_SEVERITY_ERROR
name: asset-check-failed-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
alert_targets:
- asset_key_target:
asset_key:
- s3
- report
- asset_group_target:
asset_group: transformed
location_name: prod
repo_name: __repository__
description: Sends a PagerDuty alert when an asset check fails.
event_types:
- ASSET_CHECK_SEVERITY_ERROR
name: asset-check-failed-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
alert_targets:
- asset_key_target:
asset_key:
- s3
- report
- asset_group_target:
asset_group: transformed
location_name: prod
repo_name: __repository__
description: Sends a Slack message when an asset check fails.
event_types:
- ASSET_CHECK_SEVERITY_ERROR
name: asset-check-failed-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
Alerting when a schedule or sensor tick fails
You can set up alerts to fire when any schedule or sensor tick across your entire deployment fails.
Alerts are sent only when a schedule/sensor transitions from success to failure, so only the initial failure will trigger the alert.
- In the UI
- Using the CLI
- In the Dagster UI, click Deployment.
- Click the Alerts tab.
- Click Add alert policy.
- Select Schedule/Sensor alert from the dropdown.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
description: Sends an email when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
description: Sends a Microsoft Teams webhook when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
description: Sends a PagerDuty alert when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
description: Sends a Slack message when a schedule or sensor tick fails.
event_types:
- TICK_FAILURE
name: schedule-sensor-failure-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
Alerting when a code location fails to load
You can set up alerts to fire when any code location fails to load due to an error.
- In the UI
- Using the CLI
- In the Dagster UI, click Deployment.
- Click the Alerts tab.
- Click Add alert policy.
- Select Code location error alert from the dropdown.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
description: Sends an email when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
description: Sends a Microsoft Teams webhook when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
description: Sends a PagerDuty alert when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
description: Sends a Slack message when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli
Alerting when a Hybrid agent becomes unavailable
This is only available for Hybrid deployments.
You can set up alerts to fire if your Hybrid agent hasn't sent a heartbeat in the last 5 minutes.
- In the UI
- Using the CLI
- In the Dagster UI, click Deployment.
- Click the Alerts tab.
- Click Add alert policy.
- Select Code location error alert from the dropdown.
Execute the following command to sync the configured alert policy to your Dagster+ deployment.
dagster-cloud deployment alert-policies sync -a /path/to/alert_policies.yaml
- Microsoft Teams
- PagerDuty
- Slack
# alert_policies.yaml
alert_policies:
description: Sends an email when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-email
notification_service:
email_addresses:
- richard.hendricks@hooli.com
- nelson.bighetti@hooli.com
# alert_policies.yaml
alert_policies:
description: Sends a Microsoft Teams webhook when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-microsoft_teams
notification_service:
webhook_url: https://yourdomain.webhook.office.com/...
# alert_policies.yaml
alert_policies:
description: Sends a PagerDuty alert when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-pagerduty
notification_service:
integration_key: <pagerduty_integration_key>
# alert_policies.yaml
alert_policies:
description: Sends a Slack message when a Hybrid agent becomes unavailable.
event_types:
- AGENT_UNAVAILABLE
name: code-location-error-slack
notification_service:
slack_channel_name: notifications
slack_workspace_name: hooli