File tree Expand file tree Collapse file tree 9 files changed +11
-11
lines changed Expand file tree Collapse file tree 9 files changed +11
-11
lines changed Original file line number Diff line number Diff line change @@ -111,9 +111,9 @@ DAGSTER_CODE_SERVER_HOST=
111111
112112# max runtime for a job in dagster
113113# https://docs.dagster.io/deployment/run-monitoring#general-run-timeouts
114- ANOMSTACK_MAX_RUNTIME_SECONDS_TAG = 3600
114+ ANOMSTACK_MAX_RUNTIME_SECONDS_TAG = 900
115115# kill runs that exceed this many minutes
116- ANOMSTACK_KILL_RUN_AFTER_MINUTES = 60
116+ ANOMSTACK_KILL_RUN_AFTER_MINUTES = 15
117117
118118# postgres related env vars
119119ANOMSTACK_POSTGRES_USER = postgres_user
Original file line number Diff line number Diff line change @@ -699,7 +699,7 @@ Below you see an example of an LLM alert via email. In this case we add a descri
699699
700700Sometimes Dagster runs can get stuck. Anomstack ships with a sensor that
701701terminates any run exceeding a configurable timeout. By default runs are killed
702- after 60 minutes. You can override this in your `dagster.yaml` or via the
702+ after 15 minutes. You can override this in your `dagster.yaml` or via the
703703`ANOMSTACK_KILL_RUN_AFTER_MINUTES` environment variable. You can also invoke the
704704cleanup manually with:
705705
Original file line number Diff line number Diff line change 1212)
1313from dagster ._core .errors import DagsterUserCodeUnreachableError
1414
15- DEFAULT_MINUTES = 60
15+ DEFAULT_MINUTES = 15
1616
1717def _load_config_timeout_minutes () -> int :
1818 env_val = os .getenv ("ANOMSTACK_KILL_RUN_AFTER_MINUTES" )
Original file line number Diff line number Diff line change @@ -11,7 +11,7 @@ run_monitoring:
1111 enabled : true
1212 start_timeout_seconds : 300 # 5 minutes to start
1313 cancel_timeout_seconds : 180 # 3 minutes to cancel
14- max_runtime_seconds : 3600 # 1 hour max runtime per run
14+ max_runtime_seconds : 900 # 15 minutes max runtime per run
1515 poll_interval_seconds : 60 # Check every minute
1616
1717storage :
Original file line number Diff line number Diff line change @@ -72,7 +72,7 @@ run_monitoring:
7272 enabled : true
7373 start_timeout_seconds : 300 # 5 minutes to start
7474 cancel_timeout_seconds : 180 # 3 minutes to cancel
75- max_runtime_seconds : 3600 # 1 hour max runtime per run
75+ max_runtime_seconds : 900 # 15 minutes max runtime per run
7676 max_resume_run_attempts : 2 # Resume runs after worker crashes (DockerRunLauncher only)
7777 poll_interval_seconds : 60 # Check every minute
7878
Original file line number Diff line number Diff line change @@ -44,7 +44,7 @@ run_monitoring:
4444 enabled : true
4545 start_timeout_seconds : 300 # 5 minutes to start (increased for cold starts)
4646 cancel_timeout_seconds : 180 # 3 minutes to cancel (increased)
47- max_runtime_seconds : 3600 # 1 hour max runtime per run
47+ max_runtime_seconds : 900 # 15 minutes max runtime per run
4848 poll_interval_seconds : 30 # Check every 30 seconds (more frequent)
4949
5050# Disable telemetry
Original file line number Diff line number Diff line change @@ -181,8 +181,8 @@ Lightweight defaults to prevent disk space issues.
181181
182182| Variable | Required | Description | Default | Example |
183183| ----------| ----------| -------------| ---------| ---------|
184- | ` ANOMSTACK_MAX_RUNTIME_SECONDS_TAG ` | No | Max job runtime in seconds | ` 3600 ` | ` 7200 ` |
185- | ` ANOMSTACK_KILL_RUN_AFTER_MINUTES ` | No | Kill long-running jobs after N minutes | ` 60 ` | ` 120 ` |
184+ | ` ANOMSTACK_MAX_RUNTIME_SECONDS_TAG ` | No | Max job runtime in seconds | ` 900 ` | ` 1800 ` |
185+ | ` ANOMSTACK_KILL_RUN_AFTER_MINUTES ` | No | Kill long-running jobs after N minutes | ` 15 ` | ` 30 ` |
186186
187187## 🐳 Docker & Deployment
188188
Original file line number Diff line number Diff line change @@ -60,7 +60,7 @@ run_monitoring:
6060 enabled: true
6161 start_timeout_seconds: 300
6262 cancel_timeout_seconds: 180
63- max_runtime_seconds: 3600
63+ max_runtime_seconds: 900
6464 poll_interval_seconds: 60
6565
6666# Disabled telemetry to reduce disk writes
Original file line number Diff line number Diff line change @@ -17,7 +17,7 @@ ANOMSTACK_MODEL_PATH=local:///data/models
1717
1818# max runtime for a job in dagster
1919# https://docs.dagster.io/deployment/run-monitoring#general-run-timeouts
20- ANOMSTACK_MAX_RUNTIME_SECONDS_TAG = 3600
20+ ANOMSTACK_MAX_RUNTIME_SECONDS_TAG = 900
2121
2222# Enable Netdata
2323ANOMSTACK__NETDATA__INGEST_DEFAULT_SCHEDULE_STATUS = RUNNING
You can’t perform that action at this time.
0 commit comments