-
Notifications
You must be signed in to change notification settings - Fork 12
Refactor Dagster configuration and environment setup #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Removed DAGSTER_HOME from .example.env and set it directly in docker-compose.yaml. - Updated dagster_docker.yaml to enable threading for schedules and sensors with specified worker counts. - Cleaned up dagster.yaml by uncommenting and consolidating schedule and sensor configurations. - Enhanced Makefile to streamline local development and added a new target for running Docker in development mode. - Modified kill_long_running_tasks.py to utilize a configurable timeout for long-running tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors Dagster configuration management to improve consistency between local and Docker environments. The changes standardize environment variable usage, enable threading for schedules and sensors, and make timeout handling configurable.
- Standardized
DAGSTER_HOMEconfiguration by removing it from environment files and setting it directly in Docker configurations - Enabled threading for schedules and sensors with specified worker counts in both local and Docker configurations
- Made long-running task timeout configurable by integrating with sensor configuration instead of using hardcoded values
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/kill_long_running_tasks.py | Updated to use configurable timeout from sensor configuration instead of hardcoded 1-hour value |
| docker-compose.yaml | Added dagster_docker.yaml volume mount and hardcoded DAGSTER_HOME path for consistency |
| dagster_docker.yaml | Enabled threading for schedules/sensors and removed kill_sensor configuration |
| dagster.yaml | Uncommented and enabled schedules/sensors threading configuration, removed kill_sensor section |
| Makefile | Added docker-dev-env target and simplified local target to set DAGSTER_HOME inline |
| .example.env | Removed DAGSTER_HOME environment variable |
| # Cutoff for long-running (1 hour ago) | ||
| cutoff_time = datetime.now(timezone.utc) - timedelta(hours=1) | ||
| # Add the parent directory to sys.path to import the sensor module | ||
| sys.path.append(str(script_dir.parent)) |
Copilot
AI
Jul 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying sys.path at runtime can lead to import conflicts and makes the code harder to maintain. Consider using relative imports or installing the package in development mode with 'pip install -e .' instead.
| from anomstack.sensors.timeout import get_kill_after_minutes | ||
|
|
||
| # Use the same configurable timeout as the sensor | ||
| kill_after_minutes = get_kill_after_minutes() | ||
| cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes) | ||
| print(f"Using {kill_after_minutes} minute timeout") |
Copilot
AI
Jul 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import could fail if the anomstack.sensors.timeout module doesn't exist or doesn't contain the get_kill_after_minutes function. Consider adding error handling to gracefully fall back to a default timeout value if the import fails.
| from anomstack.sensors.timeout import get_kill_after_minutes | |
| # Use the same configurable timeout as the sensor | |
| kill_after_minutes = get_kill_after_minutes() | |
| cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes) | |
| print(f"Using {kill_after_minutes} minute timeout") | |
| try: | |
| from anomstack.sensors.timeout import get_kill_after_minutes | |
| # Use the same configurable timeout as the sensor | |
| kill_after_minutes = get_kill_after_minutes() | |
| print(f"Using {kill_after_minutes} minute timeout") | |
| except (ImportError, AttributeError) as e: | |
| print(f"⚠️ Warning: Could not import 'get_kill_after_minutes' or call it: {e}") | |
| kill_after_minutes = 60 # Default timeout value | |
| print(f"Using default timeout of {kill_after_minutes} minutes") | |
| cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes) |
| - ./tmp:/opt/dagster/app/tmp | ||
| - anomstack_metrics_duckdb:/metrics_db/duckdb | ||
| - ./dagster_home:/opt/dagster/dagster_home | ||
| - ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster.yaml |
Copilot
AI
Jul 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Mounting dagster_docker.yaml directly over dagster.yaml could overwrite any existing dagster.yaml file in the container. This tight coupling makes it difficult to have different configurations per service. Consider using a more explicit configuration approach or documenting this behavior clearly.
| - ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster.yaml | |
| - ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster_docker.yaml |
📊 Test Coverage ReportCoverage: 63% (yellow) ✅ Coverage maintained or improved!
|
This pull request introduces several updates to improve Dagster configuration, Docker setup, and the handling of long-running tasks. The changes standardize environment variable usage, enhance configuration management, and align timeout handling across components.
Dagster Configuration Updates:
schedulesandsensorswith threading and worker configurations in bothdagster.yamlanddagster_docker.yaml, while removing outdatedkill_sensorconfigurations. [1] [2]Docker Configuration Enhancements:
docker-compose.yamlto includedagster_docker.yamlas a mounted volume for consistent configuration across environments and removed the use of theDAGSTER_HOMEenvironment variable fallback. [1] [2] [3] [4]Makefile Improvements:
docker-dev-envtarget for running Docker in development mode with the appropriate environment and reintroduced adocker-stoptarget with updated behavior. [1] [2]localtarget to directly setDAGSTER_HOMEinline, removing dependency on external configuration.Script Enhancements:
scripts/kill_long_running_tasks.pyto dynamically fetch the timeout value from the sensor configuration, ensuring consistent behavior across the system. [1] [2]Cleanup:
DAGSTER_HOMEvariable from.example.envto align with the new configuration approach.