Skip to content

Conversation

@andrewm4894
Copy link
Owner

  • Removed DAGSTER_HOME from .example.env and set it directly in docker-compose.yaml.
  • Updated dagster_docker.yaml to enable threading for schedules and sensors with specified worker counts.
  • Cleaned up dagster.yaml by uncommenting and consolidating schedule and sensor configurations.
  • Enhanced Makefile to streamline local development and added a new target for running Docker in development mode.
  • Modified kill_long_running_tasks.py to utilize a configurable timeout for long-running tasks.

This pull request introduces several updates to improve Dagster configuration, Docker setup, and the handling of long-running tasks. The changes standardize environment variable usage, enhance configuration management, and align timeout handling across components.

Dagster Configuration Updates:

  • Enabled schedules and sensors with threading and worker configurations in both dagster.yaml and dagster_docker.yaml, while removing outdated kill_sensor configurations. [1] [2]

Docker Configuration Enhancements:

  • Updated docker-compose.yaml to include dagster_docker.yaml as a mounted volume for consistent configuration across environments and removed the use of the DAGSTER_HOME environment variable fallback. [1] [2] [3] [4]

Makefile Improvements:

  • Added a new docker-dev-env target for running Docker in development mode with the appropriate environment and reintroduced a docker-stop target with updated behavior. [1] [2]
  • Simplified the local target to directly set DAGSTER_HOME inline, removing dependency on external configuration.

Script Enhancements:

  • Updated scripts/kill_long_running_tasks.py to dynamically fetch the timeout value from the sensor configuration, ensuring consistent behavior across the system. [1] [2]

Cleanup:

  • Removed the unused DAGSTER_HOME variable from .example.env to align with the new configuration approach.

- Removed DAGSTER_HOME from .example.env and set it directly in docker-compose.yaml.
- Updated dagster_docker.yaml to enable threading for schedules and sensors with specified worker counts.
- Cleaned up dagster.yaml by uncommenting and consolidating schedule and sensor configurations.
- Enhanced Makefile to streamline local development and added a new target for running Docker in development mode.
- Modified kill_long_running_tasks.py to utilize a configurable timeout for long-running tasks.
@andrewm4894 andrewm4894 requested a review from Copilot July 17, 2025 14:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors Dagster configuration management to improve consistency between local and Docker environments. The changes standardize environment variable usage, enable threading for schedules and sensors, and make timeout handling configurable.

  • Standardized DAGSTER_HOME configuration by removing it from environment files and setting it directly in Docker configurations
  • Enabled threading for schedules and sensors with specified worker counts in both local and Docker configurations
  • Made long-running task timeout configurable by integrating with sensor configuration instead of using hardcoded values

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scripts/kill_long_running_tasks.py Updated to use configurable timeout from sensor configuration instead of hardcoded 1-hour value
docker-compose.yaml Added dagster_docker.yaml volume mount and hardcoded DAGSTER_HOME path for consistency
dagster_docker.yaml Enabled threading for schedules/sensors and removed kill_sensor configuration
dagster.yaml Uncommented and enabled schedules/sensors threading configuration, removed kill_sensor section
Makefile Added docker-dev-env target and simplified local target to set DAGSTER_HOME inline
.example.env Removed DAGSTER_HOME environment variable

# Cutoff for long-running (1 hour ago)
cutoff_time = datetime.now(timezone.utc) - timedelta(hours=1)
# Add the parent directory to sys.path to import the sensor module
sys.path.append(str(script_dir.parent))
Copy link

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying sys.path at runtime can lead to import conflicts and makes the code harder to maintain. Consider using relative imports or installing the package in development mode with 'pip install -e .' instead.

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +28
from anomstack.sensors.timeout import get_kill_after_minutes

# Use the same configurable timeout as the sensor
kill_after_minutes = get_kill_after_minutes()
cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes)
print(f"Using {kill_after_minutes} minute timeout")
Copy link

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import could fail if the anomstack.sensors.timeout module doesn't exist or doesn't contain the get_kill_after_minutes function. Consider adding error handling to gracefully fall back to a default timeout value if the import fails.

Suggested change
from anomstack.sensors.timeout import get_kill_after_minutes
# Use the same configurable timeout as the sensor
kill_after_minutes = get_kill_after_minutes()
cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes)
print(f"Using {kill_after_minutes} minute timeout")
try:
from anomstack.sensors.timeout import get_kill_after_minutes
# Use the same configurable timeout as the sensor
kill_after_minutes = get_kill_after_minutes()
print(f"Using {kill_after_minutes} minute timeout")
except (ImportError, AttributeError) as e:
print(f"⚠️ Warning: Could not import 'get_kill_after_minutes' or call it: {e}")
kill_after_minutes = 60 # Default timeout value
print(f"Using default timeout of {kill_after_minutes} minutes")
cutoff_time = datetime.now(timezone.utc) - timedelta(minutes=kill_after_minutes)

Copilot uses AI. Check for mistakes.
- ./tmp:/opt/dagster/app/tmp
- anomstack_metrics_duckdb:/metrics_db/duckdb
- ./dagster_home:/opt/dagster/dagster_home
- ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster.yaml
Copy link

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Mounting dagster_docker.yaml directly over dagster.yaml could overwrite any existing dagster.yaml file in the container. This tight coupling makes it difficult to have different configurations per service. Consider using a more explicit configuration approach or documenting this behavior clearly.

Suggested change
- ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster.yaml
- ./dagster_docker.yaml:/opt/dagster/dagster_home/dagster_docker.yaml

Copilot uses AI. Check for mistakes.
@github-actions
Copy link

📊 Test Coverage Report

Coverage: 63% (yellow)

✅ Coverage maintained or improved!

💡 See detailed coverage report in the tests README

@andrewm4894 andrewm4894 marked this pull request as ready for review July 17, 2025 15:37
@andrewm4894 andrewm4894 merged commit 70f3a52 into main Jul 17, 2025
2 checks passed
@andrewm4894 andrewm4894 deleted the improve-docker-and-long-running-sensors branch July 17, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants