-
Notifications
You must be signed in to change notification settings - Fork 13
General dev work #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General dev work #178
Conversation
- Reduce retention periods in dagster.yaml (3 days schedule, 1-3 days sensor runs) - Lower concurrent run limits in dagster_docker.yaml (10 max concurrent, reduced tag limits) - Add run monitoring with timeouts and telemetry disabled - Update .example.env with lightweight storage defaults (tmp_light directory structure) - Reduce worker counts and concurrency limits to prevent excessive storage This addresses issues with 200K+ run directories and 63GB+ storage accumulation during development.
…nd documentation Makefile improvements: - Add reset operations (gentle, medium, nuclear, full-nuclear) with interactive mode - Add Dagster cleanup targets (status, minimal, standard, aggressive) - Expand documentation targets (install, start, build, serve, clear) - All targets properly integrated with utility scripts Documentation additions: - Complete Makefile.md with detailed target documentation and usage examples - Updated scripts/README.md with utilities documentation - Clear categorization and quick reference for all available commands This provides developers with comprehensive tooling for Docker management, Dagster maintenance, and documentation workflow.
… references Documentation additions: - New storage-optimization.md with comprehensive guide on preventing Dagster storage buildup - Covers root causes, solutions, monitoring, and maintenance best practices - Added to sidebar navigation for easy access Documentation improvements: - Updated README.md with Makefile.md reference and quick command examples - Enhanced docker deployment docs with storage management best practices - Cross-referenced storage optimization guide from deployment documentation These changes provide users with essential guidance for managing Dagster storage issues and highlight the available developer tooling.
Update yarn.lock with latest documentation package dependencies.
Added essential utility scripts referenced by Makefile targets: scripts/utils/reset_docker.sh: - Interactive Docker reset utility with multiple cleanup levels - Supports gentle, medium, nuclear, and full-nuclear reset options - Comprehensive cleanup of containers, networks, volumes, and images - Interactive mode with guided user prompts scripts/utils/cleanup_dagster_storage.sh: - Dagster storage cleanup utility with configurable retention policies - Multiple cleanup levels: minimal, standard, aggressive - Storage usage monitoring and reporting - Batch cleanup operations for runs, logs, and artifacts scripts/utils/README.md: - Documentation for all utility scripts and usage patterns - Examples and best practices for each cleanup level These scripts provide the foundation for the Makefile's reset and cleanup targets.
Updated the README.md to include detailed descriptions and usage instructions for new utility scripts: - `kill_long_running_tasks.py`: A Dagster utility for managing long-running jobs. - SQLite utilities: `create_index.py`, `list_tables.py`, `list_indexes.py`, and `qry.py` for database performance optimization and inspection. Added quick examples for each script to enhance user guidance and improve overall documentation clarity.
|
Caution Review failedThe pull request is closed. WalkthroughThe changes introduce a comprehensive hot reload and storage management system for Anomstack, including new scripts, Makefile targets, and documentation. Hot reload of configuration is now possible via manual commands, scheduled jobs, or a smart file watcher, all integrated with Dagster's API. Storage optimization is addressed through new retention policies, cleanup scripts, and detailed best practices documentation. Numerous utility scripts, Makefile enhancements, and Docker configuration updates support these features. Changes
Sequence Diagram(s)Hot Reload Workflow via File WatchersequenceDiagram
participant User
participant Config File
participant Sensor (config_file_watcher)
participant Dagster API
participant Code Server
User->>Config File: Edit YAML config
Sensor-->>Config File: Monitor for changes (hash)
Sensor->>Sensor: Detects change (hash differs)
Sensor->>Dagster API: POST reload_code_location mutation
Dagster API->>Code Server: Reload code location
Code Server-->>Dagster API: Reload result (success/failure)
Dagster API-->>Sensor: Response
Sensor->>Sensor: Update stored hash if success
Manual Config Reload via ScriptsequenceDiagram
participant User
participant reload_config.py
participant Dagster API
participant Code Server
User->>reload_config.py: Run script
reload_config.py->>Dagster API: Check /server_info
reload_config.py->>reload_config.py: Validate config files
reload_config.py->>Dagster API: POST reload_code_location mutation
Dagster API->>Code Server: Reload code location
Code Server-->>Dagster API: Reload result
Dagster API-->>reload_config.py: Response
reload_config.py-->>User: Success/failure message
Scheduled Config Reload JobsequenceDiagram
participant Dagster Scheduler
participant Config Reload Job
participant Dagster API
participant Code Server
Dagster Scheduler->>Config Reload Job: Trigger per cron schedule
Config Reload Job->>Dagster API: POST reload_code_location mutation
Dagster API->>Code Server: Reload code location
Code Server-->>Dagster API: Reload result
Dagster API-->>Config Reload Job: Response
Config Reload Job->>Dagster Scheduler: Log result
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
- Add reload job for dynamic config updates - Add config reload sensor to monitor changes - Add reload script for manual config refresh - Add documentation for hot reload feature - Update config and main modules to support reloading - Update documentation sidebar to include hot reload docs
- Update docker-compose files for dev and production - Update Dagster Docker configuration - Update anomstack code Dockerfile
- Update Makefile with new commands and improvements - Update example environment configuration - Improve Dagster storage cleanup script
- Update python_ingest_simple.yaml with latest configuration format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
.example.env (1)
128-128: Extraneous spaces break shellexportparsing.
ANOMSTACK_DASHBOARD_PORT = 5001contains a space before=.
Most.envloaders treat this as part of the variable name and fail to parse.-ANOMSTACK_DASHBOARD_PORT = 5001 +ANOMSTACK_DASHBOARD_PORT=5001
♻️ Duplicate comments (1)
scripts/reload_config.py (1)
21-113: Well-implemented reload function with excellent user feedback.The function provides comprehensive error handling and clear console output for different scenarios.
This function duplicates the logic in
anomstack/config.py. Consider the refactoring suggestion made for that file to reduce code duplication.
🧹 Nitpick comments (21)
.example.env (2)
72-75: Comment values (“Reduced from …”) risk drifting – make them code, not docs.Hard-coding previous values in comments becomes stale quickly. Prefer deriving from a single source (defaults file,
make print-env, etc.) or remove the inline deltas.
145-147: Default combo (AUTO_CONFIG_RELOAD=false,CONFIG_WATCHER=true) is counter-intuitive.If the goal is to reload only when files change, disable the watcher by default
and let users opt-in, otherwise the daemon polls forever on first run.docker-compose.dev.yaml (1)
10-12: ExposeDAGSTER_HOST/PORTvia.envto avoid hard-coding service names.When users override the webserver service name or port, these in-file literals
break hot-reload. Make them configurable:environment: DAGSTER_HOST: ${DAGSTER_HOST:-anomstack_webserver} DAGSTER_PORT: ${DAGSTER_PORT:-3000}README.md (1)
481-488: Minor markdown nit – align code block fence with list bullet.Indenting the back-ticks by two spaces renders the block as plain text on some
parsers. Remove the leading spaces:-```bash +```bashdagster.yaml (1)
33-35: Trailing whitespace violates yamllint and pollutes diffsA minor style nit – lines 33-35 carry stray spaces after the values.
- skipped: 1 # Keep only 1 day of skipped runs·· - failure: 3 # Keep 3 days of failures for debugging·· - success: 1 # Keep only 1 day of successful runs·· + skipped: 1 # Keep only 1 day of skipped runs + failure: 3 # Keep 3 days of failures for debugging + success: 1 # Keep only 1 day of successful runsdagster_docker.yaml (1)
59-67: Aggressive retention policies look appropriate for storage optimization.The retention configuration effectively manages storage growth with reasonable timeframes for different run types.
Fix the trailing whitespace on line 66:
- failure: 3 # Keep 3 days of failed runs for debugging + failure: 3 # Keep 3 days of failed runs for debuggingscripts/README.md (1)
69-95: Well-documented Docker reset utility section.The documentation clearly explains the different reset levels and safety features, which is important for such a potentially destructive tool.
Consider adding a subject to improve the sentence on line 78:
-Can be run interactively or with specific reset levels via Makefile targets. +It can be run interactively or with specific reset levels via Makefile targets.anomstack/config.py (2)
92-183: Well-implemented hot reload functionality with comprehensive error handling.The GraphQL mutation approach and detailed response parsing provide robust configuration reload capabilities.
Consider extracting the common reload logic to avoid duplication with
scripts/reload_config.py. The functions are nearly identical except for logging vs print statements:# Common module approach from .reload_common import reload_code_location_impl def reload_code_location(location_name="anomstack_code"): return reload_code_location_impl(location_name, get_dagster_logger())
94-94: Move imports to top level for better performance.The
import requestsinside the function causes repeated import overhead on each call.Move the import to the top of the file:
import os from pathlib import Path import yaml from dagster import get_dagster_logger +import requestsscripts/reload_config.py (1)
10-11: Remove unused imports to clean up the code.The
jsonandtimeimports are not used in this script.Remove the unused imports:
import os import sys -import json -import time import requests from pathlib import Pathscripts/utils/README.md (1)
90-102: Good interactive mode documentation with clear example.The example session helps users understand what to expect from the interactive interface.
Add language specification to the code block for better formatting:
-**Example Interactive Session:** -``` +**Example Interactive Session:** +```text 🐳 Anomstack Docker Reset Toolanomstack/sensors/config_reload.py (4)
9-9: Remove unused importsStatic analysis correctly identifies unused imports that should be removed for cleaner code:
-from typing import Dict, Optional +from typing import Optional-from dagster import ( - RunRequest, - SensorEvaluationContext, +from dagster import ( + SensorEvaluationContext,Also applies to: 12-12
26-63: LGTM: Robust hash calculation with good error handlingThe hash calculation correctly combines file path, modification time, and content for comprehensive change detection. The error handling ensures the sensor continues operating even with file access issues.
Consider adding debug logging for file access errors to aid troubleshooting:
except (OSError, IOError) as e: # File might have been deleted or is unreadable, skip it + logger = get_dagster_logger() + logger.debug(f"Skipping file {yaml_file} due to access error: {e}") hash_md5.update(f"ERROR:{e}".encode())
83-101: Consider simplifying the wrapper functionWhile the function works correctly, it's essentially a thin wrapper around
execute_config_reload(). Consider whether the sensor-specific logging justifies the extra layer, or if you could callexecute_config_reload()directly and enhance its logging instead.The current implementation is functional and provides good traceability for sensor executions.
176-184: Consider consistent sensor namingThe disabled sensor uses a different name (
config_file_watcher_disabledvsconfig_file_watcher), which could cause confusion. Consider using the same name with different behavior:@sensor( - name="config_file_watcher_disabled", + name="config_file_watcher", minimum_interval_seconds=3600, # Check once per hour ) def config_file_watcher(context: SensorEvaluationContext):This ensures consistent sensor names regardless of enabled state.
Makefile (1)
35-35: Consider preventing duplicate environment variablesThe new Docker operations are well-implemented, but the
enable-*targets could create duplicate entries if run multiple times:enable-auto-reload: @echo "🤖 Enabling automatic configuration reloading..." - @echo "ANOMSTACK_AUTO_CONFIG_RELOAD=true" >> .env - @echo "ANOMSTACK_CONFIG_RELOAD_STATUS=RUNNING" >> .env + @grep -q "ANOMSTACK_AUTO_CONFIG_RELOAD=" .env || echo "ANOMSTACK_AUTO_CONFIG_RELOAD=true" >> .env + @grep -q "ANOMSTACK_CONFIG_RELOAD_STATUS=" .env || echo "ANOMSTACK_CONFIG_RELOAD_STATUS=RUNNING" >> .env @echo "✅ Auto reload enabled! Restart containers: make docker-restart"This prevents duplicate entries while maintaining the same user experience.
Also applies to: 127-147
docs/docs/storage-optimization.md (1)
97-97: Remove the trailing colon from the heading.Markdown-lint complains (
MD026) and most renderers display the extra colon awkwardly.-#### Cleanup Levels: +#### Cleanup LevelsMakefile.md (2)
136-142: Specify a language for fenced code blocks.Two blocks beginning at these lines use plain back-ticks without a language tag, triggering
MD040and losing syntax highlighting.-``` +# bash +```bashApply the same change to the following block a few lines below.
263-269: Convert bare URLs to Markdown links.Bare links violate
MD034and reduce readability. Example fix:-https://docs.docker.com/engine/reference/commandline/system_prune/ +[Docker system-prune docs](https://docs.docker.com/engine/reference/commandline/system_prune/)Repeat for the other instance on line 268.
scripts/utils/cleanup_dagster_storage.sh (2)
15-24: Deduplicate common helpers across utility scripts.
print_*andconfirmare byte-for-byte identical to the versions inreset_docker.sh. Factor them into a singlescripts/utils/common.sh(andsourceit) to avoid future drift.
362-366:clearbreaks non-interactive usage.Running the script from cron or CI with no TTY will fail.
Guard the call:-if tty -s; then clear; fi +if tty -s; then clear; fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (23)
.example.env(2 hunks)Makefile(4 hunks)Makefile.md(1 hunks)README.md(1 hunks)anomstack/config.py(1 hunks)anomstack/jobs/reload.py(1 hunks)anomstack/main.py(3 hunks)anomstack/sensors/config_reload.py(1 hunks)dagster.yaml(1 hunks)dagster_docker.yaml(3 hunks)docker-compose.dev.yaml(1 hunks)docker-compose.yaml(4 hunks)docker/Dockerfile.anomstack_code(2 hunks)docs/docs/configuration/hot-reload.md(1 hunks)docs/docs/deployment/docker.md(1 hunks)docs/docs/storage-optimization.md(1 hunks)docs/sidebars.js(1 hunks)metrics/examples/python/python_ingest_simple/python_ingest_simple.yaml(1 hunks)scripts/README.md(2 hunks)scripts/reload_config.py(1 hunks)scripts/utils/README.md(1 hunks)scripts/utils/cleanup_dagster_storage.sh(1 hunks)scripts/utils/reset_docker.sh(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
anomstack/config.py (1)
scripts/reload_config.py (1)
reload_code_location(21-113)
anomstack/sensors/config_reload.py (1)
anomstack/config.py (1)
execute_config_reload(186-236)
anomstack/jobs/reload.py (1)
anomstack/config.py (1)
execute_config_reload(186-236)
scripts/utils/reset_docker.sh (1)
scripts/utils/cleanup_dagster_storage.sh (5)
print_info(16-16)print_success(17-17)confirm(22-40)print_error(19-19)usage(430-442)
scripts/utils/cleanup_dagster_storage.sh (1)
scripts/utils/reset_docker.sh (6)
print_info(16-16)print_warning(18-18)print_success(17-17)confirm(22-40)print_error(19-19)usage(220-230)
🪛 YAMLlint (1.37.1)
dagster.yaml
[error] 33-33: trailing spaces
(trailing-spaces)
dagster_docker.yaml
[error] 66-66: trailing spaces
(trailing-spaces)
🪛 LanguageTool
scripts/README.md
[style] ~78-~78: To form a complete sentence, be sure to include a subject.
Context: ...reset + complete Docker system cleanup Can be run interactively or with specific r...
(MISSING_IT_THERE)
🪛 markdownlint-cli2 (0.17.2)
docs/docs/storage-optimization.md
97-97: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
scripts/utils/README.md
91-91: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
Makefile.md
136-136: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
141-141: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
263-263: Bare URL used
(MD034, no-bare-urls)
268-268: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.12.2)
anomstack/sensors/config_reload.py
9-9: typing.Dict imported but unused
Remove unused import: typing.Dict
(F401)
12-12: dagster.RunRequest imported but unused
Remove unused import: dagster.RunRequest
(F401)
scripts/reload_config.py
10-10: json imported but unused
Remove unused import: json
(F401)
11-11: time imported but unused
Remove unused import: time
(F401)
🔇 Additional comments (34)
.example.env (1)
69-77: Ensuretmp_light/*directories exist in containers or switch to absolute pathsThe
tmp_light/*paths in.example.env(lines 69–77) won’t be present in a fresh container image, and we didn’t find any Dockerfiles or entrypoint scripts that create them. To prevent Dagster runtime failures:Either update to an absolute, pre-existing path:
# Docker-friendly defaults -ANOMSTACK_DAGSTER_LOCAL_ARTIFACT_STORAGE_DIR=tmp_light/artifacts +ANOMSTACK_DAGSTER_LOCAL_ARTIFACT_STORAGE_DIR=/opt/dagster/artifactsOr ensure directory creation in your image build/entrypoint, for example:
mkdir -p tmp_light/{artifacts,compute_logs,storage} chown dagster:dagster tmp_light/*Please verify that these directories are created (and owned correctly) in your container environment.
metrics/examples/python/python_ingest_simple/python_ingest_simple.yaml (1)
4-5: Staggered cron values are good – check divisibility with downstream jobs.
ingest */9/train */11desynchronise nicely, but verify that
score_cron_schedule */6still fires after an ingest has produced data and a
train has completed; otherwise score may run on stale data and generate empty
alerts. Consider a short buffer sensor instead of pure cron.docs/sidebars.js (1)
70-79: Sidebar additions look correct.No logical issues – new docs will render in the expected categories.
docs/docs/deployment/docker.md (1)
42-50: Check relative link path to the Storage Optimization guideThe new cross-reference points to
../storage-optimization.md, but the file was added underdocs/docs/.... Please confirm the relative path renders correctly once the docs are built; broken links slip through CI easily.anomstack/main.py (1)
13-20: Ensure the reload artefacts have unique Dagster names
reload_jobs,reload_schedules, andconfig_file_watcherwill be merged into the global registry alongside many existing definitions. Dagster will raise aDagsterInvalidDefinitionErrorif any names collide. Double-check the underlying modules export uniquename=arguments.docker/Dockerfile.anomstack_code (1)
25-27: Verify gRPC health-check still succeeds withcode-serverThe container now starts
dagster code-server, but the health-check still pingsdagster api grpc-health-check.code-servershould keep the gRPC endpoint alive, yet if it’s disabled in future versions the container will be marked unhealthy.Please confirm the probe passes; if not, switch to the GraphQL
/server_infocurl used elsewhere.dagster_docker.yaml (3)
9-16: LGTM! Concurrency reduction will help manage storage growth.The reduced limits from 25→10 concurrent runs and tag-specific reductions are appropriate for preventing storage issues while maintaining reasonable throughput.
69-75: Run monitoring configuration provides good lifecycle management.The timeout settings (5min start, 3min cancel, 1hr max runtime) with 1-minute polling provide effective stuck run detection and cleanup.
77-79: Good optimization to disable telemetry for reduced disk writes.This aligns well with the storage optimization goals of this PR.
scripts/README.md (3)
11-24: Excellent documentation for the kill_long_running_tasks utility.Clear description of features and usage instructions will help users understand this important maintenance tool.
29-67: Comprehensive SQLite utilities documentation.The detailed sections for each utility script with usage examples provide excellent guidance for database management tasks.
109-135: Great addition of quick examples section.The practical command-line examples will help users quickly accomplish common tasks without reading through all the detailed documentation.
anomstack/config.py (1)
186-236: Excellent validation wrapper for the reload functionality.The pre-flight checks for Dagster accessibility and configuration files provide good user experience and prevent common failure scenarios.
scripts/reload_config.py (1)
146-180: Excellent main function with comprehensive validation and user guidance.The script provides clear feedback, troubleshooting tips, and proper exit codes for automation use.
scripts/utils/README.md (3)
7-18: Excellent documentation structure for the Docker reset utility.The clear feature list with emojis and descriptions provides users with a quick understanding of the utility's capabilities and safety features.
19-76: Comprehensive documentation of reset levels with clear use cases.The progressive reset levels (gentle → medium → nuclear → full-nuclear) are well-explained with specific use cases and command examples.
104-158: Excellent safety documentation and usage guidance.The sections on preserved files, safety features, and usage tips provide crucial information for users of this potentially destructive utility.
anomstack/jobs/reload.py (4)
1-19: LGTM: Clean imports and configurationThe imports are well-organized and the environment variable pattern with sensible defaults is consistent with the codebase.
82-95: LGTM: Clean configuration logicThe function correctly implements the enabling logic with proper boolean conversion and reasonable defaults for development mode.
98-128: LGTM: Well-structured conditional feature activationThe module-level conditional execution properly handles the optional nature of this feature. The environment variable handling with sensible defaults and clear logging provides good operational visibility.
The pattern of appending to
reload_jobsandreload_scheduleslists suggests these are consumed elsewhere in the application, which is a clean approach for modular feature management.
21-79: Defaults match Docker Compose configurationVerified that the hardcoded defaults in
reload_configuration(DAGSTER_HOST="anomstack_webserver"andDAGSTER_PORT="3000") align with the values defined in bothdocker-compose.yamlanddocker-compose.dev.yaml. No changes required.anomstack/sensors/config_reload.py (4)
21-24: LGTM: Clean utility functionThe
get_metrics_directory()function correctly usesPath.resolve()to get an absolute path, which is good practice for file operations.
66-80: LGTM: Clean cursor management utilitiesThe cursor management functions are well-designed with appropriate error handling for JSON operations and follow Dagster sensor patterns correctly.
103-117: LGTM: Consistent enabling logicThe
should_enable_config_watcher()function correctly mirrors the approach used in the reload job, maintaining consistency across the configuration reload system.
123-175: LGTM: Well-structured sensor logic with smart cursor managementThe sensor implementation correctly handles the baseline establishment, change detection, and reload workflow. The decision to only update the cursor on successful reload ensures failed reloads will be retried, which is excellent for reliability.
The comprehensive error handling and detailed logging provide good operational visibility.
docs/docs/configuration/hot-reload.md (1)
1-337: LGTM: Comprehensive and accurate documentationThis documentation excellently covers the hot reload system with:
- ✅ Clear explanation of all three reload methods (manual, scheduled, smart file watcher)
- ✅ Accurate environment variable configurations that match the code implementation
- ✅ Proper distinction between what supports hot reload vs. requires container restart
- ✅ Comprehensive troubleshooting section
- ✅ Best practices for safe configuration management
- ✅ Good examples and command references
The documentation structure is logical and user-friendly, making it easy for developers to understand and implement the hot reload capabilities effectively.
scripts/utils/reset_docker.sh (4)
1-40: LGTM: Solid script foundation with good practicesThe script header follows bash best practices with
set -efor error handling, robust directory detection, and well-structured utility functions. Theconfirm()function properly handles default values and user input validation.
42-76: LGTM: Excellent user safety and communication featuresThe data size reporting and preservation messaging provide crucial user awareness before destructive operations. The defensive programming with existence checks and error suppression prevents script failures while still providing useful information.
78-217: LGTM: Well-designed reset levels with excellent safety measuresThe progressive reset levels with increasing destructiveness are well-implemented:
- ✅ Appropriate warnings and confirmations for each level
- ✅ Double confirmation for nuclear options
- ✅ Comprehensive cleanup of relevant directories
- ✅ Proper error handling with fallback commands
- ✅ Good integration with Makefile targets
- ✅ Defensive programming with existence checks before
rm -rfThe safety measures effectively prevent accidental data loss while providing the necessary cleanup capabilities.
219-283: LGTM: Clean main function with comprehensive option handlingThe main function provides both command-line and interactive interfaces effectively. The usage information is clear, error handling is appropriate, and the interactive menu makes the script accessible to users who prefer guided operation.
Makefile (3)
162-186: LGTM: Well-organized reset operations sectionThe reset operations are cleanly organized with clear descriptions and proper integration with the reset script. The interactive default option provides a user-friendly entry point while still supporting direct command-line usage.
188-212: LGTM: Comprehensive Dagster cleanup integrationThe Dagster cleanup section provides excellent granular control over storage cleanup with clear descriptions and proper script integration. The status option is particularly valuable for understanding storage usage before cleanup.
262-286: LGTM: Comprehensive documentation workflow supportThe enhanced documentation targets provide full lifecycle management for the Docusaurus site with clear, standard naming conventions. The backward-compatible alias for the
docstarget is thoughtful.scripts/utils/cleanup_dagster_storage.sh (1)
133-141: Unsafe deletion when directory names contain spaces.
find … -exec rm -rf {} \;splits on whitespace. Use a null-delimited pipeline, mirroring the approach in the aggressive cleanup:-find tmp/ -maxdepth 1 -type d -name "*-*-*-*-*" -mtime +30 -exec rm -rf {} \; 2>/dev/null || true +find tmp/ -maxdepth 1 -type d -name "*-*-*-*-*" -mtime +30 -print0 2>/dev/null | xargs -0 rm -rf -- 2>/dev/null || trueReplicate the change in the nuclear cleanup block below.
Likely an incorrect or invalid review comment.
| # automatic configuration reloading | ||
| # Enable scheduled config reload job (runs every N minutes) | ||
| ANOMSTACK_AUTO_CONFIG_RELOAD=false | ||
| ANOMSTACK_CONFIG_RELOAD_CRON="*/5 * * * *" | ||
| ANOMSTACK_CONFIG_RELOAD_STATUS=STOPPED | ||
|
|
||
| # Enable smart config file watcher sensor (reloads only when files change) | ||
| ANOMSTACK_CONFIG_WATCHER=true | ||
| ANOMSTACK_CONFIG_WATCHER_INTERVAL=30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watcher variable name mismatch will disable the feature completely.
Code & docs previously referenced ANOMSTACK_CONFIG_WATCHER_POLL_INTERVAL,
but the file introduces ANOMSTACK_CONFIG_WATCHER_INTERVAL.
-ANOMSTACK_CONFIG_WATCHER_INTERVAL=30
+ANOMSTACK_CONFIG_WATCHER_POLL_INTERVAL=30 # keep in sync with anomstack/config.pySearch the repo for the constant to confirm.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # automatic configuration reloading | |
| # Enable scheduled config reload job (runs every N minutes) | |
| ANOMSTACK_AUTO_CONFIG_RELOAD=false | |
| ANOMSTACK_CONFIG_RELOAD_CRON="*/5 * * * *" | |
| ANOMSTACK_CONFIG_RELOAD_STATUS=STOPPED | |
| # Enable smart config file watcher sensor (reloads only when files change) | |
| ANOMSTACK_CONFIG_WATCHER=true | |
| ANOMSTACK_CONFIG_WATCHER_INTERVAL=30 | |
| # automatic configuration reloading | |
| # Enable scheduled config reload job (runs every N minutes) | |
| ANOMSTACK_AUTO_CONFIG_RELOAD=false | |
| ANOMSTACK_CONFIG_RELOAD_CRON="*/5 * * * *" | |
| ANOMSTACK_CONFIG_RELOAD_STATUS=STOPPED | |
| # Enable smart config file watcher sensor (reloads only when files change) | |
| ANOMSTACK_CONFIG_WATCHER=true | |
| ANOMSTACK_CONFIG_WATCHER_POLL_INTERVAL=30 # keep in sync with anomstack/config.py |
🤖 Prompt for AI Agents
In the .example.env file around lines 143 to 151, the environment variable for
the watcher interval is named ANOMSTACK_CONFIG_WATCHER_INTERVAL, but the code
and documentation expect ANOMSTACK_CONFIG_WATCHER_POLL_INTERVAL. Rename
ANOMSTACK_CONFIG_WATCHER_INTERVAL to ANOMSTACK_CONFIG_WATCHER_POLL_INTERVAL to
match the expected variable name and ensure the watcher feature works correctly.
Verify by searching the repo for all references to this constant to confirm
consistency.
| build: | ||
| context: . | ||
| dockerfile: docker/Dockerfile.anomstack_code | ||
| container_name: anomstack_code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DAGSTER_CURRENT_IMAGE now points at a different image – runs may launch stale code
The service is locally built (build:) but DAGSTER_CURRENT_IMAGE is still hard-coded to the remote andrewm4894/anomstack_code:latest. Dagster will spin up new run containers from that tag, not the image you just built, defeating hot-reload consistency.
- DAGSTER_CURRENT_IMAGE: "andrewm4894/anomstack_code:latest"
+# Use the freshly-built image so launched runs match this container
+# (Docker Compose will tag it as "<project>_anomstack_code").
+ DAGSTER_CURRENT_IMAGE: "${DAGSTER_CURRENT_IMAGE:-anomstack_code}"Also applies to: 54-59
🤖 Prompt for AI Agents
In docker-compose.yaml around lines 38 to 41 and 54 to 59, the
DAGSTER_CURRENT_IMAGE environment variable is hard-coded to a remote image tag,
causing new run containers to use stale code instead of the locally built image.
Update DAGSTER_CURRENT_IMAGE to reference the locally built image by using the
container_name or build context image tag, ensuring that new runs launch with
the latest local build for hot-reload consistency.
| # Remove run directories older than 1 day | ||
| print_info "Removing run directories older than 24 hours..." | ||
| find tmp/ -maxdepth 1 -type d -name "*-*-*-*-*" -mtime +1 -exec rm -rf {} \; 2>/dev/null || true | ||
|
|
||
| # Remove all logs older than 1 day | ||
| find tmp/ -name "*.log" -mtime +1 -delete 2>/dev/null || true | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same whitespace-safe deletion issue as above.
Apply the -print0 | xargs -0 pattern here as well to avoid accidental partial path deletion.
🤖 Prompt for AI Agents
In scripts/utils/cleanup_dagster_storage.sh around lines 274 to 280, the find
commands deleting run directories and log files use -exec and -delete directly,
which can cause issues with paths containing whitespace. Replace these with the
safer pattern using -print0 piped to xargs -0 rm -rf (for directories) and xargs
-0 rm (for files) to ensure paths with spaces are handled correctly and prevent
accidental partial deletions.
- Adjusted ingest, train, score, and alert cron schedules for improved timing: - Ingest schedule changed from "*/9 * * * *" to "*/10 * * * *" - Train schedule changed from "*/11 * * * *" to "*/15 * * * *" - Score schedule updated from "*/6 * * * *" to "*/11 * * * *" - Alert schedule modified from "*/5 * * * *" to "*/12 * * * *"
- Added ANOMSTACK_HOME variable to .example.env for home directory configuration. - Updated dagster_docker.yaml to use ANOMSTACK_HOME for volume mounts, improving flexibility and portability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
dagster_docker.yaml (1)
67-67: Strip trailing spaces (YAMLLint error).Line 67 has two trailing spaces that violate the project’s lint rules.
- failure: 3 # Keep 3 days of failed runs for debugging·· + failure: 3 # Keep 3 days of failed runs for debugging
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.example.env(3 hunks)dagster_docker.yaml(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .example.env
🧰 Additional context used
🪛 YAMLlint (1.37.1)
dagster_docker.yaml
[error] 67-67: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (3)
dagster_docker.yaml (3)
31-39: VerifyANOMSTACK_HOMEavailability & permissions.
ANOMSTACK_HOMEis now propagated into launched containers and used for three host-path mounts.
If this variable is not set (or points to a non-absolute / non-existent path) Dagster will fail to start runs because Docker will refuse the bind mounts.Please confirm that:
ANOMSTACK_HOMEis exported in every environment wheredagster-daemonanddagitrun (dev, CI, prod).- The directory structure (
tmp,dagster_home,metrics) exists and is readable / writable by the container UID.A quick smoke test with
docker run --rm -v "${ANOMSTACK_HOME}/tmp:/data busybox ls /data"is enough to catch mis-configuration early.
60-68: Retention section may not match Dagster’s expected schema.The YAML keys look correct at first glance, but Dagster’s retention config is still marked “experimental” and the accepted field names changed twice in recent releases.
Make sure the current Dagster version recognises:retention: schedule: purge_after_days: … sensor: purge_after_days: skipped: … failure: … success: …Otherwise the daemon will silently ignore the block and disk usage will grow again.
Double-check withdagster debug print-configor start the daemon in verbose mode to see whether the policy is loaded.
70-76: Confirmcancel_timeout_secondsis supported.
run_monitoringdefinitely supportsstart_timeout_seconds,max_runtime_seconds, andpoll_interval_seconds.
For the cancel timeout Dagster currently usescancel_grace_period_seconds(at least up to 1.7.x). Using an unknown key would again be silently ignored.Please validate against the Dagster version you pin in
requirements*.txtor runtime image.
If the correct key iscancel_grace_period_seconds, patch as follows:- cancel_timeout_seconds: 180 # 3 minutes to cancel + cancel_grace_period_seconds: 180 # 3 minutes to cancel
- Updated the environment variable handling in get_specs to ensure overrides only occur if parameters are not specified in defaults or YAML files. - Improved logging to reflect the changes made by environment variable overrides.
📊 Test Coverage ReportCoverage: 59% (yellow) ✅ Coverage maintained or improved!
|
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores