Skip to content

Commit e04ccb0

Browse files
committed
docs: update documentation and tests for disk cleanup solution
- Add comprehensive cleanup_disk_space.py documentation to maintenance README - Document new Fly.io cleanup Makefile commands in Makefile.md - Update test expectations to account for new cleanup job/schedule - Include usage examples and safety features for all cleanup tools Provides complete documentation for the disk space management solution including automated and manual cleanup options.
1 parent 4ca6bc2 commit e04ccb0

File tree

3 files changed

+85
-4
lines changed

3 files changed

+85
-4
lines changed

Makefile.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -562,6 +562,43 @@ make posthog-example
562562
make kill-long-runs
563563
```
564564

565+
### Fly.io Disk Space Management
566+
567+
#### `make fly-cleanup-preview`
568+
**Preview disk cleanup on Fly instance (dry run)**
569+
- Shows what files would be removed
570+
- Safe way to check cleanup impact
571+
- Requires `FLY_APP` environment variable
572+
573+
```bash
574+
export FLY_APP=anomstack-demo
575+
make fly-cleanup-preview
576+
```
577+
578+
#### `make fly-cleanup`
579+
**Clean up disk space on Fly instance**
580+
- Removes old artifacts (6+ hours)
581+
- Removes old logs (24+ hours)
582+
- Cleans database and runs VACUUM
583+
- Reports disk usage before/after
584+
585+
```bash
586+
export FLY_APP=anomstack-demo
587+
make fly-cleanup
588+
```
589+
590+
#### `make fly-cleanup-aggressive`
591+
**Emergency disk cleanup (aggressive mode)**
592+
- Removes artifacts older than 1 hour
593+
- Removes ALL log files
594+
- Use only when disk is critically full
595+
- More thorough than normal cleanup
596+
597+
```bash
598+
export FLY_APP=anomstack-demo
599+
make fly-cleanup-aggressive
600+
```
601+
565602
### Legacy Targets
566603

567604
#### `make docker-dev-env`

scripts/maintenance/README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,50 @@ python kill_long_running_tasks.py
3939
- **Validation**: Checks job status before taking action
4040
- **Error Handling**: Handles unreachable user code servers gracefully
4141

42+
### `cleanup_disk_space.py`
43+
Standalone script for managing disk space by cleaning up old artifacts, logs, and metrics.
44+
45+
**Features:**
46+
- **Artifact Cleanup**: Removes old Dagster run artifacts
47+
- **Log Cleanup**: Removes old log files from multiple directories
48+
- **Database Cleanup**: Removes old metrics and vacuums database
49+
- **Disk Usage Reporting**: Shows before/after disk usage statistics
50+
- **Dry Run Mode**: Preview cleanup without making changes
51+
- **Aggressive Mode**: More thorough cleanup for emergency situations
52+
53+
**Use Cases:**
54+
- **Emergency Cleanup**: Free disk space when volume is full
55+
- **Scheduled Maintenance**: Regular cleanup to prevent disk issues
56+
- **Deployment Optimization**: Optimize Fly.io volume usage
57+
- **Development**: Clean up after testing
58+
59+
**Usage:**
60+
```bash
61+
# Preview what would be cleaned up
62+
python cleanup_disk_space.py --dry-run
63+
64+
# Normal cleanup (6h artifacts, 24h logs)
65+
python cleanup_disk_space.py
66+
67+
# Aggressive cleanup (1h artifacts, all logs)
68+
python cleanup_disk_space.py --aggressive
69+
70+
# Emergency cleanup with preview
71+
python cleanup_disk_space.py --dry-run --aggressive
72+
```
73+
74+
**Cleanup Targets:**
75+
- **Artifacts**: Dagster run artifacts older than 6 hours (1 hour in aggressive mode)
76+
- **Logs**: Log files older than 24 hours (all logs in aggressive mode)
77+
- **Database**: Metrics older than 90 days + VACUUM operation
78+
- **Locations**: `/data/artifacts`, `/tmp/dagster`, `/data/dagster_storage`
79+
80+
**Safety Features:**
81+
- **Dry Run Mode**: Safe preview of cleanup actions
82+
- **Detailed Reporting**: Shows exactly what will be/was removed
83+
- **Error Handling**: Continues cleanup even if individual files fail
84+
- **Size Calculation**: Reports space freed by cleanup operations
85+
4286
## Common Maintenance Tasks
4387

4488
### Regular Cleanup Operations

tests/test_main.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,19 @@
99

1010

1111
def test_jobs_len():
12-
assert len(jobs) == 185
12+
assert len(jobs) == 186 # Updated to include cleanup job
1313

1414

1515
def test_jobs_len_ingest():
16-
assert len(ingest_jobs) == (len(jobs)-1) / 8
16+
assert len(ingest_jobs) == (len(jobs)-2) / 8 # Updated to account for cleanup job (2 non-metric jobs total)
1717

1818

1919
def test_schedules_len():
20-
assert len(schedules) == 185
20+
assert len(schedules) == 186 # Updated to include cleanup schedule
2121

2222

2323
def test_schedules_len_ingest():
24-
assert len(ingest_schedules) == (len(schedules)-1) / 8
24+
assert len(ingest_schedules) == (len(schedules)-2) / 8 # Updated to account for cleanup schedule
2525

2626

2727
def test_jobs_schedules_len_match():

0 commit comments

Comments
 (0)