Skip to content

Commit 3a7d028

Browse files
committed
Simplify Docker architecture and remove Docker Hub dependencies
- Remove all andrewm4894/anomstack image references, force local builds - Switch from DockerRunLauncher to DefaultRunLauncher for simpler deployment - Replace PostgreSQL with SQLite storage for reduced complexity - Remove Docker socket and complex volume mounts - Update Makefile to remove Docker Hub commands (tag, push, pull) - Simplify docker-smart command to build locally instead of pulling - Update DOCKER.md to reflect simplified 4-container architecture - Align Docker setup with Fly.io approach for consistency Benefits: - Fewer resources required (no PostgreSQL container) - Simpler troubleshooting (fewer moving parts) - Users always get latest code (local builds) - Easier deployment and development setup
1 parent 6a626b1 commit 3a7d028

File tree

4 files changed

+111
-190
lines changed

4 files changed

+111
-190
lines changed

DOCKER.md

Lines changed: 44 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,21 @@ This document provides comprehensive information about the Docker setup for the
44

55
## Overview
66

7-
Anomstack uses Docker to containerize all components for easy deployment and development. The setup includes:
7+
Anomstack uses a **simplified Docker architecture** with local image building and SQLite storage for easy deployment and development. The setup includes:
88

99
- **Dagster Code Server**: Runs your data pipelines
10-
- **Dagster UI (Dagit)**: Web interface for pipeline management
11-
- **Dagster Daemon**: Background process for scheduling and execution
10+
- **Dagster UI**: Web interface for pipeline management
11+
- **Dagster Daemon**: Background process for scheduling and execution (uses DefaultRunLauncher)
1212
- **Dashboard**: FastHTML-based metrics dashboard
13-
- **PostgreSQL**: Database for Dagster metadata
13+
- **SQLite Storage**: All Dagster metadata stored in SQLite files (no separate database)
1414
- **DuckDB Volume**: Persistent storage for metrics data
1515

16+
### Key Simplifications
17+
-**No PostgreSQL**: Uses SQLite for simpler deployment
18+
-**No Docker Socket**: DefaultRunLauncher runs jobs in same container
19+
-**Local Builds**: All images built locally (no Docker Hub dependency)
20+
-**Fewer Resources**: Reduced memory and complexity
21+
1622
## Quick Start
1723

1824
```bash
@@ -27,39 +33,32 @@ make docker
2733

2834
## Services
2935

30-
### 1. PostgreSQL Database (`anomstack_postgresql`)
31-
- **Image**: `postgres:11`
32-
- **Purpose**: Stores Dagster run history, schedules, and metadata
33-
- **Port**: 5432 (configurable via `ANOMSTACK_POSTGRES_FORWARD_PORT`)
34-
- **Environment Variables**:
35-
- `ANOMSTACK_POSTGRES_USER` (default: postgres_user)
36-
- `ANOMSTACK_POSTGRES_PASSWORD` (default: postgres_password)
37-
- `ANOMSTACK_POSTGRES_DB` (default: postgres_db)
38-
39-
### 2. Code Server (`anomstack_code`)
40-
- **Image**: `andrewm4894/anomstack_code:latest`
36+
### 1. Code Server (`anomstack_code`)
37+
- **Image**: Built locally from `docker/Dockerfile.anomstack_code`
4138
- **Purpose**: Runs Dagster user code via gRPC
4239
- **Port**: 4000 (internal)
4340
- **Restart Policy**: `always`
4441
- **Volumes**:
4542
- `./tmp:/opt/dagster/app/tmp` (temporary files)
4643
- `anomstack_metrics_duckdb:/data` (DuckDB data)
44+
- `./dagster_home:/opt/dagster/dagster_home` (Dagster storage - includes SQLite)
4745

48-
### 3. Dagster UI (`anomstack_dagit`)
49-
- **Image**: `andrewm4894/anomstack_dagster:latest`
50-
- **Purpose**: Web interface for pipeline management
46+
### 2. Dagster UI (`anomstack_webserver`)
47+
- **Image**: Built locally from `docker/Dockerfile.dagster`
48+
- **Purpose**: Web interface for pipeline management
5149
- **Port**: 3000 (external)
5250
- **Restart Policy**: `on-failure`
5351
- **Access**: http://localhost:3000
52+
- **Storage**: Uses SQLite files in mounted dagster_home volume
5453

55-
### 4. Dagster Daemon (`anomstack_daemon`)
56-
- **Image**: `andrewm4894/anomstack_dagster:latest`
54+
### 3. Dagster Daemon (`anomstack_daemon`)
55+
- **Image**: Built locally from `docker/Dockerfile.dagster`
5756
- **Purpose**: Background process for scheduling and run execution
5857
- **Restart Policy**: `on-failure`
59-
- **Volumes**: Same as dagit (for Docker socket access)
58+
- **Run Launcher**: DefaultRunLauncher (runs jobs in same container process)
6059

61-
### 5. Dashboard (`anomstack_dashboard`)
62-
- **Image**: `andrewm4894/anomstack_dashboard:latest`
60+
### 4. Dashboard (`anomstack_dashboard`)
61+
- **Image**: Built locally from `docker/Dockerfile.anomstack_dashboard`
6362
- **Purpose**: FastHTML-based metrics dashboard
6463
- **Port**: 5001 (external) → 8080 (internal)
6564
- **Restart Policy**: `on-failure`
@@ -76,8 +75,13 @@ make docker
7675

7776
### Bind Mounts
7877
- **`./tmp`**: Temporary files and local artifacts
79-
- **`/var/run/docker.sock`**: Docker socket for container management
80-
- **`/tmp/io_manager_storage`**: Dagster I/O manager storage
78+
- **`./dagster_home`**: Dagster configuration, SQLite storage, and logs
79+
- **`./metrics`**: Hot-reloadable metric configurations
80+
81+
### Storage Simplification
82+
- **SQLite Storage**: All Dagster metadata (runs, schedules, events) stored in SQLite files
83+
- **No PostgreSQL**: Eliminated separate database container for simpler deployment
84+
- **No Docker Socket**: DefaultRunLauncher removes need for Docker socket access
8185

8286
## Environment Configuration
8387

@@ -87,15 +91,8 @@ Create a `.env` file based on `.example.env`:
8791
```bash
8892
# Database paths
8993
ANOMSTACK_DUCKDB_PATH=/data/anomstack.db
90-
ANOMSTACK_SQLITE_PATH=tmpdata/anomstack-sqlite.db
91-
92-
# PostgreSQL
93-
ANOMSTACK_POSTGRES_USER=postgres_user
94-
ANOMSTACK_POSTGRES_PASSWORD=postgres_password
95-
ANOMSTACK_POSTGRES_DB=postgres_db
96-
ANOMSTACK_POSTGRES_FORWARD_PORT=5432
9794

98-
# Dagster
95+
# Dagster (uses SQLite for metadata storage)
9996
DAGSTER_HOME=
10097
DAGSTER_LOG_LEVEL=DEBUG
10198
DAGSTER_CONCURRENCY=4
@@ -117,10 +114,7 @@ make docker-prune # Remove everything including volumes (⚠️ dele
117114
### Image Management
118115
```bash
119116
make docker-build # Build all images locally
120-
make docker-tag # Tag images for Docker Hub
121-
make docker-push # Push images to Docker Hub
122-
make docker-build-push # Build, tag, and push in one command
123-
make docker-pull # Pull latest images from Docker Hub
117+
make docker-smart # Build and start services (recommended)
124118
```
125119

126120
### Debugging
@@ -300,17 +294,21 @@ make docker-clean
300294
- Set up log aggregation
301295
- Monitor volume usage
302296

303-
## Docker Hub Images
297+
## Building Images
304298

305-
All images are available on Docker Hub:
306-
- `andrewm4894/anomstack_code:latest`
307-
- `andrewm4894/anomstack_dagster:latest`
308-
- `andrewm4894/anomstack_dashboard:latest`
299+
All images are built locally from their respective Dockerfiles:
300+
- `anomstack_code_image` - Built from `docker/Dockerfile.anomstack_code`
301+
- `anomstack_dagster_image` - Built from `docker/Dockerfile.dagster`
302+
- `anomstack_dashboard_image` - Built from `docker/Dockerfile.anomstack_dashboard`
309303

310-
### Building and Pushing
311-
Images are automatically tagged and pushed using the Makefile:
304+
### Building Images
305+
Images are built automatically when using Docker Compose:
312306
```bash
313-
make docker-build-push
307+
# Build and start all services
308+
make docker-smart
309+
310+
# Or build manually
311+
docker compose build --no-cache
314312
```
315313

316-
This ensures consistent deployments across environments.
314+
This ensures you're always running the latest code and dependencies.

Makefile

Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -30,25 +30,20 @@ dev:
3030
# DOCKER OPERATIONS
3131
# =============================================================================
3232

33-
.PHONY: docker docker-dev docker-smart docker-build docker-dev-build docker-tag docker-push docker-build-push
34-
.PHONY: docker-pull docker-clean docker-logs docker-logs-code docker-logs-dagit docker-logs-daemon docker-logs-dashboard
33+
.PHONY: docker docker-dev docker-smart docker-build docker-dev-build docker-clean
34+
.PHONY: docker-logs docker-logs-code docker-logs-dagit docker-logs-daemon docker-logs-dashboard
3535
.PHONY: docker-shell-code docker-shell-dagit docker-shell-dashboard docker-restart-dashboard docker-restart-code docker-restart reload-config enable-auto-reload enable-config-watcher
3636
.PHONY: docker-stop docker-down docker-rm docker-prune
3737

3838
# start docker containers (now uses pre-built images)
3939
docker:
4040
docker compose up -d
4141

42-
# smart docker start: try to pull, fallback to build if images don't exist
42+
# smart docker start: build images locally and start containers
4343
docker-smart:
44-
@echo "🔄 Attempting to pull pre-built images..."
45-
@if docker compose pull 2>/dev/null; then \
46-
echo "✅ Successfully pulled images, starting containers..."; \
47-
docker compose up -d; \
48-
else \
49-
echo "⚠️ Pull failed, building images locally..."; \
50-
make docker-dev-build && make docker-dev; \
51-
fi
44+
@echo "🔄 Building images locally and starting containers..."
45+
docker compose build --no-cache
46+
docker compose up -d
5247

5348
# start docker containers with local development images
5449
docker-dev:
@@ -64,26 +59,29 @@ docker-build:
6459
docker-dev-build:
6560
docker compose -f docker-compose.yaml -f docker-compose.dev.yaml build --no-cache
6661

62+
# The following commands are commented out as we no longer publish to Docker Hub
63+
# Users should build images locally instead
64+
6765
# tag docker images for Docker Hub
68-
docker-tag:
69-
docker tag anomstack_code_image andrewm4894/anomstack_code:latest
70-
docker tag anomstack_dagster_image andrewm4894/anomstack_dagster:latest
71-
docker tag anomstack_dashboard_image andrewm4894/anomstack_dashboard:latest
66+
# docker-tag:
67+
# docker tag anomstack_code_image andrewm4894/anomstack_code:latest
68+
# docker tag anomstack_dagster_image andrewm4894/anomstack_dagster:latest
69+
# docker tag anomstack_dashboard_image andrewm4894/anomstack_dashboard:latest
7270

7371
# push docker images to Docker Hub
74-
docker-push:
75-
docker push andrewm4894/anomstack_code:latest
76-
docker push andrewm4894/anomstack_dagster:latest
77-
docker push andrewm4894/anomstack_dashboard:latest
72+
# docker-push:
73+
# docker push andrewm4894/anomstack_code:latest
74+
# docker push andrewm4894/anomstack_dagster:latest
75+
# docker push andrewm4894/anomstack_dashboard:latest
7876

7977
# build, tag, and push all images in one command
80-
docker-build-push: docker-build docker-tag docker-push
78+
# docker-build-push: docker-build docker-tag docker-push
8179

8280
# pull latest images from Docker Hub
83-
docker-pull:
84-
docker pull andrewm4894/anomstack_code:latest
85-
docker pull andrewm4894/anomstack_dagster:latest
86-
docker pull andrewm4894/anomstack_dashboard:latest
81+
# docker-pull:
82+
# docker pull andrewm4894/anomstack_code:latest
83+
# docker pull andrewm4894/anomstack_dagster:latest
84+
# docker pull andrewm4894/anomstack_dashboard:latest
8785

8886
# clean up unused docker resources
8987
docker-clean:

dagster_docker.yaml

Lines changed: 29 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -6,58 +6,30 @@ run_coordinator:
66
module: dagster.core.run_coordinator
77
class: QueuedRunCoordinator
88
config:
9-
max_concurrent_runs: 10 # Reduced from 25 to prevent excessive storage
9+
max_concurrent_runs: 8 # Reduced for better stability in single container
1010
tag_concurrency_limits:
1111
- key: "dagster/concurrency_key"
1212
value: "database"
13-
limit: 2 # Reduced from 3
13+
limit: 1 # Reduced to prevent resource contention
1414
- key: "dagster/concurrency_key"
1515
value: "ml_training"
16-
limit: 1 # Reduced from 2
16+
limit: 1
1717

18+
# Use DefaultRunLauncher instead of DockerRunLauncher for simpler Docker setup
1819
run_launcher:
19-
module: dagster_docker
20-
class: DockerRunLauncher
21-
config:
22-
env_vars:
23-
- DAGSTER_POSTGRES_USER
24-
- DAGSTER_POSTGRES_PASSWORD
25-
- DAGSTER_POSTGRES_DB
26-
- ANOMSTACK_DUCKDB_PATH
27-
- ANOMSTACK_TABLE_KEY
28-
- ANOMSTACK_MODEL_PATH
29-
- ANOMSTACK_IGNORE_EXAMPLES
30-
- DAGSTER_HOME
31-
- ANOMSTACK_HOME
32-
network: anomstack_network
33-
container_kwargs:
34-
volumes: # Make docker client accessible to any launched containers as well
35-
- /var/run/docker.sock:/var/run/docker.sock
36-
- /tmp/io_manager_storage:/tmp/io_manager_storage
37-
- ${ANOMSTACK_HOME}/tmp:/opt/dagster/app/tmp
38-
- ${ANOMSTACK_HOME}/dagster_home:/opt/dagster/dagster_home
39-
- ${ANOMSTACK_HOME}/metrics:/opt/dagster/app/metrics
40-
- anomstack_metrics_duckdb:/data
20+
module: dagster.core.launcher.default_run_launcher
21+
class: DefaultRunLauncher
4122

42-
run_storage:
43-
module: dagster_postgres.run_storage
44-
class: PostgresRunStorage
45-
config:
46-
postgres_db:
47-
hostname: anomstack_postgresql
48-
username:
49-
env: DAGSTER_POSTGRES_USER
50-
password:
51-
env: DAGSTER_POSTGRES_PASSWORD
52-
db_name:
53-
env: DAGSTER_POSTGRES_DB
54-
port: 5432
23+
# Use SQLite storage instead of PostgreSQL for simpler deployment
24+
storage:
25+
sqlite:
26+
base_dir: "/opt/dagster/dagster_home/storage"
5527

5628
run_retries:
5729
enabled: true
58-
max_retries: 1
30+
max_retries: 2 # Increased for better reliability
5931

60-
# Aggressive retention policies to prevent disk space issues
32+
# Retention policies to prevent disk space issues
6133
retention:
6234
schedule:
6335
purge_after_days: 3 # Keep schedule runs for 3 days
@@ -73,45 +45,34 @@ run_monitoring:
7345
start_timeout_seconds: 300 # 5 minutes to start
7446
cancel_timeout_seconds: 180 # 3 minutes to cancel
7547
max_runtime_seconds: 900 # 15 minutes max runtime per run
76-
max_resume_run_attempts: 2 # Resume runs after worker crashes (DockerRunLauncher only)
7748
poll_interval_seconds: 60 # Check every minute
7849

7950
# Disable telemetry to reduce disk writes
8051
telemetry:
8152
enabled: false
8253

54+
# Optimized for container environment
8355
schedules:
8456
use_threads: true
85-
num_workers: 8
57+
num_workers: 4 # Reasonable for Docker container
8658

8759
sensors:
88-
use_threads: true
89-
num_workers: 4
60+
use_threads: true
61+
num_workers: 2 # Reasonable for Docker container
9062

91-
schedule_storage:
92-
module: dagster_postgres.schedule_storage
93-
class: PostgresScheduleStorage
63+
compute_logs:
64+
module: dagster.core.storage.local_compute_log_manager
65+
class: LocalComputeLogManager
9466
config:
95-
postgres_db:
96-
hostname: anomstack_postgresql
97-
username:
98-
env: DAGSTER_POSTGRES_USER
99-
password:
100-
env: DAGSTER_POSTGRES_PASSWORD
101-
db_name:
102-
env: DAGSTER_POSTGRES_DB
103-
port: 5432
67+
base_dir: "/opt/dagster/dagster_home/compute_logs"
10468

105-
event_log_storage:
106-
module: dagster_postgres.event_log
107-
class: PostgresEventLogStorage
69+
# Local artifact storage
70+
local_artifact_storage:
71+
module: dagster.core.storage.root
72+
class: LocalArtifactStorage
10873
config:
109-
postgres_db:
110-
hostname: anomstack_postgresql
111-
username:
112-
env: DAGSTER_POSTGRES_USER
113-
password:
114-
env: DAGSTER_POSTGRES_PASSWORD
115-
db_name:
116-
env: DAGSTER_POSTGRES_DB
117-
port: 5432
74+
base_dir: "/opt/dagster/dagster_home/artifacts"
75+
76+
# Enhanced logging for debugging
77+
code_servers:
78+
reload_timeout: 60 # Give code servers more time to reload

0 commit comments

Comments
 (0)