You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ARCHITECTURE.md
+75-14Lines changed: 75 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,7 @@ Anomstack is a distributed anomaly detection platform built on modern data stack
13
13
-**Flexibility**: Support for multiple data sources, ML algorithms, and deployment patterns
14
14
-**Observability**: Built-in monitoring, logging, and visualization
15
15
-**Simplicity**: Minimal configuration required to get started
16
+
-**Reliability**: Direct Python module loading eliminates gRPC network overhead and improves system stability
16
17
17
18
## High-Level Architecture
18
19
@@ -379,29 +380,34 @@ graph LR
379
380
380
381
### Docker Containerized
381
382
383
+
Simplified 3-container architecture with direct Python module loading (no gRPC server):
384
+
382
385
```mermaid
383
386
graph TB
384
387
subgraph "Docker Environment"
385
-
subgraph "Dagster Container"
386
-
DG[Dagster]
388
+
subgraph "Consolidated Container"
389
+
DGW[Dagster Webserver + User Code<br/>(Direct Python Module Loading)]
387
390
end
388
391
389
-
subgraph "Dashboard Container"
390
-
DASH[FastHTML Dashboard]
392
+
subgraph "Daemon Container"
393
+
DGD[Dagster Daemon]
391
394
end
392
395
393
-
subgraph "Database Container"
394
-
PG[(PostgreSQL)]
396
+
subgraph "Dashboard Container"
397
+
DASH[FastHTML Dashboard]
395
398
end
396
399
397
400
subgraph "Storage"
398
-
VOL[Docker Volumes]
401
+
SQLITE[(SQLite Files)]
402
+
DUCKDB[(DuckDB Volume)]
399
403
end
400
404
end
401
405
402
-
DG --> PG
403
-
DG --> VOL
404
-
DASH --> DG
406
+
DGW --> SQLITE
407
+
DGW --> DUCKDB
408
+
DGD --> SQLITE
409
+
DGD --> DUCKDB
410
+
DASH --> DUCKDB
405
411
```
406
412
407
413
### Cloud Native (Dagster Cloud)
@@ -434,6 +440,61 @@ graph TB
434
440
JOBS --> EMAIL
435
441
```
436
442
443
+
### Advanced: gRPC Code Server (Optional)
444
+
445
+
By default, Anomstack uses direct Python module loading for simplicity and reliability. However, users can optionally configure separate gRPC code servers for advanced deployment scenarios such as:
446
+
447
+
-**Multi-language environments**: When integrating with non-Python services
448
+
-**Container isolation**: When user code needs strict separation from Dagster services
449
+
-**High-security environments**: When code execution requires additional sandboxing
450
+
-**Distributed deployments**: When code servers need to run on separate machines
451
+
452
+
#### Enabling gRPC Setup
453
+
454
+
To enable gRPC code servers, modify your `workspace.yaml`:
455
+
456
+
```yaml
457
+
load_from:
458
+
- grpc_server:
459
+
host: localhost
460
+
port: 4000
461
+
location_name: "anomstack_code"
462
+
```
463
+
464
+
And update your deployment to include a separate code server:
|Code Loading | Direct Python Modules | User code integration (no gRPC server needed)|
565
626
566
627
This architecture enables Anomstack to provide a robust, scalable, and maintainable anomaly detection platform that can adapt to various deployment scenarios and organizational requirements.
**ALWAYS check the Makefile first** when working on Anomstack! The project includes comprehensive Make commands for all common development tasks. Before running manual commands, review the available Makefile commands.
120
+
102
121
### Code Style
103
122
- Uses ruff for linting (line length: 100)
104
-
- Star imports allowed in dashboard modules
123
+
- Star imports allowed in dashboard modules (F403/F405 rules ignored)
105
124
- Pre-commit hooks enforce code quality
125
+
- Per-file lint ignores configured for dashboard routes and maintenance scripts
106
126
107
127
### Testing
108
128
- Tests in `tests/` directory
109
-
- Use pytest for running tests
129
+
- Use `pytest` or `make tests` for running tests
130
+
-`make test-examples` - Run only example ingest function tests
Copy file name to clipboardExpand all lines: DOCKER.md
+25-37Lines changed: 25 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,20 +4,20 @@ This document provides comprehensive information about the Docker setup for the
4
4
5
5
## Overview
6
6
7
-
Anomstack uses a **simplified Docker architecture** with local image building and SQLite storage for easy deployment and development. The setup includes:
7
+
Anomstack uses a **simplified 3-container Docker architecture** with direct Python module loading and SQLite storage for improved reliability and easy deployment. The setup includes:
8
8
9
-
-**Dagster Code Server**: Runs your data pipelines
10
-
-**Dagster UI**: Web interface for pipeline management
9
+
-**Dagster Webserver + User Code** (consolidated): Web interface and data pipelines in a single container using direct Python module loading
11
10
-**Dagster Daemon**: Background process for scheduling and execution (uses DefaultRunLauncher)
12
11
-**Dashboard**: FastHTML-based metrics dashboard
13
12
-**SQLite Storage**: All Dagster metadata stored in SQLite files (no separate database)
14
13
-**DuckDB Volume**: Persistent storage for metrics data
15
14
16
15
### Key Simplifications
16
+
- ✅ **No gRPC Code Server**: User code loaded directly as Python module (eliminates network overhead and reliability issues)
17
17
- ✅ **No PostgreSQL**: Uses SQLite for simpler deployment
18
18
- ✅ **No Docker Socket**: DefaultRunLauncher runs jobs in same container
19
19
- ✅ **Local Builds**: All images built locally (no Docker Hub dependency)
20
-
- ✅ **Fewer Resources**: Reduced memory and complexity
20
+
- ✅ **Fewer Resources**: Reduced memory and complexity with consolidated architecture
21
21
22
22
## Quick Start
23
23
@@ -26,38 +26,31 @@ Anomstack uses a **simplified Docker architecture** with local image building an
26
26
make docker
27
27
28
28
# Access services
29
-
# - Dagster UI: http://localhost:3000
29
+
# - Dagster UI: http://localhost:3000 (with embedded user code)
30
30
# - Dashboard: http://localhost:5001
31
-
# - PostgreSQL: localhost:5432 (if port forwarding enabled)
32
31
```
33
32
34
33
## Services
35
34
36
-
### 1. Code Server (`anomstack_code`)
37
-
-**Image**: Built locally from `docker/Dockerfile.anomstack_code`
38
-
-**Purpose**: Runs Dagster user code via gRPC
39
-
-**Port**: 4000 (internal)
40
-
-**Restart Policy**: `always`
41
-
-**Volumes**:
42
-
-`./tmp:/opt/dagster/app/tmp` (temporary files)
43
-
-`anomstack_metrics_duckdb:/data` (DuckDB data)
44
-
-`./dagster_home:/opt/dagster/dagster_home` (Dagster storage - includes SQLite)
45
-
46
-
### 2. Dagster UI (`anomstack_webserver`)
35
+
### 1. Dagster Webserver + User Code (`anomstack_webserver`)
47
36
-**Image**: Built locally from `docker/Dockerfile.dagster`
48
-
-**Purpose**: Web interface for pipeline management
37
+
-**Purpose**: Web interface and user code execution using direct Python module loading (no gRPC server needed)
49
38
-**Port**: 3000 (external)
50
39
-**Restart Policy**: `on-failure`
51
40
-**Access**: http://localhost:3000
52
41
-**Storage**: Uses SQLite files in mounted dagster_home volume
42
+
-**Volumes**:
43
+
-`./tmp:/opt/dagster/app/tmp` (temporary files)
44
+
-`anomstack_metrics_duckdb:/data` (DuckDB data)
45
+
-`./dagster_home:/opt/dagster/dagster_home` (Dagster storage - includes SQLite)
53
46
54
-
### 3. Dagster Daemon (`anomstack_daemon`)
47
+
### 2. Dagster Daemon (`anomstack_daemon`)
55
48
-**Image**: Built locally from `docker/Dockerfile.dagster`
56
49
-**Purpose**: Background process for scheduling and run execution
57
50
-**Restart Policy**: `on-failure`
58
51
-**Run Launcher**: DefaultRunLauncher (runs jobs in same container process)
59
52
60
-
### 4. Dashboard (`anomstack_dashboard`)
53
+
### 3. Dashboard (`anomstack_dashboard`)
61
54
-**Image**: Built locally from `docker/Dockerfile.anomstack_dashboard`
62
55
-**Purpose**: FastHTML-based metrics dashboard
63
56
-**Port**: 5001 (external) → 8080 (internal)
@@ -120,23 +113,22 @@ make docker-smart # Build and start services (recommended)
120
113
### Debugging
121
114
```bash
122
115
make docker-logs # View all container logs
123
-
make docker-logs-code # View code server logs
124
-
make docker-logs-dagit # View Dagster UI logs
116
+
make docker-logs-webserver # View consolidated webserver logs (with embedded user code)
125
117
make docker-logs-daemon # View daemon logs
126
118
make docker-logs-dashboard # View dashboard logs
127
119
```
128
120
129
121
### Shell Access
130
122
```bash
131
-
make docker-shell-code# Shell into code server
132
-
make docker-shell-dagit# Shell into Dagster UI container
123
+
make docker-shell-webserver# Shell into consolidated webserver container
124
+
make docker-shell-daemon# Shell into daemon container
133
125
make docker-shell-dashboard # Shell into dashboard container
134
126
```
135
127
136
128
### Service Management
137
129
```bash
138
130
make docker-restart-dashboard # Restart dashboard only
139
-
make docker-restart-code # Restart code server only
131
+
make docker-restart-webserver # Restart webserver (with embedded user code) only
140
132
```
141
133
142
134
### Cleanup
@@ -197,9 +189,6 @@ docker run --rm -v anomstack_metrics_duckdb:/data -v $(pwd):/backup alpine tar c
197
189
docker run --rm -v anomstack_metrics_duckdb:/data -v $(pwd):/backup alpine tar xzf /backup/duckdb-backup.tar.gz -C /
198
190
```
199
191
200
-
### PostgreSQL Database
201
-
-**Backup**: Use standard PostgreSQL backup tools
202
-
-**Data**: Stored in Docker volume (not explicitly named)
203
192
204
193
## Networking
205
194
@@ -208,9 +197,8 @@ All services run on the `anomstack_network` bridge network:
208
197
-**External access**: Only specific ports are exposed to host
209
198
210
199
### Port Mapping
211
-
-**3000**: Dagster UI
200
+
-**3000**: Dagster UI (with embedded user code)
212
201
-**5001**: Dashboard
213
-
-**5432**: PostgreSQL (if enabled)
214
202
215
203
## Troubleshooting
216
204
@@ -268,14 +256,15 @@ make docker-clean
268
256
- Check for port conflicts (5001)
269
257
270
258
#### Dagster UI Not Accessible
271
-
- Verify dagit container is running
272
-
-Check if code server is accessible
273
-
-Review workspace.yaml configuration
259
+
- Verify webserver container is running (now includes embedded user code)
260
+
-Review workspace.yaml configuration for direct Python module loading
261
+
-Check if user code is properly embedded in consolidated container
274
262
275
-
#### Code Server Issues
276
-
- Check if all dependencies are installed
263
+
#### User Code Issues
264
+
- Check if all dependencies are installed in consolidated container
277
265
- Verify environment variables are set
278
266
- Check DuckDB volume mount
267
+
- Ensure Python module loading is working (no gRPC server needed)
279
268
280
269
## Production Considerations
281
270
@@ -297,8 +286,7 @@ make docker-clean
297
286
## Building Images
298
287
299
288
All images are built locally from their respective Dockerfiles:
300
-
-`anomstack_code_image` - Built from `docker/Dockerfile.anomstack_code`
301
-
-`anomstack_dagster_image` - Built from `docker/Dockerfile.dagster`
289
+
-`anomstack_dagster_image` - Built from `docker/Dockerfile.dagster` (includes both webserver and user code)
302
290
-`anomstack_dashboard_image` - Built from `docker/Dockerfile.anomstack_dashboard`
0 commit comments