Skip to content

Commit ab304cf

Browse files
authored
Merge pull request #200 from andrewm4894/dont-use-grpc-by-default
Dont use grpc by default
2 parents f238ab3 + 019d014 commit ab304cf

25 files changed

+503
-270
lines changed

ARCHITECTURE.md

Lines changed: 75 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Anomstack is a distributed anomaly detection platform built on modern data stack
1313
- **Flexibility**: Support for multiple data sources, ML algorithms, and deployment patterns
1414
- **Observability**: Built-in monitoring, logging, and visualization
1515
- **Simplicity**: Minimal configuration required to get started
16+
- **Reliability**: Direct Python module loading eliminates gRPC network overhead and improves system stability
1617

1718
## High-Level Architecture
1819

@@ -379,29 +380,34 @@ graph LR
379380

380381
### Docker Containerized
381382

383+
Simplified 3-container architecture with direct Python module loading (no gRPC server):
384+
382385
```mermaid
383386
graph TB
384387
subgraph "Docker Environment"
385-
subgraph "Dagster Container"
386-
DG[Dagster]
388+
subgraph "Consolidated Container"
389+
DGW[Dagster Webserver + User Code<br/>(Direct Python Module Loading)]
387390
end
388391
389-
subgraph "Dashboard Container"
390-
DASH[FastHTML Dashboard]
392+
subgraph "Daemon Container"
393+
DGD[Dagster Daemon]
391394
end
392395
393-
subgraph "Database Container"
394-
PG[(PostgreSQL)]
396+
subgraph "Dashboard Container"
397+
DASH[FastHTML Dashboard]
395398
end
396399
397400
subgraph "Storage"
398-
VOL[Docker Volumes]
401+
SQLITE[(SQLite Files)]
402+
DUCKDB[(DuckDB Volume)]
399403
end
400404
end
401405
402-
DG --> PG
403-
DG --> VOL
404-
DASH --> DG
406+
DGW --> SQLITE
407+
DGW --> DUCKDB
408+
DGD --> SQLITE
409+
DGD --> DUCKDB
410+
DASH --> DUCKDB
405411
```
406412

407413
### Cloud Native (Dagster Cloud)
@@ -434,6 +440,61 @@ graph TB
434440
JOBS --> EMAIL
435441
```
436442

443+
### Advanced: gRPC Code Server (Optional)
444+
445+
By default, Anomstack uses direct Python module loading for simplicity and reliability. However, users can optionally configure separate gRPC code servers for advanced deployment scenarios such as:
446+
447+
- **Multi-language environments**: When integrating with non-Python services
448+
- **Container isolation**: When user code needs strict separation from Dagster services
449+
- **High-security environments**: When code execution requires additional sandboxing
450+
- **Distributed deployments**: When code servers need to run on separate machines
451+
452+
#### Enabling gRPC Setup
453+
454+
To enable gRPC code servers, modify your `workspace.yaml`:
455+
456+
```yaml
457+
load_from:
458+
- grpc_server:
459+
host: localhost
460+
port: 4000
461+
location_name: "anomstack_code"
462+
```
463+
464+
And update your deployment to include a separate code server:
465+
466+
```yaml
467+
# docker-compose.yaml example
468+
services:
469+
code_server:
470+
image: anomstack_code:latest
471+
command: dagster code-server start -h 0.0.0.0 -p 4000 -f anomstack/main.py
472+
ports:
473+
- "4000:4000"
474+
475+
dagster_webserver:
476+
image: anomstack_dagster:latest
477+
depends_on:
478+
- code_server
479+
ports:
480+
- "3000:3000"
481+
```
482+
483+
#### Trade-offs
484+
485+
**Benefits of gRPC approach:**
486+
- Stronger isolation between user code and Dagster services
487+
- Support for non-Python user code
488+
- Better resource allocation control
489+
490+
**Benefits of direct module loading (default):**
491+
- Simpler deployment and configuration
492+
- Eliminates network overhead and potential connectivity issues
493+
- Faster startup times and more reliable execution
494+
- Easier debugging and development workflow
495+
496+
For most use cases, the default direct Python module loading is recommended.
497+
437498
## Security Architecture
438499
439500
### Authentication & Authorization
@@ -553,14 +614,14 @@ Anomstack provides APIs for external integration:
553614

554615
| Layer | Technology | Purpose |
555616
|-------|------------|---------|
556-
| Orchestration | Dagster | Pipeline orchestration and scheduling |
617+
| Orchestration | Dagster | Pipeline orchestration and scheduling (consolidated architecture) |
557618
| Web Framework | FastHTML | Dashboard and API development |
558619
| ML Library | PyOD | Anomaly detection algorithms |
559620
| Data Processing | Pandas, NumPy | Data manipulation and analysis |
560621
| Visualization | Plotly, Matplotlib | Chart generation and visualization |
561-
| Database | PostgreSQL, SQLite | Metadata and configuration storage |
622+
| Database | SQLite, DuckDB | Metadata and metrics storage (simplified deployment) |
562623
| Cloud Storage | S3, GCS | Model and data storage |
563-
| Containerization | Docker | Application packaging and deployment |
564-
| Monitoring | Prometheus, Grafana | System monitoring and alerting |
624+
| Containerization | Docker | Application packaging and deployment (3-container architecture) |
625+
| Code Loading | Direct Python Modules | User code integration (no gRPC server needed) |
565626

566627
This architecture enables Anomstack to provide a robust, scalable, and maintainable anomaly detection platform that can adapt to various deployment scenarios and organizational requirements.

CLAUDE.md

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,18 @@ Anomstack is an open-source anomaly detection system built on Dagster and FastHT
1313
- `make dashboard` - Start FastHTML dashboard locally (port 5003)
1414
- `make dashboard-uvicorn` - Start dashboard with uvicorn (hot reload)
1515
- `make dashboard-local-dev` - Start dashboard with seeded test data
16+
- `make dev` - Setup local development environment and install dependencies
1617

1718
### Docker Operations
1819
- `make docker` - Start all services with Docker Compose
19-
- `make docker-dev` - Start with local development images
20+
- `make docker-dev` - Start with local development images
2021
- `make docker-smart` - Build fresh images and start containers
22+
- `make docker-build` - Build all images locally
2123
- `make docker-logs` - View logs for all containers
24+
- `make docker-logs-<service>` - View logs for specific service (code, dagit, dashboard, daemon)
25+
- `make docker-shell-<service>` - Get shell access to running containers
2226
- `make docker-restart` - Restart all containers (useful for .env changes)
27+
- `make docker-stop` - Stop all containers
2328

2429
### Testing & Quality
2530
- `pytest` or `make tests` - Run test suite
@@ -31,6 +36,17 @@ Anomstack is an open-source anomaly detection system built on Dagster and FastHT
3136
- `make seed-local-db-all` - Seed with all example metric batches
3237
- `make seed-local-db-custom BATCHES='batch1,batch2' DB_PATH='path/to/db'`
3338

39+
### Configuration & Hot Reload
40+
- `make reload-config` - Reload configuration without restarting containers
41+
- `make enable-auto-reload` - Enable automatic config reloading
42+
- `make enable-config-watcher` - Enable smart config file watcher
43+
44+
### Reset & Cleanup Operations
45+
- `make reset-interactive` - Interactive reset with guided options
46+
- `make reset-gentle` - Rebuild containers (safest reset)
47+
- `make reset-nuclear` - Remove everything including data
48+
- `make dagster-cleanup-standard` - Clean up old Dagster runs
49+
3450
## Architecture
3551

3652
### Core Components
@@ -99,14 +115,19 @@ ANOMSTACK__PYTHON_INGEST_SIMPLE__ALERT_METHODS=email
99115

100116
## Development Notes
101117

118+
### Important Development Rule
119+
**ALWAYS check the Makefile first** when working on Anomstack! The project includes comprehensive Make commands for all common development tasks. Before running manual commands, review the available Makefile commands.
120+
102121
### Code Style
103122
- Uses ruff for linting (line length: 100)
104-
- Star imports allowed in dashboard modules
123+
- Star imports allowed in dashboard modules (F403/F405 rules ignored)
105124
- Pre-commit hooks enforce code quality
125+
- Per-file lint ignores configured for dashboard routes and maintenance scripts
106126

107127
### Testing
108128
- Tests in `tests/` directory
109-
- Use pytest for running tests
129+
- Use `pytest` or `make tests` for running tests
130+
- `make test-examples` - Run only example ingest function tests
110131
- Test coverage tracking with badges in README
111132

112133
### Deployment Options

DOCKER.md

Lines changed: 25 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,20 @@ This document provides comprehensive information about the Docker setup for the
44

55
## Overview
66

7-
Anomstack uses a **simplified Docker architecture** with local image building and SQLite storage for easy deployment and development. The setup includes:
7+
Anomstack uses a **simplified 3-container Docker architecture** with direct Python module loading and SQLite storage for improved reliability and easy deployment. The setup includes:
88

9-
- **Dagster Code Server**: Runs your data pipelines
10-
- **Dagster UI**: Web interface for pipeline management
9+
- **Dagster Webserver + User Code** (consolidated): Web interface and data pipelines in a single container using direct Python module loading
1110
- **Dagster Daemon**: Background process for scheduling and execution (uses DefaultRunLauncher)
1211
- **Dashboard**: FastHTML-based metrics dashboard
1312
- **SQLite Storage**: All Dagster metadata stored in SQLite files (no separate database)
1413
- **DuckDB Volume**: Persistent storage for metrics data
1514

1615
### Key Simplifications
16+
-**No gRPC Code Server**: User code loaded directly as Python module (eliminates network overhead and reliability issues)
1717
-**No PostgreSQL**: Uses SQLite for simpler deployment
1818
-**No Docker Socket**: DefaultRunLauncher runs jobs in same container
1919
-**Local Builds**: All images built locally (no Docker Hub dependency)
20-
-**Fewer Resources**: Reduced memory and complexity
20+
-**Fewer Resources**: Reduced memory and complexity with consolidated architecture
2121

2222
## Quick Start
2323

@@ -26,38 +26,31 @@ Anomstack uses a **simplified Docker architecture** with local image building an
2626
make docker
2727

2828
# Access services
29-
# - Dagster UI: http://localhost:3000
29+
# - Dagster UI: http://localhost:3000 (with embedded user code)
3030
# - Dashboard: http://localhost:5001
31-
# - PostgreSQL: localhost:5432 (if port forwarding enabled)
3231
```
3332

3433
## Services
3534

36-
### 1. Code Server (`anomstack_code`)
37-
- **Image**: Built locally from `docker/Dockerfile.anomstack_code`
38-
- **Purpose**: Runs Dagster user code via gRPC
39-
- **Port**: 4000 (internal)
40-
- **Restart Policy**: `always`
41-
- **Volumes**:
42-
- `./tmp:/opt/dagster/app/tmp` (temporary files)
43-
- `anomstack_metrics_duckdb:/data` (DuckDB data)
44-
- `./dagster_home:/opt/dagster/dagster_home` (Dagster storage - includes SQLite)
45-
46-
### 2. Dagster UI (`anomstack_webserver`)
35+
### 1. Dagster Webserver + User Code (`anomstack_webserver`)
4736
- **Image**: Built locally from `docker/Dockerfile.dagster`
48-
- **Purpose**: Web interface for pipeline management
37+
- **Purpose**: Web interface and user code execution using direct Python module loading (no gRPC server needed)
4938
- **Port**: 3000 (external)
5039
- **Restart Policy**: `on-failure`
5140
- **Access**: http://localhost:3000
5241
- **Storage**: Uses SQLite files in mounted dagster_home volume
42+
- **Volumes**:
43+
- `./tmp:/opt/dagster/app/tmp` (temporary files)
44+
- `anomstack_metrics_duckdb:/data` (DuckDB data)
45+
- `./dagster_home:/opt/dagster/dagster_home` (Dagster storage - includes SQLite)
5346

54-
### 3. Dagster Daemon (`anomstack_daemon`)
47+
### 2. Dagster Daemon (`anomstack_daemon`)
5548
- **Image**: Built locally from `docker/Dockerfile.dagster`
5649
- **Purpose**: Background process for scheduling and run execution
5750
- **Restart Policy**: `on-failure`
5851
- **Run Launcher**: DefaultRunLauncher (runs jobs in same container process)
5952

60-
### 4. Dashboard (`anomstack_dashboard`)
53+
### 3. Dashboard (`anomstack_dashboard`)
6154
- **Image**: Built locally from `docker/Dockerfile.anomstack_dashboard`
6255
- **Purpose**: FastHTML-based metrics dashboard
6356
- **Port**: 5001 (external) → 8080 (internal)
@@ -120,23 +113,22 @@ make docker-smart # Build and start services (recommended)
120113
### Debugging
121114
```bash
122115
make docker-logs # View all container logs
123-
make docker-logs-code # View code server logs
124-
make docker-logs-dagit # View Dagster UI logs
116+
make docker-logs-webserver # View consolidated webserver logs (with embedded user code)
125117
make docker-logs-daemon # View daemon logs
126118
make docker-logs-dashboard # View dashboard logs
127119
```
128120

129121
### Shell Access
130122
```bash
131-
make docker-shell-code # Shell into code server
132-
make docker-shell-dagit # Shell into Dagster UI container
123+
make docker-shell-webserver # Shell into consolidated webserver container
124+
make docker-shell-daemon # Shell into daemon container
133125
make docker-shell-dashboard # Shell into dashboard container
134126
```
135127

136128
### Service Management
137129
```bash
138130
make docker-restart-dashboard # Restart dashboard only
139-
make docker-restart-code # Restart code server only
131+
make docker-restart-webserver # Restart webserver (with embedded user code) only
140132
```
141133

142134
### Cleanup
@@ -197,9 +189,6 @@ docker run --rm -v anomstack_metrics_duckdb:/data -v $(pwd):/backup alpine tar c
197189
docker run --rm -v anomstack_metrics_duckdb:/data -v $(pwd):/backup alpine tar xzf /backup/duckdb-backup.tar.gz -C /
198190
```
199191

200-
### PostgreSQL Database
201-
- **Backup**: Use standard PostgreSQL backup tools
202-
- **Data**: Stored in Docker volume (not explicitly named)
203192

204193
## Networking
205194

@@ -208,9 +197,8 @@ All services run on the `anomstack_network` bridge network:
208197
- **External access**: Only specific ports are exposed to host
209198

210199
### Port Mapping
211-
- **3000**: Dagster UI
200+
- **3000**: Dagster UI (with embedded user code)
212201
- **5001**: Dashboard
213-
- **5432**: PostgreSQL (if enabled)
214202

215203
## Troubleshooting
216204

@@ -268,14 +256,15 @@ make docker-clean
268256
- Check for port conflicts (5001)
269257

270258
#### Dagster UI Not Accessible
271-
- Verify dagit container is running
272-
- Check if code server is accessible
273-
- Review workspace.yaml configuration
259+
- Verify webserver container is running (now includes embedded user code)
260+
- Review workspace.yaml configuration for direct Python module loading
261+
- Check if user code is properly embedded in consolidated container
274262

275-
#### Code Server Issues
276-
- Check if all dependencies are installed
263+
#### User Code Issues
264+
- Check if all dependencies are installed in consolidated container
277265
- Verify environment variables are set
278266
- Check DuckDB volume mount
267+
- Ensure Python module loading is working (no gRPC server needed)
279268

280269
## Production Considerations
281270

@@ -297,8 +286,7 @@ make docker-clean
297286
## Building Images
298287

299288
All images are built locally from their respective Dockerfiles:
300-
- `anomstack_code_image` - Built from `docker/Dockerfile.anomstack_code`
301-
- `anomstack_dagster_image` - Built from `docker/Dockerfile.dagster`
289+
- `anomstack_dagster_image` - Built from `docker/Dockerfile.dagster` (includes both webserver and user code)
302290
- `anomstack_dashboard_image` - Built from `docker/Dockerfile.anomstack_dashboard`
303291

304292
### Building Images

0 commit comments

Comments
 (0)