|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +Anomstack is an open-source anomaly detection system built on Dagster and FastHTML. It provides ML-powered anomaly detection for metrics from various data sources (BigQuery, Snowflake, ClickHouse, DuckDB, SQLite, etc.) with built-in alerting via email/Slack. |
| 8 | + |
| 9 | +## Development Commands |
| 10 | + |
| 11 | +### Local Development |
| 12 | +- `make local` - Start Dagster locally with dev setup |
| 13 | +- `make dashboard` - Start FastHTML dashboard locally (port 5003) |
| 14 | +- `make dashboard-uvicorn` - Start dashboard with uvicorn (hot reload) |
| 15 | +- `make dashboard-local-dev` - Start dashboard with seeded test data |
| 16 | + |
| 17 | +### Docker Operations |
| 18 | +- `make docker` - Start all services with Docker Compose |
| 19 | +- `make docker-dev` - Start with local development images |
| 20 | +- `make docker-smart` - Build fresh images and start containers |
| 21 | +- `make docker-logs` - View logs for all containers |
| 22 | +- `make docker-restart` - Restart all containers (useful for .env changes) |
| 23 | + |
| 24 | +### Testing & Quality |
| 25 | +- `pytest` or `make tests` - Run test suite |
| 26 | +- `make pre-commit` - Run pre-commit hooks (ruff linting) |
| 27 | +- `make coverage` - Run tests with coverage report |
| 28 | + |
| 29 | +### Database Seeding |
| 30 | +- `make seed-local-db` - Seed local DB with python_ingest_simple data |
| 31 | +- `make seed-local-db-all` - Seed with all example metric batches |
| 32 | +- `make seed-local-db-custom BATCHES='batch1,batch2' DB_PATH='path/to/db'` |
| 33 | + |
| 34 | +## Architecture |
| 35 | + |
| 36 | +### Core Components |
| 37 | +- **anomstack/**: Main application code |
| 38 | + - `main.py`: Dagster definitions and job orchestration |
| 39 | + - `config.py`: Configuration management |
| 40 | + - `jobs/`: Dagster jobs (ingest, train, score, alert, plot) |
| 41 | + - `ml/`: Machine learning components (PyOD models) |
| 42 | + - `external/`: Database connectors (BigQuery, Snowflake, etc.) |
| 43 | + - `alerts/`: Email/Slack alerting system |
| 44 | +- **dashboard/**: FastHTML web dashboard with MonsterUI |
| 45 | +- **metrics/**: Metric batch configurations (.yaml) and SQL queries (.sql) |
| 46 | + - `defaults/`: Default configuration parameters |
| 47 | + - `examples/`: Example metric batches for various data sources |
| 48 | + |
| 49 | +### Metric Batch System |
| 50 | +Metrics are organized into "batches" - collections of related metrics with shared configuration. Each batch requires: |
| 51 | +- `.yaml` config file defining parameters (database, schedule, alert methods) |
| 52 | +- `.sql` file with query OR custom Python ingest function |
| 53 | +- Optional custom preprocessing functions |
| 54 | + |
| 55 | +### Jobs Workflow |
| 56 | +1. **Ingest**: Run SQL/Python to collect metrics |
| 57 | +2. **Train**: Train PyOD anomaly detection models |
| 58 | +3. **Score**: Score new data points for anomalies |
| 59 | +4. **Alert**: Send email/Slack alerts for detected anomalies |
| 60 | +5. **Plot**: Generate visualizations in Dagster UI |
| 61 | + |
| 62 | +### Database Storage |
| 63 | +All data stored in long-format "metrics" table with columns: |
| 64 | +- `metric_timestamp`, `metric_batch`, `metric_name`, `metric_type`, `metric_value` |
| 65 | +- `metric_type` can be: 'metric' (raw data), 'score' (anomaly score), 'alert' (alert flag) |
| 66 | + |
| 67 | +## Configuration |
| 68 | + |
| 69 | +### Environment Files |
| 70 | +- `.env`: Main environment configuration |
| 71 | +- `profiles/`: Environment profiles for different deployments |
| 72 | + - `local-dev.env`: Local development with simple examples |
| 73 | + - `demo.env`: Demo configuration for Fly.io deployment |
| 74 | + - `production.env`: Production settings |
| 75 | + |
| 76 | +### Override Pattern |
| 77 | +Environment variables can override metric batch config using pattern: |
| 78 | +`ANOMSTACK__<METRIC_BATCH>__<PARAM>` (uppercase, underscores for dashes) |
| 79 | + |
| 80 | +Example: |
| 81 | +```bash |
| 82 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__DB=bigquery |
| 83 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__ALERT_METHODS=email |
| 84 | +``` |
| 85 | + |
| 86 | +## Key Files to Know |
| 87 | + |
| 88 | +### Configuration |
| 89 | +- `dagster.yaml`: Dagster configuration |
| 90 | +- `metrics/defaults/defaults.yaml`: Default parameters for all metric batches |
| 91 | +- `pyproject.toml`: Ruff linting configuration |
| 92 | + |
| 93 | +### Entry Points |
| 94 | +- `anomstack/main.py`: Main Dagster definitions |
| 95 | +- `dashboard/app.py`: FastHTML dashboard application |
| 96 | + |
| 97 | +### Database Connectors |
| 98 | +- `anomstack/external/`: Connectors for BigQuery, Snowflake, ClickHouse, DuckDB, SQLite |
| 99 | + |
| 100 | +## Development Notes |
| 101 | + |
| 102 | +### Code Style |
| 103 | +- Uses ruff for linting (line length: 100) |
| 104 | +- Star imports allowed in dashboard modules |
| 105 | +- Pre-commit hooks enforce code quality |
| 106 | + |
| 107 | +### Testing |
| 108 | +- Tests in `tests/` directory |
| 109 | +- Use pytest for running tests |
| 110 | +- Test coverage tracking with badges in README |
| 111 | + |
| 112 | +### Deployment Options |
| 113 | +- Local Python environment |
| 114 | +- Docker Compose (recommended for development) |
| 115 | +- Fly.io (production deployment) |
| 116 | +- Dagster Cloud (serverless) |
| 117 | +- GitHub Codespaces |
| 118 | + |
| 119 | +### Metric Examples |
| 120 | +The `metrics/examples/` directory contains ready-to-use examples: |
| 121 | +- HackerNews story metrics via API |
| 122 | +- Weather data from Open Meteo |
| 123 | +- Stock prices from Yahoo Finance |
| 124 | +- System metrics from Netdata |
| 125 | +- Simple Python-generated test metrics |
| 126 | + |
| 127 | +When adding new metrics, follow existing patterns in examples and ensure proper `.yaml` configuration with required fields like `metric_batch`, `db`, and cron schedules. |
0 commit comments