|
| 1 | +--- |
| 2 | +sidebar_position: 1 |
| 3 | +--- |
| 4 | + |
| 5 | +# Environment Variables |
| 6 | + |
| 7 | +This page documents all environment variables available in Anomstack. Copy the [`.example.env`](https://github.com/andrewm4894/anomstack/blob/main/.example.env) file to `.env` and configure the variables you need. |
| 8 | + |
| 9 | +```bash |
| 10 | +cp .example.env .env |
| 11 | +``` |
| 12 | + |
| 13 | +## 🗄️ Database & Data Sources |
| 14 | + |
| 15 | +### Google Cloud Platform |
| 16 | +Configure access to BigQuery and Google Cloud Storage. |
| 17 | + |
| 18 | +| Variable | Required | Description | Example | |
| 19 | +|----------|----------|-------------|---------| |
| 20 | +| `ANOMSTACK_GOOGLE_APPLICATION_CREDENTIALS` | No | Path to GCP service account JSON file | `/path/to/credentials.json` | |
| 21 | +| `ANOMSTACK_GOOGLE_APPLICATION_CREDENTIALS_JSON` | No | GCP credentials as JSON string (alternative to file path) | `{"type": "service_account", ...}` | |
| 22 | +| `ANOMSTACK_GCP_PROJECT_ID` | No | Google Cloud Project ID for BigQuery | `my-project-123` | |
| 23 | + |
| 24 | +### Snowflake |
| 25 | +Connect to Snowflake data warehouse. |
| 26 | + |
| 27 | +| Variable | Required | Description | Example | |
| 28 | +|----------|----------|-------------|---------| |
| 29 | +| `ANOMSTACK_SNOWFLAKE_ACCOUNT` | No | Snowflake account identifier | `xy12345.us-east-1` | |
| 30 | +| `ANOMSTACK_SNOWFLAKE_USER` | No | Snowflake username | `anomstack_user` | |
| 31 | +| `ANOMSTACK_SNOWFLAKE_PASSWORD` | No | Snowflake password | `your-password` | |
| 32 | +| `ANOMSTACK_SNOWFLAKE_WAREHOUSE` | No | Snowflake warehouse name | `ANOMSTACK_WH` | |
| 33 | + |
| 34 | +### AWS |
| 35 | +Connect to S3 and other AWS services. |
| 36 | + |
| 37 | +| Variable | Required | Description | Example | |
| 38 | +|----------|----------|-------------|---------| |
| 39 | +| `ANOMSTACK_AWS_ACCESS_KEY_ID` | No | AWS access key ID | `AKIAIOSFODNN7EXAMPLE` | |
| 40 | +| `ANOMSTACK_AWS_SECRET_ACCESS_KEY` | No | AWS secret access key | `wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY` | |
| 41 | + |
| 42 | +### ClickHouse |
| 43 | +Connect to ClickHouse database. |
| 44 | + |
| 45 | +| Variable | Required | Description | Default | Example | |
| 46 | +|----------|----------|-------------|---------|---------| |
| 47 | +| `ANOMSTACK_CLICKHOUSE_HOST` | No | ClickHouse host | `localhost` | `clickhouse.example.com` | |
| 48 | +| `ANOMSTACK_CLICKHOUSE_PORT` | No | ClickHouse port | `8123` | `8123` | |
| 49 | +| `ANOMSTACK_CLICKHOUSE_USER` | No | ClickHouse username | `anomstack` | `admin` | |
| 50 | +| `ANOMSTACK_CLICKHOUSE_PASSWORD` | No | ClickHouse password | `anomstack` | `your-password` | |
| 51 | +| `ANOMSTACK_CLICKHOUSE_DATABASE` | No | ClickHouse database | `default` | `metrics` | |
| 52 | + |
| 53 | +### MotherDuck & Turso |
| 54 | +Enhanced DuckDB and SQLite services. |
| 55 | + |
| 56 | +| Variable | Required | Description | Example | |
| 57 | +|----------|----------|-------------|---------| |
| 58 | +| `ANOMSTACK_MOTHERDUCK_TOKEN` | No | MotherDuck authentication token | `your-motherduck-token` | |
| 59 | +| `ANOMSTACK_TURSO_DATABASE_URL` | No | Turso database URL | `libsql://your-db.turso.io` | |
| 60 | +| `ANOMSTACK_TURSO_AUTH_TOKEN` | No | Turso authentication token | `your-turso-token` | |
| 61 | + |
| 62 | +## 💾 Storage Configuration |
| 63 | + |
| 64 | +### Database Paths |
| 65 | +Configure where metrics and metadata are stored. |
| 66 | + |
| 67 | +| Variable | Required | Description | Docker Default | Local Default | |
| 68 | +|----------|----------|-------------|---------------|---------------| |
| 69 | +| `ANOMSTACK_DUCKDB_PATH` | No | DuckDB database path | `/metrics_db/duckdb/anomstack.db` | `tmpdata/anomstack-duckdb.db` | |
| 70 | +| `ANOMSTACK_SQLITE_PATH` | No | SQLite database path | `tmpdata/anomstack-sqlite.db` | `tmpdata/anomstack-sqlite.db` | |
| 71 | +| `ANOMSTACK_TABLE_KEY` | No | Table identifier for metrics | `tmp.metrics` | `production.metrics` | |
| 72 | + |
| 73 | +### Model Storage |
| 74 | +Configure where trained ML models are stored. |
| 75 | + |
| 76 | +| Variable | Required | Description | Examples | |
| 77 | +|----------|----------|-------------|----------| |
| 78 | +| `ANOMSTACK_MODEL_PATH` | No | Model storage location | `local://./tmp/models`<br/>`gs://your-bucket/models`<br/>`s3://your-bucket/models` | |
| 79 | + |
| 80 | +**Storage Options:** |
| 81 | +- **Local**: `local://./tmp/models` (default) |
| 82 | +- **Google Cloud Storage**: `gs://your-bucket/models` |
| 83 | +- **AWS S3**: `s3://your-bucket/models` |
| 84 | + |
| 85 | +### Application Paths |
| 86 | +Internal directory configuration. |
| 87 | + |
| 88 | +| Variable | Required | Description | Default | |
| 89 | +|----------|----------|-------------|---------| |
| 90 | +| `ANOMSTACK_HOME` | No | Home directory for Anomstack | `.` (current directory) | |
| 91 | + |
| 92 | +## 📧 Alert Configuration |
| 93 | + |
| 94 | +### Email Alerts |
| 95 | +Configure email notifications for anomalies. |
| 96 | + |
| 97 | +| Variable | Required | Description | Default | Example | |
| 98 | +|----------|----------|-------------|---------|---------| |
| 99 | +| `ANOMSTACK_ALERT_EMAIL_FROM` | No | Sender email address | | `[email protected]` | |
| 100 | +| `ANOMSTACK_ALERT_EMAIL_TO` | No | Recipient email address | | `[email protected]` | |
| 101 | +| `ANOMSTACK_ALERT_EMAIL_SMTP_HOST` | No | SMTP server host | `smtp.gmail.com` | `smtp.office365.com` | |
| 102 | +| `ANOMSTACK_ALERT_EMAIL_SMTP_PORT` | No | SMTP server port | `587` | `25` | |
| 103 | +| `ANOMSTACK_ALERT_EMAIL_PASSWORD` | No | Email password/app token | | `your-app-password` | |
| 104 | + |
| 105 | +### Failure Email Alerts |
| 106 | +Separate email configuration for job failures. |
| 107 | + |
| 108 | +| Variable | Required | Description | Example | |
| 109 | +|----------|----------|-------------|---------| |
| 110 | +| `ANOMSTACK_FAILURE_EMAIL_FROM` | No | Sender for failure alerts | `[email protected]` | |
| 111 | +| `ANOMSTACK_FAILURE_EMAIL_TO` | No | Recipient for failure alerts | `[email protected]` | |
| 112 | +| `ANOMSTACK_FAILURE_EMAIL_SMTP_HOST` | No | SMTP host for failures | `smtp.gmail.com` | |
| 113 | +| `ANOMSTACK_FAILURE_EMAIL_SMTP_PORT` | No | SMTP port for failures | `587` | |
| 114 | +| `ANOMSTACK_FAILURE_EMAIL_PASSWORD` | No | Email password for failures | `your-app-password` | |
| 115 | + |
| 116 | +### Slack Alerts |
| 117 | +Configure Slack notifications. |
| 118 | + |
| 119 | +| Variable | Required | Description | Example | |
| 120 | +|----------|----------|-------------|---------| |
| 121 | +| `ANOMSTACK_SLACK_BOT_TOKEN` | No | Slack bot token | `xoxb-your-bot-token` | |
| 122 | +| `ANOMSTACK_SLACK_CHANNEL` | No | Slack channel for alerts | `#anomaly-alerts` | |
| 123 | + |
| 124 | +## 🤖 LLM Integration |
| 125 | + |
| 126 | +### OpenAI |
| 127 | +Configure AI-powered anomaly detection and alerts. |
| 128 | + |
| 129 | +| Variable | Required | Description | Default | Example | |
| 130 | +|----------|----------|-------------|---------|---------| |
| 131 | +| `ANOMSTACK_OPENAI_KEY` | No | OpenAI API key | | `sk-...` | |
| 132 | +| `OPENAI_API_KEY` | No | Alternative OpenAI API key | | `sk-...` | |
| 133 | +| `ANOMSTACK_OPENAI_MODEL` | No | OpenAI model to use | `gpt-4o-mini` | `gpt-4o` | |
| 134 | + |
| 135 | +### Anthropic |
| 136 | +Alternative LLM provider. |
| 137 | + |
| 138 | +| Variable | Required | Description | Default | Example | |
| 139 | +|----------|----------|-------------|---------|---------| |
| 140 | +| `ANOMSTACK_ANTHROPIC_KEY` | No | Anthropic API key | | `sk-ant-...` | |
| 141 | +| `ANOMSTACK_ANTHROPIC_MODEL` | No | Anthropic model | `claude-3-haiku-20240307` | `claude-3-sonnet-20240229` | |
| 142 | + |
| 143 | +### LLM Platform Selection |
| 144 | + |
| 145 | +| Variable | Required | Description | Default | Options | |
| 146 | +|----------|----------|-------------|---------|---------| |
| 147 | +| `ANOMSTACK_LLM_PLATFORM` | No | Which LLM provider to use | `openai` | `openai`, `anthropic` | |
| 148 | + |
| 149 | +### LangSmith Tracing |
| 150 | +Optional LLM call tracing and monitoring. |
| 151 | + |
| 152 | +| Variable | Required | Description | Default | Example | |
| 153 | +|----------|----------|-------------|---------|---------| |
| 154 | +| `LANGSMITH_TRACING` | No | Enable LangSmith tracing | `true` | `false` | |
| 155 | +| `LANGSMITH_ENDPOINT` | No | LangSmith API endpoint | `https://api.smith.langchain.com` | | |
| 156 | +| `LANGSMITH_API_KEY` | No | LangSmith API key | | `your-api-key` | |
| 157 | +| `LANGSMITH_PROJECT` | No | LangSmith project name | `anomaly-agent` | `your-project` | |
| 158 | + |
| 159 | +## ⚙️ Dagster Configuration |
| 160 | + |
| 161 | +### Core Dagster Settings |
| 162 | + |
| 163 | +| Variable | Required | Description | Default | Example | |
| 164 | +|----------|----------|-------------|---------|---------| |
| 165 | +| `DAGSTER_LOG_LEVEL` | No | Dagster logging level | `DEBUG` | `INFO`, `WARNING`, `ERROR` | |
| 166 | +| `DAGSTER_CONCURRENCY` | No | Number of concurrent jobs | `4` | `8` | |
| 167 | + |
| 168 | +### Dagster Directories |
| 169 | +Lightweight defaults to prevent disk space issues. |
| 170 | + |
| 171 | +| Variable | Required | Description | Default | |
| 172 | +|----------|----------|-------------|---------| |
| 173 | +| `ANOMSTACK_DAGSTER_LOCAL_ARTIFACT_STORAGE_DIR` | No | Artifacts storage directory | `tmp_light/artifacts` | |
| 174 | +| `ANOMSTACK_DAGSTER_OVERALL_CONCURRENCY_LIMIT` | No | Overall concurrency limit | `5` | |
| 175 | +| `ANOMSTACK_DAGSTER_DEQUEUE_USE_THREADS` | No | Use threads for dequeuing | `false` | |
| 176 | +| `ANOMSTACK_DAGSTER_DEQUEUE_NUM_WORKERS` | No | Number of dequeue workers | `2` | |
| 177 | +| `ANOMSTACK_DAGSTER_LOCAL_COMPUTE_LOG_MANAGER_DIRECTORY` | No | Compute logs directory | `tmp_light/compute_logs` | |
| 178 | +| `ANOMSTACK_DAGSTER_SQLITE_STORAGE_BASE_DIR` | No | SQLite storage base directory | `tmp_light/storage` | |
| 179 | + |
| 180 | +### Job Timeout Configuration |
| 181 | + |
| 182 | +| Variable | Required | Description | Default | Example | |
| 183 | +|----------|----------|-------------|---------|---------| |
| 184 | +| `ANOMSTACK_MAX_RUNTIME_SECONDS_TAG` | No | Max job runtime in seconds | `3600` | `7200` | |
| 185 | +| `ANOMSTACK_KILL_RUN_AFTER_MINUTES` | No | Kill long-running jobs after N minutes | `60` | `120` | |
| 186 | + |
| 187 | +## 🐳 Docker & Deployment |
| 188 | + |
| 189 | +### PostgreSQL (Docker) |
| 190 | +Database for Dagster metadata when using Docker. |
| 191 | + |
| 192 | +| Variable | Required | Description | Default | |
| 193 | +|----------|----------|-------------|---------| |
| 194 | +| `ANOMSTACK_POSTGRES_USER` | No | PostgreSQL username | `postgres_user` | |
| 195 | +| `ANOMSTACK_POSTGRES_PASSWORD` | No | PostgreSQL password | `postgres_password` | |
| 196 | +| `ANOMSTACK_POSTGRES_DB` | No | PostgreSQL database name | `postgres_db` | |
| 197 | +| `ANOMSTACK_POSTGRES_FORWARD_PORT` | No | Local port forwarding | `5432` (leave blank to disable) | |
| 198 | + |
| 199 | +### Dashboard Configuration |
| 200 | + |
| 201 | +| Variable | Required | Description | Default | |
| 202 | +|----------|----------|-------------|---------| |
| 203 | +| `ANOMSTACK_DASHBOARD_PORT` | No | Dashboard port | `5001` | |
| 204 | + |
| 205 | +## 🔧 Advanced Configuration |
| 206 | + |
| 207 | +### Example Metrics |
| 208 | + |
| 209 | +| Variable | Required | Description | Default | Options | |
| 210 | +|----------|----------|-------------|---------|---------| |
| 211 | +| `ANOMSTACK_IGNORE_EXAMPLES` | No | Ignore example metrics | `no` | `yes`, `no` | |
| 212 | + |
| 213 | +### Auto-Reload Configuration |
| 214 | +Automatically reload configuration when files change. |
| 215 | + |
| 216 | +| Variable | Required | Description | Default | Example | |
| 217 | +|----------|----------|-------------|---------|---------| |
| 218 | +| `ANOMSTACK_AUTO_CONFIG_RELOAD` | No | Enable scheduled config reload | `false` | `true` | |
| 219 | +| `ANOMSTACK_CONFIG_RELOAD_CRON` | No | Config reload schedule | `*/5 * * * *` | `*/10 * * * *` | |
| 220 | +| `ANOMSTACK_CONFIG_RELOAD_STATUS` | No | Config reload job status | `STOPPED` | `RUNNING` | |
| 221 | +| `ANOMSTACK_CONFIG_WATCHER` | No | Enable smart file watcher | `true` | `false` | |
| 222 | +| `ANOMSTACK_CONFIG_WATCHER_INTERVAL` | No | File watcher check interval (seconds) | `30` | `60` | |
| 223 | + |
| 224 | +### Analytics |
| 225 | + |
| 226 | +| Variable | Required | Description | Example | |
| 227 | +|----------|----------|-------------|---------| |
| 228 | +| `POSTHOG_API_KEY` | No | PostHog analytics API key | `phc_...` | |
| 229 | + |
| 230 | +## 🎛️ Per-Metric Batch Overrides |
| 231 | + |
| 232 | +You can override any configuration parameter for specific metric batches using environment variables: |
| 233 | + |
| 234 | +```bash |
| 235 | +ANOMSTACK__<METRIC_BATCH>__<PARAMETER>=<VALUE> |
| 236 | +``` |
| 237 | + |
| 238 | +**Format Rules:** |
| 239 | +- `<METRIC_BATCH>`: Uppercase metric batch name with dashes replaced by underscores |
| 240 | +- `<PARAMETER>`: Uppercase parameter name with underscores |
| 241 | + |
| 242 | +**Examples:** |
| 243 | +```bash |
| 244 | +# Override database for python_ingest_simple metric batch |
| 245 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__DB=bigquery |
| 246 | + |
| 247 | +# Override alert methods |
| 248 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__ALERT_METHODS=email |
| 249 | + |
| 250 | +# Override schedule |
| 251 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__INGEST_CRON_SCHEDULE="*/1 * * * *" |
| 252 | + |
| 253 | +# Enable specific job schedules |
| 254 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__INGEST_DEFAULT_SCHEDULE_STATUS=RUNNING |
| 255 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__TRAIN_DEFAULT_SCHEDULE_STATUS=RUNNING |
| 256 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__SCORE_DEFAULT_SCHEDULE_STATUS=RUNNING |
| 257 | +ANOMSTACK__PYTHON_INGEST_SIMPLE__ALERT_DEFAULT_SCHEDULE_STATUS=RUNNING |
| 258 | +``` |
| 259 | + |
| 260 | +This allows you to configure different metric batches differently without modifying YAML files. |
| 261 | + |
| 262 | +## 📝 Common Configuration Patterns |
| 263 | + |
| 264 | +### Development Setup |
| 265 | +```bash |
| 266 | +# Use local storage |
| 267 | +ANOMSTACK_DUCKDB_PATH=tmpdata/anomstack-duckdb.db |
| 268 | +ANOMSTACK_MODEL_PATH=local://./tmp/models |
| 269 | +ANOMSTACK_IGNORE_EXAMPLES=no |
| 270 | + |
| 271 | +# Basic email alerts |
| 272 | + |
| 273 | + |
| 274 | +``` |
| 275 | + |
| 276 | +### Production Setup |
| 277 | +```bash |
| 278 | +# Use cloud storage |
| 279 | +ANOMSTACK_MODEL_PATH=gs://company-anomstack/models |
| 280 | +ANOMSTACK_DUCKDB_PATH=/metrics_db/duckdb/anomstack.db |
| 281 | + |
| 282 | +# Production alerts |
| 283 | + |
| 284 | + |
| 285 | +ANOMSTACK_SLACK_CHANNEL=#production-alerts |
| 286 | + |
| 287 | +# Disable examples |
| 288 | +ANOMSTACK_IGNORE_EXAMPLES=yes |
| 289 | +``` |
| 290 | + |
| 291 | +### BigQuery + GCS Setup |
| 292 | +```bash |
| 293 | +# BigQuery connection |
| 294 | +ANOMSTACK_GCP_PROJECT_ID=your-project-id |
| 295 | +ANOMSTACK_GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json |
| 296 | + |
| 297 | +# Use GCS for model storage |
| 298 | +ANOMSTACK_MODEL_PATH=gs://your-bucket/models |
| 299 | + |
| 300 | +# BigQuery table for metrics |
| 301 | +ANOMSTACK_TABLE_KEY=your_dataset.metrics |
| 302 | +``` |
| 303 | + |
| 304 | +## 🔐 Security Best Practices |
| 305 | + |
| 306 | +1. **Use environment files**: Never commit `.env` files with secrets to version control |
| 307 | +2. **Rotate credentials**: Regularly rotate API keys and passwords |
| 308 | +3. **Least privilege**: Use service accounts with minimal required permissions |
| 309 | +4. **Secrets management**: Consider using proper secrets management in production (AWS Secrets Manager, Google Secret Manager, etc.) |
| 310 | +5. **File permissions**: Restrict access to your `.env` file (`chmod 600 .env`) |
| 311 | + |
| 312 | +## 🆘 Troubleshooting |
| 313 | + |
| 314 | +**Environment not loading?** |
| 315 | +- Ensure `.env` file exists in the project root |
| 316 | +- Check file permissions and syntax |
| 317 | +- Verify no extra spaces around `=` signs |
| 318 | + |
| 319 | +**Docker not picking up changes?** |
| 320 | +- Restart containers: `make docker-stop && make docker` |
| 321 | +- Check if environment is properly mounted |
| 322 | + |
| 323 | +**Database connection issues?** |
| 324 | +- Verify credentials and network access |
| 325 | +- Test connections independently |
| 326 | +- Check firewall and VPN settings |
0 commit comments