Skip to content

goal: Usage Tracking & Observability (OpenTelemetry)Β #219

@locnguyen1986

Description

@locnguyen1986

🎯 Goal

Usage Tracking & Observability (OpenTelemetry)

πŸ“– Context

We need end-to-end visibility: traces, metrics, and logs that follow a request through API β†’ gateway β†’ vLLM β†’ memory β†’ files.

βœ… Scope

  • OTel SDK wiring across services (traces/metrics/logs)
  • Resource & span attributes: model id, tokens in/out, latency, user/org hash, error code
  • Collector configs (dev/prod), exemplars, sampling policies
  • Dashboards (Grafana): QPS, p50/p95, tokens/sec, GPU usage, 4xx/5xx
  • Alerts: auth failures, 429 spikes, slow p95, runner health

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions