Skip to content

Service & Infrastructure Metrics (Prometheus)

Status: ✅ Operational — GET /metrics endpoint live, 9 metrics exported

Metrics are collected from the FastAPI inference service via src/app/metrics.py and a _PrometheusMiddleware applied to all requests.

Available at: GET /metrics (Prometheus exposition format)


Exported metrics

HTTP layer (✅ live)

Metric Type Description
http_requests_total Counter Total HTTP requests by method, path, and status code
http_request_duration_seconds Histogram End-to-end HTTP request latency by method and path

Prediction API (✅ live)

Metric Type Description
prediction_requests_total Counter On-demand prediction tasks dispatched to the Celery ml queue (source="sync")
prediction_duration_seconds Histogram End-to-end prediction latency including Celery queue roundtrip (sync path)

ML worker / model (✅ live)

Metric Type Description
inference_duration_seconds Histogram Pure ML inference time inside the Celery worker (excluding queue wait)
prediction_confidence Histogram Model predicted probability per outcome class (outcome="home_win\|draw\|away_win")
model_info Gauge Metadata of the currently loaded model; value=1 when loaded (model_name, version, stage labels)
model_registered_at_seconds Gauge Unix timestamp when the currently loaded model version was last loaded by the worker
model_feature_drift_score Gauge Evidently dataset drift score (share of drifted features); updated by GET /monitoring/drift

Celery runtime status is available via REST (not Prometheus-scraped): - GET /monitoring/celery/queues — per-queue message count - GET /monitoring/celery/workers — active worker ping status


Not yet implemented

  • RabbitMQ queue metrics via dedicated exporter
  • Kubernetes CPU / memory / pod restarts
  • PostgreSQL query latency via pg_exporter
  • Log aggregation (stdout only today)

Dashboards

Grafana dashboards for these metrics are planned — see Dashboards. Full coverage matrix: Monitoring Status