Skip to content

Inference API Contract

This page is the canonical reference for the inference API surface: all implemented endpoints, their request/response schemas, and error semantics.

For concrete examples see Examples. For model input/output contract see ML: Model Contract.


Authentication

All /predict/* endpoints require a valid X-API-Key header (configured via API_KEY env var). /healthcheck/, /metrics, /monitoring/*, and /livescores/ are unauthenticated.

X-API-Key: <your-api-key>

Implemented endpoints

GET /predict/predictions/

Bulk read of all precomputed predictions from predictions.parquet (in-memory cache). Returns both future and historical matches. Feature vectors are omitted — display columns only. Intended for the Streamlit UI data explorer and diagnostic tooling.

Response — 200 OK — array of objects with fields:

Field Type Description
match_id int Match identifier
homeTeamName str \| null Home team name
awayTeamName str \| null Away team name
startTimeUtc str (ISO 8601) Match start time
proba_home float Home win probability
proba_draw float Draw probability
proba_away float Away win probability
predicted_class int Argmax: 0=Home Win, 1=Draw, 2=Away Win
predicted_label str Human-readable label
is_future bool \| null True if match has not started yet
model_stage str \| null MLflow alias used
model_run_id str \| null MLflow run ID for traceability

GET /predict/precomputed/{match_id}

Returns the precomputed prediction for a single match from predictions.parquet. No Celery task, no MLflow model call at request time — reads from in-memory cache only.

Response — 200 OK

{
  "match_id": 99,
  "proba_home": 0.58,
  "proba_draw": 0.27,
  "proba_away": 0.15,
  "predicted_class": 0,
  "predicted_label": "home_win",
  "is_future": false,
  "start_time_utc": "2025-05-10T18:00:00+00:00",
  "home_team_name": "Arsenal",
  "away_team_name": "Chelsea",
  "model_run_id": "3f7a1c9d2e4b",
  "model_stage": "champion",
  "predictions_computed_at": "2025-05-10T12:00:00+00:00"
}

Error responses

Code Condition
404 Not Found match_id not in current predictions.parquet

GET /predict/cards/

Returns all precomputed predictions merged with Fonbet 1X2 odds in a single response. Combines predictions.parquet and fonbet_odds.parquet on match_id. Each entry contains probabilities, predicted class, 1X2 odds, outcome (if finished), and Fonbet URL. Served from in-memory cache — no MinIO call at request time. Used by the Streamlit UI.

Response — 200 OK — array of merged match card objects.


GET /predict/region-roi/

Returns flat-stake ROI statistics per region from the live-betting simulation. Data produced by the live-betting DVC pipeline stage and cached in memory (60 s MinIO re-check interval). Returns an empty list when roi_by_region.csv has not been produced yet.

Response — 200 OK

[
  {
    "region_name": "England",
    "roi_pct": 12.5,
    "n_bets": 48,
    "hit_rate": 0.54,
    "region_id": 7
  }
]
Field Type Description
region_name str \| null Region label
roi_pct float \| null Flat-stake ROI %
n_bets int Number of simulated bets
hit_rate float \| null Fraction of winning bets
region_id int \| null Internal region identifier

GET /predict/odds/

Returns Fonbet 1X2 odds (odd_home, odd_draw, odd_away) for all matches. Reads from fonbet_odds.parquet in the data-raw MinIO bucket. Returns an empty list if the file has not been produced yet.


GET /predict/{match_id}

Runs a prediction for a single match synchronously via the Celery ml queue. Features are read from match_features.parquet (in-memory cache). Blocks until the Celery predict_match task completes (up to 30 s). Use ?stage=challenger to target the challenger model.

Response — 200 OK — full prediction result dict (same shape as PredictResponse).

Error responses

Code Condition
404 Not Found match_id not in current feature parquet
400 Bad Request Requested stage is not in the loaded set
504 Gateway Timeout Celery worker did not respond within 30 s
500 Internal Server Error Task failed on worker

GET /predict/model/info

Retrieves MLflow model metadata synchronously via the Celery ml queue. Use ?stage=challenger to query a specific model stage.

Response — 200 OK

{
  "model_name": "soccer-match-outcome",
  "stage": "champion",
  "version": "7",
  "run_id": "3f7a1c9d2e4b",
  "metrics": {"log_loss": 1.006, "roc_auc_ovr": 0.643},
  "params": {"n_estimators": "300"},
  "feature_names": ["diff_win_5_mean", "diff_goals_for_5_mean"],
  "created_at": "2026-03-15T10:00:00"
}

GET /livescores/

Returns matches from PostgreSQL filtered by year/month, ordered by startTimeUtc descending.

Query parameters

Parameter Type Default Description
year int current year Calendar year
month int Month (1–12); omit for full year
limit int Max rows (1–10000)
offset int 0 Pagination offset

GET /monitoring/drift

Returns the latest feature drift summary from reports/drift/latest.json (written by the monitor_drift DVC stage / Airflow DAG). Returns {"drift_score": null} if the report does not exist yet. Also refreshes the drift_score Prometheus gauge.


GET /monitoring/celery/queues

Returns active, scheduled, and reserved task counts across all Celery workers.


GET /monitoring/celery/workers

Returns active queues and ping status for all connected workers.


GET /monitoring/task_status/{task_id}

Returns current status and result for any Celery task by ID.

Response

{
  "task_id": "abc-123-def-456",
  "status": "SUCCESS",
  "result": { ... }
}

result is null while the task is pending.


GET /healthcheck/

Liveness probe. Used by Kubernetes liveness checks. Checks database connectivity.

Response — 200 OK

{
  "status": "healthy",
  "version": "...",
  "worker_pid": 42,
  "memory_usage_mb": 210.4,
  "database": true
}

GET /metrics

Prometheus-compatible metrics endpoint. Scraped by the in-cluster Prometheus instance.

Returns plain-text exposition format with 9 counters, histograms, and gauges:

  • http_requests_total{method, path, status_code}
  • http_request_duration_seconds (histogram)
  • prediction_requests_total{source="sync|async"}
  • prediction_duration_seconds (histogram)
  • inference_duration_seconds (histogram)
  • prediction_confidence{outcome} (histogram)
  • model_info{model_name, version, stage} (gauge)
  • model_registered_at_seconds{model_name} (gauge)
  • model_feature_drift_score (gauge)

Planned endpoints

Endpoint Status Notes
POST /predict/batch 📋 Planned HTTP batch endpoint; batch parquet exists but no HTTP API yet

Validation semantics

  • All requests are validated against Pydantic schemas (src/app/schemas/predict.py) before any inference logic runs.
  • Path and query parameters are validated by FastAPI/Pydantic; invalid types return 422 Unprocessable Entity.
  • Input validation failures are client errors — they are not retried.

Schema boundary

Features for on-demand inference (GET /predict/{match_id}) are read server-side from match_features.parquet. The serving layer does not accept caller-supplied feature dicts. See ML: Model Contract for the full input/output contract.