Skip to content

Pitch

v1.0 · Completed May 2026 — All Definition-of-Done criteria met. This documentation and the live deployment accurately reflect the implemented system. See Implementation Status for the canonical readiness matrix.

What

End-to-end football match outcome prediction system:

scraping → DVC pipelines → MLflow registry → FastAPI/Celery serving on K8s, with Evidently drift monitoring.

  • Raw match data scraped from WhoScored.com via Airflow ETL, stored in PostgreSQL + MinIO.
  • Features engineered deterministically through a DVC pipeline with explicit contracts.
  • Models trained, tracked in MLflow, promoted via registry (smoke → candidate → champion aliases).
  • Predictions served by FastAPI + Celery on Kubernetes with Helm charts.
  • Prometheus 9 metrics exported; Evidently drift detection operational (daily DAG); two Grafana dashboards deployed.

Why

A portfolio project to demonstrate applied ML engineering at system level — not a notebook experiment, but a production-style ML platform with reproducible pipelines, explicit contracts, and operational clarity.

It directly addresses the common gap between "I trained a model" and "I ran a system in production."


Key results

Metric Value
Model log-loss (hold-out 2024+) 1.006 (bookmaker benchmark ~0.97)
Brier score 0.601
Calibration ECE 0.004
ROC AUC (OVR) 0.643 (random baseline 0.500)
Holdout set 135 970 matches (2024+)
Serving p95 latency 442 ms (GET /predict/{id}, 3 concurrent, local dev)
Serving p99 latency 460 ms
Throughput at tested RPS ~18 RPS (local dev); ~5 RPS production ceiling (single-node K8s)
Drift detection lag < 24 h (daily Airflow DAG at 06:00 UTC)
Tests collected 560 (unit, property, service, contract, load)

Resource Link
Live Demo soccer.dmitryivanov.dev — Streamlit UI
MLflow UI mlflow.dmitryivanov.dev
Grafana grafana.dmitryivanov.dev
GitLab repo gitlab.com/dmitry-ivanov-ds/soccer
Full docs docs.soccer.dmitryivanov.dev
Architecture Tour architecture-tour.md
Results results.md

Evidence

All claims above are backed by Quarto reports generated from pipeline artifacts. The reports are the authoritative record of implemented and verified system behaviour:

Report What it verifies
EDA & Preprocessing Data volume, temporal span, class balance
Feature Engineering Feature matrix completeness, ELO/rolling stats
Experiment Studies 7 studies, 218 runs — model selection rationale
Model Analysis Calibration curves, SHAP feature importance
Holdout Analysis Temporal holdout metrics, slice diagnostics, ROI simulation
Live Inference & Odds Batch predictions vs Fonbet live odds
Live Betting Strategy Edge-threshold strategy, region-level ROI breakdown