Portfolio Overview¶

SoccerPredictAI is a production-style end-to-end MLOps system for football match outcome prediction.

This section is designed for hiring managers, technical interviewers, and ML/MLOps engineers who want to evaluate the project quickly without navigating the full documentation.

What this system demonstrates¶

Capability	Where to look
One-page summary	Pitch
ML methodology & results	Results
System architecture & design decisions	Architecture Tour
Trade-offs & out-of-scope choices	Trade-offs
Key ADR digest	Decisions
AI-augmented dev workflow	AI Workflow
ML System Design course coverage map	ML System Design

Role	Recommended path	Time
Recruiter / Hiring Manager	Pitch only	2 min
ML Engineer	Pitch → Results → Architecture Tour	10 min
MLOps / Platform Engineer	Architecture Tour → Decisions → Trade-offs	15 min
Engineering Manager	Pitch → Trade-offs	5 min
Technical Interviewer	See deep-dive path below	15–20 min
Engineer reviewing code	Quickstart → Code Structure	20 min
AI/Agentic workflow	AI Workflow → Customization Layer	10 min

2-minute path¶

"What is this project and is it real?"

v1.0 completed May 2026. All criteria in Requirements — Definition of Done are met. Status is the canonical source.

Pitch — what the system is and what it demonstrates.
Implementation Status — what is built vs planned.
Key facts:
Data source: WhoScored.com (Airflow scraper → PostgreSQL)
ML: XGBoost classifier, temporal-split validation, MLflow tracking
Serving: FastAPI + Celery async, 564 automated tests
Infra: Docker, Kubernetes/Helm, GitLab CI, SOPS secrets
Monitoring: Prometheus /metrics; Evidently drift detection; ML quality monitor; 2 Grafana dashboards
AI workflow: GitHub Copilot customization layer with scoped instructions, skills, hooks, and audit cycles

Technical deep-dive (15–20 minutes)¶

"Can this person design systems, not just use frameworks?"

Architecture Trade-offs — documented decisions with alternatives considered
ML Problem & Baseline — task formulation, why beating the bookmaker matters
Feature Engineering — leakage-safe design, offline/online parity
Model Contract & Signature — input/output schema enforced at boundary
CI/CD Quality Gates — what runs before code ships
ADR Decisions — orchestration, data versioning, serving modes

What this project proves (by competency)¶

Competency	Evidence
Reproducibility	`dvc repro` from clean checkout → same model. Verified by DVC lock + MLflow run IDs.
Validation rigor	Temporal split enforced in code, tested with `hypothesis` property tests.
Serving design	FastAPI with Pydantic schemas, sync + async via Celery, health endpoints.
Deployment readiness	Docker multi-stage, K8s manifests, Helm charts, GitLab CI pipeline.
Observability thinking	Prometheus `/metrics` (10 metrics), Celery queue stats, Evidently drift detection (daily DAG), Grafana dashboards deployed, alerting runbooks documented.
AI-augmented engineering	GitHub Copilot customization layer with scoped instructions, skills (`audit-system`, `error-analysis`, `train-serve-skew-check`), hooks, and audit cycles. AI Workflow.
Operational maturity	564 tests (unit / property / service / contract / load), SOPS + age secrets.
System thinking	C4 diagrams, ADRs, explicit layer contracts, no cross-layer shortcuts.

Full documentation¶

Full system documentation is at docs.soccer.dmitryivanov.dev (planned: live deploy post-v1.0).