Portfolio Overview¶
SoccerPredictAI is a production-style end-to-end MLOps system for football match outcome prediction.
This section is designed for hiring managers, technical interviewers, and ML/MLOps engineers who want to evaluate the project quickly without navigating the full documentation.
What this system demonstrates¶
| Capability | Where to look |
|---|---|
| One-page summary | Pitch |
| ML methodology & results | Results |
| System architecture & design decisions | Architecture Tour |
| Trade-offs & out-of-scope choices | Trade-offs |
| Key ADR digest | Decisions |
| AI-augmented dev workflow | AI Workflow |
| ML System Design course coverage map | ML System Design |
Navigation by role¶
| Role | Recommended path | Time |
|---|---|---|
| Recruiter / Hiring Manager | Pitch only | 2 min |
| ML Engineer | Pitch → Results → Architecture Tour | 10 min |
| MLOps / Platform Engineer | Architecture Tour → Decisions → Trade-offs | 15 min |
| Engineering Manager | Pitch → Trade-offs | 5 min |
| Technical Interviewer | See deep-dive path below | 15–20 min |
| Engineer reviewing code | Quickstart → Code Structure | 20 min |
| AI/Agentic workflow | AI Workflow → Customization Layer | 10 min |
2-minute path¶
"What is this project and is it real?"
v1.0 completed May 2026. All criteria in Requirements — Definition of Done are met. Status is the canonical source.
- Pitch — what the system is and what it demonstrates.
- Implementation Status — what is built vs planned.
- Key facts:
- Data source: WhoScored.com (Airflow scraper → PostgreSQL)
- ML: XGBoost classifier, temporal-split validation, MLflow tracking
- Serving: FastAPI + Celery async, 564 automated tests
- Infra: Docker, Kubernetes/Helm, GitLab CI, SOPS secrets
- Monitoring: Prometheus
/metrics; Evidently drift detection; ML quality monitor; 2 Grafana dashboards - AI workflow: GitHub Copilot customization layer with scoped instructions, skills, hooks, and audit cycles
Technical deep-dive (15–20 minutes)¶
"Can this person design systems, not just use frameworks?"
- Architecture Trade-offs — documented decisions with alternatives considered
- ML Problem & Baseline — task formulation, why beating the bookmaker matters
- Feature Engineering — leakage-safe design, offline/online parity
- Model Contract & Signature — input/output schema enforced at boundary
- CI/CD Quality Gates — what runs before code ships
- ADR Decisions — orchestration, data versioning, serving modes
What this project proves (by competency)¶
| Competency | Evidence |
|---|---|
| Reproducibility | dvc repro from clean checkout → same model. Verified by DVC lock + MLflow run IDs. |
| Validation rigor | Temporal split enforced in code, tested with hypothesis property tests. |
| Serving design | FastAPI with Pydantic schemas, sync + async via Celery, health endpoints. |
| Deployment readiness | Docker multi-stage, K8s manifests, Helm charts, GitLab CI pipeline. |
| Observability thinking | Prometheus /metrics (9 metrics), Celery queue stats, Evidently drift detection (daily DAG), Grafana dashboards deployed, alerting runbooks documented. |
| AI-augmented engineering | GitHub Copilot customization layer with scoped instructions, skills (audit-system, error-analysis, train-serve-skew-check), hooks, and audit cycles. AI Workflow. |
| Operational maturity | 564 tests (unit / property / service / contract / load), SOPS + age secrets. |
| System thinking | C4 diagrams, ADRs, explicit layer contracts, no cross-layer shortcuts. |
Full documentation¶
Full system documentation is at docs.soccer.dmitryivanov.dev (planned: live deploy post-v1.0).