CI/CD Overview¶
The CI/CD layer automates the build and validation stages of the SoccerPredictAI MLOps system, and supports structured deployment via Helm.
Its primary goals are: - enforce quality gates before changes reach production, - provide deterministic and reproducible builds, - support structured deployment with controlled promotion, - reduce manual operational risk.
CI/CD is treated as a core part of the ML system, not as an auxiliary tool.
Current state¶
| Capability | Status |
|---|---|
| GitLab CI pipeline | Implemented |
| Docker image build and push | Implemented |
| Helm-based deployment | Implemented (semi-automated) |
| Production deployment | Manual approval required |
| Rollback (service) | Manual (helm rollback) |
| Rollback (model) | Manual (MLflow alias) |
| Rollback (data) | Manual (dvc checkout) |
Production deployments require manual approval. Rollbacks across all layers are performed manually.
Pipeline Architecture¶
Stages¶
| Stage | Purpose |
|---|---|
base |
Prepare base images and shared artifacts |
linting |
Code style and static analysis |
build |
Build Docker images for services |
deploy-images |
Push images to the container registry |
deploy |
Deploy services via Helm to Kubernetes |
release |
Tag and promote releases |
pages |
Build and publish documentation |
Pipeline philosophy¶
- fail fast on quality issues,
- separate build from deploy,
- promote artifacts, not source code,
- keep production deploys explicit and auditable.
Triggering rules: merge requests trigger validation (lint, test, build); pushes to main trigger build + image push; staging deploy requires manual trigger after CI passes; production deploy requires manual approval and quality gates.
Container Image Strategy¶
Each service is packaged as a separate immutable Docker image (API service, Celery worker, Airflow components). No secrets are baked into images; dependencies come from pinned requirements-*.txt files exported from pdm.lock.
Image tagging scheme:
| Context | Tag format | Example |
|---|---|---|
| Branch build (CI) | <branch>-<short-sha> |
main-a1b2c3d |
| Release | v<major>.<minor>.<patch> |
v1.2.0 |
| Latest stable | latest (staging/prod only) |
latest |
The same image artifact is promoted across environments — no rebuilds between staging and production.
Deployment (Helm)¶
Deployments use Helm charts for reproducible configuration, environment-specific overrides, and safe rollbacks.
Deployment flow: 1. CI decrypts secrets (SOPS) 2. Helm renders manifests with environment values 3. Kubernetes applies the release 4. Readiness probes gate traffic
Failed deployments do not receive traffic. Rollbacks do not require rebuilding images.
Release & Rollback Policy¶
Release cadence:
| Environment | Trigger | Approvals Required |
|---|---|---|
dev |
Every push to main |
None |
staging |
Manual trigger after CI passes | 1 reviewer |
production |
Manual trigger + quality gates pass | 2 reviewers |
Quality gates before release: all tests green, ruff pass, dvc repro --dry succeeds, model metrics meet champion baseline, no HIGH severity container scan findings.
Rollback process (manual across three independent layers):
# Service rollback (Helm)
helm rollback soccer-api
# Model rollback (MLflow — reassign champion alias to prior version)
# Via MLflow UI or mlflow CLI
# Data rollback (DVC)
dvc checkout <commit>
Rollbacks are never automated. All rollback decisions require human review. Re-run CI after any rollback to confirm system state.
Quality Gates¶
ML systems fail not only due to bugs, but due to data issues, silent regressions, and configuration drift. Quality gates prevent unsafe changes from reaching production.
Implemented gates¶
| Gate | Category | Blocks deploy? |
|---|---|---|
| Linting and formatting (ruff) | Code quality | ✅ Yes |
| Unit + property-based tests (pytest + Hypothesis) | Testing | ✅ Yes |
| Critical Great Expectations checks | Data contracts | ✅ Yes |
| Pipeline smoke run (reduced dataset) | ML sanity | ✅ Yes |
| API contract test (happy path + invalid schema) | Serving | ✅ Yes |
Non-blocking (signal-only)¶
- Drift warnings (Evidently)
- Non-critical GE checks (distribution or advisory expectations)
- Performance regression checks (initially informational)
Artifact traceability¶
Every production deployment can be traced to: - git commit, Docker image digest, dataset version, model version.