Development Workflow & AI Tooling¶
This section documents how the project is developed and maintained day-to-day. It covers two things: the engineering guardrails that keep the codebase correct and reproducible, and the AI tooling wired to operate within those guardrails.
Status: Operational. The customization layer, audit cycles, and iteration plans below are all active. See Implementation Status for system-level readiness.
Engineering discipline first¶
The codebase is governed by explicit rules that apply to every change — human or AI-assisted:
- No bypassing Hydra, DVC, or MLflow.
- No coupling training logic to serving logic.
- No cross-layer shortcuts (data / features / models / pipelines / app are independent layers).
- No silent change of public behavior.
- No claiming planned design as implemented.
- No new dependencies without justification.
These rules are enforced by code review, CI quality gates, and the test suite. The AI tooling is configured to operate within these same rules — it does not lower the bar.
How a request flows¶
flowchart TD
U[Developer request] --> A{Activation tier}
A -->|Always-on| R1[copilot-instructions.md\nproject-wide rules]
A -->|Auto-attached| R2["*.instructions.md\nscoped by file path"]
A -->|On-demand| R3[/prompts, skills, agents/]
A -->|Deterministic| R4[hooks\ntool lifecycle]
R1 --> M[Model proposes change]
R2 --> M
R3 --> M
R4 --> M
M --> H[Human review\n+ tests + DVC + MLflow]
H --> C[Commit]
The four activation tiers work like IDE autocomplete vs. command palette vs. CI hooks —
just applied to an agent. Full mechanics: .github/AGENT_CUSTOMIZATION.md.
What AI tooling is used for¶
| Workflow | What the AI does | Human still does |
|---|---|---|
| New feature / endpoint | Scaffolds boilerplate, checks for boundary violations | Reviews, runs tests, verifies contracts |
| Refactoring | Proposes change with scope limit | Approves scope, checks no hidden side effects |
| Test coverage | Identifies gaps, generates property tests | Reviews invariants, runs in CI |
| Architecture review | Checks against documented boundaries and contracts | Decides on design changes |
| Audit cycle | Runs audit-system skill, generates report |
Reviews findings, creates prioritized plan |
| Documentation | Generates drafts from code and specs | Verifies accuracy, removes speculation |
| Debug a test failure | Traces failure to root cause | Decides on fix |
The principle: AI accelerates the typing; the engineering discipline comes from the project rules.
Real productivity examples¶
- A new DVC pipeline stage (e.g.,
add_feature_group) takes ~20 minutes with AI scaffolding vs. ~90 minutes from scratch. The AI fills in boilerplate; the human designs the contract. - System-wide audit cycles (finding docs vs. code contradictions, status mismatches) that would
take a full day manually are reduced to a 30-minute structured session using the
audit-systemskill. - Generating property tests for the leakage invariant took one iteration with AI vs.
multiple hours of manual research into
hypothesisstrategies.
AI validation rules¶
Every non-trivial AI-assisted change must satisfy:
pytest tests/ -qpasses with no new failures.ruff check src/passes (no lint regressions).dvc reprostill produces the same pipeline output (if pipeline-adjacent).- Documentation updated to match any behavior change.
- No speculative or planned behavior described as implemented.
AI-generated code is never merged without these checks. The CI pipeline enforces most of them automatically.
Hard constraints enforced in every instruction file¶
- Hydra / DVC / MLflow: configuration, pipeline, and tracking tools are mandatory, not optional.
- Layer boundaries:
src/data/,src/features/,src/models/,src/pipelines/,src/app/are isolated. - No opportunistic refactor: AI changes only what the task requires. Unused cleanup is rejected.
- Status honesty: every status claim must be backed by code or explicit task context.
Tooling inventory¶
| Artifact | Count | Purpose |
|---|---|---|
| Always-on rules | 1 file (copilot-instructions.md) |
Project-wide guardrails |
| Scoped instruction files | 9 | Python, FastAPI, Airflow, MLflow, DVC, features, tests, docs, agent-customization |
| Subagents | 2 | Code reviewer (read-only), Docs agent (read-only) |
| Prompts | 7 | Add endpoint, add pipeline stage, add feature, register model, release checklist, debug test, sync docs |
| Skills | 5 | audit-system, plan-test-coverage, error-analysis, dvc-pipeline-optimize, train-serve-skew-check |
| Hooks | 1 config | pre-tool-checks.json |
| MCP servers | 2 | awesome-copilot-main reference catalogue, soccer-docs filesystem |
| Audit cycles completed | 5 | 2026-04-24, -04-26, -04-28, -04-30, -05-16 |
What’s in this section¶
| Page | What it covers |
|---|---|
| Customization Layer | .github/ contents: agents, instructions, prompts, skills, hooks. When each activates and how to invoke it. |
| Continuous System Audits | The audit-system skill and reports/validation/ artifacts. How a system-wide health check is a reproducible procedure. |
| Iteration Plans | The reports/planning/ artifacts: dated, phased plans generated and tracked with AI assistance. |