Model Quality Gate Terminology Checkpoint
A cleanup checkpoint that removes the old human-review gate contract from active production-readiness surfaces.
This checkpoint removes the old human-review gate vocabulary from the active GDPval, replay, feedback, beta-admission, deployment-readiness, and runtime-readiness surfaces, replacing it with model-quality, structured-model, semantic-review, and quality-gate language. The production Personalization Engine goal is still active; this is a cleanup and proof checkpoint that prevents future agents from reintroducing the wrong review contract while we continue toward JD-aware role context, AI Fluency maps, clean embeddings, and runtime readiness.
What Changed
The active gate terminology now reflects the current operating model.
| Before | After |
|---|---|
human_review_decisions | model_quality_gate_decisions |
requires_human_review_before_external_beta | requires_model_quality_gate_before_external_beta |
human_risk_gate_status | model_risk_gate_status |
human_review_required_for_path | model_quality_gate_required_for_path |
awaiting_human_decision style GDPval holds | awaiting_structured_model_decision and ready_for_structured_model_calibration |
| “human-reviewed workflows” in derived outputs | “quality-reviewed workflows” |
This touched code, tests, and regenerated durable receipts under artifacts/validation, artifacts/reports, artifacts/events, and artifacts/review.
Main Files Touched
Core changes landed across beta admission, deployment readiness, event replay, feedback, GDPval replacement flows, packet quality, reporting, and the planner. Focused tests were updated or exercised for beta admission, deployment readiness, feedback, GDPval calibration and practice, packet quality, and planner behavior.
src/aina_data_engine/*Gate names, normalizers, runtime flags, report keys.
tests/test_*GDPval*Focused assertions now expect model-quality and structured-model language.
artifacts/*Durable JSON, JSONL, Markdown, HTML, and event receipts regenerated.
Verification
| Check | Result |
|---|---|
ain-506-p0-gate | pass |
ain-510-retrieval-promotion-gate | pass, status promotion_ready |
production-runtime-readiness | pass, status ready_to_harden_headless_production_runtime |
validate | pass |
| Focused pytest | 51 passed in 48.88s |
| Ruff on changed Python/test files | pass |
| Active stale-phrase scan for GDPval/replay/feedback receipts | no hits |
Planner/runtime-readiness human_review scan | no hits |
| Artifact ignore policy | bulk DuckDB, Parquet, vector, and raw paths ignored; durable reports, receipts, and handoff not ignored |
What Is Intentionally Not Solved Here
This checkpoint does not finish the full production objective. It does not complete JD-aware role-context evidence, source-authority registry v2, role-resolution decisions, AI Fluency E2E fixture expansion to 25-50 real rows, clean full-corpus embedding, archive/retirement proof, or final production-pluggable release.
Some historical or compatibility surfaces still contain old terms by design: source normalizers may include old phrase literals, raw or bulk artifacts may still need source-family repair, and risk/accountability holds remain valid as long as they are not represented as human_review columns or gates.
Current Production Goal Status
The active goal remains open. The repo is now cleaner for the next slices, but the real end state still requires clean-start and authority reconciliation, runtime contracts, JD-aware role context, AI Fluency loop validation, clean-repair-embed progression, exact-cosine retrieval proof, and donor retirement/archive proof.
Resume Commands
git status --short --branch
git log -5 --oneline
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-p0-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-510-retrieval-promotion-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
Focused regression command:
uv run pytest tests/test_deployment_readiness.py tests/test_gdpval_hold_closeout.py tests/test_gdpval_calibration_packet.py tests/test_packet_quality_gate.py tests/test_gdpval_replacement_practice.py tests/test_gdpval_replacement_replay_bridge.py tests/test_gdpval_replacement_approved_intake.py tests/test_gdpval_replacement_candidate_pack.py tests/test_gdpval_replacement_candidate_practice.py tests/test_gdpval_replacement_candidate_batch_practice.py tests/test_gdpval_replacement_candidate_batch_resolution.py tests/test_beta_admission.py tests/test_feedback_loop.py tests/test_feedback_state_matrix.py tests/test_planner.py -q
Start with runtime/source-authority reconciliation, not a new embedding push; make role-context evidence JD-aware and product-consumable, then embed only clean semantic chunks that pass source-family gates.