AINA Production Runtime Readiness Handoff
A next-agent map for the new headless runtime harness, its receipts, its gates, and what remains before real production.
The completed local ICP title map now has a headless production-runtime readiness harness. The repo can run representative title requests through the beta readiness map, generate local synthetic learning decisions for allowed cases, run deterministic evaluator receipts, block held/excluded/unknown titles, and prove the runtime gates through full validation.
Current State
Scope remains VDS-local, headless, and synthetic. This milestone prepares the runtime boundary; it does not add UI, auth, public serving, production telemetry, or real-user data.
Files Added Or Changed
| Path | Purpose |
|---|---|
src/aina_data_engine/production_runtime_readiness.py | New runtime readiness harness, golden cases, receipt writer, generated report writer, deterministic evaluator runner, and safety gates. |
src/aina_data_engine/cli.py | Adds production-runtime-readiness command and wires the artifact into validate. |
src/aina_data_engine/reports.py | Adds production-runtime readiness checks to full validation. |
tests/test_production_runtime_readiness.py | Adds unit/CLI coverage for the runtime harness and blocked/planned case behavior. |
artifacts/validation/production_runtime_readiness_v0.json | Summary receipt for the runtime rehearsal. |
artifacts/validation/production_runtime_readiness_v0.jsonl | Row-level case receipt for all golden cases. |
artifacts/runtime/production_runtime_readiness_v0_golden_cases.jsonl | Golden runtime title requests. |
artifacts/validation/full_validation.json | Regenerated validation receipt with production-runtime checks. |
docs/reports/2026-06-11-founder-production-runtime-readiness.md/html | Founder-facing report pair. |
docs/handoff/2026-06-11-production-runtime-readiness-handoff.md/html | Technical handoff pair. |
Runtime Contract
The command ensures the source-backed warehouse and beta readiness path exist, then runs the production-runtime readiness harness. Each golden case is validated against the current beta title map and assigned exactly one runtime action.
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
| Runtime action | Behavior |
|---|---|
plan_local_synthetic | Calls generate_data_decision(..., persist=False), records packet/plan/module/exercise/rubric details, and runs evaluate_submission. |
hold_for_review | Refuses planning because the title is a reviewed residual hold. |
block_not_icp | Refuses planning because the title is excluded or outside the ICP path. |
block_unknown_title | Refuses planning because the title is not in the current beta title map. |
Golden Cases
All nine cases passed expectation checks in the live VDS run.
| Case | Title | Expected cohort | Runtime action |
|---|---|---|---|
serve_now_seasonal_sales_associate | seasonal sales associate | serve_now | plan_local_synthetic |
serve_now_support_associate_soma | support associate - soma | serve_now | plan_local_synthetic |
serve_now_director_bi | director of business intelligence | serve_now | plan_local_synthetic |
serve_now_retail_sales_associate | part-time retail sales associate | serve_now | plan_local_synthetic |
fallback_case_manager | case manager | serve_with_fallback | plan_local_synthetic |
fallback_technical_support_assistant_manager | technical support assistant manager | serve_with_fallback | plan_local_synthetic |
hold_family_law_attorney | family law attorney | reviewed_residual_hold | hold_for_review |
excluded_registered_nurse | registered nurse - 1755724 | excluded_or_not_icp | block_not_icp |
unknown_future_job | totally unknown future job | unknown_unmapped_title | block_unknown_title |
Validation Checks Added
| Check | Why it matters |
|---|---|
beta_readiness_receipt_valid | Confirms the runtime rehearsal is based on the current beta title map. |
all_golden_expectations_met | Confirms each title resolves to the intended cohort and action. |
planned_cases_have_modules | Confirms planned cases are not empty shells. |
planned_cases_have_exercises | Confirms practice exists for planned cases. |
planned_cases_have_rubrics | Confirms evaluator criteria exist. |
planned_cases_evaluator_passed | Confirms deterministic evaluator checks pass. |
blocked_and_held_cases_do_not_plan | Confirms holds/exclusions/unknowns do not generate plans. |
fallback_cases_caveated | Confirms fallback plans carry caveats. |
source_grounding_visible | Confirms title/source grounding appears in the receipt. |
domain_review_required_visible | Confirms sensitive review tags are visible. |
no_real_production_runtime_unlocks | Confirms the harness unlocks no real production path. |
auth_privacy_runtime_gates_declared | Confirms missing production boundaries are named. |
Review Outcome
Two read-only Codex review agents checked the milestone before checkpointing.
| Reviewer | Finding | Resolution |
|---|---|---|
| Correctness | Full validation did not gate the golden-cases JSONL artifact. | Fixed in CLI ensure, full validation, receipt metadata, and tests. |
| Correctness | Held/blocked and fallback safety checks could false-green through aggregate counts. | Fixed with row-level checks and a regression test. |
| Safety/claims | No concrete safety or overclaim bug found. | Auth/session/privacy gaps remain declared as missing production work. |
Validation Commands
uv run ruff check src/aina_data_engine/production_runtime_readiness.py src/aina_data_engine/cli.py src/aina_data_engine/reports.py tests/test_production_runtime_readiness.py uv run pytest tests/test_production_runtime_readiness.py -q uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate uv run pytest -q
| Command | Result |
|---|---|
| Targeted ruff | Pass |
| Targeted production-runtime tests | 4 passed |
| Production-runtime command | Valid receipt, no failed checks |
| Full engine validation | status: pass |
| Full pytest | 181 passed |
What Is Still Not Production
| Gap | Why it matters |
|---|---|
| No auth/session/tenant integration | A real product must not trust client-supplied learner identity or tenant context. |
| No real-user data policy enforcement | The current harness uses synthetic profiles only. |
| No production telemetry sink | Local receipts exist; production observability does not. |
| No external write path | Correct for this repo, but production needs explicit write boundaries. |
| No real-beta allowlist | Local serviceability is broader than real learner eligibility. |
| No UI | Not needed for this milestone, but a thin internal tester could help later. |
| Domain review still required | Sensitive role classes remain gated. |
Resume Commands
git status --short
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
jq '{status, valid, metrics, failed_checks, scope}' artifacts/validation/production_runtime_readiness_v0.json
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
uv run pytest tests/test_production_runtime_readiness.py -q
uv run pytest -q
Recommended Next Build Step
The next real milestone should keep the repo self-contained but make the runtime boundary more production-shaped.
| Step | Purpose |
|---|---|
| Add a real-beta allowlist policy stricter than local synthetic serviceability | Separates internal proof from learner eligibility. |
| Add request/response JSON schema fixtures for the future runtime API | Makes integration assumptions testable before UI/auth work. |
| Add failing gates for missing auth/session/tenant assumptions | Keeps account boundaries visible. |
| Add privacy, consent, retention, deletion, and redaction fixtures | Prepares for real learners without accepting real data yet. |
| Add telemetry sink policy with local-only and production-approved modes | Prevents observability from becoming accidental data leakage. |
| Expand the golden title suite beyond 9 cases | Creates regression coverage across representative ICP slices. |
| Add a static internal tester only if it helps inspect decisions faster | Keeps UI optional rather than making it a dependency. |
Start by rerunning the production-runtime receipt, then expand the runtime boundary from local synthetic proof toward real-beta allowlist gates.