AINA data engine room - technical handoff - 2026-06-11

AINA Production Runtime Readiness Handoff

A next-agent map for the new headless runtime harness, its receipts, its gates, and what remains before real production.

Codex execution lane · next agent, technical collaborator, and Ali · branch ali/personalization-engine-mission-2026-06-09

The Single Idea

The completed local ICP title map now has a headless production-runtime readiness harness. The repo can run representative title requests through the beta readiness map, generate local synthetic learning decisions for allowed cases, run deterministic evaluator receipts, block held/excluded/unknown titles, and prove the runtime gates through full validation.

Current State

9golden runtime cases

9golden expectations passed

6local synthetic plans

2fallback plans

1reviewed hold

2blocked excluded/unknown cases

6evaluator runs and passes

0production unlocks

Scope remains VDS-local, headless, and synthetic. This milestone prepares the runtime boundary; it does not add UI, auth, public serving, production telemetry, or real-user data.

Files Added Or Changed

Path	Purpose
`src/aina_data_engine/production_runtime_readiness.py`	New runtime readiness harness, golden cases, receipt writer, generated report writer, deterministic evaluator runner, and safety gates.
`src/aina_data_engine/cli.py`	Adds `production-runtime-readiness` command and wires the artifact into `validate`.
`src/aina_data_engine/reports.py`	Adds production-runtime readiness checks to full validation.
`tests/test_production_runtime_readiness.py`	Adds unit/CLI coverage for the runtime harness and blocked/planned case behavior.
`artifacts/validation/production_runtime_readiness_v0.json`	Summary receipt for the runtime rehearsal.
`artifacts/validation/production_runtime_readiness_v0.jsonl`	Row-level case receipt for all golden cases.
`artifacts/runtime/production_runtime_readiness_v0_golden_cases.jsonl`	Golden runtime title requests.
`artifacts/validation/full_validation.json`	Regenerated validation receipt with production-runtime checks.
`docs/reports/2026-06-11-founder-production-runtime-readiness.md/html`	Founder-facing report pair.
`docs/handoff/2026-06-11-production-runtime-readiness-handoff.md/html`	Technical handoff pair.

Runtime Contract

The command ensures the source-backed warehouse and beta readiness path exist, then runs the production-runtime readiness harness. Each golden case is validated against the current beta title map and assigned exactly one runtime action.

Codex · Runtime readiness · run the headless harness

uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness

Watch: do not treat a valid local receipt as real production approval.

Runtime action	Behavior
`plan_local_synthetic`	Calls `generate_data_decision(..., persist=False)`, records packet/plan/module/exercise/rubric details, and runs `evaluate_submission`.
`hold_for_review`	Refuses planning because the title is a reviewed residual hold.
`block_not_icp`	Refuses planning because the title is excluded or outside the ICP path.
`block_unknown_title`	Refuses planning because the title is not in the current beta title map.

All case records explicitly set these production gates to false: real production runtime, real-user data, external writes, production telemetry, and production claims.

Golden Cases

All nine cases passed expectation checks in the live VDS run.

Case	Title	Expected cohort	Runtime action
`serve_now_seasonal_sales_associate`	`seasonal sales associate`	`serve_now`	`plan_local_synthetic`
`serve_now_support_associate_soma`	`support associate - soma`	`serve_now`	`plan_local_synthetic`
`serve_now_director_bi`	`director of business intelligence`	`serve_now`	`plan_local_synthetic`
`serve_now_retail_sales_associate`	`part-time retail sales associate`	`serve_now`	`plan_local_synthetic`
`fallback_case_manager`	`case manager`	`serve_with_fallback`	`plan_local_synthetic`
`fallback_technical_support_assistant_manager`	`technical support assistant manager`	`serve_with_fallback`	`plan_local_synthetic`
`hold_family_law_attorney`	`family law attorney`	`reviewed_residual_hold`	`hold_for_review`
`excluded_registered_nurse`	`registered nurse - 1755724`	`excluded_or_not_icp`	`block_not_icp`
`unknown_future_job`	`totally unknown future job`	`unknown_unmapped_title`	`block_unknown_title`

Validation Checks Added

Check	Why it matters
`beta_readiness_receipt_valid`	Confirms the runtime rehearsal is based on the current beta title map.
`all_golden_expectations_met`	Confirms each title resolves to the intended cohort and action.
`planned_cases_have_modules`	Confirms planned cases are not empty shells.
`planned_cases_have_exercises`	Confirms practice exists for planned cases.
`planned_cases_have_rubrics`	Confirms evaluator criteria exist.
`planned_cases_evaluator_passed`	Confirms deterministic evaluator checks pass.
`blocked_and_held_cases_do_not_plan`	Confirms holds/exclusions/unknowns do not generate plans.
`fallback_cases_caveated`	Confirms fallback plans carry caveats.
`source_grounding_visible`	Confirms title/source grounding appears in the receipt.
`domain_review_required_visible`	Confirms sensitive review tags are visible.
`no_real_production_runtime_unlocks`	Confirms the harness unlocks no real production path.
`auth_privacy_runtime_gates_declared`	Confirms missing production boundaries are named.

Review Outcome

Two read-only Codex review agents checked the milestone before checkpointing.

Reviewer	Finding	Resolution
Correctness	Full validation did not gate the golden-cases JSONL artifact.	Fixed in CLI ensure, full validation, receipt metadata, and tests.
Correctness	Held/blocked and fallback safety checks could false-green through aggregate counts.	Fixed with row-level checks and a regression test.
Safety/claims	No concrete safety or overclaim bug found.	Auth/session/privacy gaps remain declared as missing production work.

Validation Commands

Codex · Validation · rerun the milestone proof

uv run ruff check src/aina_data_engine/production_runtime_readiness.py src/aina_data_engine/cli.py src/aina_data_engine/reports.py tests/test_production_runtime_readiness.py
uv run pytest tests/test_production_runtime_readiness.py -q
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
uv run pytest -q

Watch: rerun this block if the runtime receipt, validation gates, or reports move again.

Command	Result
Targeted ruff	Pass
Targeted production-runtime tests	`4 passed`
Production-runtime command	Valid receipt, no failed checks
Full engine validation	`status: pass`
Full pytest	`181 passed`

What Is Still Not Production

Gap	Why it matters
No auth/session/tenant integration	A real product must not trust client-supplied learner identity or tenant context.
No real-user data policy enforcement	The current harness uses synthetic profiles only.
No production telemetry sink	Local receipts exist; production observability does not.
No external write path	Correct for this repo, but production needs explicit write boundaries.
No real-beta allowlist	Local serviceability is broader than real learner eligibility.
No UI	Not needed for this milestone, but a thin internal tester could help later.
Domain review still required	Sensitive role classes remain gated.

Resume Commands

Codex · Resume · inspect current truth

git status --short
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
jq '{status, valid, metrics, failed_checks, scope}' artifacts/validation/production_runtime_readiness_v0.json
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
uv run pytest tests/test_production_runtime_readiness.py -q
uv run pytest -q

Watch: do not skip the receipt check if the repo has moved since this handoff.

Recommended Next Build Step

The next real milestone should keep the repo self-contained but make the runtime boundary more production-shaped.

Step	Purpose
Add a real-beta allowlist policy stricter than local synthetic serviceability	Separates internal proof from learner eligibility.
Add request/response JSON schema fixtures for the future runtime API	Makes integration assumptions testable before UI/auth work.
Add failing gates for missing auth/session/tenant assumptions	Keeps account boundaries visible.
Add privacy, consent, retention, deletion, and redaction fixtures	Prepares for real learners without accepting real data yet.
Add telemetry sink policy with local-only and production-approved modes	Prevents observability from becoming accidental data leakage.
Expand the golden title suite beyond 9 cases	Creates regression coverage across representative ICP slices.
Add a static internal tester only if it helps inspect decisions faster	Keeps UI optional rather than making it a dependency.

Where to start

Start by rerunning the production-runtime receipt, then expand the runtime boundary from local synthetic proof toward real-beta allowlist gates.

Ali Mehdi Mukadam - co-authored with Codex - 2026-06-11

topics:
  - aina-personalization-engine
  - production-runtime-readiness
  - handoff
subtopics:
  - implementation-map
  - validation
  - next-agent-resume
  - production-hardening

aina-personalization-engine production-runtime-readiness handoff validation production-hardening