AINA data engine room - VDS local execution - 2026-06-10

AINA ICP Title Coverage Goal Run Handoff

Technical handoff for the local personalization-engine title coverage lane.

Branch: ali/personalization-engine-mission-2026-06-09 - previous checkpoint: ac3e076

The Single Idea

The repo now has a self-contained, source-backed local engine that can route the full ICP title surface into serve locally now, serve with fallback caveats, or hold until stronger evidence exists. The current working state can serve 46,401 title rows locally, has 17,477 rows excluded or outside the ICP wedge, and has 10,347 ambiguous rows left. The future reviewer lane is Codex-Spark-only.

74,225raw title rows

46,401serviceable locally

17,477excluded or not ICP

10,347still ambiguous

01 - Current State

Verified Local Metrics

Measure	Count	Meaning
Raw input title rows	74,225	All title rows in the source-backed coverage receipt.
Deduped title count	74,213	Title-level working surface.
Current serviceable rows	46,401	45,564 base serviceable rows plus 837 adjudicated fallback promotions.
Current excluded/not ICP rows	17,477	12,572 base exclusions plus 4,905 adjudicated exclusions.
Remaining ambiguous rows	10,347	Rows still needing deterministic evidence or Codex-Spark adjudication.
Structured model decisions applied	2,545	Accepted decisions in the adjudication input.
Production unlocks	0	No public runtime, external writes, real-user data, telemetry, or deployment promotion.

62.3%serviceable locally

22.9%excluded or not ICP

14.9%still ambiguous

02 - What Changed

Reviewer Lane Moved To Codex-Spark

Before the policy change, complete Claude-era batches 012 and 014 were merged through the stricter gate and added 192 accepted decisions. Incomplete Claude outputs for 013 and 015 were not merged. After Ali asked to stop Claude reviews, the live lane moved to gpt-5.3-codex-spark only.

Change	Outcome
Stopped Claude/Haiku/Sonnet reviewer lane	No new Claude outputs should be used from here.
Killed GPT-5.4-mini job	Do not use GPT-5.4-mini unless Ali changes policy.
Promoted Codex-Spark	Default reviewer model is now `gpt-5.3-codex-spark`.
Tested 200-row Spark batch	Worked once, but the second reviewer can hit context limits.
Settled on 100-row Spark batches	Safer repeatable throughput unit.

Spark batch	Prompt rows	Consensus decisions	Gate note
`gpt_001_200`	200	97	Two Spark reviewers; 6 exact-title SOC corrections.
`gpt_002_100`	100	47	Two Spark reviewers; 1 exact-title SOC correction.
`gpt_003_100`-`gpt_007_100`	500	419	Five 100-title batches; targeted repairs; exact-prompt gate passed.
`gpt_008_100`-`gpt_012_100`	500	317	Five more 100-title batches; exact-prompt gate passed.

03 - What Was Built

Repo Surfaces

File	Role
`src/aina_data_engine/title_coverage.py`	First-pass ICP title coverage receipt.
`src/aina_data_engine/title_adjudication.py`	Deterministic routing and structured model-review decisions.
`src/aina_data_engine/cli.py`	Coverage, adjudication, merge-review, HF ingest, source authority, and validation commands.
`tests/test_icp_title_adjudication.py`	Prompt-window, parser, evidence-ref, SOC-normalization, and CLI regression tests.
`artifacts/validation/`	Machine-readable proof of current state.
`artifacts/review/`	Review inputs, prompts, merge receipts, and model-output evidence.

The merge gate now supports --expected-review-prompt, expected-row coverage receipts, off-prompt rejection, exact-title SOC normalization, and two-reviewer consensus.

04 - Process Now

Codex-Spark Loop

The repeatable loop is to generate a 100-row prompt, run two independent Codex-Spark sessions, preserve raw outputs, merge with the expected prompt, refresh adjudication artifacts, validate, and commit locally.

Codex - Merge Spark Review - exact prompt coverage required

uv run aina-data-engine --root /srv/aina/aina-data-engine-room title-adjudication-merge-reviews \
  --batch-id icp_ambiguous_batch_gpt_013_100_codexspark \
  --expected-review-prompt artifacts/review/model_outputs/icp_batch_gpt_013_100_prompt.md \
  --review-output "Codex Spark A:gpt-5.3-codex-spark=artifacts/review/model_outputs/icp_batch_gpt_013_100_codexspark_a.json" \
  --review-output "Codex Spark B:gpt-5.3-codex-spark=artifacts/review/model_outputs/icp_batch_gpt_013_100_codexspark_b.json"

Watch-out: model success is not acceptance. The merge receipt is the acceptance gate.

05 - Source Provenance

HF, O*NET, GDPval Chain

Source signal	Evidence
Hugging Face downloaded files	15
Hugging Face downloaded bytes	186,743,670
GDPval task count	220
EconomicIndex legacy signal count	821
Mapped SOC count	907
O*NET occupation rows	1,016
O*NET task rows	18,796

06 - Validation

Current Verification

Command	Result
`uv run ruff check .`	Passed.
`uv run pytest -q`	171 passed.
`uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate`	Pass.

07 - What AINA Can Serve

Current Capability Boundary

AINA can serve 46,401 title rows locally today, strongest in white-collar knowledge-work areas: sales, marketing, operations, customer success, analytics, finance, HR, legal/compliance, product, administration, education, strategy, and consulting. It should not serve 17,477 title rows today, and 10,347 rows still need adjudication.

Boundary	Status
Local internal personalization	Allowed
Fallback precision caveats	Required for fallback-routed titles
Public runtime	Not allowed
External writes	Not allowed
Real-user data	Not allowed
Production telemetry	Not allowed
Deployment promotion	Not allowed

08 - Resume Prompt

Prompt For Next Agent

Codex - Resume ICP Title Coverage - Codex-Spark only

Continue in /srv/aina/aina-data-engine-room on branch ali/personalization-engine-mission-2026-06-09.

Goal: continue the AINA Personalization Engine ICP title-coverage milestone on the VDS, self-contained in local git, no push/merge ceremony.

Reviewer policy from Ali: stop Claude reviews; use Codex/Codex-Spark only. Do not use Haiku, Sonnet, or GPT-5.4-mini unless Ali explicitly changes the policy.

Current verified metrics:
- serviceable rows: 46,401
- excluded/not ICP rows: 17,477
- remaining adjudication queue: 10,347
- structured model decisions applied: 2,545
- production/external/real-user/deployment unlocks: 0

Use the current 100-row prompt batch, run two independent gpt-5.3-codex-spark outputs, merge with --expected-review-prompt, run title-adjudication, run ruff, pytest, validate, and commit locally.

Watch-out: exact prompt coverage and two-reviewer consensus are the acceptance criteria.

Where to start

Start with the current 100-row prompt batch and keep shrinking the 10,347-row ambiguity queue through Codex-Spark-only two-reviewer consensus.

Ali Mehdi Mukadam - co-authored with Codex - 2026-06-10

topics: ['aina-data-engine', 'personalization-engine', 'icp-title-coverage']
subtopics: ['codex-spark-review', 'huggingface-provenance', 'vds-local-execution', 'handoff']

aina-data-enginepersonalization-engineicp-title-coveragecodex-spark-reviewhuggingface-provenance