The repo now has a self-contained, source-backed local engine that can route the full ICP title surface into serve locally now, serve with fallback caveats, or hold until stronger evidence exists. The current working state can serve 46,401 title rows locally, has 17,477 rows excluded or outside the ICP wedge, and has 10,347 ambiguous rows left. The future reviewer lane is Codex-Spark-only.
74,225raw title rows
46,401serviceable locally
17,477excluded or not ICP
10,347still ambiguous
01 - Current State
Verified Local Metrics
Measure
Count
Meaning
Raw input title rows
74,225
All title rows in the source-backed coverage receipt.
Deduped title count
74,213
Title-level working surface.
Current serviceable rows
46,401
45,564 base serviceable rows plus 837 adjudicated fallback promotions.
Current excluded/not ICP rows
17,477
12,572 base exclusions plus 4,905 adjudicated exclusions.
Remaining ambiguous rows
10,347
Rows still needing deterministic evidence or Codex-Spark adjudication.
Structured model decisions applied
2,545
Accepted decisions in the adjudication input.
Production unlocks
0
No public runtime, external writes, real-user data, telemetry, or deployment promotion.
62.3%serviceable locally
22.9%excluded or not ICP
14.9%still ambiguous
02 - What Changed
Reviewer Lane Moved To Codex-Spark
Before the policy change, complete Claude-era batches 012 and 014 were merged through the stricter gate and added 192 accepted decisions. Incomplete Claude outputs for 013 and 015 were not merged. After Ali asked to stop Claude reviews, the live lane moved to gpt-5.3-codex-spark only.
Change
Outcome
Stopped Claude/Haiku/Sonnet reviewer lane
No new Claude outputs should be used from here.
Killed GPT-5.4-mini job
Do not use GPT-5.4-mini unless Ali changes policy.
Promoted Codex-Spark
Default reviewer model is now gpt-5.3-codex-spark.
Tested 200-row Spark batch
Worked once, but the second reviewer can hit context limits.
Settled on 100-row Spark batches
Safer repeatable throughput unit.
Spark batch
Prompt rows
Consensus decisions
Gate note
gpt_001_200
200
97
Two Spark reviewers; 6 exact-title SOC corrections.
gpt_002_100
100
47
Two Spark reviewers; 1 exact-title SOC correction.
gpt_003_100-gpt_007_100
500
419
Five 100-title batches; targeted repairs; exact-prompt gate passed.
gpt_008_100-gpt_012_100
500
317
Five more 100-title batches; exact-prompt gate passed.
03 - What Was Built
Repo Surfaces
File
Role
src/aina_data_engine/title_coverage.py
First-pass ICP title coverage receipt.
src/aina_data_engine/title_adjudication.py
Deterministic routing and structured model-review decisions.
src/aina_data_engine/cli.py
Coverage, adjudication, merge-review, HF ingest, source authority, and validation commands.
tests/test_icp_title_adjudication.py
Prompt-window, parser, evidence-ref, SOC-normalization, and CLI regression tests.
artifacts/validation/
Machine-readable proof of current state.
artifacts/review/
Review inputs, prompts, merge receipts, and model-output evidence.
The merge gate now supports --expected-review-prompt, expected-row coverage receipts, off-prompt rejection, exact-title SOC normalization, and two-reviewer consensus.
04 - Process Now
Codex-Spark Loop
The repeatable loop is to generate a 100-row prompt, run two independent Codex-Spark sessions, preserve raw outputs, merge with the expected prompt, refresh adjudication artifacts, validate, and commit locally.
Watch-out: model success is not acceptance. The merge receipt is the acceptance gate.
05 - Source Provenance
HF, O*NET, GDPval Chain
Source signal
Evidence
Hugging Face downloaded files
15
Hugging Face downloaded bytes
186,743,670
GDPval task count
220
EconomicIndex legacy signal count
821
Mapped SOC count
907
O*NET occupation rows
1,016
O*NET task rows
18,796
06 - Validation
Current Verification
Command
Result
uv run ruff check .
Passed.
uv run pytest -q
171 passed.
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
Pass.
07 - What AINA Can Serve
Current Capability Boundary
AINA can serve 46,401 title rows locally today, strongest in white-collar knowledge-work areas: sales, marketing, operations, customer success, analytics, finance, HR, legal/compliance, product, administration, education, strategy, and consulting. It should not serve 17,477 title rows today, and 10,347 rows still need adjudication.
Boundary
Status
Local internal personalization
Allowed
Fallback precision caveats
Required for fallback-routed titles
Public runtime
Not allowed
External writes
Not allowed
Real-user data
Not allowed
Production telemetry
Not allowed
Deployment promotion
Not allowed
08 - Resume Prompt
Prompt For Next Agent
Codex - Resume ICP Title Coverage - Codex-Spark only
Continue in /srv/aina/aina-data-engine-room on branch ali/personalization-engine-mission-2026-06-09.
Goal: continue the AINA Personalization Engine ICP title-coverage milestone on the VDS, self-contained in local git, no push/merge ceremony.
Reviewer policy from Ali: stop Claude reviews; use Codex/Codex-Spark only. Do not use Haiku, Sonnet, or GPT-5.4-mini unless Ali explicitly changes the policy.
Current verified metrics:
- serviceable rows: 46,401
- excluded/not ICP rows: 17,477
- remaining adjudication queue: 10,347
- structured model decisions applied: 2,545
- production/external/real-user/deployment unlocks: 0
Use the current 100-row prompt batch, run two independent gpt-5.3-codex-spark outputs, merge with --expected-review-prompt, run title-adjudication, run ruff, pytest, validate, and commit locally.
Watch-out: exact prompt coverage and two-reviewer consensus are the acceptance criteria.
Where to start
Start with the current 100-row prompt batch and keep shrinking the 10,347-row ambiguity queue through Codex-Spark-only two-reviewer consensus.