AINA data engine room · technical handoff · Ali Mehdi Mukadam + Codex · 2026-06-11

Top Worked Title Readiness

A restartable map of the new Top 1000 title lane, its artifacts, validations, and next slice.

The Single Idea

This milestone adds a replayable Top 1000 Worked Titles lane. It ranks raw title popularity, produces an ICP-serviceable top 1000, resolves generic titles into deterministic families where possible, writes a 50-row semantic sample, wires validation, and keeps all real beta and production permissions blocked.

1,000raw top titles
1,000ICP top titles
370generic unresolved rows
passvalidation status
01

Source Truth

Repo root is /srv/aina/aina-data-engine-room on branch ali/personalization-engine-mission-2026-06-09. Baseline before this milestone was cbe31bd Add semantic title coverage runtime readiness. Primary validation truth is artifacts/validation/full_validation.json.

02

What Was Built

New module: src/aina_data_engine/top_worked_title_readiness.py. New command: uv run aina-data-engine --root /srv/aina/aina-data-engine-room top-worked-title-readiness --limit 1000.

The resolver preserves existing non-generic functions but overrides with boundary families when the title clearly indicates healthcare, software delivery, engineering/hardware, or physical/frontline work.

03

Artifacts

04

Current Metrics

MetricCount
Raw fallback rows684
Raw serve-now rows17
Raw excluded/not-ICP rows295
Raw reviewed holds4
ICP serve-now rows29
ICP fallback rows971
ICP general-business rows517
ICP unresolved general-business rows370
Production unlocks0
05

Semantic Result

The 50-row sample confirms clinical, software, engineering/hardware, physical/frontline, warehouse, technician, and medical assistant rows are blocked or caveated boundary surfaces. It also leaves store manager and general manager as high-priority unresolved generic titles.

The semantic pass changed the implementation: engineer titles no longer collapse into software delivery, and medical assistant no longer stays plain administration.
06

Resume Commands

cd /srv/aina/aina-data-engine-room
git status --short
uv run aina-data-engine --root /srv/aina/aina-data-engine-room top-worked-title-readiness --limit 1000
jq '.top_worked_title_readiness_summary' artifacts/validation/full_validation.json
jq -s '.[0:50]' artifacts/validation/top_worked_title_readiness_v1_semantic_sample_50.jsonl
Where to start
Resolve the top 100 unresolved generic rows, then harden reusable packets by family and risk.