Top Worked Title Readiness
A restartable map of the new Top 1000 title lane, its artifacts, validations, and next slice.
This milestone adds a replayable Top 1000 Worked Titles lane. It ranks raw title popularity, produces an ICP-serviceable top 1000, resolves generic titles into deterministic families where possible, writes a 50-row semantic sample, wires validation, and keeps all real beta and production permissions blocked.
Source Truth
Repo root is /srv/aina/aina-data-engine-room on branch ali/personalization-engine-mission-2026-06-09. Baseline before this milestone was cbe31bd Add semantic title coverage runtime readiness. Primary validation truth is artifacts/validation/full_validation.json.
What Was Built
New module: src/aina_data_engine/top_worked_title_readiness.py. New command: uv run aina-data-engine --root /srv/aina/aina-data-engine-room top-worked-title-readiness --limit 1000.
The resolver preserves existing non-generic functions but overrides with boundary families when the title clearly indicates healthcare, software delivery, engineering/hardware, or physical/frontline work.
Artifacts
artifacts/validation/top_worked_title_readiness_v1.jsonartifacts/validation/top_worked_title_readiness_v1_raw_top_1000.jsonlartifacts/validation/top_worked_title_readiness_v1_icp_serviceable_top_1000.jsonlartifacts/validation/top_worked_title_readiness_v1_semantic_sample_50.jsonlartifacts/reports/top_worked_title_readiness_v1.mdand.htmltests/test_top_worked_title_readiness.py
Current Metrics
| Metric | Count |
|---|---|
| Raw fallback rows | 684 |
| Raw serve-now rows | 17 |
| Raw excluded/not-ICP rows | 295 |
| Raw reviewed holds | 4 |
| ICP serve-now rows | 29 |
| ICP fallback rows | 971 |
| ICP general-business rows | 517 |
| ICP unresolved general-business rows | 370 |
| Production unlocks | 0 |
Semantic Result
The 50-row sample confirms clinical, software, engineering/hardware, physical/frontline, warehouse, technician, and medical assistant rows are blocked or caveated boundary surfaces. It also leaves store manager and general manager as high-priority unresolved generic titles.
Resume Commands
cd /srv/aina/aina-data-engine-room git status --short uv run aina-data-engine --root /srv/aina/aina-data-engine-room top-worked-title-readiness --limit 1000 jq '.top_worked_title_readiness_summary' artifacts/validation/full_validation.json jq -s '.[0:50]' artifacts/validation/top_worked_title_readiness_v1_semantic_sample_50.jsonl
Resolve the top 100 unresolved generic rows, then harden reusable packets by family and risk.