Session Closeout
A paused local checkpoint for the personalization data engine: what changed, what is in the repo, what remains, and where Gemini embeddings fit next.
This run moved the engine from a large title map toward a real local personalization data engine. It now has 110,184 occupations, safer resolver behavior, promoted evidence packs in packets, and labeled evidence fan-out covering 14,114 eligible non-excluded occupations. It is still local-only and not production runtime.
The Engine Got Broader And More Honest
| Area | What changed | Proof |
|---|---|---|
| Title expansion | Warehouse reached 110,184 occupations. | title_expansion_v1.json |
| Runtime semantic replay | First 100 semantic rows processed through local reviewer/replay artifacts. | runtime_semantic_decision_replay_v1.json |
| Evidence packs | RoleIntelligencePacket.evidence_pack is wired. | evidence_enrichment.py |
| Resolver | Description/title serving now avoids awkward grade-fragment answers. | serving_probe_v1.md |
| Evidence fan-out | Evidence matching is labeled by trust path: title, SOC, derived SOC, family. | evidence_fanout_probe_v1.json |
| Exclusion safety | Excluded roles skip evidence-pack attachment. | packets.py |
The Checkpoint Is Measurable
| Total occupations | 110,184 |
|---|---|
| Eligible non-excluded occupations | 95,251 |
| Excluded occupations skipped for evidence | 14,933 |
| Eligible occupations with evidence packs | 14,114 |
| Eligible evidence coverage rate | 14.82% |
| Serving probe OOD failures | 0/34 |
| Last full suite before final report edits | 218 passed in 110.03s |
What Lives Where
| Area | Purpose |
|---|---|
src/aina_data_engine/ | Main package: ingestion, resolver, packets, runtime, semantic gates, Hugging Face, source authority, readiness, reports. |
artifacts/aina_data_engine.duckdb | Local warehouse used by serving and probes. |
data/sources/title_expansion/ | Provenanced WRTMJ, gold-IP, and mapping title expansion inputs. |
evidence/canonical/ | Canonical affordance, workflow, responsibility, task, and qualitative corpora. |
artifacts/validation/ | Machine-readable validation and replay receipts. |
artifacts/reports/ | Human-readable probe and validation reports. |
docs/handoff/ | Operational memory, founder reports, and inventory reports. |
scripts/ | Rebuild and probe scripts. |
tests/ | Pytest coverage for contracts, runtime, evidence, semantic review, readiness, and telemetry. |
What It Can And Cannot Do
The engine can resolve many real job titles into role packets, attach promoted affordance evidence for a growing subset of roles, generate runtime/curriculum/sandbox/evaluator artifacts, and prove serving behavior with local probes.
It cannot yet serve every ICP title with high-quality evidence. It is not a production API/UI with auth, tenant isolation, observability, or live learner feedback. It has not used quarantined responsibility_registry_v2.
Use Gemini As An Evaluated Sidecar First
| Use | Why | Guardrail |
|---|---|---|
| Title-to-pack similarity | May improve evidence-pack selection beyond token overlap. | Compare against evidence_fanout_probe_v1 before changing runtime. |
| Source mining | Can find likely responsibilities/workflows for uncovered titles. | Write sidecar artifacts only. |
| Pack clustering | Can de-dupe and prioritize similar packs. | Preserve provenance. |
| Anomaly detection | Can catch mismatched content like sales advice for engineering roles. | Use as evaluator signal, not runtime authority. |
Checks Run
| Check | Result |
|---|---|
| Ruff over touched source/tests/scripts | Pass |
| Focused evidence/contract/fallback tests | 13 passed |
| Evidence fan-out probe | Pass |
| Serving probe | Pass |
| Full pytest before final report edits | 218 passed in 110.03s |
The Next Real Milestone
Run a bounded Gemini embeddings evaluation lane, promote or mine more affordance packs, continue semantic replay batches, decide the quarantine status of responsibility_registry_v2, and keep weak matches caveated.
Where To Start Next
Continue from /srv/aina/aina-data-engine-room. The goal is paused after the 2026-06-11 closeout. Read docs/handoff/2026-06-11-session-closeout-data-engine-room-handoff.md and docs/handoff/2026-06-11-founder-facing-data-engine-room-report.md first. Do not use quarantined responsibility_registry_v2. Design a bounded Gemini embeddings evaluation lane that compares embedding-based title-to-pack/source matching against the deterministic evidence_fanout_probe_v1 baseline, writes sidecar artifacts only, and does not change runtime behavior until validation proves improvement.
Start with an embeddings evaluation lane, not a runtime rewrite.