AINA data engine room - session closeout - 2026-06-11

Session Closeout

A paused local checkpoint for the personalization data engine: what changed, what is in the repo, what remains, and where Gemini embeddings fit next.

The Single Idea

This run moved the engine from a large title map toward a real local personalization data engine. It now has 110,184 occupations, safer resolver behavior, promoted evidence packs in packets, and labeled evidence fan-out covering 14,114 eligible non-excluded occupations. It is still local-only and not production runtime.

01 - What Changed

The Engine Got Broader And More Honest

AreaWhat changedProof
Title expansionWarehouse reached 110,184 occupations.title_expansion_v1.json
Runtime semantic replayFirst 100 semantic rows processed through local reviewer/replay artifacts.runtime_semantic_decision_replay_v1.json
Evidence packsRoleIntelligencePacket.evidence_pack is wired.evidence_enrichment.py
ResolverDescription/title serving now avoids awkward grade-fragment answers.serving_probe_v1.md
Evidence fan-outEvidence matching is labeled by trust path: title, SOC, derived SOC, family.evidence_fanout_probe_v1.json
Exclusion safetyExcluded roles skip evidence-pack attachment.packets.py
02 - Current Proof Numbers

The Checkpoint Is Measurable

Total occupations110,184
Eligible non-excluded occupations95,251
Excluded occupations skipped for evidence14,933
Eligible occupations with evidence packs14,114
Eligible evidence coverage rate14.82%
Serving probe OOD failures0/34
Last full suite before final report edits218 passed in 110.03s
The evidence fan-out deliberately values caveats over raw coverage. Some titles remain un-enriched because the nearest pack would be misleading.
03 - Repo Inventory

What Lives Where

AreaPurpose
src/aina_data_engine/Main package: ingestion, resolver, packets, runtime, semantic gates, Hugging Face, source authority, readiness, reports.
artifacts/aina_data_engine.duckdbLocal warehouse used by serving and probes.
data/sources/title_expansion/Provenanced WRTMJ, gold-IP, and mapping title expansion inputs.
evidence/canonical/Canonical affordance, workflow, responsibility, task, and qualitative corpora.
artifacts/validation/Machine-readable validation and replay receipts.
artifacts/reports/Human-readable probe and validation reports.
docs/handoff/Operational memory, founder reports, and inventory reports.
scripts/Rebuild and probe scripts.
tests/Pytest coverage for contracts, runtime, evidence, semantic review, readiness, and telemetry.
04 - Current Capability

What It Can And Cannot Do

The engine can resolve many real job titles into role packets, attach promoted affordance evidence for a growing subset of roles, generate runtime/curriculum/sandbox/evaluator artifacts, and prove serving behavior with local probes.

It cannot yet serve every ICP title with high-quality evidence. It is not a production API/UI with auth, tenant isolation, observability, or live learner feedback. It has not used quarantined responsibility_registry_v2.

05 - Gemini Embeddings

Use Gemini As An Evaluated Sidecar First

UseWhyGuardrail
Title-to-pack similarityMay improve evidence-pack selection beyond token overlap.Compare against evidence_fanout_probe_v1 before changing runtime.
Source miningCan find likely responsibilities/workflows for uncovered titles.Write sidecar artifacts only.
Pack clusteringCan de-dupe and prioritize similar packs.Preserve provenance.
Anomaly detectionCan catch mismatched content like sales advice for engineering roles.Use as evaluator signal, not runtime authority.
06 - Validation

Checks Run

CheckResult
Ruff over touched source/tests/scriptsPass
Focused evidence/contract/fallback tests13 passed
Evidence fan-out probePass
Serving probePass
Full pytest before final report edits218 passed in 110.03s

The Next Real Milestone

Run a bounded Gemini embeddings evaluation lane, promote or mine more affordance packs, continue semantic replay batches, decide the quarantine status of responsibility_registry_v2, and keep weak matches caveated.

08 - Resume Prompt

Where To Start Next

Codex - Resume paused goal - design embeddings sidecar
Continue from /srv/aina/aina-data-engine-room. The goal is paused after the 2026-06-11 closeout. Read docs/handoff/2026-06-11-session-closeout-data-engine-room-handoff.md and docs/handoff/2026-06-11-founder-facing-data-engine-room-report.md first. Do not use quarantined responsibility_registry_v2. Design a bounded Gemini embeddings evaluation lane that compares embedding-based title-to-pack/source matching against the deterministic evidence_fanout_probe_v1 baseline, writes sidecar artifacts only, and does not change runtime behavior until validation proves improvement.
Watch-out: do not turn embeddings into runtime authority before sidecar validation beats the deterministic baseline.
Where to start

Start with an embeddings evaluation lane, not a runtime rewrite.