Runtime Retrieval Proof Handoff
Local exact-cosine retrieval now runs over existing Gemini vectors, deterministic reranking improves weak matches, and the production boundary stays closed.
AIN-510 now has a local runtime retrieval proof that reads the existing Gemini Embedding 2 vector snapshot, runs exact-cosine retrieval for planned runtime cases, applies a deterministic semantic reranker over that candidate pool, and proves that vector retrieval can support deterministic fallback without calling Gemini live or unlocking production runtime authority.
Why This Does Not Call Live Gemini
Live Gemini calls belong to the embedding-production lane, where clean eligible source-family chunks are converted into new vectors.
This slice is different. It is a runtime proof gate. Its job is to answer: given the vectors we already created, can local exact-cosine retrieval help role-to-workflow, role-to-curriculum, and role-to-evaluator decisions while the deterministic fallback remains in control?
The gate exists because runtime behavior must be replayable, stable, and inspectable. If this proof made fresh model calls, it would blur two different questions: whether the vector corpus is clean enough, and whether retrieval over that corpus is good enough.
What Changed
The repo now has a dedicated runtime retrieval proof module, CLI command, tests, deterministic reranking, and durable proof receipts. Full validation checks the receipt, JSONL row count, local exact-cosine behavior, deterministic fallback boundary, semantic caveat reporting, rerank activity, rerank caveat reduction, no-live-Gemini flag, and external-surface blocks.
| Artifact | Purpose |
|---|---|
src/aina_data_engine/runtime_retrieval_proof.py | Runs the proof over existing vector snapshots. |
tests/test_runtime_retrieval_proof.py | Protects the local exact-cosine and boundary behavior. |
artifacts/validation/ain_510_runtime_retrieval_proof_v1.json | Machine-readable receipt. |
artifacts/validation/ain_510_runtime_retrieval_proof_v1.jsonl | Per-case proof rows. |
Current Proof
The proof status is pass_with_rerank_caveats, with valid: true and no failed checks. All 17 planned local synthetic cases received role anchors plus workflow, curriculum, and evaluator retrieval candidates. The deterministic reranker improved 4 caveated cases and left 4 unresolved for source-authority repair. The 6 blocked or held cases did not use vector authority.
| Target | Vector rows |
|---|---|
| Workflow | 33 |
| Curriculum | 252 |
| Evaluator | 220 |
What The Caveats Mean
The good news is that every planned runtime case found retrieval candidates, and reranking cut unresolved caveats from 8 cases to 4. It improved retail sales, bilingual technical support, marketing operations, and technical support assistant manager.
The remaining weak cases are useful signals, not failures to hide: BI, data analyst, web designer, and generic case manager still need stronger curriculum/evaluator source authority or role specificity before vector-selected recommendations should be trusted.
Validation Added
Full validation now asserts proof existence, JSONL existence, report existence, HTML existence, receipt validity, local exact-cosine coverage, deterministic fallback retention, semantic caveat reporting, rerank activity, rerank caveat reduction, external-surface blocks, and no live Gemini invocation.
The production line remains closed: no public runtime, no real-user data, no external writes, no production telemetry, and no production embedding authority promotion.
Resume Commands
cd /srv/aina/aina-data-engine-room
git status --short --branch
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-510-runtime-authority-contract --request-local-authority
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-510-runtime-retrieval-proof
jq '{status, valid, live_gemini_api_invoked, metrics, runtime_boundary, failed_checks}' artifacts/validation/ain_510_runtime_retrieval_proof_v1.json
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
Start with the four unresolved rows: they are now the clearest path from a locally useful retrieval proof to a semantically reliable personalization runtime.