Semantic Sidecar Handoff
A build-time embedding evaluation layer now exists over the real local personalization assets, while Gemini runtime generation stays deliberately off.
AIN-506 now has two layers: the P0 contract gate proves the rules are in place, and the semantic sidecar proves those rules can be exercised against real local personalization assets without calling Gemini or changing runtime behavior.
Implemented And Passing
The AIN-506 build-time semantic sidecar is implemented and passing locally on the VDS. It takes the current Personalization Engine assets, turns them into AIN-506-compatible semantic chunks and vector rows, stores them in Parquet plus DuckDB, and evaluates deterministic exact-cosine matches across title, runtime payload, Hugging Face, GDPval, and review-dashboard assets.
This is not live Gemini runtime embedding generation yet. Runtime embedding authority remains off, live Gemini API calls remain blocked, and real learner data is not embedded.
From Contracts To A Measurable Sidecar
- Added
src/aina_data_engine/semantic_sidecar.py. - Added CLI command
ain-506-semantic-sidecar. - Added
tests/test_semantic_sidecar.py. - Added the sidecar report and embedding artifacts under
artifacts/.
What A Future Agent Should Open First
| Artifact | Path |
|---|---|
| Receipt JSON | artifacts/validation/ain_506_semantic_sidecar_v1.json |
| Match JSONL | artifacts/validation/ain_506_semantic_sidecar_v1_matches.jsonl |
| Report | artifacts/reports/ain_506_semantic_sidecar_v1.md |
| Chunk Parquet | artifacts/embeddings/sidecar/chunks/schema_version=embedding_contract_v1/ain_506_semantic_sidecar_v1.parquet |
| Vector Parquet | artifacts/embeddings/sidecar/vectors/model=gemini-embedding-2/dim=768/schema_version=embedding_contract_v1/ain_506_semantic_sidecar_v1.parquet |
| Batch Manifest | artifacts/embeddings/sidecar/gemini_batch_manifest/ain_506_semantic_sidecar_v1.jsonl |
| DuckDB | artifacts/embeddings/sidecar/ain_506_semantic_sidecar_v1.duckdb |
Latest Sidecar Run
| Metric | Value |
|---|---|
| Chunks | 488 |
| Embedding records | 488 |
| Query probes | 50 |
| Hash-skip reused | 488 |
| Same-family top-1 rate | 0.94 |
| Same-SOC top-1 rate | 0.66 |
| Cross-asset top-5 rate | 1.0 |
| Sensitive top-1 mismatches | 3 |
The sidecar currently surfaces top_worked_title, title_coverage_expansion, runtime_payload, hf_role_signal, gdpval_task, and review_dashboard chunks.
Commands That Passed
uv run ruff check src/aina_data_engine/semantic_sidecar.py src/aina_data_engine/cli.py tests/test_semantic_sidecar.py pyproject.toml uv run pytest tests/test_embedding_contracts.py tests/test_semantic_sidecar.py -q uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-p0-gate uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-semantic-sidecar uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate uv run pytest -q
Results: Ruff passed, focused tests passed with 10 tests, the P0 gate passed, the sidecar passed with 488 chunks, repo validation passed, and full pytest passed with 228 tests.
Controlled Gemini Batch Comparison
The next useful milestone is a paid Gemini batch comparison, still build-time only. Freeze a representative sidecar corpus, decide the maximum spend, submit the generated batch manifest, store returned vectors beside the deterministic vectors, compare quality and mismatch rates, and promote nothing unless the Gemini run beats the baseline without increasing risky matches.
The untracked ledger files docs/MAPPING-CHAIN-LEDGER.md, docs/TITLE-LEDGER.md, and docs/TITLE-LEDGER.html pre-existed this lane and were not touched.
Start with the sidecar receipt, then compare its matches before deciding whether a real Gemini batch run is worth spending on.