AIN-506 · VDS local lane · 2026-06-12

Semantic Sidecar Handoff

A build-time embedding evaluation layer now exists over the real local personalization assets, while Gemini runtime generation stays deliberately off.

Ali Mehdi Mukadam - co-authored with Codex - branch ali/ain-506-p0-gate-2026-06-12

The Single Idea

AIN-506 now has two layers: the P0 contract gate proves the rules are in place, and the semantic sidecar proves those rules can be exercised against real local personalization assets without calling Gemini or changing runtime behavior.

01 - Status

Implemented And Passing

The AIN-506 build-time semantic sidecar is implemented and passing locally on the VDS. It takes the current Personalization Engine assets, turns them into AIN-506-compatible semantic chunks and vector rows, stores them in Parquet plus DuckDB, and evaluates deterministic exact-cosine matches across title, runtime payload, Hugging Face, GDPval, and review-dashboard assets.

This is not live Gemini runtime embedding generation yet. Runtime embedding authority remains off, live Gemini API calls remain blocked, and real learner data is not embedded.

02 - What Changed

From Contracts To A Measurable Sidecar

ContractP0 keeps Gemini model, dimensions, schemas, storage, and no-live-runtime rules pinned.
SidecarLocal assets become semantic chunks, vector rows, a DuckDB catalog, and match receipts.
EvaluationExact-cosine probes compare candidate matches without promoting runtime behavior.
03 - Artifacts

What A Future Agent Should Open First

ArtifactPath
Receipt JSONartifacts/validation/ain_506_semantic_sidecar_v1.json
Match JSONLartifacts/validation/ain_506_semantic_sidecar_v1_matches.jsonl
Reportartifacts/reports/ain_506_semantic_sidecar_v1.md
Chunk Parquetartifacts/embeddings/sidecar/chunks/schema_version=embedding_contract_v1/ain_506_semantic_sidecar_v1.parquet
Vector Parquetartifacts/embeddings/sidecar/vectors/model=gemini-embedding-2/dim=768/schema_version=embedding_contract_v1/ain_506_semantic_sidecar_v1.parquet
Batch Manifestartifacts/embeddings/sidecar/gemini_batch_manifest/ain_506_semantic_sidecar_v1.jsonl
DuckDBartifacts/embeddings/sidecar/ain_506_semantic_sidecar_v1.duckdb
04 - Metrics

Latest Sidecar Run

MetricValue
Chunks488
Embedding records488
Query probes50
Hash-skip reused488
Same-family top-1 rate0.94
Same-SOC top-1 rate0.66
Cross-asset top-5 rate1.0
Sensitive top-1 mismatches3

The sidecar currently surfaces top_worked_title, title_coverage_expansion, runtime_payload, hf_role_signal, gdpval_task, and review_dashboard chunks.

05 - Proof Commands

Commands That Passed

Codex/VDS - verification - rerun the lane without enabling Gemini runtime
uv run ruff check src/aina_data_engine/semantic_sidecar.py src/aina_data_engine/cli.py tests/test_semantic_sidecar.py pyproject.toml
uv run pytest tests/test_embedding_contracts.py tests/test_semantic_sidecar.py -q
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-p0-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-semantic-sidecar
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
uv run pytest -q
Watch-out: do not turn this into live Gemini runtime generation while verifying the sidecar.

Results: Ruff passed, focused tests passed with 10 tests, the P0 gate passed, the sidecar passed with 488 chunks, repo validation passed, and full pytest passed with 228 tests.

06 - Next Milestone

Controlled Gemini Batch Comparison

The next useful milestone is a paid Gemini batch comparison, still build-time only. Freeze a representative sidecar corpus, decide the maximum spend, submit the generated batch manifest, store returned vectors beside the deterministic vectors, compare quality and mismatch rates, and promote nothing unless the Gemini run beats the baseline without increasing risky matches.

The untracked ledger files docs/MAPPING-CHAIN-LEDGER.md, docs/TITLE-LEDGER.md, and docs/TITLE-LEDGER.html pre-existed this lane and were not touched.

Where To Start

Start with the sidecar receipt, then compare its matches before deciding whether a real Gemini batch run is worth spending on.