Source Authority And Runtime Reconciliation Checkpoint
The production spine now reconciles chunk/vector authority, source registry, runtime contracts, and JD-aware role context from live repo state.
This checkpoint proves the next serial production-spine slice after the terminology cleanup: the current chunk/vector snapshot, source-authority registry, promoted runtime contracts, and JD-aware role-context evidence all reconcile from live repo state. The important product correction is now backed by receipts: titles are not being treated in isolation when role context is available; the engine has a JD-aware evidence layer that joins title, job context, company reference, responsibility snippets, tools, source refs, and explicit gaps.
Receipts Regenerated
| Receipt | Status | Why It Matters |
|---|---|---|
source_authority_start_here_v1 | pass | Confirms clean-before-embed stance and source-authority prerequisites. |
production_chunk_vector_reconciliation_v1 | pass | Reconciles base chunks, repaired overlays, vector rows, and stale-vector state. |
source_authority_registry_v2 | pass | Classifies all chunk families and keeps labels as metadata, not truth. |
production_runtime_contracts_v1 | pass | Promotes product-facing contracts above raw warehouse tables. |
jd_aware_role_context_evidence_v1 | pass | Builds JD-aware role-context evidence and 50 real-row E2E fixtures. |
production_embedding_semantic_qa_v1__source_family=jd_aware_role_context | pass | Spot-checks 50 JD-aware chunks before any embedding scale-up. |
Key Counts
27,844 repaired chunks
322,515 combined chunks
0 stale vectors
316,009 unvectorized chunks
44,440 clean candidates
15,104 trusted jobs-research titles
1,004 role-context rows
1,004 resolution decisions
JD-Aware Role Context Proof
The JD-aware receipt is the direct answer to the title-only trap. It proves 50 E2E fixtures trace to real linkedin_jobs rows, top 500 and top 1,000 titles have role context or explicit gaps, and source rows were not mutated.
| Measure | Count |
|---|---|
| Top 500 with role context | 485 |
| Top 500 explicit gaps | 14 |
| Top 1,000 with role context | 946 |
| Top 1,000 explicit gaps | 50 |
| Rows with job context | 970 |
| Rows with JD summaries | 954 |
| Rows with responsibility snippets | 954 |
| Rows with tool mentions | 715 |
Embedding eligibility is intentionally conservative: 660 reference-only, 294 progressive-only, 34 blocked, and 16 repair-first.
Runtime Contract Proof
The runtime contract receipt confirms product consumers read promoted contracts, not raw warehouse tables. Runtime payload contracts, route contracts, role-context evidence, and role-resolution decisions are present; ambiguous roles can abstain; assessment seed remains an onboarding seed only; and runtime boundaries stay local-only.
Commands And Results
uv run aina-data-engine --root /srv/aina/aina-data-engine-room source-authority-start-here
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-chunk-vector-reconciliation
uv run aina-data-engine --root /srv/aina/aina-data-engine-room source-authority-registry-v2
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-contracts
uv run aina-data-engine --root /srv/aina/aina-data-engine-room jd-aware-role-context-evidence --fixture-limit 50
uv run pytest tests/test_chunk_vector_reconciliation.py tests/test_source_authority_registry_v2.py tests/test_source_authority_start_here.py tests/test_production_runtime_contracts.py tests/test_runtime_source_authority_repair.py tests/test_jd_aware_role_context.py -q
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
All receipt commands passed. JD-aware semantic QA sampled 50 rows with 50 passing, 0 failing, 0 raw-JD key hits, and 0 legacy review-gate hits. Focused pytest reported 13 passed in 1.18s. Validate passed.
What Remains
This checkpoint does not finish the full Personalization Engine production goal. Next, use the JD-aware evidence and runtime contracts to harden AI Fluency maps for the real-row fixture set, semantically inspect 50 real rows across risk categories, repair the blocked and repair-first rows, and only then move clean chunks into the next Gemini embedding ladder.
Start by semantically spot-checking the 50 JD-aware fixtures and top-band gap rows; that is the bridge between structural proof and user-facing quality.