# Linear Update Payloads - Gemini Workflow Embedding Ali Mehdi Mukadam - co-authored with Codex Linear posting was blocked by `401 auth_revoked`, so these ready-to-post payloads were preserved locally. ## AIN-506 ```markdown 2026-06-12 update: Gemini clean-before-embed workflow slice completed locally on the VDS. Proof: - Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks. - 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block. - Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`. - Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`. - Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`. - Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43. Validation: - `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed. - `ruff check` on touched production embedding files/tests -> pass. - `source-authority-start-here` -> pass. - `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion. - `ain-506-p0-gate` -> pass. - `validate` -> pass. - `git diff --check` -> pass. Handoff: - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md` - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html` Artifact policy cleanup: - `artifacts/embeddings/` was removed from the git index but kept on disk and ignored going forward. Bulk vectors/Parquet/DuckDB/batch manifests are local build outputs; durable JSON/report receipts remain tracked. ``` ## AIN-508 ```markdown 2026-06-12 update: clean-before-embed gate advanced for workflow data. The key correction was semantic QA before embedding: 7 repaired workflow chunks were excluded by exact `(chunk_id, text_hash)`, instruction-prefix and placeholder noise were removed, and only the 43 reviewed non-blocked rows were embedded. The remaining 39 workflow rows are not batch-cleared yet. Proof: - Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks. - 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block. - Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`. - Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`. - Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`. - Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43. Validation: - `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed. - `ruff check` on touched production embedding files/tests -> pass. - `source-authority-start-here` -> pass. - `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion. - `ain-506-p0-gate` -> pass. - `validate` -> pass. - `git diff --check` -> pass. Handoff: - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md` - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html` ``` ## AIN-510 ```markdown 2026-06-12 update: retrieval promotion gate now has workflow vectors, but remains correctly blocked for runtime promotion. AIN-510 result: - Status: `blocked_for_runtime_promotion` - Promotion eligible: `false` - Valid vectors: `6,077` - Workflow vectors: `43` - Stale vectors: `0` - Known similar pairs: `50` - Known dissimilar pairs: `50` - Cosine gap: `0.207005` - Remaining failed checks: `gate_4_sensitive_mismatch_fixture_suite_present`, `gate_5_runtime_rollback_proof_present` Proof: - Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks. - 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block. - Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`. - Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`. - Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`. - Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43. Validation: - `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed. - `ruff check` on touched production embedding files/tests -> pass. - `source-authority-start-here` -> pass. - `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion. - `ain-506-p0-gate` -> pass. - `validate` -> pass. - `git diff --check` -> pass. Handoff: - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md` - `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html` ``` ## Footer Ali Mehdi Mukadam - co-authored with Codex - 2026-06-12 ```yaml topics: - aina-data-engine-room - linear-proof subtopics: - gemini-embeddings - workflow-vectors - ain-506 - ain-508 - ain-510 ```