# Linear Update Payloads - Gemini Workflow Embedding

Ali Mehdi Mukadam - co-authored with Codex

Linear posting was blocked by `401 auth_revoked`, so these ready-to-post payloads were preserved locally.

## AIN-506

```markdown
2026-06-12 update: Gemini clean-before-embed workflow slice completed locally on the VDS.

Proof:
- Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks.
- 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block.
- Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`.
- Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`.
- Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`.
- Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43.

Validation:
- `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed.
- `ruff check` on touched production embedding files/tests -> pass.
- `source-authority-start-here` -> pass.
- `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion.
- `ain-506-p0-gate` -> pass.
- `validate` -> pass.
- `git diff --check` -> pass.

Handoff:
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md`
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html`

Artifact policy cleanup:
- `artifacts/embeddings/` was removed from the git index but kept on disk and ignored going forward. Bulk vectors/Parquet/DuckDB/batch manifests are local build outputs; durable JSON/report receipts remain tracked.
```

## AIN-508

```markdown
2026-06-12 update: clean-before-embed gate advanced for workflow data.

The key correction was semantic QA before embedding: 7 repaired workflow chunks were excluded by exact `(chunk_id, text_hash)`, instruction-prefix and placeholder noise were removed, and only the 43 reviewed non-blocked rows were embedded. The remaining 39 workflow rows are not batch-cleared yet.

Proof:
- Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks.
- 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block.
- Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`.
- Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`.
- Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`.
- Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43.

Validation:
- `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed.
- `ruff check` on touched production embedding files/tests -> pass.
- `source-authority-start-here` -> pass.
- `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion.
- `ain-506-p0-gate` -> pass.
- `validate` -> pass.
- `git diff --check` -> pass.

Handoff:
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md`
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html`

```

## AIN-510

```markdown
2026-06-12 update: retrieval promotion gate now has workflow vectors, but remains correctly blocked for runtime promotion.

AIN-510 result:
- Status: `blocked_for_runtime_promotion`
- Promotion eligible: `false`
- Valid vectors: `6,077`
- Workflow vectors: `43`
- Stale vectors: `0`
- Known similar pairs: `50`
- Known dissimilar pairs: `50`
- Cosine gap: `0.207005`
- Remaining failed checks: `gate_4_sensitive_mismatch_fixture_suite_present`, `gate_5_runtime_rollback_proof_present`

Proof:
- Scoped `jobs_research_workflow` repaired corpus: 89 repaired workflow chunks.
- 50-row adversarial semantic spot check: 37 pass, 6 caution, 7 block.
- Enforced quality exclusions: `/srv/aina/aina-data-engine-room/artifacts/validation/production_embedding_quality_exclusions_v1.json`.
- Dry run: `candidate_count: 43`, `quality_exclusion_row_count: 7`, `quality_excluded_candidate_count: 7`, `orphan_existing_vector_count_pruned: 0`, `live_gemini_api_invoked: false`.
- Live Vertex Gemini Embedding 2: 43 new `jobs_research_workflow` vectors, `failed_count: 0`.
- Current vector snapshot: 6,077 total Gemini vectors; top 1,000 = 1,000; top 500 = 500; workflow vectors = 43.

Validation:
- `pytest tests/test_production_embeddings.py tests/test_embedding_contracts.py tests/test_source_authority_start_here.py -q` -> 53 passed.
- `ruff check` on touched production embedding files/tests -> pass.
- `source-authority-start-here` -> pass.
- `ain-510-retrieval-promotion-gate` -> valid, still blocked for runtime promotion.
- `ain-506-p0-gate` -> pass.
- `validate` -> pass.
- `git diff --check` -> pass.

Handoff:
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.md`
- `/srv/aina/aina-data-engine-room/docs/handoff/2026-06-12-gemini-workflow-embedding-handoff.html`

```

## Footer

Ali Mehdi Mukadam - co-authored with Codex - 2026-06-12

```yaml
topics:
  - aina-data-engine-room
  - linear-proof
subtopics:
  - gemini-embeddings
  - workflow-vectors
  - ain-506
  - ain-508
  - ain-510
```