PE Donor Promotion Package Checkpoint

2026-06-15

PE Donor Promotion Package Checkpoint

Date: 2026-06-15 Branch: codex/pe-donor-promotion-2026-06-15 Latest package command: uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-industry-taxonomy-support

The Single Idea

The old personalization-engine-aina work is now represented inside the engine room as a self-contained promotion package, not as a donor repo that future agents have to trust wholesale. Contract-shaped and evidence-shaped rows can be derived into engine-room receipts; prompt/workflow/ontology bulk stays advisory; known-bad market/K2 lineage stays quarantined.

What Changed

Added pe_donor_promotion_package_v1, a first-class artifact lane that reads the existing prior_work_source_authority_promotion_v1 receipt and extracts only the personalization_engine_aina rows into a smaller promotion queue.

Added pe_donor_derived_contracts_v1, a second artifact lane that promotes only two contract-shaped rows from that package into engine-room derived contract candidates:

pe_aina_ship_loop_operator_contract
pe_aina_evidence_roadmap_internal_preview

Added pe_donor_export_runtime_mapping_v1, a third artifact lane that maps those two derived contracts onto current engine-room consumer surfaces without mutating exports or runtime contracts.

Added pe_donor_foundation_source_mapping_v1, a fourth artifact lane that promotes the verified PE foundation-status source as repair-reduction lineage only. This is the narrow bridge for the old repo’s useful status/ontology/tool/title-mapping work: it points to current engine-room proof surfaces instead of re-importing donor rows wholesale.

Added pe_donor_workflow_grounding_mapping_v1, a fifth artifact lane that captures the post-May-15 workflow-grounding consensus batches as lineage and anti-loop proof. This is intentionally not a bulk row promotion: the donor summary reviewed 250 rows, but only 4 were production-allowed and 246 stayed not promoted or pending. The engine room now preserves that work so agents do not redo it, while still keeping workflow bulk behind current repair and semantic QA gates.

Added pe_donor_title_taxonomy_gate_v1, a sixth artifact lane that captures the deterministic PE title-taxonomy bucket output as gate-pending lineage. The donor output is useful because it canonically pooled JDs by bucket before normalization and produced 2,026 deterministic bucket records, but its bundled 50-row audit template is unscored. The engine room now records the source, provenance, SHA256s, and current replacement proof while blocking row promotion, runtime authority, embedding authority, and batch authority until a scored audit or current-repo diff proof exists.

Added pe_donor_title_taxonomy_audit_v1, a seventh artifact lane that deterministically checks the donor’s 50-row audit sample against the donor JSONL, donor hashes, and current engine-room aggregate function support. This does not use an LLM and does not promote rows. It proves the donor files are fresh against the gate receipt, confirms all 50 sampled buckets still exist and match donor JD counts, and separates 32 replacement-diff candidates from 11 generic/noisy lineage-only rows and 7 blocked rows.

Added pe_donor_prompt_workflow_ontology_inventory_v1, an eighth artifact lane that splits the old repo’s prompt/workflow/ontology bulk into source-family inventory rows. This is Ali’s “do not lose the valuable prompt, image prompt, workflow, and ontology work” concern captured as durable proof: 10 donor families are hashed and counted, but still blocked from row promotion, runtime authority, embedding authority, and batch authority until each family passes repair/diff/quality gates.

Added pe_donor_curriculum_release_lineage_v1, a ninth artifact lane that captures the old repo’s curriculum-engine release packets, curricula, polished packet, and reports as repair lineage. This preserves the valuable packet/curriculum/mastery-gate shape without importing raw profile text or promoting stale role/workflow joins. The live receipt hashes 14 files, sees 46 curriculum modules, blocks 6 packet rows for empty role tasks/workflows plus supply-safety bypass evidence, and blocks the weak founder/ecommerce role match before any runtime, export, embedding, or batch authority.

Added pe_donor_source_intelligence_scaleout_lineage_v1, a tenth artifact lane that captures the old repo’s source-intelligence scaleout package as a 30-family import-decision ledger. This closes the loop on the parallel audit reports Ali ran: the 71-file donor package is useful as lineage, deterministic script comparison input, alpha feedback support, review-packet support, and future diff candidates, but it is not direct runtime/export/embedding/batch/row authority. The receipt explicitly preserves 4 future import candidates, 7 repair-first families, 8 already-accounted-for families, 6 advisory-only families, 4 superseded families, and 1 blocked raw surface.

Added e5_source_authority_reconciliation_v1, an eleventh artifact lane that turns the E5 title ledger, E6 mapping-chain ledger, source-authority registry, Academy export manifest, LinkedIn/JD source-intake, JD-aware role context, chunk/vector reconciliation, prior-work source promotion, and full validation into an explicit anti-loop receipt. This is the machine-checkable answer to Ali’s concern that agents were re-reviewing individual titles when contextual source authority already existed. The live receipt accounts for 9 of 9 required assets, blocks fresh title-level LLM review for all accounted assets, records 129,165 LinkedIn/JD context rows and 151,983 current vectors, and still grants no runtime, embedding, batch, donor-mutation, external-write, or public-runtime authority.

Added prior_work_promotion_delta_closure_v1, a twelfth artifact lane that reconciles the original 16 promote_candidate_verify_first rows from prior_work_source_authority_promotion_v1 against the specialized gates that now exist. This closes the generic donor backlog without pretending the rows are production data: 6 are covered by current engine-room receipts, 7 by specialized donor gates, 2 are advisory-only, and 1 is a future deterministic diff lane. Fresh title-level LLM review remains blocked for all 16 rows.

Added pe_donor_alpha_feedback_support_v1, a thirteenth artifact lane that turns the source-intelligence alpha_feedback_bundle_v2 import candidate into a hashed review-support receipt. It preserves 40 examples, marks 36 as review-support candidates, blocks 4 noisy marketplace-title examples, excludes raw workflow/practice/proof text, and still grants no export, runtime, row-promotion, embedding, batch, public-runtime, real-user-data, external-write, or donor-mutation authority.

Added pe_donor_review_packet_support_v1, a fourteenth artifact lane that turns the source-intelligence review_packets_v1 import candidate into a packet-footprint support receipt. It preserves the four 500-row review packet file pairs as hashes, column shapes, and packet counts, excludes all 2,000 donor packet rows and old auto-reviewed labels, redacts the legacy status column name, and still grants no export, runtime, row-promotion, embedding, batch, public-runtime, real-user-data, external-write, or donor-mutation authority.

Added pe_donor_industry_taxonomy_support_v1, a fifteenth artifact lane that turns the source-intelligence industry_taxonomy_decisions_v1 import candidate into aggregate repair-reduction support. It verifies the full 17,118-row donor JSONL/CSV against provenance, summarizes only four decision groups, excludes individual title/category rows and old auto-reviewed labels, and still grants no export, runtime, row-promotion, embedding, batch, public-runtime, real-user-data, external-write, or donor-mutation authority.

Promoted that mapping into the actual consumer receipts as lineage-only input:

production_runtime_contracts_v1 now requires the PE donor export/runtime mapping receipt, records two donor lineage rows and six surface refs, and keeps runtime/embedding authority false.
engine_room_export_manifest_v1 now records personalization_engine_aina_donor_lineage as a product-consumed source family while still exporting only pinned top-500/top-1,000 static rows.
source_authority_registry_v2 now carries the two audited PE donor lineage rows as advisory_lineage, never runtime authority or embedding authority.
source_authority_registry_v2 now also carries the verified PE foundation source row as source_evidence, still never runtime authority or embedding authority.
source_authority_registry_v2 now also carries the PE workflow-grounding consensus row as source_evidence, still never runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the PE title-taxonomy gate row as source_evidence, including deterministic audit candidate/freshness counts, still never row authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the prompt/workflow/ontology bulk inventory row as source_evidence, including candidate-family, file, parseable-row, invalid-row, and legacy-reviewer-wording counts, still never row authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the curriculum release lineage row as source_evidence, including packet/curriculum/report counts, module count, repair blockers, and weak-match evidence, still never row authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the source-intelligence scaleout row as source_evidence, including asset-family, donor-file, repair-first, and import-candidate counts, still never row authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the alpha-feedback support row as source_evidence, including 40 examples, 36 review-support candidates, and 4 noisy-title blocks, still never row authority, export authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the review-packet support row as source_evidence, including 4 review packets, 2,000 counted CSV rows, and 4 packets with 500 rows each, still never row authority, export authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the industry-taxonomy support row as source_evidence, including 17,118 verified donor decisions and 4 aggregate decision groups, still never row authority, export authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the E5/E6 source-authority reconciliation row as source_evidence, including accounted-asset count, top-500/top-1,000 export counts, LinkedIn context count, and fresh-LLM-review block count, still never row authority, runtime authority, embedding authority, or batch authority.
source_authority_registry_v2 now also carries the prior-work promotion delta closure row as source_evidence, including candidate count, generic-open count, future-lane count, and fresh-title-LLM-review block count, still never row authority, runtime authority, embedding authority, or batch authority.
jd_aware_role_context_evidence_v1 now consumes the PE donor mapping as lineage while using gold-spine evidence IDs and exact local LinkedIn title matches to recover JD-aware context.

The lane writes four durable artifacts:

artifacts/validation/pe_donor_promotion_package_v1.json
artifacts/validation/pe_donor_promotion_package_v1.jsonl
artifacts/reports/pe_donor_promotion_package_v1.md
artifacts/reports/pe_donor_promotion_package_v1.html
artifacts/validation/pe_donor_foundation_source_mapping_v1.json
artifacts/validation/pe_donor_foundation_source_mapping_v1.jsonl
artifacts/reports/pe_donor_foundation_source_mapping_v1.md
artifacts/reports/pe_donor_foundation_source_mapping_v1.html
artifacts/validation/pe_donor_workflow_grounding_mapping_v1.json
artifacts/validation/pe_donor_workflow_grounding_mapping_v1.jsonl
artifacts/reports/pe_donor_workflow_grounding_mapping_v1.md
artifacts/reports/pe_donor_workflow_grounding_mapping_v1.html
artifacts/validation/pe_donor_title_taxonomy_gate_v1.json
artifacts/validation/pe_donor_title_taxonomy_gate_v1.jsonl
artifacts/reports/pe_donor_title_taxonomy_gate_v1.md
artifacts/reports/pe_donor_title_taxonomy_gate_v1.html
artifacts/validation/pe_donor_title_taxonomy_audit_v1.json
artifacts/validation/pe_donor_title_taxonomy_audit_v1.jsonl
artifacts/reports/pe_donor_title_taxonomy_audit_v1.md
artifacts/reports/pe_donor_title_taxonomy_audit_v1.html
artifacts/validation/pe_donor_prompt_workflow_ontology_inventory_v1.json
artifacts/validation/pe_donor_prompt_workflow_ontology_inventory_v1.jsonl
artifacts/reports/pe_donor_prompt_workflow_ontology_inventory_v1.md
artifacts/reports/pe_donor_prompt_workflow_ontology_inventory_v1.html
artifacts/validation/pe_donor_curriculum_release_lineage_v1.json
artifacts/validation/pe_donor_curriculum_release_lineage_v1.jsonl
artifacts/reports/pe_donor_curriculum_release_lineage_v1.md
artifacts/reports/pe_donor_curriculum_release_lineage_v1.html
artifacts/validation/pe_donor_source_intelligence_scaleout_lineage_v1.json
artifacts/validation/pe_donor_source_intelligence_scaleout_lineage_v1.jsonl
artifacts/reports/pe_donor_source_intelligence_scaleout_lineage_v1.md
artifacts/reports/pe_donor_source_intelligence_scaleout_lineage_v1.html
artifacts/validation/e5_source_authority_reconciliation_v1.json
artifacts/validation/e5_source_authority_reconciliation_v1.jsonl
artifacts/reports/e5_source_authority_reconciliation_v1.md
artifacts/reports/e5_source_authority_reconciliation_v1.html
artifacts/validation/prior_work_promotion_delta_closure_v1.json
artifacts/validation/prior_work_promotion_delta_closure_v1.jsonl
artifacts/reports/prior_work_promotion_delta_closure_v1.md
artifacts/reports/prior_work_promotion_delta_closure_v1.html
artifacts/validation/pe_donor_alpha_feedback_support_v1.json
artifacts/validation/pe_donor_alpha_feedback_support_v1.jsonl
artifacts/reports/pe_donor_alpha_feedback_support_v1.md
artifacts/reports/pe_donor_alpha_feedback_support_v1.html
artifacts/validation/pe_donor_review_packet_support_v1.json
artifacts/validation/pe_donor_review_packet_support_v1.jsonl
artifacts/reports/pe_donor_review_packet_support_v1.md
artifacts/reports/pe_donor_review_packet_support_v1.html
artifacts/validation/pe_donor_industry_taxonomy_support_v1.json
artifacts/validation/pe_donor_industry_taxonomy_support_v1.jsonl
artifacts/reports/pe_donor_industry_taxonomy_support_v1.md
artifacts/reports/pe_donor_industry_taxonomy_support_v1.html

It is also wired into the CLI as:

uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-promotion-package
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-derived-contracts
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-export-runtime-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-foundation-source-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-workflow-grounding-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-title-taxonomy-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-title-taxonomy-audit
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-prompt-workflow-ontology-inventory
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-curriculum-release-lineage
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-source-intelligence-scaleout-lineage
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-alpha-feedback-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-review-packet-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-industry-taxonomy-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room e5-source-authority-reconciliation
uv run aina-data-engine --root /srv/aina/aina-data-engine-room prior-work-promotion-delta-closure

Current Proof

The live package receipt is valid and reports:

Metric	Value
PE donor rows	7
Derive-ready rows	5
Advisory-only rows	1
Quarantined rows	1
Embedding rows allowed now	0
Runtime authority rows allowed now	0

The derived-contract receipt is also valid and reports:

Metric	Value
Derived contract rows	2
Expected source rows present	2
Embedding rows allowed now	0
Runtime authority rows allowed now	0

The export/runtime mapping receipt is valid and reports:

Metric	Value
Mapping rows	2
Mapped surface refs	6
Embedding rows allowed now	0
Runtime authority rows allowed now	0

It maps the donor-derived contracts to:

engine_room_export_manifest_v1
ai_fluency_headless_loop_v1
production_runtime_contracts_v1
jd_aware_role_context_evidence_v1
source_authority_registry_v2

The verified-foundation mapping receipt is valid and reports:

Metric	Value
Mapping rows	1
Mapped source-authority surfaces	5
Embedding rows allowed now	0
Runtime authority rows allowed now	0
LLM title review allowed rows	0

It maps pe_aina_verified_data_foundation_status to:

production_source_authority_registry_v1
production_source_authority_inventory_v1
source_authority_registry_v2
jd_aware_role_context_evidence_v1
named_tool_source_authority_v1

The workflow-grounding mapping receipt is valid and reports:

Metric	Value
Mapping rows	1
Mapped proof surfaces	6
Donor reviewed rows	250
Donor production-allowed rows	4
Donor not-promoted rows	246
Donor hard-stop rows	245
Workflow-seed semantic QA failures	5
Workflow-intelligence repaired QA pass count	50
Embedding rows allowed now	0
Runtime authority rows allowed now	0
Batch rows allowed now	0

It maps pe_aina_workflow_grounding_consensus_batches to the current workflow proof surfaces:

source_authority_registry_v2
production_chunk_vector_reconciliation_v1
workflow_seed_embedding_eligibility
workflow_intelligence_embedding_eligibility
workflow_seed_semantic_qa
workflow_intelligence_repaired_semantic_qa

The title-taxonomy gate receipt is valid and reports:

Metric	Value
Mapping rows	1
Donor bucket count	2,026
Donor provenance bucket count	2,026
Donor audit sample rows	50
Donor audit verified rows	0
Donor audit unscored rows	50
Current beta title rows	74,225
Current serviceable title rows	50,053
Clean candidate rows	44,440
Trusted jobs-research titles	15,104
Top 500 titles with role context	484
Top 1,000 titles with role context	964
Row promotion allowed now	0
Embedding rows allowed now	0
Runtime authority rows allowed now	0
Batch rows allowed now	0

It maps pe_aina_title_taxonomy_bucket_outputs to the current title/source proof surfaces:

beta_readiness_path_v1
production_source_authority_registry_v1
jd_aware_role_context_evidence_v1
top_worked_title_readiness_v1
source_authority_registry_v2

The title-taxonomy donor paths are intentionally not serialized as live host paths. Receipts now use sanitized external-ref:* references plus SHA256s so the engine room preserves provenance without exposing local donor filesystem layout.

The title-taxonomy deterministic audit receipt is valid and reports:

Metric	Value
Audit rows parsed	50
Buckets found in donor JSONL	50
JD-count matches	50
Current function-supported rows	43
Replacement-diff candidates	32
Generic/noisy lineage-only rows	11
Blocked missing/mismatch rows	7
Noisy tool rows	0
Noisy responsibility rows	0
Row promotion allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the donor bucket JSONL, provenance file, and audit template hashes still match the gate receipt. This closes the immediate freshness loop Claude flagged, while keeping the actual promotion threshold and replacement-diff application as a future deliberate slice.

The prompt/workflow/ontology inventory receipt is valid and reports:

Metric	Value
Inventory rows	10
Existing donor source families	10
Files inventoried	7,916
JSON files	3,040
Markdown files	2,010
Schema files	13
JSONL lines counted	48,541
Parseable JSON rows	48,534
Invalid JSON rows	8
Rows with role IDs	48,513
Rows with workflow IDs	39,863
Rows with prompt instructions	39,863
Legacy reviewer wording rows	14,798
Row promotion allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the coarse PE bulk row exists in the donor package, source refs are sanitized, legacy human_review fields are not serialized, every row is inventory-only, donor repos stay read-only, and no live Gemini call was made. The 8 invalid JSON rows and legacy wording count are now explicit repair inputs instead of hidden risk.

The curriculum-release lineage receipt is valid and reports:

Metric	Value
Donor files inventoried	14
Packet JSON files	5
Polished packet JSON files	1
Curriculum JSON files	5
Report markdown files	3
Curriculum modules seen	46
Packet rows with empty role tasks	6
Packet rows with empty role workflows	6
Rows with supply-safety bypass evidence	6
Rows with profile or learner context excluded	11
Weak role-match rows	1
Row promotion allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove donor files are hashed, source refs are sanitized, raw profile text is not serialized, profile/learner context is excluded from import, the weak founder/ecommerce match is blocked, current contract receipts are present, donor repos stay read-only, and no live Gemini call was made. This is the engine-room proof that the old curriculum work is valuable lineage and shape evidence, not current product authority.

The source-intelligence scaleout lineage receipt is valid and reports:

Metric	Value
Donor files inventoried	71
Asset families classified	30
Already-accounted-for families	8
Future import candidates	4
Repair-first families	7
Advisory-only families	6
Superseded-by-engine-room families	4
Blocked raw surfaces	1
Row promotion allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the real donor package is present, has the expected 71-file footprint, matches the 30-family classification ledger from the parallel source-intelligence audit, uses sanitized external source refs, removes legacy human-review wording from derived receipt rows, keeps raw JD/original source references blocked, keeps workflow bulk repair-first, treats alpha feedback as candidate-only, keeps donor repos read-only, and makes no live Gemini call.

The alpha-feedback support receipt is valid and reports:

Metric	Value
Donor examples	40
Review-support candidates	36
Blocked noisy-title examples	4
Artifact types	5
Audience buckets	6
Row promotion allowed now	0
Export allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the source-intelligence lineage receipt still marks alpha_feedback_bundle_v2 as an import candidate, the six donor alpha files are present, row counts match the JSON/JSONL/summary footprint, noisy marketplace titles are blocked, raw workflow/practice/proof text is excluded from the engine-room receipt, only hashes and compact role/tool metadata are serialized, legacy review fields are absent, donor repos stay read-only, and no live Gemini call was made. This gives future review-support or curriculum-critique lanes a safe starting point without promoting the alpha examples to product authority.

The review-packet support receipt is valid and reports:

Metric	Value
Review packets	4
CSV rows counted	2,000
Packets with 500 rows	4
CSV files	4
Markdown files	4
Row promotion allowed now	0
Export allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the source-intelligence lineage receipt still marks review_packets_v1 as an import candidate, all four expected packet pairs are present, each CSV has 500 rows, packet rows are not serialized, old reviewer values are absent from the derived receipt, only hashes and column shapes are carried forward, donor repos stay read-only, and no live Gemini call was made. This gives future deterministic repair, QA, or semantic comparison lanes a bounded prior-work reference without making any old packet row or label a product authority.

The industry-taxonomy support receipt is valid and reports:

Metric	Value
Decision rows verified	17,118
CSV rows verified	17,118
Decision groups serialized	4
Generic/caveated rows	14,906
Spelling-normalization rows	1,749
Accepted-industry rows	304
Role-label-not-industry rows	159
Row promotion allowed now	0
Export allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove the source-intelligence lineage receipt still marks industry_taxonomy_decisions_v1 as an import candidate, the full donor JSONL and CSV hashes match provenance, the packaged summary count matches the 17,118-row donor file, only aggregate decision groups are serialized, old reviewer values are absent from the derived receipt, donor repos stay read-only, and no live Gemini call was made. This gives future company/industry repair lanes a verified prior-work reference without making any donor industry label product truth.

The E5/E6 source-authority reconciliation receipt is valid and reports:

Metric	Value
Required assets	9
Accounted assets	9
Context/JD-aware gate assets	2
Fresh LLM review allowed assets	0
Top 500 export rows	500
Top 1,000 export rows	1,000
LinkedIn/JD context rows	129,165
Current vectors accounted	151,983
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove docs/TITLE-LEDGER.md, docs/MAPPING-CHAIN-LEDGER.md, source_authority_registry_v2, engine_room_export_manifest_v1, top_band_linkedin_source_authority_intake_v1, jd_aware_role_context_evidence_v1, production_chunk_vector_reconciliation_v1, prior_work_source_authority_promotion_v1, and full_validation.json are present enough to block fresh title-only LLM review unless a missing, stale, broken, or conflicting source-authority signal is named.

The prior-work promotion delta closure receipt is valid and reports:

Metric	Value
Promote-candidate rows reconciled	16
Prior promote-candidate rows found	16
Covered by current engine-room receipts	6
Covered by specialized donor gates	7
Advisory-only rows	2
Future deterministic diff lanes	1
Generic promote-candidate rows still open	0
Fresh title-level LLM review allowed rows	0
Row promotion allowed now	0
Runtime authority rows allowed now	0
Embedding rows allowed now	0
Batch rows allowed now	0

Its checks prove every prior promote-candidate row is now accounted for by a current receipt or specific future lane, every row has complete proof, the generic promote-candidate backlog is closed, redundant title-level LLM review is blocked, donor repos stay read-only, and no live Gemini call was made.

The source-authority and JD-aware receipts are also valid after this slice:

Metric	Value
Source-authority registry rows	48
PE donor lineage rows in source authority	2
PE foundation source rows in source authority	1
PE workflow-grounding rows in source authority	1
PE curriculum-release rows in source authority	1
PE curriculum-release files	14
PE curriculum-release modules	46
PE curriculum-release repair blockers	6
PE prompt/workflow/ontology inventory rows in source authority	1
PE prompt/workflow/ontology candidate families	10
PE source-intelligence scaleout rows in source authority	1
E5/E6 reconciliation rows in source authority	1
E5/E6 accounted assets	9
E5/E6 fresh LLM review allowed assets	0
Prior-work delta closure rows in source authority	1
Prior-work delta candidate rows	16
Prior-work generic promote candidates still open	0
Prior-work specific future lanes	3
PE prompt/workflow/ontology files	7,916
PE prompt/workflow/ontology parseable rows	48,534
PE source-intelligence rows in source authority	1
PE source-intelligence asset families	30
PE source-intelligence repair-first families	7
PE source-intelligence import candidates	4
PE alpha-feedback rows in source authority	1
PE alpha-feedback examples	40
PE alpha-feedback review-support candidates	36
PE alpha-feedback noisy-title blocks	4
PE review-packet rows in source authority	1
PE review packets	4
PE review-packet CSV rows	2,000
PE review packets with 500 rows	4
PE industry-taxonomy rows in source authority	1
PE industry-taxonomy decision rows	17,118
PE industry-taxonomy decision groups	4
PE title-taxonomy gate rows in source authority	1
PE title-taxonomy deterministic audit candidates	32
Combined chunks covered	467,436
Gemini vectors covered	151,983
JD-aware role-context rows	1,056
Rows with JD context	1,018
Top 500 titles with role context	484 / 499
Top 1,000 titles with role context	964 / 996
E2E fixture rows	50
Teaching-ready / guardrail fixtures	34 / 16

The JD-aware recovery is deterministic: it first uses explicit source refs, then gold-spine evidence_jd_ids, then bounded exact linkedin_jobs.title_normalized matches. Consumer artifacts still redact raw job IDs, company refs, summaries, and snippets according to the artifact exposure policy.

All package checks passed:

Prior-work receipt is valid.
PE donor rows are present.
Bulk rows are not promoted.
market_v2 / context-blind K2 lineage is quarantined.
Embedding authority remains false.
Runtime authority remains false.
Public runtime and external writes remain false.
Donor repos remain read-only.
Live Gemini was not invoked.
Source refs are sanitized.
No human_review field was introduced.
Existing export rows were not mutated.
Platform runtime was not touched.
Runtime contracts now consume the PE donor mapping as lineage only.
Academy export manifest now records the PE donor lineage family without changing export rows.
Source authority now admits only the two audited PE donor lineage source IDs.
Source authority now admits the verified PE foundation-status row as source evidence, not runtime authority.
Source authority now admits the workflow-grounding consensus row as source evidence, not runtime authority, embedding authority, or batch authority.
Source authority now admits the title-taxonomy bucket output as gate-pending source evidence with deterministic audit freshness/candidate counts, not row authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the source-intelligence scaleout row as source evidence, not row authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the alpha-feedback support row as source evidence, not row authority, export authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the review-packet support row as source evidence, not row authority, export authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the industry-taxonomy support row as source evidence, not row authority, export authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the E5/E6 reconciliation row as source evidence, not row authority, runtime authority, embedding authority, or batch authority.
Source authority now admits the prior-work promotion delta closure row as source evidence, not row authority, runtime authority, embedding authority, or batch authority.
A negative test proves unaudited prompt/workflow/ontology bulk cannot enter the source-authority registry from the PE donor mapping.
A new donor workflow-grounding test proves the 250-row consensus run is preserved but cannot become bulk authority: only 4 rows were production-allowed, and current workflow-seed semantic QA still blocks bulk promotion.
A new donor title-taxonomy test proves the 2,026 deterministic bucket outputs are preserved as lineage but cannot become promoted rows because the donor audit has zero verified/scored rows.
A new donor title-taxonomy audit test proves deterministic sample inspection can produce replacement-diff candidates while row/runtime/embedding/batch authority remains zero.
A new donor prompt/workflow/ontology inventory test proves prompt assets, workflow ontology, schemas, module content, mastery gates, audio prompts, prompt downloads, review queues, source-intelligence scaleout outputs, and curriculum release packets are inventoried as gated source families without serializing legacy reviewer fields or granting authority.
A new donor curriculum-release lineage test proves 14 packet/curriculum/report files are preserved as source evidence, while empty role tasks/workflows, supply-safety bypasses, profile context, and the weak founder/ecommerce role match block row/runtime/vector/batch authority.
A new donor source-intelligence scaleout lineage test proves 30 asset families across the 71-file donor package are classified without serializing legacy reviewer-gate wording or granting row/runtime/vector/batch authority.
A new donor alpha-feedback support test proves 40 alpha examples are preserved as hashed review-support lineage, 4 noisy marketplace-title examples are blocked, raw workflow/practice/proof text is excluded, and no export/runtime/vector/batch authority is granted.
A new donor review-packet support test proves four 500-row source-intelligence review packet pairs are preserved as hashed packet-footprint support, no packet rows or old auto-reviewed labels are serialized, and no export/runtime/vector/batch authority is granted.
A new donor industry-taxonomy support test proves 17,118 donor taxonomy decisions are verified against provenance and reduced to 4 aggregate decision groups without serializing title/category rows or granting export/runtime/vector/batch authority.
A new E5 source-authority reconciliation test proves E5/E6/source-authority/export/JD/vector/validation assets block redundant title-level LLM review without granting row/runtime/vector/batch authority.
A new prior-work promotion delta closure test proves the 16 old generic promote-candidate rows are closed into named current-receipt or future-lane decisions without fresh title-level LLM review or row/runtime/vector/batch authority.
JD-aware role context now uses prior gold-spine and local LinkedIn evidence before falling back to explicit gaps.
Artifact exposure scan reports active_finding_count: 0.
AIN-506, AIN-510, production runtime readiness, and full validation all pass after this slice.

Why This Matters

This reduces the loop Ali flagged: we should not repeatedly review one title at a time when prior repo work already contains useful validated concepts, prompts, workflows, ontologies, and status reports. This package gives the next slice a clean way to promote good prior work into engine-room contracts without also importing unverified bulk or known-bad lineage.

Parallel Report Reconciliation

Ali ran three parallel report lanes before this closeout. I checked them before landing this checkpoint:

Parallel report	Path	How it affects this lane
Source intelligence scaleout audit	`/srv/aina/worktrees/aina-data-engine-room-source-intelligence-scaleout/docs/handoff/2026-06-15-source-intelligence-scaleout-audit.md`	Confirms the 71-file donor package is useful as lineage/review support only. It explicitly identifies `alpha_feedback_bundle_v2` as a future import candidate, which this checkpoint now converts into a hashed support receipt without row/runtime/export/embedding authority.
Curriculum release audit	`/srv/aina/worktrees/aina-data-engine-room-curriculum-release-audit/docs/reports/2026-06-15-agent-curriculum-release-audit.md`	Confirms old curriculum packets, mastery gates, and rubrics are valuable shape evidence, but must remain repair-lineage because of synthetic learner context, empty task/workflow fields, supply-safety bypass evidence, and weak role-match examples.
Academy static export consumption report	`/srv/aina/worktrees/aina-academy-engine-room-export-consumption/ops/reports/2026-06-15-engine-room-static-export-consumption.md`	Confirms Academy should consume pinned, static, versioned export bundles only. No Cloudflare runtime should call VDS paths, DuckDB, Python, Gemini jobs, or live engine-room internals.

The common decision across all three is consistent with the current engine-room posture: preserve and hash useful donor work, use it to reduce rework, but do not promote donor rows, raw JDs, old labels, or vector presence as production authority.

Exact Resume Commands

cd /srv/aina/aina-data-engine-room
git status --short --branch
uv run aina-data-engine --root /srv/aina/aina-data-engine-room prior-work-source-authority-promotion
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-promotion-package
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-derived-contracts
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-export-runtime-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-foundation-source-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-workflow-grounding-mapping
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-title-taxonomy-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-title-taxonomy-audit
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-curriculum-release-lineage
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-source-intelligence-scaleout-lineage
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-alpha-feedback-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-review-packet-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room pe-donor-industry-taxonomy-support
uv run aina-data-engine --root /srv/aina/aina-data-engine-room e5-source-authority-reconciliation
uv run aina-data-engine --root /srv/aina/aina-data-engine-room prior-work-promotion-delta-closure
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-contracts
uv run aina-data-engine --root /srv/aina/aina-data-engine-room engine-room-export-manifest
uv run aina-data-engine --root /srv/aina/aina-data-engine-room source-authority-registry-v2
uv run aina-data-engine --root /srv/aina/aina-data-engine-room jd-aware-role-context-evidence
uv run pytest tests/test_production_runtime_contracts.py tests/test_engine_room_export_manifest.py tests/test_source_authority_registry_v2.py tests/test_jd_aware_role_context.py tests/test_pe_donor_export_runtime_mapping.py tests/test_pe_donor_foundation_source_mapping.py tests/test_pe_donor_workflow_grounding_mapping.py tests/test_pe_donor_title_taxonomy_gate.py tests/test_pe_donor_curriculum_release_lineage.py tests/test_pe_donor_source_intelligence_scaleout_lineage.py tests/test_pe_donor_alpha_feedback_support.py tests/test_pe_donor_review_packet_support.py tests/test_pe_donor_industry_taxonomy_support.py tests/test_e5_source_authority_reconciliation.py tests/test_prior_work_promotion_delta_closure.py tests/test_pe_donor_derived_contracts.py tests/test_pe_donor_promotion_package.py tests/test_prior_work_source_authority_promotion.py -q
uv run aina-data-engine --root /srv/aina/aina-data-engine-room artifact-exposure-scan
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-p0-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-510-retrieval-promotion-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate

Recommended Next Slice

The two most contract-shaped rows now have derived, mapping, runtime-contract, export-manifest, source-authority, and JD-aware lineage proof:

pe_aina_ship_loop_operator_contract
pe_aina_evidence_roadmap_internal_preview

The verified foundation-status source now has repair-reduction source-evidence proof:

pe_aina_verified_data_foundation_status

The workflow-grounding consensus source now has source-evidence proof and a hard boundary against bulk promotion:

pe_aina_workflow_grounding_consensus_batches

The title-taxonomy bucket source now has gate-pending source-evidence proof, freshness proof, and deterministic sample-audit proof, with a hard boundary against row/runtime/vector promotion:

pe_aina_title_taxonomy_bucket_outputs

The prompt/workflow/ontology bulk source now has source-family inventory proof, hashes, counts, parse-risk disclosure, and a hard boundary against row/runtime/ vector/batch promotion:

pe_aina_prompt_workflow_ontology_bulk_candidates

The curriculum-release source now has packet/curriculum/report lineage proof, module counts, repair blockers, and a hard boundary against row/runtime/vector/ batch promotion:

pe_aina_curriculum_engine_release_outputs

The source-intelligence scaleout source now has import-decision ledger proof, parallel-audit reconciliation, repair-first/import-candidate classification, legacy wording sanitization, and a hard boundary against row/runtime/vector/ batch promotion:

pe_aina_source_intelligence_scaleout_outputs

The alpha-feedback bundle now has review-support proof, hashed examples, noisy-title blocking, raw-text exclusion, and a hard boundary against export, row/runtime/vector/batch promotion:

pe_aina_alpha_feedback_bundle_v2

The review-packet bundle now has packet-footprint proof, four preserved 500-row packet pairs, legacy status redaction, row serialization exclusion, and a hard boundary against export, row/runtime/vector/batch promotion:

pe_aina_review_packets_v1

The industry-taxonomy decisions now have aggregate repair-reduction proof, full-file provenance hash verification, four decision-group summaries, legacy status exclusion, row serialization exclusion, and a hard boundary against export, row/runtime/vector/batch promotion:

pe_aina_industry_taxonomy_decisions_v1

The E5/E6 reconciliation source now has anti-loop proof across current ledgers, export manifests, JD-aware context, chunk/vector receipts, prior-work promotion, and validation, with fresh title-level LLM review blocked for accounted assets:

e5_e6_source_authority_accounted_for

The prior-work promotion delta closure source now has anti-loop proof that the old generic promote-candidate backlog is closed into named gates/specific lanes:

donor_promote_candidate_delta_closure

Next, keep moving through the same evidence-first pattern:

commit this checkpoint locally
check e5_source_authority_reconciliation_v1 before any title-level LLM review or title-only repair
check prior_work_promotion_delta_closure_v1 before reopening broad donor salvage or promote-candidate searches
use pe_donor_alpha_feedback_support_v1 only for review-support or curriculum-critique examples, not runtime/export/embedding inputs
use pe_donor_review_packet_support_v1 only for deterministic repair, QA, and semantic comparison reference; do not treat old review-packet labels or rows as product authority
use pe_donor_industry_taxonomy_support_v1 only for deterministic company/industry taxonomy repair diffs; do not treat old donor industry labels as product truth
use the donor/source-authority registry to decide which remaining PE concepts should become typed engine-room contracts
repair/diff the prompt/workflow/ontology inventory families against current capability, workflow, evaluator, export, and runtime contracts before creating clean derived chunks
repair/diff the curriculum-release packet and curriculum shapes against current role-context, workflow, evaluator, proof-tail, and Academy export contracts before porting any content
keep prompt/workflow/ontology bulk, title bulk, and market/K2 lineage blocked until their own source-family gates pass; title taxonomy specifically needs a declared promotion threshold and replacement-diff application before row promotion
keep embeddings/runtime authority parked until AIN-510 and source-family gates explicitly pass

Ali Mehdi Mukadam · co-authored with Codex · 2026-06-15

topics:
  - personalization-engine
  - source-authority
  - donor-promotion
subtopics:
  - personalization-engine-aina
  - engine-room-receipts
  - quarantine-before-runtime