Source Authority Registry v2
A source-family registry tied to the current combined corpus, not stale planning counts.
The Single Idea
Every current chunk family now has an authority class, vector count, and next action before further embedding or runtime promotion.
322519chunks covered
6510vectors covered
25families
Families
| Family | Authority | Chunks | Vectors | Next |
|---|---|---|---|---|
| onet_task_evidence | source_evidence | 131095 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| serviceable_title | canonical | 60100 | 3440 | use_repaired_overlay_as_current_authority_then_continue_progressive_embedding |
| semantic_review | canonical | 54686 | 500 | use_repaired_overlay_as_current_authority_then_continue_progressive_embedding |
| jobs_research_responsibility | donor_clean | 43196 | 0 | run_source_family_eligibility_then_progressive_embedding |
| workflow_seed | donor_clean | 7277 | 0 | run_source_family_eligibility_then_progressive_embedding |
| jobs_research_role | donor_clean | 6656 | 0 | run_source_family_eligibility_then_progressive_embedding |
| affordance_pack | source_evidence | 6626 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| workflow_intelligence | source_evidence | 3152 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| workflow_ai_affordance | source_evidence | 3051 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| onet_occupation_evidence | source_evidence | 2828 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| top_worked_title | canonical | 1084 | 1000 | use_repaired_overlay_as_current_authority_then_continue_progressive_embedding |
| hf_role_signal | source_evidence | 907 | 826 | partially_vectorized_continue_after_quality_gate |
| iwa_evidence | source_evidence | 476 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| jd_aware_role_context | canonical | 292 | 292 | vectorized_current_snapshot |
| jobs_research_workflow | donor_clean | 267 | 33 | use_repaired_overlay_as_current_authority_then_continue_progressive_embedding |
| realism_corpus | source_evidence | 230 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| gdpval_task | source_evidence | 220 | 220 | vectorized_current_snapshot |
| jobs_research_tool | donor_clean | 146 | 73 | partially_vectorized_continue_after_quality_gate |
| jobs_research_ai_affordance | donor_clean | 89 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
| ai_fluency_headless_loop | canonical | 48 | 48 | vectorized_current_snapshot |
| alipe_vision_doc | advisory_lineage | 32 | 32 | vectorized_current_snapshot |
| named_tool_authority | canonical | 20 | 20 | vectorized_current_snapshot |
| workflow_tool_evidence | source_evidence | 20 | 10 | partially_vectorized_continue_after_quality_gate |
| harvest_source_map | canonical | 16 | 16 | vectorized_current_snapshot |
| qualitative_corpus | source_evidence | 5 | 0 | hold_for_source_authority_or_semantic_qa_before_embedding |
Checks
| Check | Status |
|---|---|
| all_chunk_families_classified | PASS |
| chunk_vector_reconciliation_valid | PASS |
| donor_repos_read_only | PASS |
| family_chunk_counts_match_reconciliation | PASS |
| family_vector_counts_match_reconciliation | PASS |
| labels_are_metadata_not_truth | PASS |
| no_live_gemini_api_invoked | PASS |
| raw_market_rows_not_embedding_authority | PASS |
| source_assets_carried_from_v1 | PASS |
| source_authority_registry_v1_valid | PASS |