AINA Data Engine Room · Handoff · 2026-06-11

Runtime Evaluator Fixtures Handoff

Local evaluator cases for packet hardening, caveat service, source reruns, and hold mining.

Ali Mehdi Mukadam · co-authored with Codex · branch ali/personalization-engine-mission-2026-06-09

The Single Idea

The engine room now has local evaluator fixtures for the current 1,000-title runtime payload sample. This converts packet-hardening and caveat-service payloads into concrete test cases a tutor, learner agent, and evaluator can exercise locally, while keeping all real-user, external-write, and production claims blocked.

01

What Changed

Added src/aina_data_engine/runtime_evaluator_fixtures.py, wired aina-data-engine runtime-evaluator-fixtures, extended tests, and improved the sidecar function resolver in runtime_payloads.py.

Healthcare, finance/lending, frontline operations, field service, construction, housekeeping, material handling, and retail sales roles now get more realistic local workflow prompts.

02

Live Artifacts

ArtifactRowsSHA-256
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1.jsonl10008fecf3b95084519ecbd022fede387980c4e680e7198362acb3d530ec9ee812be
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1_packet_quality_fixtures.jsonl2957432a551b3b7505535539d9e0c5d37e1a50ca93e3e2f59845c3a4d82d591bab4
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1_caveat_evaluator_fixtures.jsonl670380df385e460a9fac1abf9052fe2728bc92a1b3f4e6bcc1c7bd7306108029415
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1_source_ref_rerun_fixtures.jsonl141397d96fd19aa64996ca82b0c648301828d71522de8ef3b7e2b86ea628821a0
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1_hold_mining_fixtures.jsonl344cb6bfbde6d6a023f72338044f20af399c1b2df03bc89a151828d202f2b44412
/srv/aina/aina-data-engine-room/artifacts/validation/runtime_evaluator_fixtures_v1_semantic_anomalies.jsonl4623f28997c8e18f3892473094aef6806110f0cd6a9f333e491550fb08a03917575
The runtime payload artifacts were regenerated too; runtime_payloads_v1.jsonl now has SHA-256 856dde3aa8130906eeb29ca57ac4b2034e448ade7475951adaf46e9a13d504d9.
03

Live Result

Fixture laneCount
Packet quality fixtures295
Caveat evaluator fixtures670
Source-ref rerun fixtures1
Hold-mining fixtures34
Locally serviceable evaluator fixtures965
Runtime functionServiceable count
General business261
Sales139
Operations111
Administration78
Finance73
Customer success68
Data analytics53
Healthcare48
Legal/compliance38
Marketing32
People/HR22
Design/creative19
Product9
Strategy consulting8
Education6
04

Anomaly Queue

FlagCount
Serviceable general-business context still broad261
Function changed for runtime171
Missing source refs35
Not source backed35
Non-runtime fixture blocked35
Hold not runtime allowed34
Source-ref rerun required1
Unknown source function defaulted1
05

Semantic Spot Check

I inspected 50 actual rows with fixture lane, display title, source function, runtime function, local runtime flag, expected action, semantic flags, and artifact under test.

TitleLaneRuntime functionResult
Seasonal Sales AssociatePacket qualitysalesSource-backed local packet-quality fixture.
Support Associate - SomaPacket qualitycustomer_successSource-backed support/customer fixture.
Customer Service AssistantPacket qualitycustomer_successCorrected from administration for runtime use.
SalespersonCaveat evaluatorsalesCorrected from general business.
Business AnalystPacket qualitydata_analyticsCorrected from sales to analysis/reporting.
Patient Care TechnicianCaveat evaluatorhealthcareHealthcare-specific, caveat-visible fixture.
PhlebotomistCaveat evaluatorhealthcareHealthcare-specific, caveat-visible fixture.
Mortgage Loan OfficerCaveat evaluatorfinanceFinance/lending-specific fixture.
HousekeeperCaveat evaluatoroperationsOperations/process fixture.
Family Law AttorneyHold mininglegal_complianceNot runtime-allowed; needs source mining.
06

Validation

All fixture validation checks are true: payload validity, count preservation, lane split, evaluator assertions, caveat requirements, non-runtime blocking, removed human reviewer gate, and blocked production claims.

cd /srv/aina/aina-data-engine-room
.venv/bin/python -m ruff check src tests
.venv/bin/python -m pytest -q
All checks passed.
198 passed in 192.55s.
07

Resume Commands

cd /srv/aina/aina-data-engine-room
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room runtime-evaluator-fixtures
cd /srv/aina/aina-data-engine-room
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room harvest-source-map
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room source-import-recipes
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room semantic-harvest-gate --sample-limit 1000
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room semantic-repair-queue
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room deterministic-semantic-repairs
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room semantic-patch-replay
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room runtime-intake
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room runtime-payloads
.venv/bin/aina-data-engine --root /srv/aina/aina-data-engine-room runtime-evaluator-fixtures
08

Recommended Next Slices

  1. Turn the 965 local evaluator fixtures into deterministic answer/eval runs.
  2. Mine or specialize the 261 serviceable-but-broad general-business rows.
  3. Recover source references for the 35 missing-source-ref rows and rerun the chain.
  4. Expand the sample beyond 1,000 rows using the same path.
  5. Build a compact founder dashboard from the summary JSON files.