AINA production alignment · data engine, academy, platform · 2026-06-15

AINA Data Engine And Academy Alignment Board

A read-first map so future agents advance the same product spine instead of rediscovering old repos.

Ali Mehdi Mukadam · co-authored with Codex · 2026-06-15

The Single Idea

AINA's production path is aina-data-engine-room -> versioned platform-safe exports -> aina-academy -> live learner experience. aina-platform remains the live/front-door and design-pattern donor until Ali decides the convergence path. aina-core is useful, but it is a read-only donor/consolidation snapshot rather than a new center of gravity.

Old Drift Risk

Each repo becomes its own truth: titles get cleaned in isolation, embeddings run against noisy labels, and platform integration keeps waiting for one more data pass.

Aligned Path

The data engine produces trusted exports; Academy consumes them through explicit contracts; Platform remains the live/design donor until the cutover decision.

01

Current Repo Roles

RepoRole nowWhat belongs thereWhat must not happen
/srv/aina/aina-data-engine-roomCurrent data/build authoritySource authority, cleaned corpora, title/role/context spine, AI Fluency Capability Map contracts, embeddings, exact-cosine retrieval proof, local runtime contracts, release-boundary receiptsDo not make public runtime or real-user claims from local proof alone
/srv/repos/aina-academyFuture product/runtime platformCloudflare Worker, D1/Drizzle schema, Zod contracts, learner loop, Practice Arena, tutor/evaluator, recommendations, payments test mode, admin, stagingDo not query VDS-local DuckDB/Python live; consume versioned exports only
/srv/repos/aina-platformCurrent live surface and donorExisting live site/front door, design language, auth/payment/runtime patterns, production UX/copy referencesDo not fork product truth away from Academy without a convergence decision
/srv/repos/aina-coreReference/archive/experiment donorStitch ledger, title expansion evidence, serving probes, evidence enrichment patterns, verified missing assetsDo not build new canonical work here by default
Donor repos and archivesRead-only quarriesCleaned titles, jobs-research, evidence atlas, HF signals, ALIPE vision/context, Fusion task shapeDo not mutate donors, embed raw junk, or treat labels as truth
02

Product Mission

The product mission is learner/role -> AI Fluency Capability Map -> simulation/practice -> evaluation -> proof -> next curriculum move.

From AI anxiety to AI fluency: assessment, simulation, personalized curriculum, and proof.

Title coverage, embeddings, BLS/O*NET/HF data, jobs-research assets, JD-aware role context, tool maps, evaluator rubrics, and source authority are all inputs. The product object is a learner-facing AI Fluency path that can be trusted, explained, evaluated, and improved.

03

Operating Goal

The active goal is reconciliation-gated: preserve the current state, fix stale metadata, finish the Fusion/core import ledger, define the export manifest, prove Academy-safe consumption, and only then resume clean data expansion.

Autonomously preserve and reconcile the AINA Data Engine Room production spine before new data expansion: create verified backup/tag proof for current head 11bf3c0, refresh the private GitHub backup if needed, fix stale board/metadata and mark embeddings parked; then complete the Fusion/core import-decision ledger, manually port only accepted diffs with gates green, consolidate local main into one branch truth, define engine_room_export_manifest_v1, and prove one platform-safe top-500/top-1,000 export can be consumed by aina-academy without live VDS/DuckDB/Python coupling, while preserving source authority, AI Fluency capability-map boundaries, exact-cosine retrieval proof, and keeping public runtime, real-user data, external writes, production telemetry, runtime embedding authority, donor mutation, and deletion blocked until explicit release receipts exist.
04

Current Execution Truth

SurfaceState
Active branchcodex/aina-prod-readiness-2026-06-14
Execution base head11bf3c0 Add data engine academy alignment board before this board-refresh checkpoint commit.
Main statusCurrent branch is 26 commits ahead of local main; main is not yet the final truth.
RemoteNo git remote is configured in this checkout.
Pre-edit preservationArchive tag archive/2026-06-15/prod-spine-board-refresh-11bf3c0; verified bundle /srv/aina/checkpoints/aina-data-engine-room/2026-06-15-production-spine-board-refresh/aina-data-engine-room-11bf3c0-prod-spine-board-refresh.bundle; SHA256 df9677cbf79ecb68730168a085c029b4c209c3f578124cea44216d2b0759f35b.
Fusion branches24 unmerged branch labels require import-or-decline decisions before local main consolidation.
EmbeddingsParked until reconciliation and engine_room_export_manifest_v1 identify product-consumed source families.
Release boundaryPublic runtime, real-user data, external writes, production telemetry, runtime embedding authority, donor mutation, and deletion remain blocked.
05

Source Truth Rules

RuleMeaning
Current receipts winCurrent repo receipts beat old handoffs and donor labels.
Data engine owns source intelligenceaina-data-engine-room is the data authority unless a newer founder decision says otherwise.
Academy owns learner runtimeAcademy owns learner state, auth, D1, lessons, practice, evaluator routing, payments, and UI.
Core is not center of gravityaina-core is reference/archive/experiment material by default.
Labels are metadataGood text with doubtful labels can be preserved only with label authority downgraded.
Release receipts unlock productionPublic runtime, real-user data, external writes, telemetry, donor deletion, and runtime embedding authority require explicit receipts.
Embeddings wait for export proofLive Gemini work resumes only after reconciliation/export gates name clean, product-consumed source families.
06

Milestones, Slices, And Tasks

M0Durability and board preflight: live truth, archive tag, verified bundle, board/HTML refresh, embedding pause.
M1Fusion/core reconciliation: import-decision ledger, read-only review lanes, accepted manual ports, main consolidation.
M2Source authority and export contract: refresh registry, define engine_room_export_manifest_v1, platform-safe payloads.
M3Academy consumer proof: top-500 export, top-1,000 extension, local Academy import, no live VDS coupling.
M4AI Fluency spine: five-layer capability map, onboarding seed boundary, role/task/tool/risk joins.
M5Clean embeddings and retrieval: product-consumed families only, eligibility/repair/QA, progressive Gemini ladder.
M6Runtime and Academy readiness: 25-50 real-row fixtures, exact-cosine retrieval, Academy staging bridge.
M7/M8Production boundary and estate convergence: release receipts, donor retirement, platform decision support.

The first implementation move is M0/M1: preservation proof and Fusion/core import-decision ledger. The next product move is M2/M3: define engine_room_export_manifest_v1 and prove Academy can consume a pinned top-500/top-1,000 export without live coupling.

07

Embedding Policy In The New Map

Gemini embeddings remain important, but they are parked until M1-M3 prove what Academy actually consumes. After that, use gemini-embedding-2 at 768 dimensions through paid Vertex ADC on project aina-495702. Exact cosine stays source-of-truth retrieval until VSS/RuVector acceleration has parity proof.

AllowedBlocked
Clean, source-authoritative semantic chunks; repaired text with doubtful labels metadata-only; progressive foreground embedding; batch after proofRaw market/posting dumps, raw learner artifacts, bad labels in embedding text, malformed CSV rows, quarantined rows, batch with unresolved repair queues
08

No-Write Zones And Release Boundaries

Donor repos, raw source files, real learner data, secrets, env files, credentials, billing settings, and production Cloudflare/telemetry/payment writes remain no-write zones unless explicitly scoped. Data engine builds, embedding receipts, retrieval proof, headless runtime fixtures, and Academy import dry runs remain local-only by default.

09

Validation Stack

cd /srv/aina/aina-data-engine-room
git status --short --branch
uv run aina-data-engine --root /srv/aina/aina-data-engine-room source-authority-start-here
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-506-p0-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room ain-510-retrieval-promotion-gate
uv run aina-data-engine --root /srv/aina/aina-data-engine-room production-runtime-readiness
uv run aina-data-engine --root /srv/aina/aina-data-engine-room platform-live-boundary
uv run aina-data-engine --root /srv/aina/aina-data-engine-room validate
cd /srv/repos/aina-academy
pnpm typecheck
pnpm test
bash ops/smoke/core-loop.sh
10

Done Means For This Mission

The next cold agent can pick up without guessing. The repo roles are recorded, milestones are recorded, older boards point to this alignment, the active goal matches the reconciliation-gated goal, and next execution starts with Fusion/core reconciliation plus the Academy export contract instead of isolated embedding or title cleanup.

11

Next Best Action

Where to start

Finish the Fusion/import decision ledger, manually port accepted diffs with gates green, consolidate local main, define engine_room_export_manifest_v1, build a top-500/top-1,000 platform-safe export, and prove Academy can consume it locally.