Reaktor DocsStart hereCodex

Start here · Codex

Codex Roadmap

Code-grounded roadmap for turning the Reaktor Workbench prototype into a real control plane.

Use whenRead first when deciding what needs to become real and in what order.
SourceCodex
Route/docs/codex-roadmap

1. Summary

Reaktor already has more real infrastructure than the prototype suggests. The graph runtime, port model, object DB, service abstraction, auth kernel, Cloudflare bridge, Compose desktop workbench, graph document export, and BestBuds app graph are present. The missing work is a production control plane that connects those pieces to the web prototype's screens and replaces every static scenario with live, typed state.

Keep

The existing Reaktor runtime model should remain the center: Graph, Node, RouteNode, ports, lifecycle, DI, and Feature slots. The desktop engine and server graph endpoint are the first real workbench spine.

Replace

Replace targets/reaktorWeb/src/mock-services.js, static scenario*.jsx catalogs, fake DB/auth/agent data, and no-op buttons with a WorkbenchService contract and safe command queue.

Converge

Unify the React web prototype, Compose desktop workbench, and Spring/Cloudflare service layers around one serializable WorkbenchManifest and WorkbenchEvent stream.

Definition of reality: a visible workbench feature is real only when it reads from live or recorded production-shaped data, every command is persisted and auditable, every mutating action has authorization and rollback semantics, and tests prove crash-free behavior for the screen and the underlying service contract.

2. Evidence From The Current Code

Area Status Observed source What it means
Graph runtime Real ../reaktor/reaktor-graph, ../reaktor/reaktor-graph-port Lifecycle, navigation, auto-wiring, typed providers/consumers, and port events exist. The graph screen should consume this instead of hand-authored entity arrays.
Flow editor Real ../reaktor/reaktor-flow, modules/engine/src/main/kotlin/ai/bestbuds/reaktor/ReaktorWorkbench.kt Compose desktop already builds a visual graph from a live Graph. Web should share the same manifest/model, not create a second graph language.
Graph document export Partial modules/engine/src/main/kotlin/ai/bestbuds/reaktor/ReaktorGraphDocument.kt, targets/reaktorServer/.../ServerApplication.kt /apps/{appId}/graph exists and exports nodes/edges. It needs file ownership, stable IDs, tests, deploy targets, telemetry links, auth/service/DB metadata, and versioning.
BestBuds dogfood graph Real modules/app/src/commonMain/kotlin/ai/bestbuds/app/BestBuds.kt The app has chat, campaign, discover, events, friends, profile, dev, onboarding, repositories, and interactors. It is a valid first production target for the workbench.
Auth kernel and adapters Partial ../reaktor/reaktor-auth Auth models, staged login state, JWT verification, server interactor, RBAC schema, PAT contracts, and web Apple/Google providers exist. Android Apple and complete session/refresh/audit/runtime enforcement still need work.
Database layer Partial ../reaktor/reaktor-db, targets/reaktorServer/.../ServerApplication.kt SQLite object store and GraphDb/Memgraph are real. DB console functionality is currently static and needs adapters for Postgres, D1, Durable Objects, R2, local SQLite, and Memgraph with redaction and RBAC.
Cloudflare bridge Partial ../reaktor/reaktor-cloudflare, modules/app/src/jsMain Typed bindings and service-to-Hono bridge exist. Deploy/run/test screens still need a deployment control plane and environment-aware service registry.
Telemetry Partial ../reaktor/reaktor-telemetry, modules/app/src/commonMain/.../analytics Graph instrumentation and analytics sinks exist. The insights UI needs ingestion, storage, query APIs, span correlation, and release comparison.
Agents and command queue Facade targets/reaktorWeb/src/screen-agent.jsx, scenario-v3.jsx Codex, Claude, Gemini, commits, and pending commands are static. Need a persistent queue, command validators, codegen workers, git boundaries, and audit.
Prototype actions Facade targets/reaktorWeb/src/mock-services.js Hot reload, deploy, rollback, tests, IDE open, apply/revert commands, and agent actions return mocked logs. This file must disappear from production builds.

2A. Deeper Code-Level Analysis

This section is grounded in the actual Reaktor and BestBuds source. The main conclusion is stronger than the first scan: the prototype is a facade, but the repos already contain enough real runtime surfaces to make the facade honest without inventing a parallel architecture.

Runtime substrate that should be reused

Layer Actual source Observed behavior Workbench implication
Graph orchestration ../reaktor/reaktor-graph/src/commonMain/.../core/Graph.kt, Node.kt, RouteNode.kt, ContainerNode.kt Graph delegates lifecycle, DI, concurrency, navigation, and port containment. dispatch handles cross-graph navigation through containers. autoWire() connects consumers to local providers or DI fallback. The graph tab should not infer routes from static JSON. It should render real Graph, ContainerNode, RouteNode, attached nodes, back stack roots, lifecycle state, and auto-wire failures.
Port and edge model ../reaktor/reaktor-graph-port/src/commonMain/.../port, .../edge/Edge.kt, .../graph/connect.kt Provider and consumer ports are typed by (Type, Key). Edges reject already-connected consumers, emit connect/disconnect events, and carry a deterministic qualifier. Data edges in the workbench should show exact provider/consumer type, key, owner node, connection state, and errors from requireFullyWired(), not only a display label.
Typed services ../reaktor/reaktor-graph/src/commonMain/.../service/Service.kt, RequestHandler.kt, ServiceInterceptor.kt Service owns typed handlers, client invocation, request/response serializers, route patterns, environment propagation, and interceptors. Deploy, testing, auth, DB, and API tabs should be derived from handler catalogs and transport metadata. Service calls should be traceable through a common interceptor rather than manually instrumented per screen.
Cloudflare runtime bridge ../reaktor/reaktor-cloudflare/src/jsMain/.../Router.kt, Bindings.kt, CloudflareApi.kt Service.toHono() maps typed handlers onto Hono routes and injects CloudflareContext. Bindings cover D1, R2, KV, Durable Objects, service bindings, secrets, vector, AI, and Hyperdrive. The deployment and database tabs can introspect real worker bindings and routes. Production work should replace ad hoc worker UI data with a CloudflareTargetCatalog.
Object and graph persistence ../reaktor/reaktor-db/src/commonMain/.../ObjectDatabase.kt, core/ObjectStore.kt, graph/GraphDbPolicy.kt Object stores emit DB events and maintain per-key flows/cache. Graph DB policies enforce mandatory tenant parameterization and tenant filters in Cypher. The DB tab should surface stores, keys, flows, cache policy, graph query policy, and tenant safety checks. Writes should become migration or command records.
Auth substrate ../reaktor/reaktor-auth/src/commonMain/.../AuthAdapter.kt, kernel/AuthKernel.kt, jvmMain/.../LoginInteractor.kt, api/AuthServer.kt The client adapter has explicit login states, provider registry, server sign-in, and session caching. The kernel models principals, tenants, scopes, permissions, roles, token claims, and authorization decisions. The JVM server verifies provider JWTs and mints access/PAT tokens. The auth tab should be backed by provider health, login-state telemetry, RBAC catalog, token/session lifecycle, PAT operations, and authorization decision traces. Static role matrices are not enough.
Telemetry ../reaktor/reaktor-telemetry/src/commonMain/.../GraphTelemetry.kt, bestbuds/modules/app/src/commonMain/.../analytics Graph telemetry observes lifecycle, back stack, screen views, and existing port listeners. BestBuds has analytics event contracts and an analytics worker endpoint. Insights should start by correlating graph node IDs, route changes, service calls, DB events, and BestBuds analytics events into one entity timeline.
Desktop engine bestbuds/modules/engine/src/main/kotlin/ai/bestbuds/reaktor/ReaktorGraphDocument.kt, ReaktorComposeSemanticsInspector.kt, ReaktorRuntime.kt The engine already initializes desktop/test/server features, exports a graph document, and captures Compose semantics/source/bounds every 350ms while enabled. This is the first real workbench host. The web prototype should consume engine/server contracts and the desktop workbench should consume the same manifest version.

BestBuds is a concrete dogfood app, not a toy fixture

Product area Actual source Current reality Roadmap consequence
Root app graph bestbuds/modules/app/src/commonMain/kotlin/ai/bestbuds/app/BestBuds.kt App id 38d1cee3-004d-441f-8f66-ad5060859af1. Root nodes include config, messages, stickers, user, chat, social, and user interactors. Home contains Chat, Campaign, Discover, Events, Friends, Profile, and Dev graphs. The manifest builder should use this as its baseline fixture and acceptance target. Every graph tab feature should work against this real app id.
Routes and screens BestBuds.kt plus modules/app/src/commonMain/.../ui Routes include /, /onboarding, /home, /chats, /chats/{id}, profile/friend/group routes, campaigns, events, friends, profile edit/create, and dev. The Run App tab should support route dispatch and route-param editing against these actual route bindings. The Testing tab should map specs to these flows.
Chat orchestration data/interactors/ChatInteractor.kt, data/repositories/MessageRepository.kt Chat has cache-first open, background revalidation, WebSocket receive/send, optimistic messages, retry/discard, reaction queuing, typing, presence, pagination, and group-member hydration. DevTools should be able to trace one chat send through UI state, interactor, cache, socket, backend echo, reaction updates, and telemetry. Gemini scouting can target TODOs around signed socket auth, heartbeat, multi-device presence, and privacy.
User/session orchestration data/repositories/UserRepository.kt User login goes through Feature.Auth, records pending provider/status in KV, handles timeout/failure, persists the user, hydrates stickers/profile, and supports logout. There are explicit TODOs around flow-based user state. Auth and DevTools should show current login stage, pending provider, cached user, hydration work, and failure state. Auth refresh/session tests should be part of production readiness.
Social and messaging services data/social/SocialService.kt, data/services/MessagingService.kt, data/analytics/AnalyticsService.kt Typed handlers exist for friends, groups, current/discover campaigns, profile, onboarding questions/responses/complete, top messages, chat messages, send, bot send, reactions, and analytics ingest. The API catalog should be generated from typed service classes. Mock endpoint lists in the prototype should be replaced by real handler metadata plus environment URLs.
Cloudflare workers modules/app/src/jsMain/kotlin/ai/bestbuds/app/cloudflare Messaging and social workers mount typed services through bestBudsWorker. Worker auth is explicitly disabled until JWT verification is wired. Bot agents are registered in AgentRegistry.kt. Deploy view should know the worker service, bindings, auth posture, endpoint list, and security blockers. AI tab can start with the real Workers AI bot registry instead of invented model cards.

Current graph document export is useful but too thin

ReaktorGraphDocument.kt currently exports app id/label, visible nodes, containment/navigation/data edges, display labels, graph names, routes attached to nodes, and provider/consumer keys. It intentionally hides RouteNode from visible nodes and normalizes IDs from graph labels plus route or node labels.

Missing from export Why it matters Where to get it
Stable source-aware IDs IDs based on labels and routes can shift when labels change. Commands, tests, telemetry, and commits need stable identity. Combine graph/node runtime id, class fqcn, route pattern, source file, and optional KSP metadata.
Route nodes and route bindings Navigation debugging needs route payload type, path params, attached nodes, source route, target route, and container activation. RouteNode.pattern, routeBinding, navigationTargets(), ContainerNode.graphs, back stack entries.
Port type metadata Provider/consumer keys alone cannot prove type safety or explain auto-wire misses. ProviderPort.key.type, ConsumerPort.key.type, qualifiers, edge source/destination, PortEvent.
Services and worker endpoints Deploy, testing, auth, API docs, and DevTools all need real handlers. Service.handlers, RequestHandler.method, routePattern, worker wrappers under modules/app/src/jsMain.
Stores, cache policy, and DB bindings The DB tab cannot be production-ready without knowing which nodes read/write which stores and backing resources. RepositoryNode store names, ObjectDatabase.events, Cloudflare bindings, Postgres/D1/R2 declarations, GraphDb policies.
Auth requirements Users need to know whether a route/service/action is unauthenticated, user-scoped, admin-scoped, PAT-scoped, or currently unsafe. AuthKernel, AuthServer, LoginInteractor, worker auth config, service middleware/interceptors.
Tests and deploy targets Command queue, affected-tests view, and deploy gates need source-to-test and source-to-deploy mapping. tests/ folders, Gradle tasks, npm scripts, Wrangler/Partykit targets, Playwright/Maestro specs.
Known incompleteness markers The workbench should expose incomplete production behavior rather than hide it behind polished UI. TODO/FIXME scanners over BestBuds/Reaktor, disabled KSP processor, disabled worker auth, stub platform adapters, mocked web services.

Prototype-to-real replacement map

Prototype file/surface Current behavior Replacement
targets/reaktorWeb/src/scenario.jsx, scenario-v3.jsx, window.ENTITIES, window.BUNDLE Static graph, commands, auth, DB, deploy, tests, agents, and metrics. GET /workbench/apps/{appId}/manifest plus recorded fixtures generated from the same endpoint for offline tests.
mock-services.js Every action sleeps and returns a fake log, including deploy, rollback, test run, command apply/revert, IDE open, and agent calls. WorkbenchServiceClient with typed calls for command preview/apply/revert, test run, deploy dry-run/promote, source open, agent job, and runtime session control.
workbench-domain.jsx Domain helpers read global browser variables and local React state. Move to a typed client-side store fed by manifest snapshots and event streams. Keep UI-only helpers, but remove production reads from global scenario data.
screen-agent.jsx Codex, Claude, Gemini, command queue, commits, and pending work are scenario objects. Persistent command queue, agent job table, git commit mapping, read-only Gemini scout jobs, and queue items linked to commands/tests/commits.
screen-database.jsx Static resource list and sample rows. Read-only adapters for ObjectStore, Postgres/Supabase, D1, R2, Durable Objects, and Memgraph, with redaction and write commands only.
screen-testing.jsx Static suites and proxy rules. Test inventory scanner plus runners for Playwright, Maestro, Gradle, and service replay. Results link to manifest entities and command gates.
screen-deploy.jsx Mock release activity over static topology. Deploy catalog from npm scripts, Wrangler/Partykit configs, server deploy scripts, k3s scripts, secrets check, dry-run plan, approval gates, and rollback records.

Production blockers found in code

Blocker Evidence Required fix before calling it production
Worker auth is not wired bestbuds/modules/app/src/jsMain/.../core/BestBudsWorker.kt rejects enabled auth with BestBuds worker auth is not wired yet. Add JWT verification middleware using reaktor-auth token verification, wire service requirements, and expose auth posture in the Deploy/Auth tabs.
User identity is passed as a serialized header in product services SocialService and MessagingService use userHeaders(user); workers read currentUserOrNull. Replace trust-on-header behavior with signed access tokens and typed auth requirements. Keep impersonation only in explicit dev/test mode.
Chat socket identity is raw query data ChatInteractor.connectWebSocket passes userId and userName via query and comments that signed token auth is needed. Issue short-lived room tokens, verify them in the PartyKit/chat worker, and add tests for rejected/expired tokens.
Session lifecycle is incomplete LoginInteractor mints access tokens and creates a random refresh token, but complete refresh persistence/revoke/audit flow is not unified in the runtime. Persist refresh sessions, add revoke/rotate endpoints, expose active sessions in Auth, and add regression tests for refresh/revoke.
Compiler metadata path is disabled ../reaktor/reaktor-compiler/src/.../ReaktorProcessor.kt returns immediately. Either restore KSP metadata generation or explicitly replace it with Gradle/source scanning for source file, ownership, command, and test mapping.
Secrets/config leakage risk targets/reaktorServer/.../ServerApplication.kt contains hardcoded database configuration, and deploy scripts include sensitive command-line tokens. Move credentials to secret stores, add a scanner gate, and prevent Deploy tab from exposing secret values in logs or manifests.
Platform auth parity is uneven Android Apple login is a TODO; web providers exist but still need production configuration and parity tests. Make provider support visible per platform and enforce parity tests before a provider is marked healthy.
Several BestBuds user flows are visibly incomplete Profile edit/save, campaign join/details, event actions, image picker, and call actions still have TODOs or partial behavior. The workbench should mark these as incomplete product capabilities and generate command/test plans rather than displaying them as finished.

Concrete first implementation delta

Recommended next code milestone: extend ReaktorGraphDocument.kt into a versioned WorkbenchManifest, expose it from targets/reaktorServer, generate a checked-in fixture from the live BestBuds graph, and migrate the Graph/Search/Drawer portions of targets/reaktorWeb away from scenario*.jsx. That gives the team a real spine before touching deploy, DB, auth, agents, or testing.

3. Target Architecture

The clean structure is a layered control plane. Screens are thin. Domain logic lives in typed services. Runtime adapters are explicit and replaceable. Mutations go through commands. Low-level escape hatches exist only behind capability checks.

Source Apps
BestBuds and future Reaktor apps expose graphs, routes, services, repositories, auth, deploy targets, and tests.
Manifest Builder
Runtime reflection plus Gradle/KSP metadata produces a stable WorkbenchManifest.
Workbench Service
Typed APIs provide graph, runtime, DB, auth, testing, deploy, telemetry, agents, and commands.
Command Queue
All mutations become validated sealed commands with preview, apply, revert, commit, and audit records.
Web/Desktop UI
React prototype and Compose desktop consume the same contract and render mode-specific views.

Core packages to introduce or promote

Package/module Responsibility Why it belongs here
reaktor-workbench-core Shared serializable models: manifest, events, commands, queue records, adapters, RBAC scopes. This is framework-level and should serve BestBuds, Manna, and any future Reaktor app.
reaktor-workbench-service Service contract and server implementation for graph/runtime/DB/auth/testing/deploy/agent APIs. Prevents every UI surface from inventing direct transport and filesystem access.
bestbuds/modules/engine Dogfood host, desktop shell, app registration, live preview, graph document builder until promoted. Already contains real workbench code and should be the proving ground before framework extraction.
targets/reaktorWeb Production web shell for remote workbench access. Keep the UI investment, but make it a client of WorkbenchService rather than a static scenario viewer.
targets/reaktorServer Local/cloud control-plane backend, app graph endpoint, GraphQL, Memgraph, auth, deploy/test bridges. Already exposes /apps/{appId}/graph; extend it into a real workbench API.

Shape of the shared contract

sealed interface WorkbenchCommand { val id: CommandId val appId: AppId val target: WorkbenchTarget val author: WorkbenchActor val reason: String val risk: CommandRisk } data class AddNodeCommand(...) data class ConnectPortsCommand(...) data class UpdateRouteCommand(...) data class UpdateAuthPolicyCommand(...) data class CreateMigrationCommand(...) data class RunTestSuiteCommand(...) data class DeployReleaseCommand(...) data class AgentScoutCommand(...) data class WorkbenchManifest( val app: WorkbenchApp, val graph: GraphManifest, val sources: SourceIndex, val services: ServiceCatalog, val stores: StoreCatalog, val auth: AuthCatalog, val tests: TestCatalog, val deploy: DeployCatalog, val telemetry: TelemetryCatalog )

4. Prototype Feature Map

Every row below corresponds to visible functionality in targets/reaktorWeb/src. The target is not "make buttons click"; the target is production behavior with mock data only where explicitly labeled as recorded fixtures or test mode.

Prototype area Current behavior Real substrate Work to make it real Acceptance criteria
Graph Static entities and edges in scenario*.jsx. Graph, PortCapability, ReaktorFlowGraph, ReaktorGraphDocument. Serve graph manifests from live app graph. Add stable IDs, source paths, routes, ports, DI, stores, services, tests, deploy targets, telemetry nodes, and focus queries. Graph view renders BestBuds from /apps/{appId}/graph; selecting a node opens real metadata; no production dependency on window.ENTITIES.
Run App Visual fake devices and state controls. Compose desktop preview in modules/engine, BestBuds GraphApplication, web/app targets. Add runtime session manager, app preview launch, hot reload, route dispatch, device profiles, network modes, logs, state snapshots, and click-to-inspect bridge. User can launch BestBuds in phone/desktop viewport, navigate actual graph routes, inspect UI tree, and see runtime errors/logs.
DevTools Static blast radius, commands, and mini device. Graph edges, lifecycle flows, port events, telemetry spans, source index. Capture navigation transitions, port invocations, DB/service calls, payload previews, affected tests, source jump, and diff linkage. Clicking a live action produces a trace showing route, node, port, service, cache, network, DB, and telemetry events.
Deploy Mock deploy/rollback logs and static topology. npm deploy scripts, Wrangler targets, k3s scripts, Spring server deploy, n8n workflow definitions. Build deploy catalog, environment inventory, CI run adapter, migration planner, canary/rollback model, secret validation, and release health monitor. Deploy button starts a real gated release job or opens a dry-run plan; production deploy requires RBAC and passing tests.
Insights Static funnels, costs, retention, error cards. reaktor-telemetry, BestBuds analytics queue/service, Firebase/OTel adapters. Create telemetry ingestion store, query endpoints, release correlation, entity-to-metric links, cost adapters, and trace sampling controls. Insights cards are backed by a named data source, time range, and query; graph nodes show current p95/error/funnel impact.
Agent Static Codex/Claude/Gemini personas, fake chat, fake queue, fake commits. Codex CLI/worktrees, git, tests, graph manifest, command model. Build agent registry, conversation store, scout jobs, command proposal API, review gates, command-to-commit grouping, and worktree isolation. Gemini scout can run a real read-only scan; queued findings produce commands; committed groups show actual git commits and executed command IDs.
Database Static catalogs for Supabase, Memgraph, D1, Durable Objects, R2, SQLite. Reaktor object store, Memgraph/Neo4j GraphDb, Exposed/Supabase, Cloudflare bindings. Implement adapters, schema introspection, read-only query console, migrations, relationship explorer, explain plans, PII redaction, and write approvals. Schema/data tabs query real or recorded data source; writes create migration commands, not direct DB edits.
Auth Static provider, role, session, audit views; local state toggles. reaktor-auth AuthKernel, RBAC schema, AuthService, LoginInteractor, adapters. Expose provider health, role/permission CRUD via commands, session lookup/revoke, PAT flows, audit stream, provider parity tests, and RBAC middleware status. Revoking a session calls a real endpoint; role matrix changes generate audited policy commands; provider status reflects actual config/test results.
Testing Static suites plus calls to mocked runTests. tests/ structure, Playwright specs, Maestro flows, Gradle tests, Keploy config. Build test inventory scanner, runner API, artifact store, coverage-to-graph mapping, replay/proxy manager, device lab adapter, and flaky-test tracker. Run buttons execute named suites; result rows link to artifacts, graph nodes, source files, and command gates.
AI Static model registry, notebooks, gateway, evals, costs. Cloudflare Workers AI wrapper, agent concepts, possible OpenAI/Gemini adapters. Implement provider registry, model usage logs, prompt/eval store, gateway policies, safety checks, cost collection, and notebook execution sandbox. AI tab displays real configured providers, model calls, eval runs, and cost estimates from logs.
Shell, search, drawer Mostly local state over static data. Workbench manifest, command queue, source index, app registry. Centralize state orchestration, persisted preferences, RBAC-aware actions, deep links, keyboard shortcuts, source search, and drawer tabs backed by APIs. Every sidebar/search/drawer result comes from the manifest or a service query and survives refresh.

5. Implementation Phases

P0
Freeze the contract and remove ambiguity

Turn the prototype into a written/API contract before adding more surface area.

1 week
  • Define WorkbenchManifest, WorkbenchEvent, WorkbenchCommand, CommandQueueRecord, and WorkbenchCapability.
  • Tag prototype-only static data and block it from production builds with a lint/test check.
  • Create API fixtures from real BestBuds graph export so UI can migrate one endpoint at a time.
  • Define RBAC scopes for read graph, run app, query DB, modify auth, run tests, deploy, and run agents.
  • Document destructive action rules: no direct writes except through commands with preview and rollback metadata.
  • Add a production readiness checklist per screen and keep it next to Playwright specs.
P1
Make the graph and shell live

Replace static graph/search/sidebar/drawer data with the real app manifest.

2-3 weeks
  • Promote and enrich ReaktorGraphDocument.kt into a reusable manifest builder.
  • Extend /apps/{appId}/graph to /workbench/apps/{appId}/manifest with version, schema, source, and capability metadata.
  • Add stable IDs that do not depend on object identity alone; include route path, graph path, node type, and source fingerprint.
  • Add a web data layer: useWorkbenchManifest, useCommandQueue, useWorkbenchSearch.
  • Keep local fixtures only as recorded offline mode and Playwright fixtures.
  • Make graph focus, blast radius, affected tests, and sidebar groups derived selectors over manifest data.
P2
Build the command queue before mutating anything

The workbench must never apply hidden edits. Commands are the safety boundary.

3-4 weeks
  • Implement sealed command classes and validators for graph, source, DB, auth, testing, deploy, and agent actions.
  • Persist queue records with status: drafted, proposed, reviewed, ready, running, applied, reverted, failed, committed.
  • Support dry-run, diff preview, conflict detection, dependency ordering, and rollback metadata.
  • Group executed commands by actual git commit and show pending commands separately in Agent view.
  • Map every UI mutating button to either a command proposal or an explicitly disabled state with reason.
  • Add command audit entries with actor, RBAC scopes, source IP/session, approval chain, and resulting artifacts.
P3
Make Run App and DevTools production-grade

Connect the visual workbench to a real running app session and trace stream.

4-6 weeks
  • Create WorkbenchRuntimeSession for desktop, web preview, Android simulator, and iOS simulator modes.
  • Expose route dispatch, back stack, lifecycle state, selected UI element, component tree, logs, and errors.
  • Instrument port invocations, service calls, repository cache hits, DB access, websocket events, and analytics emissions.
  • Bridge Compose semantics inspector data into the shared manifest/runtime event stream.
  • Implement hot reload as a real build/reload operation with failure state and logs.
  • Make DevTools timelines replayable and link each event to graph nodes and source paths.
P4
Make Database and Auth consoles real

These screens are sensitive; implement read-first, RBAC-first, command-first behavior.

4-5 weeks
  • Build DB adapters for local SQLite object stores, Supabase/Postgres, Memgraph, D1, Durable Objects, and R2 metadata.
  • Default all database views to read-only and redact secrets/PII by policy.
  • Route schema changes through migration commands with dry-run and rollback sections.
  • Expose auth providers, RBAC roles, permissions, user-role bindings, sessions, PATs, and audit logs from reaktor-auth.
  • Finish provider parity gaps: Android Apple, session refresh lifecycle, revoke flows, and service middleware enforcement status.
  • Move hard-coded credentials and platform secret TODOs into proper secret storage before production access.
P5
Make Testing, Deploy, and Insights operational

Workbench decisions should be gated by test evidence and production telemetry.

4-6 weeks
  • Scan tests/, Gradle tasks, Playwright, Maestro, Keploy, n8n, and deploy scripts into a TestCatalog.
  • Add a runner API that can execute suites, stream logs, collect artifacts, and map failures to graph nodes.
  • Build deploy adapters for Wrangler workers, app web, analytics/sticker services, Spring server, k3s, Android, and iOS distribution.
  • Implement environment inventory, release manifests, migrations, canary gates, rollback gates, and health checks.
  • Wire telemetry ingestion to graph entities, route spans, release versions, cost data, funnel events, and crash/error cards.
  • Make the Deploy button a real pipeline starter only after RBAC, command review, tests, and environment checks pass.
P6
Make agents and AI useful, bounded, and auditable

Gemini scouts broadly, Codex proposes/builds, Claude critiques/tests, but the queue remains the source of truth.

4-6 weeks
  • Create agent registry and conversation store with model, tool, permission, trigger, and scope metadata.
  • Implement Gemini scout jobs for dead code, harness gaps, coverage mapping, inefficiencies, dependency drift, and stale generated wrappers.
  • Make agents output structured findings and commands, never untracked source mutations.
  • Run codegen in isolated worktrees and attach diffs, tests, and commit metadata back to queue records.
  • Build AI provider registry, gateway logs, eval runs, safety checks, usage costs, and notebook sandbox for the AI tab.
  • Add audit rules for every agent action: inputs, model, prompt version, files read, files changed, tests run, reviewer, and commit.

6. BestBuds Dogfood Gaps The Workbench Should Expose

The first honest proof is BestBuds. The workbench should surface these as real graph findings, not hidden TODOs.

Gap Current source signal Workbench behavior to implement
Edit profile is not implemented EditProfileScreen.kt throws TODO; ProfileScreen.kt has save TODO. Graph should mark route/screen as incomplete, Testing should show missing flow, Agent/Gemini should propose bounded implementation commands.
Campaign actions are placeholders CampaignScreen.kt and DiscoverScreen.kt have TODO click actions. Blast radius should link campaign UI buttons to missing interactor/service endpoints and tests.
Private events actions and image picker are incomplete PrivateEventsScreen.kt, CreateEventScreen.kt. Testing catalog should show absent event creation/upload flows; command queue should generate media/event service work.
Call action is not implemented ChatScreen.kt call TODO. Graph should expose action node as incomplete and link to media/work/notification capabilities needed.
Auth provider parity is incomplete AndroidAppleLogin.kt TODO; auth/session lifecycle partial. Auth tab should show provider parity status by platform and Testing should gate releases on adapter contract tests.
Secrets and credentials need hardening ServerApplication.kt contains direct Supabase connection data; reaktor-security platform secret boxes are TODO. Deploy screen should fail production readiness until secrets are moved into approved secret stores and platform boxes are implemented.

7. Testing Strategy

Contract tests

  • Manifest schema golden tests for BestBuds graph export.
  • Command validation tests for every command type and failure mode.
  • Auth provider parity tests across Android, Darwin, Web, Desktop, and server verification.
  • DB adapter tests with local fake drivers and redaction assertions.

UI and workflow tests

  • Keep Playwright for reaktorWeb tab coverage, keyboard flows, command queue, graph interactions, and crash-free behavior.
  • Use Maestro for mobile BestBuds app flows and web smoke where it provides useful black-box coverage.
  • Add screenshot/artifact checks for visual regressions and component overlap.
  • Map every UI test to graph nodes so blast radius can show affected coverage.

Production-readiness gates

Gate Required evidence
No static production data Build/test fails if production bundle imports mock-services.js or uses scenario-v3.jsx as primary data.
Every button has behavior Playwright action inventory asserts each enabled control either triggers an API call, command, navigation, test run, or explicitly visible state change.
No unsafe writes DB/auth/deploy mutations create queue records and require RBAC plus dry-run preview before apply.
Graph is source of truth Graph/search/sidebar/drawer data comes from manifest selectors; fixtures are only used in offline or test mode.
Dogfood coverage BestBuds graph, run preview, tests, deploy, auth, DB, and telemetry have at least one live integration path each.

8. Risks And Mitigations

Risk: the UI outpaces the substrate again

Mitigation: no new prototype-only tabs. Every new visual component must be backed by a domain selector, API fixture, or command contract before polish work.

Risk: graph manifest becomes another static snapshot

Mitigation: include manifest versioning, runtime session IDs, event streams, and tests that compare live graph construction against exported graph documents.

Risk: command queue can damage source or production

Mitigation: apply commands only inside isolated worktrees or controlled deploy jobs, require dry-run diffs, record rollback metadata, and gate destructive operations by RBAC.

Risk: auth/DB consoles expose sensitive data

Mitigation: read-only by default, redaction policies, scoped credentials, audit logs, session-aware access, and production write lockouts until secret storage is fixed.

Risk: agents create unreviewable complexity

Mitigation: agents output findings and typed commands. Human or policy gates approve execution. Git commits are linked to commands and tests.

Risk: two workbenches diverge

Mitigation: desktop Compose and web React must consume the same manifest, command queue, and workbench services. UI differences are allowed; domain differences are not.

Near-Term Execution Order

  1. Create WorkbenchManifest and update ReaktorGraphDocument.kt to produce the first version from BestBuds.
  2. Add /workbench/apps/{appId}/manifest and a small web fetch layer in targets/reaktorWeb.
  3. Migrate Graph, sidebar, search, and inspector summary from static scenario data to the manifest.
  4. Implement command queue storage and make all mock mutating actions create queue records instead.
  5. Wire Testing tab to the existing tests/ inventory and run at least Playwright suites through a real runner API.
  6. Wire Auth and Database tabs read-only to real catalogs, then add command-backed mutations.
  7. Make Gemini scout real as a read-only job that reports findings and proposes commands.