Interactive emulator interface with typed observability hooks in emu/, shared simulation backend with command queue, CLI REPL, and web GUI with dataflow graph execution overlay. Six implementation phases.
···1+# OR1 Monitor — Emulator Interface & Visual Monitor UI
2+3+## Summary
4+5+The OR1 Monitor is an interactive simulation tool that wraps the existing assembler (`asm/`) and emulator (`emu/`) into a controllable, observable runtime. The core problem it solves is that the emulator today runs to completion and only exposes a post-hoc `output_log`; the monitor replaces that with tick-by-tick stepping, real-time structured events, and full state inspection at any point during execution. Two front-ends are built on top of a shared simulation backend: a thin CLI REPL for scripted or headless use, and a browser-based visual monitor that overlays execution state on the dataflow graph.
6+7+The approach has three layers. First, a small set of typed event callbacks is added to `emu/pe.py` and `emu/sm.py` — these fire at well-defined points (token received, match completed, ALU executed, cell written, and so on) and are zero-cost when unused, so all existing tests continue to pass without changes. Second, a `monitor/backend.py` simulation thread owns the SimPy environment and exposes a command queue: callers send commands (`LoadCmd`, `StepTickCmd`, `RunUntilCmd`, etc.) and receive `StepResult` messages containing both the semantic events that fired and a full state snapshot. Third, the REPL and web server are thin adapters over this shared backend — they differ only in how they present results to the user, not in any simulation logic.
8+9+## Definition of Done
10+11+1. **New `monitor/` package** that integrates the assembler and emulator into an interactive simulation tool
12+2. **Observability hooks in `emu/`** — event callbacks or similar mechanism so external code can observe PE/SM state changes in real-time (not just post-execution)
13+3. **Web GUI** showing the dataflow graph with execution state overlay (active nodes, token flow, SM cell states), with tick-level stepping controls and optional per-event granularity
14+4. **Thin CLI REPL** for headless use — load dfasm programs, inject tokens, step, inspect state, clear/reset
15+5. **Shared simulation backend** powering both interfaces
16+17+**Out of scope:** Modifying dfgraph, IDE-level debugging features.
18+19+## Acceptance Criteria
20+21+### or1-monitor.AC1: Monitor package loads and runs simulations
22+- **or1-monitor.AC1.1 Success:** `LoadCmd` with valid dfasm source assembles, builds topology, injects seed tokens, and returns `GraphLoaded` with IR graph and initial snapshot
23+- **or1-monitor.AC1.2 Success:** `LoadCmd` wires event callbacks into all PEs and SMs automatically
24+- **or1-monitor.AC1.3 Failure:** `LoadCmd` with invalid dfasm source returns assembly errors without crashing the backend
25+- **or1-monitor.AC1.4 Success:** `ResetCmd` tears down the current simulation and leaves the backend ready for a new `LoadCmd`
26+- **or1-monitor.AC1.5 Success:** `ResetCmd` after a load can optionally reload the same program
27+28+### or1-monitor.AC2: Observability hooks fire structured events
29+- **or1-monitor.AC2.1 Success:** PE fires `TokenReceived` when a token is dequeued from `input_store`
30+- **or1-monitor.AC2.2 Success:** PE fires `Matched` when dyadic match completes with both operands
31+- **or1-monitor.AC2.3 Success:** PE fires `Executed` after ALU execution with op, result, and bool_out
32+- **or1-monitor.AC2.4 Success:** PE fires `Emitted` for each output token placed on route_table
33+- **or1-monitor.AC2.5 Success:** PE fires `IRAMWritten` after processing an `IRAMWriteToken`
34+- **or1-monitor.AC2.6 Success:** SM fires `TokenReceived` when a token is dequeued
35+- **or1-monitor.AC2.7 Success:** SM fires `CellWritten` on any cell state change (includes old and new presence)
36+- **or1-monitor.AC2.8 Success:** SM fires `DeferredRead` when a read blocks on a non-FULL cell
37+- **or1-monitor.AC2.9 Success:** SM fires `DeferredSatisfied` when a subsequent write satisfies a deferred read
38+- **or1-monitor.AC2.10 Success:** SM fires `ResultSent` when a result token is routed back to a PE
39+- **or1-monitor.AC2.11 Regression:** All existing emulator tests pass unchanged when no callback is provided
40+41+### or1-monitor.AC3: Web GUI displays execution state
42+- **or1-monitor.AC3.1 Success:** Graph renders in physical layout (grouped by PE) by default
43+- **or1-monitor.AC3.2 Success:** User can switch between physical and logical layout
44+- **or1-monitor.AC3.3 Success:** Active nodes highlight when their PE receives a token at that IRAM offset
45+- **or1-monitor.AC3.4 Success:** Edges animate when a token traverses them
46+- **or1-monitor.AC3.5 Success:** Event log panel shows SimEvents with simulation timestamps
47+- **or1-monitor.AC3.6 Success:** Event log is filterable by component (pe:0, sm:1) and event type
48+- **or1-monitor.AC3.7 Success:** State inspector shows full PE state (IRAM, matching store, gen counters, input queue, output log)
49+- **or1-monitor.AC3.8 Success:** State inspector shows full SM state (cells with presence, deferred reads, T0 store)
50+- **or1-monitor.AC3.9 Success:** Step tick button advances simulation to next time and updates all panels
51+- **or1-monitor.AC3.10 Success:** Step event button advances exactly one SimPy event
52+- **or1-monitor.AC3.11 Success:** Clicking a node shows its matching store state — occupied slots, stored data, which port is waiting for a partner
53+- **or1-monitor.AC3.12 Success:** Nodes visually indicate matching state (e.g., half-matched node shows it has one operand waiting)
54+55+### or1-monitor.AC4: CLI REPL provides headless interaction
56+- **or1-monitor.AC4.1 Success:** `load <path>` assembles and loads a dfasm program
57+- **or1-monitor.AC4.2 Success:** `step` and `stepe` advance by tick and by event respectively
58+- **or1-monitor.AC4.3 Success:** `run <until>` runs simulation to target time
59+- **or1-monitor.AC4.4 Success:** `send <args>` injects a token via FIFO (wakes waiting processes)
60+- **or1-monitor.AC4.5 Success:** `inject <args>` does direct injection (pre-sim setup)
61+- **or1-monitor.AC4.6 Success:** `state`, `pe <id>`, `sm <id>` display readable state information
62+- **or1-monitor.AC4.7 Success:** `log` shows recent events, `log filter` narrows by component
63+- **or1-monitor.AC4.8 Failure:** Commands on an unloaded simulation return clear error messages
64+- **or1-monitor.AC4.9 Success:** `reset` clears simulation state and `quit` exits cleanly
65+66+### or1-monitor.AC5: Shared backend powers both interfaces
67+- **or1-monitor.AC5.1 Success:** REPL and web server use the same `SimulationBackend` instance (no code duplication)
68+- **or1-monitor.AC5.2 Success:** `StepTickCmd` processes all events at the current simulation time before returning
69+- **or1-monitor.AC5.3 Success:** `StepEventCmd` processes exactly one event
70+- **or1-monitor.AC5.4 Success:** `RunUntilCmd` batches events per tick to avoid flooding
71+- **or1-monitor.AC5.5 Success:** `StepResult` contains both semantic events and a full state snapshot
72+- **or1-monitor.AC5.6 Edge:** Stepping when simulation is finished (`env.peek() == inf`) returns `finished=True` without error
73+74+## Glossary
75+76+- **SimPy**: A Python discrete-event simulation library. The emulator models time as a series of events processed by a single-threaded `Environment`; "stepping" means advancing the environment by one event or one unit of simulation time.
77+- **PE (Processing Element)**: A compute node in the dataflow CPU. Each PE holds an instruction RAM (IRAM), a matching store, and generation counters. It consumes tokens from an input queue, pairs dyadic operands, executes ALU instructions, and emits result tokens.
78+- **SM (Structure Memory)**: A memory unit implementing I-structure semantics — each cell has a presence state (`EMPTY`, `RESERVED`, `FULL`, `WAITING`) and supports deferred reads: a read on a non-full cell blocks until a subsequent write satisfies it.
79+- **Token**: The unit of data movement in the dataflow machine. A `CMToken` targets a PE (carrying operand data to a specific IRAM offset and context slot); an `SMToken` targets a Structure Memory (carrying a read/write operation).
80+- **DyadToken / MonadToken**: Subtypes of `CMToken`. A `DyadToken` carries one operand of a two-input (dyadic) instruction and must be paired with its partner before execution proceeds. A `MonadToken` carries a complete single-operand input.
81+- **IRAM (Instruction RAM)**: Per-PE memory storing `ALUInst` and `SMInst` records. A token's `offset` field indexes into IRAM to select which instruction to execute.
82+- **Matching store**: A 2D array inside each PE (`[ctx_slots][offsets]`) that buffers one operand of a dyadic instruction while waiting for the other. When both operands arrive, the match completes and execution proceeds.
83+- **Context slot (`ctx`)**: An index into the matching store that provides re-entrancy — different invocations of the same IRAM instruction can be in flight simultaneously, each occupying a distinct context slot.
84+- **Generation counter**: A per-context counter in each PE used to detect and discard stale tokens from superseded invocations.
85+- **I-structure semantics**: A single-assignment memory discipline where each cell can be written exactly once; reads that arrive before the write are deferred and satisfied automatically when the write occurs.
86+- **T0 / T1 (memory tiers)**: Two address ranges within Structure Memory. T1 (below the tier boundary) is the normal I-structure tier with presence tracking and deferred reads. T0 (at or above the tier boundary) is shared raw storage across all SM instances.
87+- **SimEvent**: A frozen dataclass union (`TokenReceived | Matched | Executed | Emitted | ...`) that captures a single observable occurrence during simulation, including its simulation timestamp and the component that fired it.
88+- **EventCallback**: A `Callable[[SimEvent], None]` injected into PE and SM constructors. When provided, it is called synchronously at each instrumentation point. Defaults to no-op.
89+- **SimulationBackend**: The `monitor/backend.py` class that owns the SimPy environment in a dedicated thread, receives `SimCommand` objects via a queue, and returns `StepResult` objects via a result queue.
90+- **StateSnapshot**: A pure-Python dataclass capturing a complete dump of all PE and SM state at a moment in simulation time.
91+- **StepResult**: The response type returned by the backend after any stepping command. Contains the list of `SimEvent`s that fired, a `StateSnapshot`, the current simulation time, and a `finished` flag.
92+- **dfasm**: The domain-specific assembly language for the OR1 dataflow CPU. Programs are compiled by the `asm/` package into PE/SM configurations.
93+- **IRGraph**: The intermediate representation produced by the assembler's lowering pass, describing the dataflow graph before code generation. Rendered as JSON in the frontend.
94+- **Cytoscape.js**: A JavaScript graph visualisation library used for rendering the dataflow graph in the browser.
95+- **ELK (Eclipse Layout Kernel)**: A graph layout algorithm library (used via `cytoscape-elk`) that automatically positions graph nodes. Supports logical and physical layout modes.
96+- **`frontend-common/`**: A shared TypeScript library extracted from `dfgraph/frontend/` to avoid duplicating graph rendering, layout switching, and export logic.
97+- **`cmd.Cmd`**: Python's standard-library base class for line-oriented command interpreters. The CLI REPL subclasses it.
98+99+## Architecture
100+101+New top-level `monitor/` package that integrates the assembler (`asm/`) and emulator (`emu/`) into an interactive simulation tool. Two observation mechanisms work together: typed event callbacks in PE/SM for semantic events (what happened), and external state snapshots for full system dumps (what everything looks like now).
102+103+A persistent simulation thread runs SimPy, controlled by a thread-safe command queue. Both the CLI REPL and web GUI send commands to the same backend and receive the same event/snapshot responses — they differ only in presentation.
104+105+### Dependency Flow
106+107+```
108+cm_inst.py <-- tokens.py <-- emu/events.py (new, self-contained)
109+ |
110+ emu/pe.py, emu/sm.py (modified: accept on_event)
111+ |
112+ emu/types.py (modified: on_event in PEConfig/SMConfig)
113+ |
114+ emu/network.py (modified: wire on_event through build_topology)
115+ |
116+ asm/ -----------------> monitor/backend.py
117+ |
118+ monitor/snapshot.py
119+ monitor/repl.py
120+ monitor/server.py
121+ |
122+ monitor/frontend/ --> frontend-common/
123+ dfgraph/frontend/ --> frontend-common/
124+```
125+126+`emu/events.py` imports only from `cm_inst` and `tokens` (which `emu/` already depends on). No circular dependencies introduced. `monitor/` imports from both `emu/` and `asm/` but neither imports from `monitor/`.
127+128+### Observability Layer (`emu/events.py`)
129+130+Union of frozen dataclasses, pattern-matchable with `match/case`:
131+132+```python
133+@dataclass(frozen=True)
134+class TokenReceived:
135+ time: float
136+ component: str # 'pe:0', 'sm:1'
137+ token: Token
138+139+@dataclass(frozen=True)
140+class Matched:
141+ time: float
142+ component: str
143+ left: int
144+ right: int
145+ ctx: int
146+ offset: int
147+148+@dataclass(frozen=True)
149+class Executed:
150+ time: float
151+ component: str
152+ op: ALUOp | MemOp
153+ result: int
154+ bool_out: bool
155+156+# ... Emitted, IRAMWritten, CellWritten, DeferredRead, DeferredSatisfied, ResultSent
157+158+SimEvent = TokenReceived | Matched | Executed | Emitted | IRAMWritten | CellWritten | DeferredRead | DeferredSatisfied | ResultSent
159+EventCallback = Callable[[SimEvent], None]
160+```
161+162+PE fires: `TokenReceived`, `Matched`, `Executed`, `Emitted`, `IRAMWritten`.
163+SM fires: `TokenReceived`, `CellWritten`, `DeferredRead`, `DeferredSatisfied`, `ResultSent`.
164+165+PE and SM constructors gain `on_event: EventCallback | None = None`. `PEConfig` and `SMConfig` gain the same field so `build_topology()` can wire callbacks through. Zero-cost when unused (defaults to no-op).
166+167+### Command/Result Protocol (`monitor/backend.py`)
168+169+Commands sent to the simulation thread:
170+171+```python
172+SimCommand = LoadCmd | StepTickCmd | StepEventCmd | RunUntilCmd | InjectCmd | SendCmd | ResetCmd | StopCmd
173+```
174+175+- `LoadCmd(source: str)` — assemble, build topology, inject seed tokens, wire callbacks
176+- `StepTickCmd` — loop `env.step()` while `env.peek() == current_time`
177+- `StepEventCmd` — call `env.step()` exactly once
178+- `RunUntilCmd(until: float)` — step continuously to target time, batch events per tick
179+- `InjectCmd(token: Token)` — direct append via `sys.inject()` (pre-sim setup, does not wake processes)
180+- `SendCmd(token: Token)` — spawns SimPy process calling `sys.send()` (respects FIFO, wakes waiting processes)
181+- `ResetCmd` — tear down environment, optionally reload last program
182+- `StopCmd` — terminate simulation thread
183+184+Results returned via event queue:
185+186+```python
187+@dataclass
188+class StepResult:
189+ events: list[SimEvent]
190+ snapshot: StateSnapshot
191+ sim_time: float
192+ finished: bool # True when env.peek() == inf
193+```
194+195+### State Snapshots (`monitor/snapshot.py`)
196+197+Reads all PE/SM state from existing public attributes:
198+199+- PE: `iram`, `matching_store`, `gen_counters`, `input_store.items`, `output_log`
200+- SM: `cells` (presence + data), `deferred_read`, `t0_store`, `input_store.items`
201+- System: `env.now`, `env.peek()`
202+203+### WebSocket Protocol (`monitor/server.py`)
204+205+Bidirectional WebSocket at `/ws`:
206+207+Server → Client: `GraphLoaded` (IR graph JSON + initial snapshot), `StepResult` (events + snapshot), `Error`.
208+209+Client → Server: `step_tick`, `step_event`, `run_until`, `pause`, `send`, `inject`.
210+211+REST endpoints for non-WebSocket clients: `POST /load`, `POST /reset`, `GET /state`.
212+213+### CLI REPL (`monitor/repl.py`)
214+215+Built on `cmd.Cmd`. Commands: `load`, `reset`, `step`, `stepe`, `run`, `send`, `inject`, `state`, `pe`, `sm`, `log`, `time`, `help`, `quit`. Synchronous queue interaction — puts commands, blocks on results. Pretty-printed output with ANSI colours.
216+217+### Frontend
218+219+Shared `frontend-common/` library extracted from `dfgraph/frontend/`: Cytoscape.js setup, ELK layout engine, base node/edge styles, logical/physical layout switching, SVG/PNG export. Both `dfgraph/frontend/` and `monitor/frontend/` depend on it.
220+221+Monitor frontend has three panels:
222+223+- **Graph View** — Dataflow graph with execution overlay. Physical layout by default (grouped by PE). Nodes highlight on execution, edges animate on token traversal. Click node to inspect PE state at that offset.
224+- **Event Log** — Scrolling list of `SimEvent`s, filterable by component and event type. Click to highlight relevant graph element.
225+- **State Inspector** — Collapsible tree view of full `StateSnapshot`. PE section (IRAM, matching store, gen counters), SM section (cells, deferred reads, T0 store).
226+227+Controls: load file, step tick/event, run until, run/pause, send token, inject token, reset.
228+229+## Existing Patterns
230+231+### dfgraph Server Architecture
232+233+`dfgraph/server.py` establishes the FastAPI + WebSocket pattern: `ConnectionManager` for WebSocket broadcast, lifespan context manager for startup/shutdown, background file watcher on a separate thread bridged to async via `asyncio.run_coroutine_threadsafe()`. Monitor server follows this pattern with the addition of bidirectional WebSocket messages (dfgraph is server-push-only).
234+235+### dfgraph Frontend Stack
236+237+`dfgraph/frontend/` uses Cytoscape.js with the ELK layout plugin, TypeScript types matching the JSON schema, and SVG/PNG export. These patterns are extracted into `frontend-common/` rather than duplicated.
238+239+### Graph JSON Schema
240+241+`dfgraph/graph_json.py` produces `GraphUpdate` messages with nodes (id, opcode, category, colour, pe, iram_offset, ctx) and edges (source, target, port, addr). Monitor extends this schema with execution state fields rather than replacing it.
242+243+### Emulator Test Patterns
244+245+Tests drive the emulator via `assemble() → build_topology() → inject() → env.run() → inspect output_log`. The monitor backend follows the same pipeline but replaces `env.run()` with controlled stepping and replaces post-hoc `output_log` inspection with real-time event callbacks.
246+247+### No Existing Observability
248+249+Investigation found no event hooks, callback systems, or trace facilities in `emu/` beyond `PE.output_log` (append-only list) and standard Python `logging` calls. The typed event system in `emu/events.py` is a new pattern.
250+251+## Implementation Phases
252+253+<!-- START_PHASE_1 -->
254+### Phase 1: Observability Hooks
255+**Goal:** Add typed event system to `emu/` so PE and SM can emit structured events during simulation
256+257+**Components:**
258+- `emu/events.py` — SimEvent union type, all event dataclasses, `EventCallback` type alias
259+- `emu/pe.py` — accept `on_event` callback, fire events at token_received, matched, executed, emitted, iram_written points
260+- `emu/sm.py` — accept `on_event` callback, fire events at token_received, cell_written, deferred_read, deferred_satisfied, result_sent points
261+- `emu/types.py` — add `on_event: EventCallback | None` field to `PEConfig` and `SMConfig`
262+- `emu/network.py` — wire `on_event` from configs through `build_topology()`
263+264+**Dependencies:** None (first phase)
265+266+**Done when:**
267+- All existing tests pass unchanged (callbacks default to no-op)
268+- New tests verify each event type fires at the correct point with correct payload
269+- Covers `or1-monitor.AC2.*`
270+<!-- END_PHASE_1 -->
271+272+<!-- START_PHASE_2 -->
273+### Phase 2: Simulation Backend
274+**Goal:** Controllable simulation engine with command queue, stepping, and state snapshots
275+276+**Components:**
277+- `monitor/__init__.py` — package init
278+- `monitor/snapshot.py` — `StateSnapshot` dataclass, `capture(sys: System)` function
279+- `monitor/backend.py` — `SimulationBackend` class: persistent thread, command queue, event queue, all command handlers (LoadCmd through StopCmd)
280+281+**Dependencies:** Phase 1 (event callbacks)
282+283+**Done when:**
284+- Backend can load a dfasm program, step by tick and by event, run until target time, inject/send tokens, reset
285+- State snapshots capture all PE/SM state correctly
286+- Covers `or1-monitor.AC1.*`, `or1-monitor.AC5.*`
287+<!-- END_PHASE_2 -->
288+289+<!-- START_PHASE_3 -->
290+### Phase 3: CLI REPL
291+**Goal:** Usable command-line interface for interactive simulation
292+293+**Components:**
294+- `monitor/repl.py` — `cmd.Cmd` subclass with all commands
295+- `monitor/__main__.py` — CLI entry point (`python -m monitor`)
296+297+**Dependencies:** Phase 2 (simulation backend)
298+299+**Done when:**
300+- REPL can load a dfasm file, step through execution, inspect PE/SM state, inject tokens, reset
301+- Output is readable with ANSI colour formatting
302+- Covers `or1-monitor.AC4.*`
303+<!-- END_PHASE_3 -->
304+305+<!-- START_PHASE_4 -->
306+### Phase 4: Frontend Common Extraction
307+**Goal:** Extract shared graph rendering code from dfgraph into a reusable library
308+309+**Components:**
310+- `frontend-common/` — shared TypeScript library (graph renderer, types, layout switching, export)
311+- `dfgraph/frontend/` — refactored to import from `frontend-common/`
312+313+**Dependencies:** None (can run in parallel with Phases 1-3, but sequenced here for clarity)
314+315+**Done when:**
316+- `frontend-common/` builds independently
317+- `dfgraph/frontend/` works identically to before but imports shared code from `frontend-common/`
318+- dfgraph's existing functionality is unaffected
319+<!-- END_PHASE_4 -->
320+321+<!-- START_PHASE_5 -->
322+### Phase 5: Web Server
323+**Goal:** FastAPI server with WebSocket protocol for the monitor frontend
324+325+**Components:**
326+- `monitor/server.py` — FastAPI app, WebSocket endpoint, REST endpoints, async event queue polling
327+- `monitor/graph_json.py` — graph JSON with execution overlay fields
328+329+**Dependencies:** Phase 2 (simulation backend)
330+331+**Done when:**
332+- Server starts, accepts WebSocket connections, handles all client commands
333+- `GraphLoaded` message sent on program load with IR graph + initial snapshot
334+- `StepResult` messages broadcast after each step
335+- Covers `or1-monitor.AC3.*` (server-side), `or1-monitor.AC5.*` (shared backend)
336+<!-- END_PHASE_5 -->
337+338+<!-- START_PHASE_6 -->
339+### Phase 6: Monitor Frontend
340+**Goal:** Browser-based simulation monitor with graph overlay, event log, and state inspector
341+342+**Components:**
343+- `monitor/frontend/` — TypeScript frontend
344+ - Graph view with execution overlay (physical layout default, logical available)
345+ - Event log panel (filterable, clickable)
346+ - State inspector panel (collapsible tree view)
347+ - Controls (load, step, run, pause, send, inject, reset)
348+349+**Dependencies:** Phase 4 (frontend-common), Phase 5 (web server)
350+351+**Done when:**
352+- Graph renders with execution state overlay (active nodes, token flow animation)
353+- Event log displays and filters SimEvents
354+- State inspector shows full PE/SM state
355+- All controls functional
356+- Covers `or1-monitor.AC3.*` (client-side)
357+<!-- END_PHASE_6 -->
358+359+## Additional Considerations
360+361+**SimPy threading:** SimPy is single-threaded and synchronous. The simulation thread owns the `Environment` exclusively. All state reads from the async server or REPL happen via the command/result queues or snapshots taken within the sim thread. No concurrent access to SimPy objects.
362+363+**Backpressure during RunUntil:** When free-running to a target time, events are batched per tick before pushing to the event queue. This prevents flooding the WebSocket with thousands of individual events during a long run. The frontend receives tick-level batches and can animate them at its own pace.
364+365+**SendCmd implementation:** Since `sys.send()` is a SimPy generator, `SendCmd` spawns a one-shot SimPy process that calls `yield from sys.send(token)`. The backend then steps the environment to process the send event, which wakes any process waiting on the destination store.