OR-1 dataflow CPU sketch
at main 397 lines 26 kB view raw view rendered
1# CLAUDE.md — OR1 Dataflow CPU 2 3## Version Control: jj (Jujutsu) 4 5This is a **jj-colocated repository** (both `.jj` and `.git` present). You MUST use `jj` commands, NOT `git` commands, for all version control operations. 6 7### Critical Rules 8 9- **NEVER use `git add`, `git commit`, `git status`, `git diff`, or `git log`.** 10 Using raw git commands in a jj repo creates orphan commits and corrupts working copy tracking. 11- All file changes are automatically tracked by jj. There is no staging area. 12 13### Command Mapping 14 15| Instead of | Use | 16|---|---| 17| `git status` | `jj status` | 18| `git diff` | `jj diff` | 19| `git diff --staged` | `jj diff` (no staging concept) | 20| `git log` | `jj log` | 21| `git add . && git commit -m "msg"` | `jj commit -m "msg"` | 22| `git add file && git commit -m "msg"` | `jj commit -m "msg"` (tracks all changes) | 23| `git log --oneline -10` | `jj log --limit 10` | 24| `git diff HEAD~3..HEAD` | `jj diff --from 'ancestors(@,3)'` | 25| `git rev-parse HEAD` | `jj log -r @ --no-graph -T 'commit_id ++ "\n"'` | 26| `git rev-parse HEAD~N` | `jj log -r 'ancestors(@,N)' --no-graph -T 'commit_id ++ "\n"' --limit 1 --reversed` | 27 28### Commit Workflow 29 30```bash 31# Make your changes to files (no git add needed) 32# Then commit all changes: 33jj commit -m "feat: description of change" 34 35# The working copy (@) is always a new empty change after committing. 36# To see what you just committed: 37jj log --limit 5 38``` 39 40### Bookmarks (Branches) 41 42```bash 43# List bookmarks 44jj bookmark list 45 46# Current bookmark is 'emu' 47# After committing, move bookmark forward: 48jj bookmark set emu -r @- 49``` 50 51## Project Structure 52 53- `cm_inst.py` — Instruction set definitions (Port, MemOp, ALUOp hierarchy, Instruction, FrameDest, OutputStyle, TokenKind, FrameOp, FrameSlotValue) 54- `tokens.py` — Token type hierarchy (Token -> PEToken -> CMToken -> DyadToken/MonadToken; PEToken -> PELocalWriteToken/FrameControlToken; Token -> SMToken). Imports ISA enums from cm_inst. 55- `encoding.py` — Pack/unpack boundary between semantic types and 16-bit hardware words (instruction encoding, flit1 routing words) 56- `sm_mod.py` — Structure Memory cell model (Presence enum, SMCell dataclass with `is_wide` metadata flag) 57- `dfasm.lark` — Lark grammar for dfasm graph assembly language 58- `emu/` — Behavioural emulator package (SimPy-based discrete event simulation) 59 - `emu/events.py` — Simulation event types: `SimEvent` union (TokenReceived, Matched, Executed, Emitted, IRAMWritten, FrameSlotWritten, FrameAllocated, FrameFreed, TokenRejected, CellWritten, DeferredRead, DeferredSatisfied, ResultSent) and `EventCallback` type alias 60 - `emu/types.py` — Config and internal types (PEConfig, SMConfig, DeferredRead). PEConfig/SMConfig accept `on_event: EventCallback | None` for simulation observability. 61 - `emu/alu.py` — Pure-function ALU: `execute(op, left, right, const) -> (result, bool_out)` 62 - `emu/pe.py` — ProcessingElement: matching store, IRAM fetch, output routing, event emission via `on_event` callback 63 - `emu/sm.py` — StructureMemory: I-structure semantics with deferred reads, event emission via `on_event` callback 64 - `emu/network.py``build_topology()` wiring and `System` container; passes `on_event` from configs to PE/SM constructors 65 - `emu/__init__.py` — Public API: exports `System`, `build_topology`, `PEConfig`, `SMConfig`, all event types from `emu/events` 66- `asm/` — Assembler package: dfasm source to emulator-ready config (see `asm/CLAUDE.md`) 67 - `asm/__init__.py` — Public API: `assemble()`, `assemble_to_tokens()`, `run_pipeline()`, `round_trip()`, `serialize_graph()` 68 - `asm/ir.py` — IR types (IRNode, IREdge, IRGraph, IRDataDef, IRRegion, SystemConfig, MacroDef, IRMacroCall, CallSite, etc.) 69 - `asm/errors.py` — Structured error types with source context (ErrorCategory includes MACRO and CALL) 70 - `asm/opcodes.py` — Opcode mnemonic mapping and arity classification 71 - `asm/lower.py` — CST to IRGraph lowering pass 72 - `asm/expand.py` — Macro expansion (opcode params, parameterized qualifiers, `@ret` wiring, variadic repetition) and function call wiring pass 73 - `asm/builtins.py` — Built-in macro library (`#loop_counted`, `#loop_while`, `#permit_inject`, `#reduce_2`/`_3`/`_4`) 74 - `asm/resolve.py` — Name resolution pass 75 - `asm/place.py` — Placement validation and auto-placement 76 - `asm/allocate.py` — IRAM offset and context slot allocation 77 - `asm/codegen.py` — Code generation (direct mode + token stream mode) 78 - `asm/serialize.py` — IRGraph to dfasm source serializer 79- `dfgraph/` — Interactive dataflow graph renderer (see `dfgraph/CLAUDE.md`) 80 - `dfgraph/__main__.py` — CLI: `python -m dfgraph path/to/file.dfasm [--port 8420]` 81 - `dfgraph/pipeline.py` — Progressive pipeline runner (parse -> lower -> expand -> resolve -> place -> allocate with error accumulation) 82 - `dfgraph/categories.py` — Opcode-to-category mapping via isinstance dispatch on ALUOp hierarchy 83 - `dfgraph/graph_json.py` — IRGraph-to-JSON conversion for frontend consumption 84 - `dfgraph/server.py` — FastAPI backend with WebSocket push and file watcher (watchdog, 300ms debounce) 85 - `dfgraph/frontend/` — TypeScript frontend: Cytoscape.js graph with ELK layout, SVG/PNG export (uses `frontend-common/`) 86- `frontend-common/` — Shared TypeScript modules extracted from dfgraph for reuse by monitor frontend 87 - `frontend-common/src/layout.ts` — ELK layout configurations 88 - `frontend-common/src/style.ts` — Cytoscape style definitions 89 - `frontend-common/src/export.ts` — SVG/PNG export utilities 90 - `frontend-common/src/types.ts` — Shared TypeScript interfaces for graph data 91- `monitor/` — Interactive simulation monitor with CLI REPL and web UI (see `monitor/CLAUDE.md`) 92 - `monitor/__init__.py` — Public API: exports `SimulationBackend`, command types, result types, `StateSnapshot`, `capture()` 93 - `monitor/__main__.py` — CLI: `python -m monitor [path/to/file.dfasm] [--web] [--port 8421]` 94 - `monitor/backend.py``SimulationBackend`: threaded simulation controller with command/result queue protocol 95 - `monitor/commands.py` — Command types (LoadCmd, StepTickCmd, StepEventCmd, RunUntilCmd, InjectCmd, SendCmd, ResetCmd, StopCmd) and result types (GraphLoaded, StepResult, ErrorResult) 96 - `monitor/snapshot.py``StateSnapshot` frozen dataclass and `capture(system)` function for state extraction 97 - `monitor/graph_json.py` — IRGraph + StateSnapshot to JSON serialization with execution overlay 98 - `monitor/server.py` — FastAPI server with bidirectional WebSocket protocol 99 - `monitor/repl.py``MonitorREPL(cmd.Cmd)` interactive CLI for simulation control 100 - `monitor/formatting.py` — ANSI colour formatting for REPL output 101 - `monitor/frontend/` — TypeScript frontend: Cytoscape.js graph with execution overlay, event log, state inspector (uses `frontend-common/`) 102- `tests/` — pytest + hypothesis test suite 103 - `tests/conftest.py` — Hypothesis strategies for token/op generation 104 - `tests/test_sm_tiers.py` — T0/T1 memory tier and EXEC bootstrap tests 105 - `tests/test_exec_bootstrap.py` — EXEC opcode acceptance criteria tests 106 - `tests/test_migration_cleanup.py` — Verifies removed types (SysToken, CfgOp, etc.) are absent from codebase 107 - `tests/test_pe_events.py` — PE event emission tests (TokenReceived, Matched, Executed, Emitted, IRAMWritten, FrameAllocated, FrameFreed, FrameSlotWritten, TokenRejected) 108 - `tests/test_pe_frames.py` — Frame-based PE matching, routing, and lifecycle tests 109 - `tests/test_pe_lanes.py` — Lane-based matching tests (ALLOC_SHARED, FREE_LANE, smart FREE, lane exhaustion, pipelining) 110 - `tests/test_sm_events.py` — SM event emission tests (CellWritten, DeferredRead, DeferredSatisfied, ResultSent) 111 - `tests/test_cycle_timing.py` — Cycle-accurate timing verification tests 112 - `tests/test_network_events.py` — Network-level event propagation tests 113 - `tests/test_network_routing.py` — Network routing and connectivity tests 114 - `tests/test_foundation_types.py` — Foundation type (Instruction, FrameDest, encoding) tests 115 - `tests/test_encoding.py` — Pack/unpack encoding tests for instructions and flit1 words 116 - `tests/test_ir_frame_types.py` — IR frame-related type tests 117 - `tests/test_allocate_frames.py` — Frame layout allocation tests 118 - `tests/test_codegen_frames.py` — Frame-based code generation tests 119 - `tests/test_sm_t0_raw.py` — SM T0 raw storage tests 120 - `tests/test_backend.py` — SimulationBackend command/result protocol tests 121 - `tests/test_snapshot.py` — StateSnapshot capture tests 122 - `tests/test_repl.py` — MonitorREPL command tests 123 - `tests/test_monitor_graph_json.py` — Monitor graph JSON serialization tests 124 - `tests/test_monitor_server.py` — Monitor FastAPI server and WebSocket protocol tests 125- `docs/` — Design documents, implementation plans, test plans 126 127## Tech Stack 128 129- Python 3.12 130- SimPy 4.1 (discrete event simulation) 131- Lark (Earley parser for dfasm grammar) 132- FastAPI + uvicorn (dfgraph and monitor web servers) 133- watchdog (file system monitoring for dfgraph live reload) 134- Cytoscape.js + cytoscape-elk (frontend graph rendering and layout) 135- pytest + hypothesis (property-based testing) 136- Nix flake for dev environment 137 138## Running Tests 139 140```bash 141python -m pytest tests/ -v 142``` 143 144## Architecture Contracts 145 146### Token Hierarchy (tokens.py) 147 148All tokens inherit from `Token(target: int)`. The hierarchy: 149 150- `PEToken(Token)` -- base for all PE-targeted tokens (used for routing in network.py) 151 - `CMToken(PEToken)` -- adds `offset: int`, `act_id: int`, `data: int` (frozen dataclass, base for data-carrying tokens) 152 - `DyadToken(CMToken)` -- adds `port: Port` (dyadic operand requiring match) 153 - `MonadToken(CMToken)` -- adds `inline: bool` (monadic operand, no match required) 154 - `PELocalWriteToken(PEToken)` -- adds `act_id`, `region`, `slot`, `data`, `is_dest` (writes frame slots directly) 155 - `FrameControlToken(PEToken)` -- adds `act_id`, `op: FrameOp`, `payload: int` (frame alloc/free) 156- `SMToken(Token)` -- `addr: int`, `op: MemOp`, `flags`, `data`, `ret: Optional[CMToken]` 157 158### Instruction Set (cm_inst.py) 159 160- `ALUOp(IntEnum)` base with subclasses: `ArithOp`, `LogicOp`, `RoutingOp` 161- `MemOp(IntEnum)` -- read/write/atomic ops (READ, WRITE, EXEC, ALLOC, FREE, EXT, CLEAR, RD_INC, RD_DEC, CMP_SW, RAW_READ, SET_PAGE, WRITE_IMM) 162- `Instruction(opcode, output: OutputStyle, has_const, dest_count, wide, fref)` -- unified instruction type for both ALU and memory ops, stored in PE IRAM 163- `FrameDest(target_pe: int, offset: int, act_id: int, port: Port, token_kind: TokenKind)` -- destination address resolved from frame slot 164- `FrameSlotValue = int | FrameDest | None` -- type alias for frame slot contents 165- `OutputStyle` enum -- INHERIT, CHANGE_TAG, SINK for output routing decisions 166- `TokenKind` enum -- DYADIC, MONADIC, INLINE for token kind classification 167- `FrameOp(IntEnum)` -- ALLOC, FREE, ALLOC_SHARED, FREE_LANE for frame lifecycle control tokens 168- `is_monadic_alu(op: ALUOp) -> bool` -- canonical source of truth for monadic ALU op classification (used by `emu/pe.py` and `asm/opcodes.py`) 169 170### ALU (emu/alu.py) 171 172Pure function, no state. `execute(op, left, right, const) -> (result: int, bool_out: bool)`. 173 174**Invariants:** 175- All results masked to 16-bit unsigned (`& 0xFFFF`) 176- Comparisons interpret values as signed 2's complement via `to_signed()` 177- ArithOp: `bool_out` always `False` 178- RoutingOp: `bool_out` drives branch/switch/gate decisions 179 180### Processing Element (emu/pe.py) 181 182Frame-based processing element with activation context management. 183 184**Frame Storage:** 185- `frames: list[list[FrameSlotValue]]` -- 2D array [frame_id][slot_idx] holding FrameDest objects and constants (shared across all lanes) 186- `tag_store: dict[int, tuple[int, int]]` -- maps act_id → (frame_id, lane) for activation-to-frame-and-lane lookup 187- `match_data: list[list[list[Optional[int]]]]` -- 3D array [frame_id][match_slot][lane] for operand values waiting for partner 188- `presence: list[list[list[bool]]]` -- 3D array [frame_id][match_slot][lane] for dyadic operand waiting state 189- `port_store: list[list[list[Optional[Port]]]]` -- 3D array [frame_id][match_slot][lane] for operand port metadata 190- `lane_count: int` -- number of matching lanes per frame 191- `lane_free: dict[int, set[int]]` -- per-frame set of available lane IDs (created on ALLOC, deleted on full FREE) 192- `free_frames: list[int]` -- pool of unallocated frame IDs 193- `iram: dict[int, Instruction]` -- instruction memory indexed by offset 194 195**Token Processing Pipeline:** 196- Side paths (FrameControlToken, PELocalWriteToken): 1 cycle 197- Dyadic CMToken: 5 cycles (dequeue + IFETCH + MATCH + EXECUTE + EMIT) 198- Monadic CMToken: 4 cycles (dequeue + IFETCH + EXECUTE + EMIT) 199 200**Matching Logic:** 201- DyadToken arrives with act_id: look up (frame_id, lane) via tag_store 202- Match slot is derived from token.offset: match_slot = token.offset % matchable_offsets 203- If presence[frame_id][match_slot][lane] is False: store token.data in match_data[frame_id][match_slot][lane], store token.port in port_store[frame_id][match_slot][lane], set presence bit to True, wait for partner 204- If presence[frame_id][match_slot][lane] is True: retrieve partner data and port from match_data and port_store, clear presence bit, fire instruction with both operands 205- Port ordering: partner with Port.L goes to left operand; Port.R to right operand 206- Match data, presence, and port storage are per-lane; frame constants/destinations (in frames) remain shared across all lanes 207 208**Output Routing** (determined by `Instruction.output`): 209- `OutputStyle.INHERIT` -- routes to destinations specified in frame slots 210- `OutputStyle.CHANGE_TAG` -- routes with different act_id tag (context switch) 211- `OutputStyle.SINK` -- writes result to frame slot, emits no token 212 213**Frame Initialization:** 214- PE constructor loads initial_frames from PEConfig: `dict[int, dict[int, FrameSlotValue] | list[FrameSlotValue]]` 215- Dict value format: `{slot_idx: value}` where int values are unpacked via `unpack_flit1()` to FrameDest 216- List value format: `[slot0, slot1, ...]` raw values assigned by index 217- Handles both codegen-produced packed integers and test-produced FrameDest objects 218 219**Output logging:** 220- `PE.output_log: list` records every token emitted (for testing and tracing) 221 222**Frame Control Operations** (`_handle_frame_control`): 223- `ALLOC` -- allocates a fresh frame from free_frames, assigns lane 0, initializes lane_free with remaining lanes 224- `FREE` -- smart free: removes act_id from tag_store, clears lane match state. If other activations share the frame, returns lane to lane_free (frame_freed=False). If last lane, returns frame to free_frames and clears frame slots (frame_freed=True) 225- `ALLOC_SHARED` -- shared allocation: looks up parent act_id (from payload) in tag_store, finds parent's frame_id, assigns next free lane from lane_free. Rejects if parent not found or no free lanes 226- `FREE_LANE` -- lane-only free: removes act_id, clears lane match state, returns lane to lane_free. Never returns frame to free_frames (frame_freed always False) 227 228**ALLOC_REMOTE** (RoutingOp in `_run` pipeline): 229- Reads fref+0 (target PE), fref+1 (target act_id), fref+2 (parent act_id) from frame constants 230- If fref+2 is non-zero: emits FrameControlToken with ALLOC_SHARED op and parent act_id as payload 231- If fref+2 is zero: emits FrameControlToken with ALLOC op (fresh frame allocation) 232 233**PELocalWriteToken handling:** 234- Writes data to frame slot at specified region/slot within the act_id's frame (1 cycle) 235 236### Structure Memory (emu/sm.py) 237 238SimPy process with I-structure (single-assignment) semantics. 239 240**Cell states** (`Presence` enum): `EMPTY`, `RESERVED`, `FULL`, `WAITING` 241 242**Cycle-accurate pipeline timing:** 243- Dequeue: 1 cycle (yields `env.timeout(1)` after `input_store.get()`) 244- Each handler yields 1 cycle for its processing stage (process/write/read-modify-write cycle) 245- `_send_result()` yields 1 cycle for the response/delivery cycle (blocks the SM while delivering) 246- Total varies by operation: READ on FULL = dequeue(1) + process(1) + result(1) + put = 3+ cycles; WRITE = dequeue(1) + write(1) = 2 cycles 247 248**Deferred read contract:** 249- READ on non-FULL cell: sets cell to `WAITING`, stores `DeferredRead` 250- Subsequent WRITE to that cell: satisfies deferred read, sends result via return route 251- Only one deferred read at a time per SM instance 252- CLEAR on a WAITING cell cancels the deferred read 253 254**Atomic operations (RD_INC, RD_DEC, CMP_SW):** 255- Restricted to cell addresses < `ATOMIC_CELL_LIMIT` (256) 256- Cell must be `FULL`; returns old value via return route 257- CMP_SW: compares `token.flags` (expected) with current; swaps to `token.data` on match 258 259**Memory Tiers:** 260- **T1 (below tier_boundary):** Per-SM I-structure cells with presence tracking, deferred reads, atomic ops. Default tier_boundary: 256. 261- **T0 (at/above tier_boundary):** Shared raw storage across all SMs. No presence tracking. `list[Token]` shared by all SM instances. 262- T0 operations: READ (immediate return), WRITE (no presence check), EXEC (inject tokens from T0 into network) 263- I-structure ops on T0 addresses are errors (logged and dropped) 264 265### Network Topology (emu/network.py) 266 267`build_topology(env, pe_configs, sm_configs, fifo_capacity) -> System` 268 269**Wiring contract:** 270- Every PE gets a `route_table` mapping `pe_id -> simpy.Store` for all PEs 271- Every PE gets `sm_routes` mapping `sm_id -> simpy.Store` for all SMs 272- Every SM gets a `route_table` mapping `pe_id -> simpy.Store` for all PEs 273- Default is full-mesh connectivity (any PE can send to any PE or SM) 274- If `PEConfig.allowed_pe_routes` or `allowed_sm_routes` is set, `build_topology` restricts routes at construction time 275 276**System API:** 277- `System.inject(token: Token)` -- route token by type: SMToken → target SM, PEToken → target PE (direct append, bypasses FIFO) 278- `System.send(token: Token)` -- same routing as inject() but yields `env.timeout(1)` for 1-cycle delivery latency then `store.put()` (SimPy generator, respects FIFO backpressure) 279- `System.load(tokens: list[Token])` -- spawns SimPy process that calls send() for each token in order 280 281**PEConfig (emu/types.py):** 282- `pe_id: int`, `iram: dict[int, Instruction] | None`, `frame_count: int = 8`, `frame_slots: int = 64`, `matchable_offsets: int = 8`, `lane_count: int = 4` 283- `initial_frames: Optional[dict[int, list[FrameSlotValue]]]` -- pre-loaded frame data 284- `initial_tag_store: Optional[dict[int, tuple[int, int]]]` -- pre-loaded act_id → (frame_id, lane) mappings 285- `allowed_pe_routes: Optional[set[int]]` -- if set, restrict PE route_table to these PE IDs 286- `allowed_sm_routes: Optional[set[int]]` -- if set, restrict PE sm_routes to these SM IDs 287- `on_event: EventCallback | None` -- if set, PE fires `SimEvent` for every token receive, match, execute, emit, frame alloc/free, slot write, and rejection 288 289**SMConfig (emu/types.py):** 290- `sm_id: int`, `cell_count: int = 512`, `initial_cells: Optional[dict]`, `tier_boundary: int = 256` 291- `on_event: EventCallback | None` -- if set, SM fires `SimEvent` for every token receive, cell write, deferred read/satisfy, and result send 292- `tier_boundary` controls the T0/T1 split: addresses below are T1 (I-structure), at/above are T0 (shared raw storage) 293- All SM instances share the same `t0_store: list[Token]` (wired by `build_topology`) 294 295### Simulation Events (emu/events.py) 296 297Frozen dataclass event types emitted by PE and SM when `on_event` callback is set. 298 299**Event types:** 300- `TokenReceived(time, component, token)` -- PE/SM received a token 301- `Matched(time, component, left, right, act_id, offset, frame_id)` -- PE matched a dyadic pair 302- `Executed(time, component, op, result, bool_out)` -- PE executed an ALU instruction 303- `Emitted(time, component, token)` -- PE emitted an output token 304- `IRAMWritten(time, component, offset, count)` -- PE wrote instructions to IRAM 305- `FrameAllocated(time, component, act_id, frame_id, lane)` -- PE allocated a frame (lane indicates which matching lane was assigned) 306- `FrameFreed(time, component, act_id, frame_id, lane, frame_freed)` -- PE freed a frame lane (frame_freed=True if physical frame returned to pool) 307- `FrameSlotWritten(time, component, frame_id, slot, value)` -- PE wrote to a frame slot 308- `TokenRejected(time, component, token, reason)` -- PE rejected a token (e.g., act_id not in tag store) 309- `CellWritten(time, component, addr, old_pres, new_pres)` -- SM cell presence changed 310- `DeferredRead(time, component, addr)` -- SM registered a deferred read 311- `DeferredSatisfied(time, component, addr, data)` -- SM satisfied a deferred read 312- `ResultSent(time, component, token)` -- SM sent a result token back 313 314**Union type:** `SimEvent = TokenReceived | Matched | Executed | Emitted | IRAMWritten | FrameAllocated | FrameFreed | FrameSlotWritten | TokenRejected | CellWritten | DeferredRead | DeferredSatisfied | ResultSent` 315 316**Callback type:** `EventCallback = Callable[[SimEvent], None]` 317 318**Invariants:** 319- `component` field is always `"pe:{id}"` or `"sm:{id}"` format 320- All events have `time: float` matching `env.now` when the event occurred 321- Events are emitted synchronously within the SimPy process step; no buffering 322 323### Monitor (monitor/) 324 325Interactive simulation monitor providing both CLI REPL and web UI for controlling and observing OR1 simulations. 326 327**Command/Result protocol (monitor/commands.py):** 328- Commands: `LoadCmd(source)`, `StepTickCmd()`, `StepEventCmd()`, `RunUntilCmd(until)`, `InjectCmd(token)`, `SendCmd(token)`, `ResetCmd(reload)`, `StopCmd()` 329- Results: `GraphLoaded(ir_graph, snapshot)`, `StepResult(events, snapshot, sim_time, finished)`, `ErrorResult(message, errors)` 330- `SimCommand` union: `LoadCmd | StepTickCmd | StepEventCmd | RunUntilCmd | InjectCmd | SendCmd | ResetCmd | StopCmd` 331 332**SimulationBackend (monitor/backend.py):** 333- Owns SimPy environment in a dedicated daemon thread 334- `start()` / `stop()` lifecycle; `send_command(cmd, timeout) -> result` 335- Wires `on_event` callback into all PEConfig/SMConfig during `LoadCmd` handling 336- Reset with `reload=True` reloads last source; `reload=False` tears down and awaits new `LoadCmd` 337 338**StateSnapshot (monitor/snapshot.py):** 339- `capture(system) -> StateSnapshot` reads live PE/SM state into frozen dataclasses 340- `StateSnapshot(sim_time, next_time, pes: dict[int, PESnapshot], sms: dict[int, SMSnapshot])` 341- `PESnapshot(pe_id, iram, frames, tag_store, presence, port_store, match_data, free_frames, lane_count, input_queue, output_log)` -- frame-based PE state with 3D match storage (presence, port_store, match_data are all [frame_id][match_slot][lane]), tag_store mapping act_id → (frame_id, lane) tuples, and lane_count field 342- `SMSnapshot(sm_id, cells: dict[int, SMCellSnapshot], deferred_read, t0_store, input_queue)` 343 344**WebSocket protocol (monitor/server.py):** 345- Client sends JSON commands: `{"type": "load", "source": "..."}`, `{"type": "step_tick"}`, etc. 346- Server responds with JSON results containing event arrays and state snapshots 347- `create_app(backend) -> FastAPI` factory function 348 349**CLI REPL (monitor/repl.py):** 350- `MonitorREPL(cmd.Cmd)` with commands: load, step, event, run, inject, send, reset, pe, sm, state, quit 351 352### Module Dependency Graph 353 354`cm_inst.py` defines ISA enums and instruction types (no dependencies). `tokens.py` imports from `cm_inst.py` and defines the token hierarchy. `sm_mod.py` is independent. `emu/events.py` imports from `cm_inst`, `sm_mod`, and `tokens`. The `emu/` package imports from root-level modules but root-level modules never import from `emu/`. The `asm/` package imports from both root-level modules and `emu/types.py` (for PEConfig/SMConfig), but neither root-level modules nor `emu/` import from `asm/`. The `dfgraph/` package imports from `cm_inst`, `asm/` (ir, lower, resolve, place, allocate, errors, opcodes), and internally between its own modules. The `monitor/` package imports from `cm_inst`, `tokens`, `sm_mod`, `emu/` (events, types, network), `asm/` (ir, codegen, run_pipeline), and `dfgraph/` (categories). Neither root-level modules, `emu/`, nor `asm/` import from `dfgraph/` or `monitor/`. 355 356``` 357cm_inst.py <-- tokens.py <-- emu/events.py <-- emu/types.py 358 | | | | 359 v v v v 360 emu/alu.py sm_mod.py emu/pe.py <--> emu/sm.py 361 \ / 362 emu/network.py 363 ^ 364 | 365asm/ir.py <-- asm/opcodes.py asm/codegen.py 366 | | | 367 v v v 368asm/lower.py asm/resolve.py asm/allocate.py 369 | | 370 | asm/place.py 371 | | 372 +--- dfgraph/pipeline.py ----------+ 373 | | 374 | dfgraph/categories.py dfgraph/graph_json.py 375 | (cm_inst) (asm/ir, asm/opcodes) 376 | | 377 | dfgraph/server.py 378 | | 379 | dfgraph/frontend/ 380 | | 381 +--- monitor/backend.py --------+ | 382 | (asm, emu) | | 383 | v | 384 +--- monitor/snapshot.py monitor/commands.py 385 | (emu/network) | 386 | v 387 +--- monitor/graph_json.py -----+ 388 | (asm/ir, dfgraph/categories, emu/events) 389 | | 390 +--- monitor/server.py ---------+ 391 | (FastAPI, WebSocket) 392 | | 393 +--- monitor/repl.py monitor/frontend/ 394 (cmd.Cmd) 395``` 396 397<!-- freshness: 2026-03-07 -->