···106106 - `tests/test_migration_cleanup.py` — Verifies removed types (SysToken, CfgOp, etc.) are absent from codebase
107107 - `tests/test_pe_events.py` — PE event emission tests (TokenReceived, Matched, Executed, Emitted, IRAMWritten, FrameAllocated, FrameFreed, FrameSlotWritten, TokenRejected)
108108 - `tests/test_pe_frames.py` — Frame-based PE matching, routing, and lifecycle tests
109109+ - `tests/test_pe_lanes.py` — Lane-based matching tests (ALLOC_SHARED, FREE_LANE, smart FREE, lane exhaustion, pipelining)
109110 - `tests/test_sm_events.py` — SM event emission tests (CellWritten, DeferredRead, DeferredSatisfied, ResultSent)
110111 - `tests/test_cycle_timing.py` — Cycle-accurate timing verification tests
111112 - `tests/test_network_events.py` — Network-level event propagation tests
···163164- `FrameSlotValue = int | FrameDest | None` -- type alias for frame slot contents
164165- `OutputStyle` enum -- INHERIT, CHANGE_TAG, SINK for output routing decisions
165166- `TokenKind` enum -- DYADIC, MONADIC, INLINE for token kind classification
166166-- `FrameOp(IntEnum)` -- ALLOC, FREE for frame lifecycle control tokens
167167+- `FrameOp(IntEnum)` -- ALLOC, FREE, ALLOC_SHARED, FREE_LANE for frame lifecycle control tokens
167168- `is_monadic_alu(op: ALUOp) -> bool` -- canonical source of truth for monadic ALU op classification (used by `emu/pe.py` and `asm/opcodes.py`)
168169169170### ALU (emu/alu.py)
···181182Frame-based processing element with activation context management.
182183183184**Frame Storage:**
184184-- `frames: list[list[FrameSlotValue]]` -- 2D array [frame_id][slot_idx] holding FrameDest objects and constants
185185-- `tag_store: dict[int, int]` -- maps act_id → frame_id for activation-to-frame lookup
186186-- `presence: list[list[bool]]` -- [frame_id][match_slot] for dyadic operand waiting state
187187-- `port_store: list[list[Optional[Port]]]` -- [frame_id][match_slot] for operand port metadata
185185+- `frames: list[list[FrameSlotValue]]` -- 2D array [frame_id][slot_idx] holding FrameDest objects and constants (shared across all lanes)
186186+- `tag_store: dict[int, tuple[int, int]]` -- maps act_id → (frame_id, lane) for activation-to-frame-and-lane lookup
187187+- `match_data: list[list[list[Optional[int]]]]` -- 3D array [frame_id][match_slot][lane] for operand values waiting for partner
188188+- `presence: list[list[list[bool]]]` -- 3D array [frame_id][match_slot][lane] for dyadic operand waiting state
189189+- `port_store: list[list[list[Optional[Port]]]]` -- 3D array [frame_id][match_slot][lane] for operand port metadata
190190+- `lane_count: int` -- number of matching lanes per frame
191191+- `lane_free: dict[int, set[int]]` -- per-frame set of available lane IDs (created on ALLOC, deleted on full FREE)
188192- `free_frames: list[int]` -- pool of unallocated frame IDs
189193- `iram: dict[int, Instruction]` -- instruction memory indexed by offset
190194···194198- Monadic CMToken: 4 cycles (dequeue + IFETCH + EXECUTE + EMIT)
195199196200**Matching Logic:**
197197-- DyadToken arrives with act_id: look up frame_id via tag_store, then check presence[frame_id][iram_offset]
198198-- If slot empty: store token.data and token.port, set presence bit, wait for partner
199199-- If slot occupied: retrieve partner data and port, clear presence bit, fire instruction with both operands
201201+- DyadToken arrives with act_id: look up (frame_id, lane) via tag_store
202202+- Match slot is derived from token.offset: match_slot = token.offset % matchable_offsets
203203+- If presence[frame_id][match_slot][lane] is False: store token.data in match_data[frame_id][match_slot][lane], store token.port in port_store[frame_id][match_slot][lane], set presence bit to True, wait for partner
204204+- If presence[frame_id][match_slot][lane] is True: retrieve partner data and port from match_data and port_store, clear presence bit, fire instruction with both operands
200205- Port ordering: partner with Port.L goes to left operand; Port.R to right operand
206206+- Match data, presence, and port storage are per-lane; frame constants/destinations (in frames) remain shared across all lanes
201207202208**Output Routing** (determined by `Instruction.output`):
203209- `OutputStyle.INHERIT` -- routes to destinations specified in frame slots
···213219**Output logging:**
214220- `PE.output_log: list` records every token emitted (for testing and tracing)
215221222222+**Frame Control Operations** (`_handle_frame_control`):
223223+- `ALLOC` -- allocates a fresh frame from free_frames, assigns lane 0, initializes lane_free with remaining lanes
224224+- `FREE` -- smart free: removes act_id from tag_store, clears lane match state. If other activations share the frame, returns lane to lane_free (frame_freed=False). If last lane, returns frame to free_frames and clears frame slots (frame_freed=True)
225225+- `ALLOC_SHARED` -- shared allocation: looks up parent act_id (from payload) in tag_store, finds parent's frame_id, assigns next free lane from lane_free. Rejects if parent not found or no free lanes
226226+- `FREE_LANE` -- lane-only free: removes act_id, clears lane match state, returns lane to lane_free. Never returns frame to free_frames (frame_freed always False)
227227+228228+**ALLOC_REMOTE** (RoutingOp in `_run` pipeline):
229229+- Reads fref+0 (target PE), fref+1 (target act_id), fref+2 (parent act_id) from frame constants
230230+- If fref+2 is non-zero: emits FrameControlToken with ALLOC_SHARED op and parent act_id as payload
231231+- If fref+2 is zero: emits FrameControlToken with ALLOC op (fresh frame allocation)
232232+216233**PELocalWriteToken handling:**
217234- Writes data to frame slot at specified region/slot within the act_id's frame (1 cycle)
218235···262279- `System.load(tokens: list[Token])` -- spawns SimPy process that calls send() for each token in order
263280264281**PEConfig (emu/types.py):**
265265-- `pe_id: int`, `iram: dict[int, Instruction] | None`, `frame_count: int = 8`, `frame_slots: int = 64`, `matchable_offsets: int = 8`
282282+- `pe_id: int`, `iram: dict[int, Instruction] | None`, `frame_count: int = 8`, `frame_slots: int = 64`, `matchable_offsets: int = 8`, `lane_count: int = 4`
266283- `initial_frames: Optional[dict[int, list[FrameSlotValue]]]` -- pre-loaded frame data
267267-- `initial_tag_store: Optional[dict[int, int]]` -- pre-loaded act_id -> frame_id mappings
284284+- `initial_tag_store: Optional[dict[int, tuple[int, int]]]` -- pre-loaded act_id → (frame_id, lane) mappings
268285- `allowed_pe_routes: Optional[set[int]]` -- if set, restrict PE route_table to these PE IDs
269286- `allowed_sm_routes: Optional[set[int]]` -- if set, restrict PE sm_routes to these SM IDs
270287- `on_event: EventCallback | None` -- if set, PE fires `SimEvent` for every token receive, match, execute, emit, frame alloc/free, slot write, and rejection
···285302- `Executed(time, component, op, result, bool_out)` -- PE executed an ALU instruction
286303- `Emitted(time, component, token)` -- PE emitted an output token
287304- `IRAMWritten(time, component, offset, count)` -- PE wrote instructions to IRAM
288288-- `FrameAllocated(time, component, act_id, frame_id)` -- PE allocated a frame
289289-- `FrameFreed(time, component, act_id, frame_id)` -- PE freed a frame
305305+- `FrameAllocated(time, component, act_id, frame_id, lane)` -- PE allocated a frame (lane indicates which matching lane was assigned)
306306+- `FrameFreed(time, component, act_id, frame_id, lane, frame_freed)` -- PE freed a frame lane (frame_freed=True if physical frame returned to pool)
290307- `FrameSlotWritten(time, component, frame_id, slot, value)` -- PE wrote to a frame slot
291308- `TokenRejected(time, component, token, reason)` -- PE rejected a token (e.g., act_id not in tag store)
292309- `CellWritten(time, component, addr, old_pres, new_pres)` -- SM cell presence changed
···321338**StateSnapshot (monitor/snapshot.py):**
322339- `capture(system) -> StateSnapshot` reads live PE/SM state into frozen dataclasses
323340- `StateSnapshot(sim_time, next_time, pes: dict[int, PESnapshot], sms: dict[int, SMSnapshot])`
324324-- `PESnapshot(pe_id, frames, tag_store, presence, port_store, free_frames, iram, input_queue, output_log)` -- frame-based PE state
341341+- `PESnapshot(pe_id, iram, frames, tag_store, presence, port_store, match_data, free_frames, lane_count, input_queue, output_log)` -- frame-based PE state with 3D match storage (presence, port_store, match_data are all [frame_id][match_slot][lane]), tag_store mapping act_id → (frame_id, lane) tuples, and lane_count field
325342- `SMSnapshot(sm_id, cells: dict[int, SMCellSnapshot], deferred_read, t0_store, input_queue)`
326343327344**WebSocket protocol (monitor/server.py):**
···11+# Frame Lanes: Concurrent Matching Within a Single Activation
22+33+Design note extending `pe-redesign-frames-and-pipeline.md` with per-frame
44+matching lanes. Addresses the problem of multiple simultaneous pending operands
55+for the same dyadic instruction within a single activation — required for loops
66+and recursion.
77+88+## Companion Documents
99+1010+- `pe-redesign-frames-and-pipeline.md` — base architecture (frames, pipeline,
1111+ approaches A/B/C)
1212+1313+---
1414+1515+## Problem Statement
1616+1717+The current frame model maps each `activation_id` to exactly one `frame_id`.
1818+Within that frame, each dyadic instruction gets one matching slot indexed by
1919+`offset % matchable_offsets`. This means at most one pending operand per
2020+instruction per activation at any time.
2121+2222+This fails for loops. Consider a counted loop where a dyadic ADD instruction
2323+receives feedback from its own INC output:
2424+2525+```
2626+iteration 1: L operand arrives at offset 3 → stored at presence[frame][3]
2727+iteration 2: L operand arrives at offset 3 → collision! presence[frame][3]
2828+ is already set, so the hardware thinks this is a MATCH
2929+ (pairing two L operands from different iterations)
3030+```
3131+3232+Without disambiguation, the second iteration's L operand is incorrectly paired
3333+with the first iteration's L operand instead of waiting for its own R partner.
3434+3535+The original design solved this with a 2-bit generation counter per context
3636+slot, but that consumed token payload bits (6 bits total for ctx+gen) and
3737+limited the instruction offset field. The frame redesign dropped the generation
3838+field to widen offset to 8 bits and simplify the token format.
3939+4040+We need a mechanism that provides generation-like disambiguation without
4141+re-adding a token field.
4242+4343+---
4444+4545+## Proposed Solution: Matching Lanes
4646+4747+### Core Idea
4848+4949+Split the tag store mapping from one-to-one into many-to-one. Multiple
5050+`activation_id` values can map to the **same physical frame** but with different
5151+**lane indices**. Lanes share the frame's constants and destinations (written
5252+once at setup) but provide independent matching slots.
5353+5454+```
5555+tag_store[act_id] → (frame_id, lane)
5656+5757+Constants/dests: frames[frame_id][slot] — shared across all lanes
5858+Match data: match_data[frame_id][offset][lane] — per-lane
5959+Presence: presence[frame_id][offset][lane] — per-lane (1 bit)
6060+Port: port_store[frame_id][offset][lane] — per-lane (1 bit)
6161+```
6262+6363+### Frame Control Token Extensions
6464+6565+The frame control token format (prefix `011+00`) has 3 spare bits in flit 1
6666+and a 16-bit payload in flit 2:
6767+6868+```
6969+Frame control flit 1: [0][1][1][PE:2][00][op:1][act_id:3][spare:3] = 16 bits
7070+Frame control flit 2: [payload:16]
7171+```
7272+7373+The `op` field currently encodes ALLOC (0) and FREE (1). We split these into
7474+four operations using 1 spare bit:
7575+7676+```
7777+op spare[2] operation
7878+── ──────── ─────────
7979+0 0 ALLOC_NEW allocate fresh frame, assign lane 0
8080+0 1 ALLOC_SHARED share existing frame, assign next free lane
8181+1 0 FREE_FRAME release lane AND return frame to free list
8282+1 1 FREE_LANE release lane only, frame stays allocated
8383+```
8484+8585+**ALLOC_SHARED** uses flit 2 to carry the parent activation_id whose frame
8686+should be shared:
8787+8888+```
8989+ALLOC_SHARED flit 2: [parent_act_id:3][spare:13]
9090+```
9191+9292+The PE looks up `tag_store[parent_act_id] → (frame_id, _)`, picks the next
9393+free lane from `lane_free[frame_id]`, and records
9494+`tag_store[act_id] = (frame_id, lane)`.
9595+9696+**FREE_LANE** removes the act_id → (frame, lane) mapping, clears that lane's
9797+presence/port bits across all matchable offsets, and marks the lane as free.
9898+The frame remains allocated (constants/dests preserved).
9999+100100+**FREE_FRAME** does the same as FREE_LANE, then additionally returns the
101101+frame to the free list. Should only be issued when all lanes for that frame
102102+are free (or the PE can force-clear remaining lanes).
103103+104104+### Lifecycle for a Loop
105105+106106+```
107107+1. ALLOC_NEW(act_id=0) → frame 2, lane 0
108108+2. Setup: write constants/dests to frame 2
109109+3. Iteration 1 seed tokens use act_id=0
110110+111111+4. Before iteration 2:
112112+ ALLOC_SHARED(act_id=1, parent=0) → frame 2, lane 1
113113+ Iteration 2 seed tokens use act_id=1
114114+115115+5. When iteration 1 completes:
116116+ FREE_LANE(act_id=0) → lane 0 freed, frame 2 stays
117117+118118+6. Before iteration 3:
119119+ ALLOC_SHARED(act_id=2, parent=1) → frame 2, lane 0 (recycled)
120120+ Iteration 3 seed tokens use act_id=2
121121+122122+7. When all iterations done:
123123+ FREE_FRAME(act_id=last) → frame 2 returned to free list
124124+```
125125+126126+Constants are written once in step 2. Each iteration gets its own matching
127127+lanes via a different act_id sharing the same frame.
128128+129129+### ABA Safety
130130+131131+The ABA concern: a stale token from iteration 1 (act_id=0) arrives after
132132+act_id=0 has been freed and re-allocated for iteration 3 (act_id=2).
133133+134134+This is safe because:
135135+- FREE_LANE removes act_id=0 from the tag store entirely
136136+- When act_id=2 is allocated for iteration 3, it uses act_id=2 (not act_id=0)
137137+- Any stale token with act_id=0 hits "act_id not in tag store" → rejected
138138+139139+With 3-bit act_id (8 values) and at most 4 lanes per frame, there are 4 IDs
140140+of ABA distance between allocation and re-use of the same act_id value. Given
141141+that stale tokens drain within single-digit cycles, this is sufficient.
142142+143143+---
144144+145145+## Hardware Impact by Approach
146146+147147+### Lane Count
148148+149149+L = number of lanes per frame. Practical values: 2 (1 bit) or 4 (2 bits).
150150+151151+With L=4 and 4 frames: 16 possible (frame, lane) pairs. With 8 matchable
152152+offsets: 128 match slots total. This provides 4 simultaneous pending operands
153153+per instruction per frame — enough for most loop depths. Deeply nested
154154+recursion beyond L would require frame splitting across PEs (the assembler
155155+already supports this).
156156+157157+### Approach C: 74LS670 Lookup (Recommended v0)
158158+159159+**Tag store changes:**
160160+161161+Currently: 2× 670, act_id → {valid:1, frame_id:2, spare:1}
162162+163163+With lanes: 2× 670, act_id → {valid:1, frame_id:2, lane:1} for L=2
164164+(the spare bit becomes the lane index). For L=4, we need
165165+{valid:1, frame_id:2, lane:2} = 5 bits, which exceeds one 670's 4-bit width.
166166+167167+Options:
168168+- **L=2 (1-bit lane):** fits in existing 670 layout. Zero additional chips for
169169+ tag store. The spare bit becomes the lane bit.
170170+- **L=4 (2-bit lane):** need a third 670 to hold the extra lane bit (and
171171+ valid moves there too). +1 chip.
172172+173173+**Presence/port metadata changes:**
174174+175175+Currently: 4× 670 indexed by frame_id, each word holds presence+port for
176176+2 offsets across 4 frames. Layout:
177177+178178+```
179179+670 chip N (offsets 2N, 2N+1):
180180+ word[frame_id] = {pres_2N:1, port_2N:1, pres_2N+1:1, port_2N+1:1}
181181+```
182182+183183+With lanes, the index becomes `[frame_id:2][lane]` instead of just
184184+`[frame_id:2]`. The 670 has 4 words, so:
185185+186186+- **L=2:** index is `[frame_id:2][lane:1]` = 3 bits, but the 670 only has
187187+ 2-bit addressing (4 words). We need to double the 670 count: 8× 670 for
188188+ presence+port, with lane as chip-select. **+4 chips.**
189189+190190+ Alternatively, re-pack: each 670 word holds presence+port for 1 offset
191191+ across 2 lanes: `{pres_L0:1, port_L0:1, pres_L1:1, port_L1:1}`. Then
192192+ we need 8 offsets × 1 chip each = 8× 670, indexed by frame_id (2 bits),
193193+ with offset selecting the chip. This is the same +4 chips but cleaner.
194194+195195+- **L=4:** index is `[frame_id:2][lane:2]` = 4 bits. The 670 has 4 words
196196+ (2-bit address), so we'd need 4 670s per offset-pair, one per frame_id.
197197+ That's 4 × 4 = 16 670s for presence+port. Impractical. At L=4, the
198198+ presence/port metadata should move to SRAM or use a different register
199199+ approach.
200200+201201+**Match operand data:**
202202+203203+Currently in frame SRAM at `[1][frame_id:2][match_slot:3]` (match slots are
204204+the low 8 offsets within the frame). With lanes, the address becomes
205205+`[1][frame_id:2][match_slot:3][lane]`.
206206+207207+- **L=2:** address is `[1][frame_id:2][match_slot:3][lane:1]` = 7 bits within
208208+ the frame region. 128 entries × 16 bits = 256 bytes. Well within SRAM
209209+ capacity. No additional chips.
210210+211211+- **L=4:** address is `[1][frame_id:2][match_slot:3][lane:2]` = 8 bits.
212212+ 256 entries × 16 bits = 512 bytes. Still fits in SRAM.
213213+214214+**Lane free tracking:**
215215+216216+Per frame, track which lanes are free. For L=2: 1 flip-flop per frame × 4
217217+frames = 4 bits. For L=4: a 2-bit counter or 4-bit bitmask per frame = 8–16
218218+bits. Either fits in a single 74LS174 (hex D flip-flop) or similar. **+1 chip.**
219219+220220+**Approach C summary (L=2):**
221221+222222+| Component | Before | After | Delta |
223223+|----------------------------|--------|--------|-------|
224224+| act_id → (frame_id, lane) | 2× 670 | 2× 670 | 0 |
225225+| Presence + port metadata | 4× 670 | 8× 670 | +4 |
226226+| Bit select mux | 1–2 | 1–2 | 0 |
227227+| Lane free tracking | 0 | 1 | +1 |
228228+| Frame SRAM | 2 | 2 | 0 |
229229+| **Total delta** | | | **+5 chips** |
230230+231231+**Approach C summary (L=4):**
232232+233233+Presence/port at L=4 exceeds practical 670 count. Two options:
234234+235235+(a) Move presence/port to SRAM. Pack all 4 lanes' presence+port for one
236236+(frame, offset) into a single 16-bit word:
237237+`{pres0:1, port0:1, pres1:1, port1:1, ..., pres3:1, port3:1, spare:8}`.
238238+Read in 1 SRAM cycle, same SRAM chip as frame data. Adds 1 cycle to matching
239239+(read presence word before reading/writing match data). **+0 chips, +1 cycle.**
240240+241241+(b) Use 74LS189 register files instead of 670s for presence/port. 189s are
242242+16-word × 4-bit, addressed by `[frame_id:2][offset:2]` = 4 bits. Two 189s
243243+(8 bits) hold presence+port for 4 lanes at one (frame, low-2-offset) combo.
244244+With offset[2] as chip-select, that's 4× 189. **+4 chips (replacing 4× 670
245245+with 4× 189)**, net change depends on baseline.
246246+247247+Option (a) is simpler and fits the v0 "minimal chips" philosophy. The extra
248248+SRAM cycle for presence is identical to Approach A's tag read — it just
249249+applies to the lane dimension instead.
250250+251251+### Approach A: Set-Associative Tags in Frame SRAM
252252+253253+Approach A already uses SRAM for tag storage. Lanes change the tag word format.
254254+255255+Currently, each tag word packs 4-way set-associative entries:
256256+257257+```
258258+{way0_valid:1, way0_act:3, way1_valid:1, way1_act:3, ...} = 16 bits
259259+```
260260+261261+With lanes, the tag word already supports the concept — each way IS effectively
262262+a lane. The act_id comparison finds the matching way, and the way index IS the
263263+lane. The only change: ALLOC_SHARED must write the same frame region for
264264+multiple act_ids.
265265+266266+Actually, Approach A's set-associative structure already provides something
267267+very close to lanes. The ways in the tag word serve the same purpose — multiple
268268+act_ids can have pending operands at the same offset, disambiguated by act_id
269269+comparison. The 4-way associativity gives 4 simultaneous pending matches per
270270+offset across ALL activations.
271271+272272+**Key difference:** in Approach A, the ways are shared across all activations
273273+at that offset (global pool). The lane model gives per-frame isolation. Under
274274+Approach A, if two different functions both have a pending operand at offset 3,
275275+they consume 2 of the 4 ways. Under the lane model, each frame has its own L
276276+lanes — no cross-activation contention.
277277+278278+**Approach A with lanes:** the tag word becomes:
279279+280280+```
281281+{way0_valid:1, way0_act:3, way0_lane:1, way1_valid:1, way1_act:3, way1_lane:1, ...}
282282+```
283283+284284+This doesn't fit in 16 bits for 4 ways with L=2 (5 bits × 4 = 20 bits). Would
285285+require wider tag words (32-bit SRAM or 2 reads per tag lookup), or reducing
286286+to 2 ways.
287287+288288+Alternatively, since act_id already implies frame_id (via the tag store),
289289+Approach A doesn't benefit from lanes in the same way. The set-associative
290290+structure already provides the disambiguation — adding lanes on top is
291291+redundant. **Approach A doesn't need lanes; its ways serve the same purpose.**
292292+293293+The real question for Approach A is: does 4-way associativity (global, shared)
294294+provide enough concurrent matching depth? For loops: yes, as long as no more
295295+than 4 iterations have pending operands at the same offset simultaneously.
296296+For mixed workloads with multiple activations: depends on access patterns.
297297+298298+### Approach B: Full Register-File Match Pool
299299+300300+Original Approach B: 8-entry global pool with `{valid:1, act_id:3, offset:6,
301301+port:1, data:16}` per entry, fully associative.
302302+303303+**With lanes, the pool needs a lane field:** `{valid:1, act_id:3, offset:3,
304304+lane:1, port:1, data:16}` for L=2. The comparator now matches on
305305+`(act_id, offset, lane)` — but lane is derived from act_id via the tag store,
306306+not carried in the token. So the comparator actually still matches on
307307+`(act_id, offset)` as before.
308308+309309+Wait — that's the key insight. Since the token carries act_id (not frame_id +
310310+lane), and different iterations use different act_ids, the existing Approach B
311311+pool already disambiguates correctly without any lane concept at all:
312312+313313+- Iteration 1 (act_id=0): L operand stored as `{act_id=0, offset=3, ...}`
314314+- Iteration 2 (act_id=1): L operand stored as `{act_id=1, offset=3, ...}`
315315+- These don't match because act_id differs.
316316+317317+**Approach B already handles concurrent matching across iterations, provided
318318+each iteration uses a distinct act_id.** The only addition is the
319319+ALLOC_SHARED/FREE_LANE mechanism to allow multiple act_ids to share one frame.
320320+No changes to the match pool hardware at all.
321321+322322+The constraint: the global pool has 8 entries total. With 4 iterations × 2
323323+pending operands each = 8 entries consumed. A tight but functional limit.
324324+325325+**B+670 variants:** same analysis. The 670s resolve act_id → frame_id for
326326+constant/dest access. The match pool (whether fully indexed or semi-CAM) uses
327327+act_id directly and already disambiguates. **Zero additional match hardware
328328+for lanes.**
329329+330330+### Approach B+670 Indexed (Dedicated Register Slots)
331331+332332+Currently: `[frame_id:2][offset:2:0]` = 5-bit address, 32 entries dedicated.
333333+One entry per (frame, offset) pair.
334334+335335+With ALLOC_SHARED: multiple act_ids map to the same frame_id, but they get
336336+different lanes. The match data must be indexed by `[frame_id:2][offset:3]
337337+[lane]` instead of just `[frame_id:2][offset:3]`.
338338+339339+- **L=2:** 6-bit address, 64 entries. 8× 189 chips (up from 8). Actually,
340340+ the original B+670 indexed already uses 8× 189 for 32 entries of 16-bit
341341+ data. Doubling to 64 entries means 16× 189. That's a lot. Alternatively,
342342+ use SRAM for the doubled range: `[frame_id:2][offset:3][lane:1]` = 6 bits
343343+ within the match region. 64 entries × 16 bits = 128 bytes. Easily fits in
344344+ the shared SRAM chip. But then we lose the "zero SRAM cycles for matching"
345345+ advantage.
346346+347347+ Better option: keep register file, use the 670 presence bits to encode lane.
348348+ The 670 already stores `{presence, port}` per (frame, offset). With L=2,
349349+ expand to `{presence_L0, port_L0, presence_L1, port_L1}`. This is exactly
350350+ the same as the Approach C lane expansion above: 8× 670 for
351351+ presence+port. **The match data register file doubles, the presence 670s
352352+ double. +8 register chips, +4 670 chips = +12 chips.** Steep.
353353+354354+ For B+670 indexed, the more practical approach at L>1 is to fall back to
355355+ SRAM for match data and keep the 670s only for act_id resolution and
356356+ presence tracking. This effectively converts B+670 indexed into Approach C
357357+ with lanes — SRAM match data, 670 metadata.
358358+359359+### B+670 Semi-CAM (Associative Within Frame)
360360+361361+Currently: per-frame associative pool with W ways. Tag stores
362362+`{valid:1, offset:3}` per way. Comparators search offset within frame.
363363+364364+With lanes: each entry's tag becomes `{valid:1, offset:3, lane:1}` for L=2.
365365+The comparator matches on `(offset, lane)` where lane comes from the 670
366366+lookup. **+1 bit per comparator.** For 3-bit offset + 1-bit lane = 4-bit
367367+compare, each 74LS85 (4-bit comparator) handles one entry exactly.
368368+369369+The pool's way count (W) determines how many simultaneous pending matches
370370+per frame. With L=2 and W=4: 4 pending matches, shared across 2 lanes.
371371+Each lane can use up to W entries (the pool is shared within the frame,
372372+not partitioned per lane). This is actually better than strict per-lane
373373+isolation — if lane 0 has 3 pending and lane 1 has 1, they use 4 entries
374374+total without wasting any.
375375+376376+**Semi-CAM hardware delta for L=2:**
377377+378378+| Component | Before (W=2) | After (W=2, L=2) | Delta |
379379+|-------------------|--------------|-------------------|-------|
380380+| Tag registers | 2 chips | 2 chips | 0 |
381381+| Comparators | 2 chips | 2 chips | 0 |
382382+| Data registers | 4 chips | 4 chips | 0 |
383383+| **Total** | | | **0** |
384384+385385+The only change is the tag width grows by 1 bit (offset:3 → offset:3 +
386386+lane:1 = 4 bits), which fits in the same comparator. **Zero additional chips
387387+for the semi-CAM itself.**
388388+389389+The ALLOC_SHARED / FREE_LANE logic: +1 chip (lane free tracking).
390390+The 670 tag store: +0 chips (lane bit fits in spare bit).
391391+392392+**This makes B+670 semi-CAM the most natural fit for lanes.** The
393393+associative pool already handles variable-occupancy matching; adding a
394394+lane bit to the tag is free in hardware.
395395+396396+---
397397+398398+## Approach Comparison with Lanes
399399+400400+| Property | A | C (L=2) | C (L=4) | B+670 semi W=2 | B+670 semi W=4 |
401401+|---------------------------|--------------|-------------|--------------|----------------|----------------|
402402+| Needs lanes at all? | no (ways) | yes | yes | yes | yes |
403403+| Extra chips for lanes | 0 | +5 | +1 (SRAM) | +1 | +1 |
404404+| Pending matches/frame | 4 (shared) | 2 per lane | 4 per lane | W (shared) | W (shared) |
405405+| Extra SRAM cycles | 0 | 0 | +1 (pres) | 0 | 0 |
406406+| Cross-activation contention | yes (global) | no | no | no | no |
407407+| Implementation complexity | none | moderate | moderate | minimal | minimal |
408408+409409+**Winner for lanes: B+670 semi-CAM.** Zero additional match hardware, lanes
410410+come free via the existing associative tag. The 670 tag store absorbs the
411411+lane bit in its spare capacity. Only cost is 1 chip for lane free tracking
412412+and the ALLOC_SHARED/FREE_LANE control logic.
413413+414414+**Runner-up: Approach C with L=2.** +5 chips (all 670s for doubled
415415+presence/port). Simple, well-understood, but the 670 count is getting high
416416+(10 670s per PE).
417417+418418+---
419419+420420+## SRAM Address Map Update
421421+422422+With L=2 lanes, the frame SRAM match region doubles:
423423+424424+```
425425+v0 address space with lanes (L=2):
426426+427427+ IRAM region: [0][offset:8] instruction templates
428428+ capacity: 256 instructions (512 bytes)
429429+430430+ Frame region: [1][frame_id:2][slot:6] per-activation storage
431431+ capacity: 4 frames × 64 slots = 256 entries (512 bytes)
432432+ (constants, destinations, accumulators — shared across lanes)
433433+434434+ Match region: (Approach C / SRAM-based)
435435+ [1][1][frame_id:2][offset:3][lane:1] match operand data
436436+ capacity: 4 × 8 × 2 = 64 entries (128 bytes)
437437+ (carved from frame region address space, or separate region)
438438+```
439439+440440+Total: 512 + 512 + 128 = 1152 bytes. Still well under 32Kx8 capacity.
441441+442442+For B+670 semi-CAM: match data lives in register files, not SRAM. The SRAM
443443+address map is unchanged (frame region only holds shared constants/dests).
444444+445445+---
446446+447447+## Assembler Impact
448448+449449+The assembler must:
450450+451451+1. **Detect loops and recursion** that require concurrent matching. Static
452452+ analysis of feedback arcs in the dataflow graph.
453453+454454+2. **Allocate activation IDs per iteration.** The loop prologue emits
455455+ ALLOC_SHARED for each new iteration's act_id before injecting seed tokens.
456456+ The loop epilogue emits FREE_LANE when an iteration completes.
457457+458458+3. **Track lane depth.** If a loop's concurrency exceeds L (or W for
459459+ semi-CAM), the assembler must either:
460460+ - Insert synchronisation barriers (drain iteration N before starting N+L)
461461+ - Split the loop body across PEs to reduce per-PE concurrency
462462+ - Report a warning (analogous to matchable_offsets exceedance, AC5.8)
463463+464464+4. **Generate setup tokens.** ALLOC_SHARED tokens carry the parent act_id
465465+ in their payload. The codegen pass already generates frame control tokens;
466466+ this extends the format.
467467+468468+---
469469+470470+## Open Questions
471471+472472+1. **L=2 vs L=4 for v0.** L=2 is cheaper (+5 670s for Approach C, +0 for
473473+ semi-CAM) and handles 2-deep loop pipelining. L=4 handles deeper nesting
474474+ but costs more in metadata storage. Recommendation: L=2 for v0, upgradable.
475475+476476+2. **Loop iteration management.** Who manages the ALLOC_SHARED / FREE_LANE
477477+ sequence? Options:
478478+ - **Compiler-generated:** the assembler statically emits alloc/free tokens
479479+ as part of the loop control flow. Simple, but inflexible.
480480+ - **PE-internal:** a loop counter mechanism in the PE automatically
481481+ rotates lanes. More complex hardware, but simpler programs.
482482+ - **Hybrid:** compiler generates the control flow, PE provides the
483483+ lane allocation hardware. (Recommended.)
484484+485485+3. **Semi-CAM way count vs lane count.** With B+670 semi-CAM, W (ways per
486486+ frame) and L (lanes per frame) interact. W=4 with L=2 gives 4 pending
487487+ matches shared across 2 lanes — 2 pending per lane on average, more if
488488+ one lane is quiet. Is W=2 sufficient? Depends on the number of dyadic
489489+ instructions with simultaneously pending operands.
490490+491491+4. **Interaction with SC arc execution.** Strongly-connected arc blocks
492492+ execute sequential instructions within a single activation. Lanes are
493493+ orthogonal — SC arcs don't need concurrent matching (they're sequential).
494494+ But the frame_id latch for SC arcs must also latch the lane. Trivial
495495+ addition.
496496+497497+<!-- freshness: 2026-03-07 -->
+270
docs/design-plans/2026-03-07-frame-lanes.md
···11+# Frame Matching Lanes Design
22+33+## Summary
44+55+Extend the PE's frame-based matching to support multiple simultaneous pending
66+operands per instruction within a single activation. Multiple `activation_id`
77+values share one physical frame (constants/destinations) while maintaining
88+independent matching state per lane. Required for loop pipelining and recursion.
99+Changes span token types, PE internals, codegen, monitor, and tests. Assembler
1010+macro expansion for automatic loop pipelining is out of scope.
1111+1212+## Definition of Done
1313+1414+The PE emulator supports matching lanes — multiple activation IDs sharing one
1515+physical frame with independent match/presence/port storage per lane. FrameOp
1616+gains ALLOC_SHARED and FREE_LANE. FREE_FRAME auto-detects last lane.
1717+ALLOC_REMOTE is data-driven (frame constant flag for shared vs new). Existing
1818+tests pass with the updated tag_store tuple API. New tests demonstrate
1919+shared-frame matching, lane exhaustion rejection, smart free behaviour, and
2020+data-driven ALLOC_REMOTE. Assembler macro expansion for automatic loop
2121+pipelining is explicitly out of scope.
2222+2323+## Acceptance Criteria
2424+2525+### AC1: Tag Store Tuple API
2626+2727+- **frame-lanes.AC1.1:** `tag_store` maps `act_id → (frame_id, lane)` where
2828+ `lane` is an `int` in range `[0, lane_count)`.
2929+- **frame-lanes.AC1.2:** `PEConfig.initial_tag_store` type is
3030+ `dict[int, tuple[int, int]]`. PE constructor initialises tag_store from it.
3131+- **frame-lanes.AC1.3:** `PEConfig.lane_count` field exists with default 4.
3232+ Controls third dimension of match arrays.
3333+- **frame-lanes.AC1.4:** All existing tests pass with updated tuple API.
3434+3535+### AC2: Separate Match Data Storage
3636+3737+- **frame-lanes.AC2.1:** Match operand data lives in
3838+ `match_data[frame_id][offset][lane]`, separate from `frames[frame_id][slot]`.
3939+- **frame-lanes.AC2.2:** `presence[frame_id][offset][lane]` is a 3D bool
4040+ array. `port_store[frame_id][offset][lane]` likewise.
4141+- **frame-lanes.AC2.3:** `_match_frame()` uses `(frame_id, match_slot, lane)`
4242+ to read/write match data, presence, and port.
4343+- **frame-lanes.AC2.4:** `frames[frame_id][slot]` remains shared across all
4444+ lanes. Constants and destinations are NOT per-lane.
4545+4646+### AC3: FrameOp Extensions
4747+4848+- **frame-lanes.AC3.1:** `FrameOp.ALLOC_SHARED` added. When received,
4949+ PE looks up `parent_act_id` (from payload), finds parent's `frame_id`,
5050+ assigns next free lane from that frame's lane pool, records
5151+ `tag_store[act_id] = (frame_id, lane)`. Clears only that lane's
5252+ presence/port bits.
5353+- **frame-lanes.AC3.2:** `FrameOp.FREE_LANE` added. Removes tag_store entry,
5454+ clears that lane's presence/port/match_data across all matchable offsets.
5555+ Does NOT return frame to free list.
5656+- **frame-lanes.AC3.3:** `FrameOp.FREE` (existing) becomes smart: removes
5757+ tag_store entry, clears lane data. If no other tag_store entries reference
5858+ the same frame_id, returns frame to free list and clears frame slots. If
5959+ other entries exist, behaves like FREE_LANE.
6060+- **frame-lanes.AC3.4:** `FrameOp.ALLOC` (existing) unchanged — allocates
6161+ fresh frame, assigns lane 0.
6262+- **frame-lanes.AC3.5:** `FrameAllocated` event gains `lane: int` field.
6363+ `FrameFreed` event gains `lane: int` and `frame_freed: bool` fields.
6464+- **frame-lanes.AC3.6:** When all lanes for a frame are occupied and
6565+ ALLOC_SHARED is received, PE emits `TokenRejected` with reason
6666+ "no free lanes" and drops the token.
6767+6868+### AC4: ALLOC_REMOTE Data-Driven
6969+7070+- **frame-lanes.AC4.1:** ALLOC_REMOTE reads `fref+2` from frame. If value
7171+ is non-zero, emits `FrameControlToken` with `op=ALLOC_SHARED` and
7272+ `payload=parent_act_id`. If zero, emits `op=ALLOC` as before.
7373+- **frame-lanes.AC4.2:** No new opcodes. Behaviour is entirely data-driven
7474+ from frame constants.
7575+7676+### AC5: FREE_FRAME Instruction
7777+7878+- **frame-lanes.AC5.1:** `FREE_FRAME` opcode uses the smart FREE behaviour
7979+ from AC3.3. Frees the executing token's activation lane; returns frame to
8080+ free list only if last lane.
8181+8282+### AC6: Monitor and Snapshot Updates
8383+8484+- **frame-lanes.AC6.1:** `PESnapshot.tag_store` type becomes
8585+ `dict[int, tuple[int, int]]`.
8686+- **frame-lanes.AC6.2:** `PESnapshot` gains `match_data`, `lane_count` fields
8787+ reflecting the separated match storage.
8888+- **frame-lanes.AC6.3:** Monitor REPL `pe` command displays lane info in
8989+ tag_store output.
9090+- **frame-lanes.AC6.4:** Monitor graph JSON serialises lane info correctly.
9191+9292+### AC7: Codegen Updates
9393+9494+- **frame-lanes.AC7.1:** `codegen.py` generates `initial_tag_store` with
9595+ `(frame_id, lane)` tuples. Existing single-activation code uses lane 0.
9696+- **frame-lanes.AC7.2:** No codegen changes needed for ALLOC_SHARED (manual
9797+ construction only for now).
9898+9999+### AC8: Test Coverage
100100+101101+- **frame-lanes.AC8.1:** Test: two act_ids sharing a frame via ALLOC_SHARED
102102+ have independent matching — L operand for act_id 0 does not interfere with
103103+ L operand for act_id 1 at the same offset.
104104+- **frame-lanes.AC8.2:** Test: ALLOC_SHARED with all lanes occupied emits
105105+ TokenRejected.
106106+- **frame-lanes.AC8.3:** Test: FREE on a shared frame frees only the lane;
107107+ other lanes' data is preserved. FREE on last lane frees the frame.
108108+- **frame-lanes.AC8.4:** Test: ALLOC_REMOTE emits ALLOC_SHARED when
109109+ `fref+2` is non-zero.
110110+- **frame-lanes.AC8.5:** Test: ALLOC_REMOTE emits ALLOC when `fref+2` is
111111+ zero (backwards compatible).
112112+- **frame-lanes.AC8.6:** Test: full loop pipelining scenario — two
113113+ iterations of a dyadic instruction running concurrently on different
114114+ lanes, both producing correct results.
115115+116116+## Architecture
117117+118118+### Current Model
119119+120120+```
121121+tag_store[act_id] → frame_id (1:1 mapping)
122122+frames[frame_id][slot] (constants + dests + match data mixed)
123123+presence[frame_id][offset] (1 pending operand per instruction)
124124+port_store[frame_id][offset] (port of pending operand)
125125+```
126126+127127+### New Model
128128+129129+```
130130+tag_store[act_id] → (frame_id, lane) (many:1, multiple act_ids per frame)
131131+frames[frame_id][slot] (constants + dests ONLY, shared)
132132+match_data[frame_id][offset][lane] (per-lane operand storage)
133133+presence[frame_id][offset][lane] (per-lane presence bits)
134134+port_store[frame_id][offset][lane] (per-lane port metadata)
135135+lane_free[frame_id] → set[int] (available lanes per frame)
136136+```
137137+138138+### Frame Control Token Payload Convention
139139+140140+```
141141+ALLOC: payload ignored (or return routing)
142142+ALLOC_SHARED: payload = parent_act_id (low 3 bits)
143143+FREE: payload ignored
144144+FREE_LANE: payload ignored
145145+```
146146+147147+### ALLOC_REMOTE Frame Slot Convention
148148+149149+```
150150+fref+0: target_pe (int)
151151+fref+1: target_act_id (int)
152152+fref+2: parent_act_id (0 = ALLOC_NEW, non-zero = ALLOC_SHARED)
153153+```
154154+155155+### Smart FREE Behaviour
156156+157157+When FREE or FREE_FRAME executes for an act_id:
158158+1. Look up `(frame_id, lane)` from tag_store
159159+2. Remove tag_store entry for act_id
160160+3. Clear match_data/presence/port_store for that lane across all offsets
161161+4. Return lane to `lane_free[frame_id]`
162162+5. Scan tag_store: does any other entry reference frame_id?
163163+ - No → return frame to free_frames, clear all frame slots
164164+ - Yes → frame stays allocated, constants/dests preserved
165165+166166+### Lifecycle Example: Loop Pipelining
167167+168168+```
169169+1. ALLOC(act_id=0) → frame 2, lane 0
170170+2. Setup: write constants/dests to frame 2
171171+3. Iteration 1 seeds use act_id=0
172172+173173+4. ALLOC_SHARED(act_id=1, parent=0) → frame 2, lane 1
174174+5. Iteration 2 seeds use act_id=1
175175+ (act_id=0 and act_id=1 match independently at same offsets)
176176+177177+6. Iteration 1 completes: FREE(act_id=0) → lane 0 freed, frame stays
178178+7. ALLOC_SHARED(act_id=2, parent=1) → frame 2, lane 0 (recycled)
179179+8. Iteration 3 seeds use act_id=2
180180+181181+9. All done: FREE(act_id=last) → last lane, frame returned
182182+```
183183+184184+## Existing Patterns
185185+186186+- **Frame control handling:** `_handle_frame_control()` in `emu/pe.py` already
187187+ dispatches on `FrameOp` enum values. Adding ALLOC_SHARED/FREE_LANE follows
188188+ the same pattern.
189189+- **Token rejection:** `TokenRejected` event already exists and is emitted for
190190+ invalid act_ids. Lane exhaustion follows the same pattern.
191191+- **Smart free (precedent):** The existing FREE already validates act_id
192192+ presence in tag_store before freeing. The smart-free extension adds a
193193+ scan step after removal.
194194+- **Data-driven opcode behaviour:** ALLOC_REMOTE already reads frame slots
195195+ to determine target. Reading an additional slot for shared-vs-new is the
196196+ same pattern.
197197+- **Codegen initial_tag_store:** Already generates `act_id → frame_id`
198198+ mappings. Extending to tuples is mechanical.
199199+200200+## Implementation Phases
201201+202202+### Phase 1: Foundation Types and Tag Store API (2 tasks)
203203+204204+Update FrameOp enum, PEConfig, and tag_store type across the codebase.
205205+All existing tests adapted to tuple API. No new behaviour yet.
206206+207207+### Phase 2: Separated Match Storage (2 tasks)
208208+209209+Extract match_data from frames into its own 3D array. Update _match_frame
210210+to use lane dimension (always lane 0 for now). Update presence/port_store
211211+to 3D. Verify matching still works identically.
212212+213213+### Phase 3: ALLOC_SHARED, FREE_LANE, Smart FREE (3 tasks)
214214+215215+Implement new FrameOp handlers. Add lane_free tracking. Implement smart
216216+FREE behaviour. Add events with lane fields. Write tests for all new ops.
217217+218218+### Phase 4: ALLOC_REMOTE Data-Driven and FREE_FRAME Update (2 tasks)
219219+220220+Update ALLOC_REMOTE to read fref+2 for shared-vs-new. Update FREE_FRAME
221221+opcode to use smart free. Write tests.
222222+223223+### Phase 5: Monitor, Snapshot, and Codegen Updates (2 tasks)
224224+225225+Update PESnapshot, capture(), REPL formatting, graph JSON. Update codegen
226226+initial_tag_store to emit tuples. Verify monitor displays lane info.
227227+228228+### Phase 6: Integration Tests (1 task)
229229+230230+Full loop pipelining scenario test. Two concurrent iterations on shared
231231+frame, both producing correct results. E2E verification.
232232+233233+## Additional Considerations
234234+235235+### ABA Safety
236236+237237+With 3-bit act_id (8 values) and at most 4 lanes per frame, there are 4 IDs
238238+of ABA distance between allocation and re-use of the same act_id value.
239239+FREE removes the act_id from tag_store entirely, so stale tokens with freed
240240+act_ids hit rejection. Re-allocation uses a different act_id value.
241241+242242+### Hardware Mapping
243243+244244+This design maps cleanly to Approach C (670 lookup) with L=2 at +5 chips,
245245+or to B+670 semi-CAM with zero additional match hardware (lane bit fits in
246246+existing comparator width). See `design-notes/frame-lanes-for-concurrent-
247247+matching.md` for full hardware analysis.
248248+249249+### Future Work
250250+251251+- **Assembler loop macro:** `#loop_counted` and `#loop_while` could auto-
252252+ generate ALLOC_SHARED/FREE_LANE control flow with act_id rotation.
253253+- **Lane depth analysis:** static analysis in the allocator to warn when
254254+ loop concurrency exceeds lane_count.
255255+- **SC arc interaction:** frame_id latch for strongly-connected arc execution
256256+ must also latch the lane. Trivial addition when SC arcs are implemented.
257257+258258+## Glossary
259259+260260+- **Lane:** An independent matching slot within a shared frame. Multiple
261261+ act_ids can map to the same frame_id with different lane indices,
262262+ providing concurrent matching without duplicating constants/destinations.
263263+- **Lane pool:** The set of available lanes per frame, tracked by
264264+ `lane_free[frame_id]`. Initially all lanes are free; ALLOC assigns lane 0,
265265+ ALLOC_SHARED assigns the next free lane.
266266+- **Smart free:** FREE behaviour that auto-detects whether the freed lane
267267+ is the last one using that frame. If last, returns frame to free list.
268268+ If not, preserves frame for remaining lanes.
269269+- **Parent act_id:** The activation ID whose frame should be shared during
270270+ ALLOC_SHARED. Used to look up the target frame_id.
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC1: Tag Store Tuple API
2020+- **frame-lanes.AC1.1 Success:** `tag_store` maps `act_id → (frame_id, lane)` where `lane` is an `int` in range `[0, lane_count)`.
2121+- **frame-lanes.AC1.2 Success:** `PEConfig.initial_tag_store` type is `dict[int, tuple[int, int]]`. PE constructor initialises tag_store from it.
2222+- **frame-lanes.AC1.3 Success:** `PEConfig.lane_count` field exists with default 4. Controls third dimension of match arrays.
2323+- **frame-lanes.AC1.4 Success:** All existing tests pass with updated tuple API.
2424+2525+---
2626+2727+<!-- START_SUBCOMPONENT_A (tasks 1-2) -->
2828+2929+<!-- START_TASK_1 -->
3030+### Task 1: Update FrameOp enum with ALLOC_SHARED and FREE_LANE
3131+3232+**Verifies:** None (enum additions only, no behaviour change)
3333+3434+**Files:**
3535+- Modify: `cm_inst.py:88-90`
3636+3737+**Implementation:**
3838+3939+Add two new enum members to `FrameOp`. Existing values stay unchanged:
4040+4141+```python
4242+class FrameOp(IntEnum):
4343+ ALLOC = 0
4444+ FREE = 1
4545+ ALLOC_SHARED = 2
4646+ FREE_LANE = 3
4747+```
4848+4949+**Testing:**
5050+5151+No tests needed — IntEnum membership is compiler-verifiable. Existing tests that use `FrameOp.ALLOC` and `FrameOp.FREE` remain unaffected.
5252+5353+**Verification:**
5454+Run: `python -m pytest tests/ -v -x`
5555+Expected: All 1277+ existing tests pass unchanged.
5656+5757+**Commit:** `jj commit -m "feat: add ALLOC_SHARED and FREE_LANE to FrameOp enum"`
5858+<!-- END_TASK_1 -->
5959+6060+<!-- START_TASK_2 -->
6161+### Task 2: Add lane_count to PEConfig
6262+6363+**Verifies:** frame-lanes.AC1.3
6464+6565+**Files:**
6666+- Modify: `emu/types.py:17-27`
6767+6868+**Implementation:**
6969+7070+Add `lane_count` field to `PEConfig` with default 4. Place it after `matchable_offsets` (line 22) to group dimensional config together:
7171+7272+```python
7373+@dataclass(frozen=True)
7474+class PEConfig:
7575+ pe_id: int = 0
7676+ iram: dict[int, Instruction] | None = None
7777+ frame_count: int = 8
7878+ frame_slots: int = 64
7979+ matchable_offsets: int = 8
8080+ lane_count: int = 4
8181+ initial_frames: Optional[dict[int, list[FrameSlotValue]]] = None
8282+ initial_tag_store: Optional[dict[int, int]] = None
8383+ allowed_pe_routes: Optional[set[int]] = None
8484+ allowed_sm_routes: Optional[set[int]] = None
8585+ on_event: EventCallback | None = None
8686+```
8787+8888+Note: `initial_tag_store` type stays `dict[int, int]` for now — the next task (Task 3) changes it to tuples.
8989+9090+**Testing:**
9191+9292+No dedicated tests — `lane_count` has a default value so all existing PEConfig constructions remain valid. The field's effect is tested when match arrays gain the lane dimension (Phase 2).
9393+9494+**Verification:**
9595+Run: `python -m pytest tests/ -v -x`
9696+Expected: All existing tests pass unchanged.
9797+9898+**Commit:** `jj commit -m "feat: add lane_count field to PEConfig with default 4"`
9999+<!-- END_TASK_2 -->
100100+101101+<!-- END_SUBCOMPONENT_A -->
102102+103103+<!-- START_SUBCOMPONENT_B (tasks 3-5) -->
104104+105105+<!-- START_TASK_3 -->
106106+### Task 3: Update PEConfig.initial_tag_store to tuple type
107107+108108+**Verifies:** frame-lanes.AC1.2
109109+110110+**Files:**
111111+- Modify: `emu/types.py:24` — change type annotation
112112+113113+**Implementation:**
114114+115115+Change the `initial_tag_store` type from `dict[int, int]` to `dict[int, tuple[int, int]]`:
116116+117117+```python
118118+initial_tag_store: Optional[dict[int, tuple[int, int]]] = None
119119+```
120120+121121+Each entry is now `act_id → (frame_id, lane)`.
122122+123123+**Testing:**
124124+125125+No dedicated tests — this is a type change. Downstream call sites are updated in Tasks 4 and 5.
126126+127127+**Verification:**
128128+129129+This change alone will break tests that construct `PEConfig` with `initial_tag_store={0: 0}` etc. Do NOT run tests yet — proceed to Task 4 immediately.
130130+131131+**Commit:** Do not commit yet — combine with Task 4.
132132+<!-- END_TASK_3 -->
133133+134134+<!-- START_TASK_4 -->
135135+### Task 4: Update PE constructor and internals for tuple tag_store
136136+137137+**Verifies:** frame-lanes.AC1.1
138138+139139+**Files:**
140140+- Modify: `emu/pe.py:72` — tag_store initialization
141141+- Modify: `emu/pe.py:88-90` — free_frames removal from tag_store values
142142+- Modify: `emu/pe.py:169-176` — CMToken act_id lookup
143143+- Modify: `emu/pe.py:260-261` — FREE_FRAME opcode handler
144144+- Modify: `emu/pe.py:289` — ALLOC frame control handler
145145+- Modify: `emu/pe.py:304-305` — FREE frame control handler
146146+- Modify: `emu/pe.py:321-322` — PELocalWriteToken handler
147147+148148+**Implementation:**
149149+150150+The internal `tag_store` type changes from `dict[int, int]` to `dict[int, tuple[int, int]]`. Every access point must be updated.
151151+152152+**Line 72 — Initialization:**
153153+```python
154154+# Tag store: act_id → (frame_id, lane)
155155+self.tag_store: dict[int, tuple[int, int]] = dict(config.initial_tag_store or {})
156156+```
157157+158158+**Lines 88-90 — Free frames removal:**
159159+The values are now tuples `(frame_id, lane)`. Extract `frame_id`:
160160+```python
161161+for frame_id, _lane in self.tag_store.values():
162162+ if frame_id in self.free_frames:
163163+ self.free_frames.remove(frame_id)
164164+```
165165+166166+**Lines 169-176 — CMToken pipeline (act_id lookup):**
167167+Where the code currently does `frame_id = self.tag_store[token.act_id]`, change to:
168168+```python
169169+frame_id, lane = self.tag_store[token.act_id]
170170+```
171171+The `lane` value is not used yet in Phase 1 — matching still uses the 2D presence/port arrays. Phase 2 adds the lane dimension to match storage.
172172+173173+**Lines 260-261 — FREE_FRAME opcode:**
174174+Where the code does `freed_frame = self.tag_store.pop(token.act_id)`, change to:
175175+```python
176176+freed_frame, _lane = self.tag_store.pop(token.act_id)
177177+```
178178+179179+**Line 289 — ALLOC handler:**
180180+Where the code stores `self.tag_store[token.act_id] = frame_id`, change to:
181181+```python
182182+self.tag_store[token.act_id] = (frame_id, 0)
183183+```
184184+New allocations always get lane 0.
185185+186186+**Lines 304-305 — FREE handler:**
187187+Where the code does `frame_id = self.tag_store.pop(token.act_id)`, change to:
188188+```python
189189+frame_id, _lane = self.tag_store.pop(token.act_id)
190190+```
191191+192192+**Lines 321-322 — PELocalWriteToken handler:**
193193+Where the code checks `token.act_id in self.tag_store` and then does `frame_id = self.tag_store[token.act_id]`, change the lookup to:
194194+```python
195195+frame_id, _lane = self.tag_store[token.act_id]
196196+```
197197+198198+**Testing:**
199199+200200+No new tests in this task — AC1.1 is verified by the existing test suite passing with the new tuple type (Task 5 updates those tests).
201201+202202+**Verification:**
203203+204204+Do NOT run tests yet — existing tests still pass `dict[int, int]` values to `initial_tag_store`. Proceed to Task 5 immediately.
205205+206206+**Commit:** Do not commit yet — combine with Task 5.
207207+<!-- END_TASK_4 -->
208208+209209+<!-- START_TASK_5 -->
210210+### Task 5: Update all test files and downstream code for tuple tag_store API
211211+212212+**Verifies:** frame-lanes.AC1.2, frame-lanes.AC1.4
213213+214214+**Files:**
215215+- Modify: `tests/test_pe_frames.py` — ~21 `pe.tag_store[N]` value access sites need tuple unpacking (e.g., `frame_id = pe.tag_store[0]` → `frame_id, _lane = pe.tag_store[0]` or `frame_id = pe.tag_store[0][0]`). Also fix `pe.tag_store[0] in range(pe.frame_count)` at line 99 to `pe.tag_store[0][0] in range(pe.frame_count)`. Note: this file does NOT use `initial_tag_store` — changes are to value access patterns only.
216216+- Modify: `tests/test_pe_events.py` — 9 `initial_tag_store` call sites: all `{0: 0}` → `{0: (0, 0)}`
217217+- Modify: `tests/test_network_routing.py` — 2 tests with `initial_tag_store` construction and `pe.tag_store` value assertions
218218+- Modify: `tests/test_snapshot.py` — tag_store capture assertions and PESnapshot type
219219+- Modify: `tests/test_pe.py` — 18 `initial_tag_store` call sites: 17 `{0: 0}` → `{0: (0, 0)}`, one `{1: 0}` → `{1: (0, 0)}`. Note: `pe.presence` indexing changes are deferred to Phase 2 Task 2 (this task changes ONLY `initial_tag_store` values in this file).
220220+- Modify: `tests/test_monitor_graph_json.py` — PESnapshot constructions with tag_store field
221221+- Modify: `monitor/snapshot.py:25` — PESnapshot.tag_store type annotation
222222+- Modify: `monitor/snapshot.py:81` — capture() tag_store copy
223223+- Modify: `asm/codegen.py:371-422` — initial_tag_store generation
224224+- Modify: `tests/conftest.py` — frame_control_token strategy (if it constructs tag_store)
225225+226226+**Implementation:**
227227+228228+This is a mechanical find-and-replace across the codebase. Every place that constructs `initial_tag_store` must change from `{act_id: frame_id}` to `{act_id: (frame_id, lane)}` where lane is 0 for all existing code.
229229+230230+**Pattern for test files:**
231231+232232+Every `initial_tag_store={0: 0}` becomes `initial_tag_store={0: (0, 0)}`.
233233+Every `initial_tag_store={1: 0}` becomes `initial_tag_store={1: (0, 0)}`.
234234+Every `initial_tag_store={0: 2, 1: 3}` becomes `initial_tag_store={0: (2, 0), 1: (3, 0)}`.
235235+236236+**Pattern for assertions on tag_store values:**
237237+238238+Where tests assert `pe.tag_store[0] == 2`, change to `pe.tag_store[0] == (2, 0)`.
239239+Where tests assert `pe.tag_store[0]` (existence check), no change needed.
240240+241241+**monitor/snapshot.py line 25:**
242242+```python
243243+tag_store: dict[int, tuple[int, int]]
244244+```
245245+246246+**monitor/snapshot.py line 81:**
247247+No code change needed — `dict(pe.tag_store)` already copies tuples correctly.
248248+249249+**asm/codegen.py lines 371-422:**
250250+251251+Where `initial_tag_store[act_id] = frame_id` is set, change to:
252252+```python
253253+initial_tag_store[act_id] = (frame_id, 0)
254254+```
255255+256256+This applies at approximately lines 383 and 412.
257257+258258+**Testing:**
259259+260260+This task verifies AC1.4 — all existing tests must pass with the updated tuple API. No new test functions are needed; the existing suite IS the verification.
261261+262262+**Verification:**
263263+Run: `python -m pytest tests/ -v -x`
264264+Expected: All existing tests pass. Zero failures.
265265+266266+**Commit:** `jj commit -m "feat: update tag_store to tuple API (act_id → frame_id, lane)"`
267267+268268+This single commit covers Tasks 3, 4, and 5 together since they form an atomic change — the type, the internals, and all call sites must change together.
269269+<!-- END_TASK_5 -->
270270+271271+<!-- END_SUBCOMPONENT_B -->
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC2: Separate Match Data Storage
2020+- **frame-lanes.AC2.1 Success:** Match operand data lives in `match_data[frame_id][offset][lane]`, separate from `frames[frame_id][slot]`.
2121+- **frame-lanes.AC2.2 Success:** `presence[frame_id][offset][lane]` is a 3D bool array. `port_store[frame_id][offset][lane]` likewise.
2222+- **frame-lanes.AC2.3 Success:** `_match_frame()` uses `(frame_id, match_slot, lane)` to read/write match data, presence, and port.
2323+- **frame-lanes.AC2.4 Success:** `frames[frame_id][slot]` remains shared across all lanes. Constants and destinations are NOT per-lane.
2424+2525+Also partially satisfies (structural type changes only, full testing in Phase 5):
2626+### frame-lanes.AC6: Monitor and Snapshot Updates (partial)
2727+- **frame-lanes.AC6.1:** `PESnapshot.tag_store` type updated to `dict[int, tuple[int, int]]`.
2828+- **frame-lanes.AC6.2:** `PESnapshot` gains `match_data`, `lane_count` fields.
2929+3030+---
3131+3232+<!-- START_SUBCOMPONENT_A (tasks 1-3) -->
3333+3434+<!-- START_TASK_1 -->
3535+### Task 1: Add match_data 3D array and convert presence/port_store to 3D
3636+3737+**Verifies:** frame-lanes.AC2.1, frame-lanes.AC2.2
3838+3939+**Files:**
4040+- Modify: `emu/pe.py:66-84` — PE constructor storage initialization
4141+- Modify: `emu/pe.py:290-296` — ALLOC handler reset logic
4242+4343+**Implementation:**
4444+4545+Add a new `match_data` 3D array and extend `presence` and `port_store` from 2D to 3D by adding the lane dimension. After Phase 1, `config.lane_count` is available.
4646+4747+**Constructor changes (replace lines 74-84):**
4848+4949+Replace the current 2D presence and port_store initialization with 3D versions, and add a match_data array:
5050+5151+```python
5252+# Match data: [frame_id][match_slot][lane] - operand values waiting for partner
5353+self.match_data: list[list[list[Optional[int]]]] = [
5454+ [
5555+ [None for _ in range(config.lane_count)]
5656+ for _ in range(config.matchable_offsets)
5757+ ]
5858+ for _ in range(config.frame_count)
5959+]
6060+6161+# Presence bits: [frame_id][match_slot][lane] - True if operand waiting for partner
6262+self.presence: list[list[list[bool]]] = [
6363+ [
6464+ [False for _ in range(config.lane_count)]
6565+ for _ in range(config.matchable_offsets)
6666+ ]
6767+ for _ in range(config.frame_count)
6868+]
6969+7070+# Port store: [frame_id][match_slot][lane] - port of waiting operand
7171+self.port_store: list[list[list[Optional[Port]]]] = [
7272+ [
7373+ [None for _ in range(config.lane_count)]
7474+ for _ in range(config.matchable_offsets)
7575+ ]
7676+ for _ in range(config.frame_count)
7777+]
7878+7979+self.lane_count = config.lane_count
8080+```
8181+8282+**ALLOC handler reset (lines 293-296):**
8383+8484+Update the presence/port_store reset loop to iterate all lanes, and also clear match_data:
8585+8686+```python
8787+for i in range(self.matchable_offsets):
8888+ for ln in range(self.lane_count):
8989+ self.match_data[frame_id][i][ln] = None
9090+ self.presence[frame_id][i][ln] = False
9191+ self.port_store[frame_id][i][ln] = None
9292+```
9393+9494+**Testing:**
9595+9696+No dedicated tests for storage structure — AC2.1 and AC2.2 are verified by the existing test suite continuing to pass after Task 2 updates `_match_frame()`. The 3D structure is exercised through matching behaviour.
9797+9898+**Verification:**
9999+100100+Do NOT run tests yet — `_match_frame()` still uses 2D indexing. Proceed to Task 2 immediately.
101101+102102+**Commit:** Do not commit yet — combine with Task 2.
103103+<!-- END_TASK_1 -->
104104+105105+<!-- START_TASK_2 -->
106106+### Task 2: Update _match_frame() to use lane dimension
107107+108108+**Verifies:** frame-lanes.AC2.3, frame-lanes.AC2.4
109109+110110+**Files:**
111111+- Modify: `emu/pe.py:343-383` — `_match_frame()` method
112112+- Modify: `emu/pe.py:169-176` — CMToken pipeline where `_match_frame()` is called
113113+114114+**Implementation:**
115115+116116+Update `_match_frame()` to accept and use the `lane` parameter. After Phase 1, the CMToken pipeline already unpacks `frame_id, lane = self.tag_store[token.act_id]` — now pass `lane` through.
117117+118118+**Update the call site (in the CMToken processing pipeline):**
119119+120120+Where `_match_frame()` is currently called with `(token, inst, frame_id)`, add `lane`:
121121+```python
122122+result = self._match_frame(token, inst, frame_id, lane)
123123+```
124124+125125+**Updated `_match_frame()` signature and body:**
126126+127127+```python
128128+def _match_frame(
129129+ self,
130130+ token: DyadToken,
131131+ inst: Instruction,
132132+ frame_id: int,
133133+ lane: int,
134134+) -> Optional[tuple[int, int]]:
135135+ """Frame-based dyadic matching with lane support.
136136+137137+ Derives match slot from low bits of token.offset:
138138+ match_slot = token.offset % matchable_offsets
139139+140140+ Match data, presence, and port are per-lane.
141141+ Frame constants/destinations remain shared.
142142+ """
143143+ match_slot = token.offset % self.matchable_offsets
144144+145145+ if self.presence[frame_id][match_slot][lane]:
146146+ # Partner already waiting — pair them
147147+ partner_data = self.match_data[frame_id][match_slot][lane]
148148+ partner_port = self.port_store[frame_id][match_slot][lane]
149149+ self.presence[frame_id][match_slot][lane] = False
150150+ self.match_data[frame_id][match_slot][lane] = None
151151+152152+ # Use port metadata to determine left/right ordering
153153+ if partner_port == Port.L:
154154+ left, right = partner_data, token.data
155155+ else:
156156+ left, right = token.data, partner_data
157157+158158+ self._on_event(Matched(
159159+ time=self.env.now, component=self._component,
160160+ left=left, right=right, act_id=token.act_id,
161161+ offset=token.offset, frame_id=frame_id,
162162+ ))
163163+ return left, right
164164+ else:
165165+ # Store and wait for partner
166166+ self.match_data[frame_id][match_slot][lane] = token.data
167167+ self.port_store[frame_id][match_slot][lane] = token.port
168168+ self.presence[frame_id][match_slot][lane] = True
169169+ return None
170170+```
171171+172172+Key changes from current code:
173173+- All `self.frames[frame_id][match_slot]` reads/writes for match data → `self.match_data[frame_id][match_slot][lane]`
174174+- All `self.presence[frame_id][match_slot]` → `self.presence[frame_id][match_slot][lane]`
175175+- All `self.port_store[frame_id][match_slot]` → `self.port_store[frame_id][match_slot][lane]`
176176+- `self.frames` is NOT touched — constants and destinations remain shared (AC2.4)
177177+178178+**Testing:**
179179+180180+AC2.3 and AC2.4 are verified by the existing test suite passing. All existing tests use lane 0 (set by Phase 1's tuple tag_store), so matching behaviour is identical.
181181+182182+Two tests directly check `pe.presence`:
183183+- `tests/test_pe.py:160` — `assert pe.presence[frame_id][0] is True` → change to `pe.presence[frame_id][0][0]`
184184+- `tests/test_pe.py:624` — `assert pe.presence[frame_id][...] is False` → change to `pe.presence[frame_id][...][0]`
185185+186186+**Verification:**
187187+Run: `python -m pytest tests/ -v -x`
188188+Expected: All existing tests pass.
189189+190190+**Commit:** `jj commit -m "feat: separate match_data from frames, add lane dimension to presence/port_store"`
191191+192192+This single commit covers Tasks 1 and 2 since they form an atomic change.
193193+<!-- END_TASK_2 -->
194194+195195+<!-- START_TASK_3 -->
196196+### Task 3: Update snapshot capture for 3D match storage
197197+198198+**Verifies:** None (snapshot updates for AC6 are in Phase 5, but this keeps snapshot working)
199199+200200+**Files:**
201201+- Modify: `monitor/snapshot.py:18-30` — PESnapshot dataclass
202202+- Modify: `monitor/snapshot.py:82-89` — capture() presence/port_store conversion
203203+- Modify: `tests/test_snapshot.py` — snapshot assertion updates
204204+205205+**Implementation:**
206206+207207+Update PESnapshot to reflect the new 3D storage shapes and add match_data field.
208208+209209+**PESnapshot dataclass updates:**
210210+211211+```python
212212+@dataclass(frozen=True)
213213+class PESnapshot:
214214+ pe_id: int
215215+ iram: dict[int, Instruction]
216216+ frames: tuple[tuple[FrameSlotValue, ...], ...]
217217+ tag_store: dict[int, tuple[int, int]]
218218+ presence: tuple[tuple[tuple[bool, ...], ...], ...]
219219+ port_store: tuple[tuple[tuple[Port | None, ...], ...], ...]
220220+ match_data: tuple[tuple[tuple[int | None, ...], ...], ...]
221221+ free_frames: tuple[int, ...]
222222+ lane_count: int
223223+ input_queue: tuple[Token, ...]
224224+ output_log: tuple[Token, ...]
225225+```
226226+227227+**capture() updates:**
228228+229229+Replace the 2D presence/port_store capture with 3D, and add match_data capture:
230230+231231+```python
232232+presence = tuple(
233233+ tuple(
234234+ tuple(lane_val for lane_val in offset_lanes)
235235+ for offset_lanes in frame_presence
236236+ )
237237+ for frame_presence in pe.presence
238238+)
239239+port_store = tuple(
240240+ tuple(
241241+ tuple(lane_val for lane_val in offset_lanes)
242242+ for offset_lanes in frame_ports
243243+ )
244244+ for frame_ports in pe.port_store
245245+)
246246+match_data = tuple(
247247+ tuple(
248248+ tuple(lane_val for lane_val in offset_lanes)
249249+ for offset_lanes in frame_match
250250+ )
251251+ for frame_match in pe.match_data
252252+)
253253+```
254254+255255+Pass `match_data=match_data` and `lane_count=pe.lane_count` to the PESnapshot constructor.
256256+257257+**Testing:**
258258+259259+Update any snapshot tests that assert on `presence` or `port_store` shape to expect 3D tuples. Update tests that construct PESnapshot directly to include `match_data` and `lane_count` fields.
260260+261261+**Verification:**
262262+Run: `python -m pytest tests/ -v -x`
263263+Expected: All tests pass.
264264+265265+**Commit:** `jj commit -m "feat: update PESnapshot for 3D match storage and match_data field"`
266266+<!-- END_TASK_3 -->
267267+268268+<!-- END_SUBCOMPONENT_A -->
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC3: FrameOp Extensions
2020+- **frame-lanes.AC3.1 Success:** `FrameOp.ALLOC_SHARED` added. When received, PE looks up `parent_act_id` (from payload), finds parent's `frame_id`, assigns next free lane from that frame's lane pool, records `tag_store[act_id] = (frame_id, lane)`. Clears only that lane's presence/port bits.
2121+- **frame-lanes.AC3.2 Success:** `FrameOp.FREE_LANE` added. Removes tag_store entry, clears that lane's presence/port/match_data across all matchable offsets. Does NOT return frame to free list.
2222+- **frame-lanes.AC3.3 Success:** `FrameOp.FREE` (existing) becomes smart: removes tag_store entry, clears lane data. If no other tag_store entries reference the same frame_id, returns frame to free list and clears frame slots. If other entries exist, behaves like FREE_LANE.
2323+- **frame-lanes.AC3.4 Success:** `FrameOp.ALLOC` (existing) unchanged — allocates fresh frame, assigns lane 0.
2424+- **frame-lanes.AC3.5 Success:** `FrameAllocated` event gains `lane: int` field. `FrameFreed` event gains `lane: int` and `frame_freed: bool` fields.
2525+- **frame-lanes.AC3.6 Success:** When all lanes for a frame are occupied and ALLOC_SHARED is received, PE emits `TokenRejected` with reason "no free lanes" and drops the token.
2626+2727+### frame-lanes.AC8: Test Coverage (partial)
2828+- **frame-lanes.AC8.1 Success:** Test: two act_ids sharing a frame via ALLOC_SHARED have independent matching — L operand for act_id 0 does not interfere with L operand for act_id 1 at the same offset.
2929+- **frame-lanes.AC8.2 Success:** Test: ALLOC_SHARED with all lanes occupied emits TokenRejected.
3030+- **frame-lanes.AC8.3 Success:** Test: FREE on a shared frame frees only the lane; other lanes' data is preserved. FREE on last lane frees the frame.
3131+3232+---
3333+3434+<!-- START_SUBCOMPONENT_A (tasks 1-2) -->
3535+3636+<!-- START_TASK_1 -->
3737+### Task 1: Add lane_free tracking and update FrameAllocated/FrameFreed events
3838+3939+**Verifies:** frame-lanes.AC3.5
4040+4141+**Files:**
4242+- Modify: `emu/events.py:84-97` — add lane fields to FrameAllocated and FrameFreed
4343+- Modify: `emu/pe.py` — add lane_free data structure, update all event emissions
4444+- Modify: `tests/test_pe_events.py` — update assertions for new event fields
4545+- Modify: `tests/test_pe_frames.py` — update assertions for new event fields
4646+4747+**Implementation:**
4848+4949+**emu/events.py — Update event dataclasses:**
5050+5151+```python
5252+@dataclass(frozen=True)
5353+class FrameAllocated:
5454+ time: float
5555+ component: str
5656+ act_id: int
5757+ frame_id: int
5858+ lane: int
5959+6060+@dataclass(frozen=True)
6161+class FrameFreed:
6262+ time: float
6363+ component: str
6464+ act_id: int
6565+ frame_id: int
6666+ lane: int
6767+ frame_freed: bool
6868+```
6969+7070+**emu/pe.py — Add lane_free tracking in constructor:**
7171+7272+After the `free_frames` initialization (line 87), add:
7373+7474+```python
7575+# Lane tracking: which lanes are free per frame
7676+self.lane_free: dict[int, set[int]] = {}
7777+```
7878+7979+`lane_free` is populated lazily — when a frame is allocated via ALLOC, its lanes are set up.
8080+8181+**emu/pe.py — Update existing ALLOC handler event emission:**
8282+8383+After Phase 1, ALLOC stores `(frame_id, 0)` in tag_store. Update to also set up lane tracking and emit `lane=0`:
8484+8585+```python
8686+if token.op == FrameOp.ALLOC:
8787+ if self.free_frames:
8888+ frame_id = self.free_frames.pop()
8989+ self.tag_store[token.act_id] = (frame_id, 0)
9090+ # Set up lane tracking: lane 0 is taken, rest are free
9191+ self.lane_free[frame_id] = set(range(1, self.lane_count))
9292+ # Initialize frame slots to None
9393+ for i in range(self.frame_slots):
9494+ self.frames[frame_id][i] = None
9595+ # Reset all lanes' match state
9696+ for i in range(self.matchable_offsets):
9797+ for ln in range(self.lane_count):
9898+ self.match_data[frame_id][i][ln] = None
9999+ self.presence[frame_id][i][ln] = False
100100+ self.port_store[frame_id][i][ln] = None
101101+ self._on_event(FrameAllocated(
102102+ time=self.env.now, component=self._component,
103103+ act_id=token.act_id, frame_id=frame_id, lane=0,
104104+ ))
105105+ else:
106106+ logger.warning(f"PE {self.pe_id}: no free frames available")
107107+```
108108+109109+**emu/pe.py — Update existing FREE handler for smart behaviour:**
110110+111111+Note: This replaces the Phase 1 FREE handler wholesale. The Phase 1 version only added tuple unpacking (`frame_id, _lane = self.tag_store.pop(...)`). This version adds lane data clearing, frame-in-use checking, and conditional frame return.
112112+113113+```python
114114+elif token.op == FrameOp.FREE:
115115+ if token.act_id in self.tag_store:
116116+ frame_id, lane = self.tag_store.pop(token.act_id)
117117+ # Clear this lane's match state
118118+ for i in range(self.matchable_offsets):
119119+ self.match_data[frame_id][i][lane] = None
120120+ self.presence[frame_id][i][lane] = False
121121+ self.port_store[frame_id][i][lane] = None
122122+ # Check if any other activations use this frame
123123+ frame_in_use = any(fid == frame_id for fid, _ in self.tag_store.values())
124124+ if frame_in_use:
125125+ # Return lane to pool, keep frame
126126+ self.lane_free[frame_id].add(lane)
127127+ self._on_event(FrameFreed(
128128+ time=self.env.now, component=self._component,
129129+ act_id=token.act_id, frame_id=frame_id,
130130+ lane=lane, frame_freed=False,
131131+ ))
132132+ else:
133133+ # Last lane — return frame to free list
134134+ self.free_frames.append(frame_id)
135135+ if frame_id in self.lane_free:
136136+ del self.lane_free[frame_id]
137137+ # Clear frame slots
138138+ for i in range(self.frame_slots):
139139+ self.frames[frame_id][i] = None
140140+ self._on_event(FrameFreed(
141141+ time=self.env.now, component=self._component,
142142+ act_id=token.act_id, frame_id=frame_id,
143143+ lane=lane, frame_freed=True,
144144+ ))
145145+```
146146+147147+**emu/pe.py — Update FREE_FRAME opcode handler (lines 259-266):**
148148+149149+Same smart free logic applies here:
150150+151151+```python
152152+if token.act_id in self.tag_store:
153153+ freed_frame, lane = self.tag_store.pop(token.act_id)
154154+ # Clear this lane's match state
155155+ for i in range(self.matchable_offsets):
156156+ self.match_data[freed_frame][i][lane] = None
157157+ self.presence[freed_frame][i][lane] = False
158158+ self.port_store[freed_frame][i][lane] = None
159159+ # Check if any other activations use this frame
160160+ frame_in_use = any(fid == freed_frame for fid, _ in self.tag_store.values())
161161+ if frame_in_use:
162162+ self.lane_free[freed_frame].add(lane)
163163+ self._on_event(FrameFreed(
164164+ time=self.env.now, component=self._component,
165165+ act_id=token.act_id, frame_id=freed_frame,
166166+ lane=lane, frame_freed=False,
167167+ ))
168168+ else:
169169+ self.free_frames.append(freed_frame)
170170+ if freed_frame in self.lane_free:
171171+ del self.lane_free[freed_frame]
172172+ for i in range(self.frame_slots):
173173+ self.frames[freed_frame][i] = None
174174+ self._on_event(FrameFreed(
175175+ time=self.env.now, component=self._component,
176176+ act_id=token.act_id, frame_id=freed_frame,
177177+ lane=lane, frame_freed=True,
178178+ ))
179179+```
180180+181181+**Also update constructor initialisation for pre-loaded tag_store entries:**
182182+183183+After Phase 1, the constructor removes allocated frames from `free_frames`. Also initialise `lane_free` for those frames:
184184+185185+```python
186186+for act_id, (frame_id, lane) in self.tag_store.items():
187187+ if frame_id in self.free_frames:
188188+ self.free_frames.remove(frame_id)
189189+ if frame_id not in self.lane_free:
190190+ # First time seeing this frame — set up lane tracking
191191+ all_lanes = set(range(self.lane_count))
192192+ self.lane_free[frame_id] = all_lanes - {lane}
193193+ else:
194194+ self.lane_free[frame_id].discard(lane)
195195+```
196196+197197+**Test updates:**
198198+199199+All existing tests that assert on FrameAllocated or FrameFreed events need updated assertions to include the new fields. For existing single-activation tests:
200200+- `FrameAllocated` assertions add `lane=0`
201201+- `FrameFreed` assertions add `lane=0, frame_freed=True`
202202+203203+Specific test files affected:
204204+- `tests/test_pe_frames.py` lines 505-507 (test_alloc_remote), 550-552 (test_free_frame_opcode), 122-125 (test_free_frame_control_token)
205205+- `tests/test_pe_events.py` — any tests asserting on FrameAllocated/FrameFreed event fields
206206+207207+**Verification:**
208208+Run: `python -m pytest tests/ -v -x`
209209+Expected: All existing tests pass with updated event assertions.
210210+211211+**Commit:** `jj commit -m "feat: add lane tracking and update FrameAllocated/FrameFreed events with lane fields"`
212212+<!-- END_TASK_1 -->
213213+214214+<!-- START_TASK_2 -->
215215+### Task 2: Implement ALLOC_SHARED and FREE_LANE handlers
216216+217217+**Verifies:** frame-lanes.AC3.1, frame-lanes.AC3.2, frame-lanes.AC3.3, frame-lanes.AC3.4, frame-lanes.AC3.6
218218+219219+**Files:**
220220+- Modify: `emu/pe.py` — add ALLOC_SHARED and FREE_LANE cases to `_handle_frame_control()`
221221+222222+**Implementation:**
223223+224224+Add two new cases to `_handle_frame_control()` after the existing ALLOC and FREE handlers:
225225+226226+```python
227227+elif token.op == FrameOp.ALLOC_SHARED:
228228+ # Shared allocation: find parent's frame, assign next free lane
229229+ parent_act_id = token.payload
230230+ if parent_act_id not in self.tag_store:
231231+ self._on_event(TokenRejected(
232232+ time=self.env.now, component=self._component,
233233+ token=token, reason=f"parent act_id {parent_act_id} not in tag store",
234234+ ))
235235+ return
236236+ parent_frame_id, _ = self.tag_store[parent_act_id]
237237+ free_lanes = self.lane_free.get(parent_frame_id, set())
238238+ if not free_lanes:
239239+ self._on_event(TokenRejected(
240240+ time=self.env.now, component=self._component,
241241+ token=token, reason="no free lanes",
242242+ ))
243243+ return
244244+ lane = min(free_lanes) # Deterministic: pick lowest free lane
245245+ free_lanes.remove(lane)
246246+ self.tag_store[token.act_id] = (parent_frame_id, lane)
247247+ # Clear only this lane's match state
248248+ for i in range(self.matchable_offsets):
249249+ self.match_data[parent_frame_id][i][lane] = None
250250+ self.presence[parent_frame_id][i][lane] = False
251251+ self.port_store[parent_frame_id][i][lane] = None
252252+ self._on_event(FrameAllocated(
253253+ time=self.env.now, component=self._component,
254254+ act_id=token.act_id, frame_id=parent_frame_id, lane=lane,
255255+ ))
256256+257257+elif token.op == FrameOp.FREE_LANE:
258258+ # Free lane only — never returns frame to free list
259259+ if token.act_id in self.tag_store:
260260+ frame_id, lane = self.tag_store.pop(token.act_id)
261261+ for i in range(self.matchable_offsets):
262262+ self.match_data[frame_id][i][lane] = None
263263+ self.presence[frame_id][i][lane] = False
264264+ self.port_store[frame_id][i][lane] = None
265265+ self.lane_free[frame_id].add(lane)
266266+ self._on_event(FrameFreed(
267267+ time=self.env.now, component=self._component,
268268+ act_id=token.act_id, frame_id=frame_id,
269269+ lane=lane, frame_freed=False,
270270+ ))
271271+```
272272+273273+**Testing:**
274274+275275+No new tests in this task — AC3.1-AC3.6 are tested in Task 3.
276276+277277+**Verification:**
278278+Run: `python -m pytest tests/ -v -x`
279279+Expected: All existing tests pass (new handlers only activate for new FrameOp values).
280280+281281+**Commit:** `jj commit -m "feat: implement ALLOC_SHARED and FREE_LANE frame control handlers"`
282282+<!-- END_TASK_2 -->
283283+284284+<!-- END_SUBCOMPONENT_A -->
285285+286286+<!-- START_SUBCOMPONENT_B (tasks 3-4) -->
287287+288288+<!-- START_TASK_3 -->
289289+### Task 3: Tests for ALLOC_SHARED, lane exhaustion, and FREE_LANE
290290+291291+**Verifies:** frame-lanes.AC3.1, frame-lanes.AC3.2, frame-lanes.AC3.6, frame-lanes.AC8.1, frame-lanes.AC8.2
292292+293293+**Files:**
294294+- Create: `tests/test_pe_lanes.py`
295295+296296+**Implementation:**
297297+298298+Create a new test file dedicated to lane functionality. Follow the existing test patterns from `tests/test_pe_frames.py`:
299299+- `simpy.Environment()` setup
300300+- `PEConfig` with `on_event=events.append`
301301+- `ProcessingElement(env, pe_id, config)` construction
302302+- Token injection via `pe.input_store.put(token)` in a SimPy process
303303+- Event collection via the events list
304304+305305+**Testing:**
306306+307307+Tests must verify these specific AC cases:
308308+309309+- **frame-lanes.AC3.1 (ALLOC_SHARED):** Send `FrameControlToken(op=FrameOp.ALLOC, act_id=0, payload=0)` to allocate a frame. Then send `FrameControlToken(op=FrameOp.ALLOC_SHARED, act_id=1, payload=0)` where payload is the parent act_id. Verify `tag_store[1]` has the same `frame_id` as `tag_store[0]` but a different lane. Verify `FrameAllocated` event has correct lane.
310310+311311+- **frame-lanes.AC3.2 (FREE_LANE):** After ALLOC_SHARED, send `FrameControlToken(op=FrameOp.FREE_LANE, act_id=1, payload=0)`. Verify `tag_store` no longer has act_id 1. Verify act_id 0 is still present. Verify frame is NOT in `free_frames`. Verify `FrameFreed` event has `frame_freed=False`.
312312+313313+- **frame-lanes.AC3.6 (lane exhaustion):** Allocate a frame, then ALLOC_SHARED until all `lane_count` lanes are occupied. Send one more ALLOC_SHARED. Verify `TokenRejected` event with reason "no free lanes".
314314+315315+- **frame-lanes.AC8.1 (independent matching):** Set up two act_ids sharing a frame (ALLOC + ALLOC_SHARED). Load an instruction with IRAM. Send L operand via DyadToken for act_id 0 and L operand via DyadToken for act_id 1 at the same offset. Verify both presence bits are set independently — neither token triggers a match (both are waiting for their R partner). Then send R for act_id 0 — verify only act_id 0 matches and fires, act_id 1's L is still pending.
316316+317317+- **frame-lanes.AC8.2 (exhaustion):** Same as AC3.6 but via the test coverage AC numbering — allocate all lanes, attempt one more, verify TokenRejected.
318318+319319+**Verification:**
320320+Run: `python -m pytest tests/test_pe_lanes.py -v`
321321+Expected: All new tests pass.
322322+323323+Run: `python -m pytest tests/ -v -x`
324324+Expected: All tests pass (new and existing).
325325+326326+**Commit:** `jj commit -m "test: add tests for ALLOC_SHARED, FREE_LANE, and lane exhaustion"`
327327+<!-- END_TASK_3 -->
328328+329329+<!-- START_TASK_4 -->
330330+### Task 4: Tests for smart FREE behaviour
331331+332332+**Verifies:** frame-lanes.AC3.3, frame-lanes.AC3.4, frame-lanes.AC8.3
333333+334334+**Files:**
335335+- Modify: `tests/test_pe_lanes.py` — add smart FREE test class
336336+337337+**Implementation:**
338338+339339+Add tests to the lane test file created in Task 3.
340340+341341+**Testing:**
342342+343343+Tests must verify these specific AC cases:
344344+345345+- **frame-lanes.AC3.3 (smart FREE on shared frame):** Allocate a frame (act_id=0, lane 0). ALLOC_SHARED (act_id=1, lane 1). Send matching operands to act_id=1 so presence bits are set. FREE act_id=0. Verify: act_id=0 removed from tag_store, act_id=1 still present, frame NOT in free_frames, act_id=1's pending match data is preserved (presence bit still True for lane 1). Verify `FrameFreed` event has `frame_freed=False`.
346346+347347+- **frame-lanes.AC3.3 (smart FREE on last lane):** Same setup. FREE act_id=0, then FREE act_id=1. After second FREE: frame IS returned to free_frames, `lane_free` entry for that frame is cleaned up. Verify `FrameFreed` event has `frame_freed=True`.
348348+349349+- **frame-lanes.AC3.4 (ALLOC unchanged):** Verify that regular ALLOC still works — allocates fresh frame, assigns lane 0, no parent required. This is a regression check.
350350+351351+- **frame-lanes.AC8.3 (data preservation):** Set up shared frame with two act_ids. Store a DyadToken L operand on act_id=1's lane. FREE act_id=0. Verify act_id=1's match_data and presence are untouched — the pending operand is still there.
352352+353353+**Verification:**
354354+Run: `python -m pytest tests/test_pe_lanes.py -v`
355355+Expected: All tests pass.
356356+357357+Run: `python -m pytest tests/ -v -x`
358358+Expected: All tests pass.
359359+360360+**Commit:** `jj commit -m "test: add tests for smart FREE behaviour and data preservation across lanes"`
361361+<!-- END_TASK_4 -->
362362+363363+<!-- END_SUBCOMPONENT_B -->
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC4: ALLOC_REMOTE Data-Driven
2020+- **frame-lanes.AC4.1 Success:** ALLOC_REMOTE reads `fref+2` from frame. If value is non-zero, emits `FrameControlToken` with `op=ALLOC_SHARED` and `payload=parent_act_id`. If zero, emits `op=ALLOC` as before.
2121+- **frame-lanes.AC4.2 Success:** No new opcodes. Behaviour is entirely data-driven from frame constants.
2222+2323+### frame-lanes.AC5: FREE_FRAME Instruction
2424+- **frame-lanes.AC5.1 Success:** `FREE_FRAME` opcode uses the smart FREE behaviour from AC3.3. Frees the executing token's activation lane; returns frame to free list only if last lane.
2525+2626+### frame-lanes.AC8: Test Coverage (partial)
2727+- **frame-lanes.AC8.4 Success:** Test: ALLOC_REMOTE emits ALLOC_SHARED when `fref+2` is non-zero.
2828+- **frame-lanes.AC8.5 Success:** Test: ALLOC_REMOTE emits ALLOC when `fref+2` is zero (backwards compatible).
2929+3030+---
3131+3232+<!-- START_SUBCOMPONENT_A (tasks 1-2) -->
3333+3434+<!-- START_TASK_1 -->
3535+### Task 1: Update ALLOC_REMOTE to read fref+2 for data-driven shared allocation
3636+3737+**Verifies:** frame-lanes.AC4.1, frame-lanes.AC4.2
3838+3939+**Files:**
4040+- Modify: `emu/pe.py:231-248` — ALLOC_REMOTE handler in `_process_token()`
4141+4242+**Implementation:**
4343+4444+The current ALLOC_REMOTE handler reads `fref+0` (target PE) and `fref+1` (target act_id) from frame constants, then emits a `FrameControlToken(op=FrameOp.ALLOC)`. Extend it to also read `fref+2` (parent act_id for shared allocation).
4545+4646+**Updated handler:**
4747+4848+```python
4949+elif inst.opcode == RoutingOp.ALLOC_REMOTE:
5050+ # PE-level: read target PE, act_id, and optional parent act_id from frame constants
5151+ # fref+0: target PE
5252+ # fref+1: target act_id
5353+ # fref+2: parent act_id (0 = fresh ALLOC, non-zero = ALLOC_SHARED)
5454+ target_pe = self.frames[frame_id][inst.fref] if inst.fref < len(self.frames[frame_id]) else 0
5555+ target_act = self.frames[frame_id][inst.fref + 1] if inst.fref + 1 < len(self.frames[frame_id]) else 0
5656+ parent_act = self.frames[frame_id][inst.fref + 2] if inst.fref + 2 < len(self.frames[frame_id]) else 0
5757+5858+ if parent_act:
5959+ alloc_op = FrameOp.ALLOC_SHARED
6060+ payload = parent_act
6161+ else:
6262+ alloc_op = FrameOp.ALLOC
6363+ payload = 0
6464+6565+ fct = FrameControlToken(
6666+ target=target_pe,
6767+ act_id=target_act,
6868+ op=alloc_op,
6969+ payload=payload,
7070+ )
7171+ self._on_event(Executed(
7272+ time=self.env.now, component=self._component,
7373+ op=inst.opcode, result=0, bool_out=False,
7474+ ))
7575+ yield self.env.timeout(1) # EXECUTE cycle
7676+ yield self.env.timeout(1) # EMIT cycle
7777+ self.env.process(self._deliver(self.route_table[target_pe], fct))
7878+```
7979+8080+Key changes from current code:
8181+- Added `parent_act` read from `fref+2`
8282+- Conditional: if `parent_act` is non-zero, use `ALLOC_SHARED` with `payload=parent_act`; otherwise, use `ALLOC` with `payload=0` (backwards compatible)
8383+- No new opcodes (AC4.2)
8484+- Note: Frame slots at `fref+0`, `fref+1`, `fref+2` must be `int` values (not `FrameDest`). The codegen guarantees this for properly assembled programs. No runtime type check added, consistent with the existing ALLOC_REMOTE pattern at `fref+0` and `fref+1`.
8585+8686+**Testing:**
8787+8888+No new tests in this task — AC4.1 and AC4.2 are tested in Task 3.
8989+9090+**Verification:**
9191+Run: `python -m pytest tests/ -v -x`
9292+Expected: All existing tests pass. Existing ALLOC_REMOTE tests set up frames with only `fref+0` and `fref+1` populated — `fref+2` defaults to `None` which is falsy, so existing tests get `op=ALLOC` as before.
9393+9494+**Commit:** `jj commit -m "feat: ALLOC_REMOTE reads fref+2 for data-driven ALLOC_SHARED"`
9595+<!-- END_TASK_1 -->
9696+9797+<!-- START_TASK_2 -->
9898+### Task 2: Verify FREE_FRAME uses smart FREE behaviour
9999+100100+**Verifies:** frame-lanes.AC5.1
101101+102102+**Files:**
103103+- Verify: `emu/pe.py:249-267` — FREE_FRAME handler
104104+105105+**Implementation:**
106106+107107+After Phase 3, the FREE_FRAME opcode handler in `_process_token()` already uses the smart FREE behaviour (tag_store.pop with tuple unpacking, lane data clearing, frame-in-use check). This task verifies that the Phase 3 changes correctly cover the FREE_FRAME path.
108108+109109+If Phase 3 was implemented correctly, the FREE_FRAME handler at lines 260-266 should already:
110110+1. Unpack `frame_id, lane = self.tag_store.pop(token.act_id)`
111111+2. Clear the lane's match_data/presence/port_store
112112+3. Check if other activations reference the same frame_id
113113+4. Return frame to free_frames only if last lane
114114+5. Emit FrameFreed with `lane` and `frame_freed` fields
115115+116116+If the Phase 3 implementation only updated `_handle_frame_control()` FREE and forgot the FREE_FRAME opcode path, this task is where you fix it. Both paths must have identical smart FREE logic.
117117+118118+**Testing:**
119119+120120+No new dedicated tests — AC5.1 is a subset of AC3.3 applied to a different code path. The existing `test_free_frame_opcode` test in `tests/test_pe_frames.py` verifies the basic path; the lane-aware behaviour is tested via the AC8.3 tests in Phase 3.
121121+122122+**Verification:**
123123+Run: `python -m pytest tests/test_pe_frames.py -v -k "free_frame"`
124124+Expected: All FREE_FRAME tests pass.
125125+126126+**Commit:** No commit needed if Phase 3 already handled this path. If a fix is needed: `jj commit -m "fix: ensure FREE_FRAME opcode uses smart FREE behaviour"`
127127+<!-- END_TASK_2 -->
128128+129129+<!-- END_SUBCOMPONENT_A -->
130130+131131+<!-- START_SUBCOMPONENT_B (tasks 3-4) -->
132132+133133+<!-- START_TASK_3 -->
134134+### Task 3: Tests for data-driven ALLOC_REMOTE
135135+136136+**Verifies:** frame-lanes.AC8.4, frame-lanes.AC8.5
137137+138138+**Files:**
139139+- Modify: `tests/test_pe_lanes.py` — add ALLOC_REMOTE data-driven tests
140140+141141+**Implementation:**
142142+143143+Add tests to the lane test file.
144144+145145+**Testing:**
146146+147147+Tests must verify these specific AC cases:
148148+149149+- **frame-lanes.AC8.4 (ALLOC_REMOTE emits ALLOC_SHARED):** Set up PE0 with an allocated frame. Write frame slots: `fref+0 = 1` (target PE 1), `fref+1 = 5` (target act_id), `fref+2 = 3` (parent act_id, non-zero). Load ALLOC_REMOTE instruction. Send a MonadToken to trigger it. Set up PE1 with a route table entry so we can capture the emitted token. Verify the FrameControlToken sent to PE1 has `op=FrameOp.ALLOC_SHARED` and `payload=3`.
150150+151151+ For capturing the emitted FrameControlToken: use `output_store = simpy.Store(env)` and set `pe.route_table[1] = output_store`, then check `output_store.items[0]`.
152152+153153+- **frame-lanes.AC8.5 (ALLOC_REMOTE emits ALLOC when fref+2 is zero):** Same setup but `fref+2 = 0` (or slot is None). Verify the FrameControlToken has `op=FrameOp.ALLOC` and `payload=0`. This is the backwards-compatible path.
154154+155155+**Verification:**
156156+Run: `python -m pytest tests/test_pe_lanes.py -v -k "alloc_remote"`
157157+Expected: All new tests pass.
158158+159159+Run: `python -m pytest tests/ -v -x`
160160+Expected: All tests pass.
161161+162162+**Commit:** `jj commit -m "test: add tests for data-driven ALLOC_REMOTE (ALLOC_SHARED vs ALLOC)"`
163163+<!-- END_TASK_3 -->
164164+165165+<!-- START_TASK_4 -->
166166+### Task 4: Test FREE_FRAME with shared frame (smart FREE via opcode)
167167+168168+**Verifies:** frame-lanes.AC5.1
169169+170170+**Files:**
171171+- Modify: `tests/test_pe_lanes.py` — add FREE_FRAME smart free test
172172+173173+**Implementation:**
174174+175175+Add a test that exercises the FREE_FRAME opcode path specifically (not the FrameControlToken FREE path, which is already tested in Phase 3).
176176+177177+**Testing:**
178178+179179+- **frame-lanes.AC5.1 (FREE_FRAME smart free):** Set up PE with a shared frame (two act_ids on the same frame via initial_tag_store with different lanes). Load FREE_FRAME instruction at the offset used by act_id=0. Send a MonadToken to act_id=0 to trigger FREE_FRAME execution. Verify: act_id=0's lane is freed, act_id=1 is still in tag_store with the same frame, frame is NOT in free_frames, FrameFreed event has `frame_freed=False`. Then trigger FREE_FRAME for act_id=1. Verify: frame IS returned to free_frames, FrameFreed event has `frame_freed=True`.
180180+181181+**Verification:**
182182+Run: `python -m pytest tests/test_pe_lanes.py -v -k "free_frame"`
183183+Expected: All tests pass.
184184+185185+Run: `python -m pytest tests/ -v -x`
186186+Expected: All tests pass.
187187+188188+**Commit:** `jj commit -m "test: add test for FREE_FRAME opcode with smart free on shared frame"`
189189+<!-- END_TASK_4 -->
190190+191191+<!-- END_SUBCOMPONENT_B -->
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC6: Monitor and Snapshot Updates
2020+- **frame-lanes.AC6.1 Success:** `PESnapshot.tag_store` type becomes `dict[int, tuple[int, int]]`.
2121+- **frame-lanes.AC6.2 Success:** `PESnapshot` gains `match_data`, `lane_count` fields reflecting the separated match storage.
2222+- **frame-lanes.AC6.3 Success:** Monitor REPL `pe` command displays lane info in tag_store output.
2323+- **frame-lanes.AC6.4 Success:** Monitor graph JSON serialises lane info correctly.
2424+2525+### frame-lanes.AC7: Codegen Updates
2626+- **frame-lanes.AC7.1 Success:** `codegen.py` generates `initial_tag_store` with `(frame_id, lane)` tuples. Existing single-activation code uses lane 0.
2727+- **frame-lanes.AC7.2 Success:** No codegen changes needed for ALLOC_SHARED (manual construction only for now).
2828+2929+---
3030+3131+Note: AC6.1 and AC6.2 (PESnapshot type changes) are structurally completed in Phase 2 Task 3 as part of the match storage separation. This phase covers the remaining monitor/codegen pieces that depend on those type changes.
3232+3333+<!-- START_SUBCOMPONENT_A (tasks 1-2) -->
3434+3535+<!-- START_TASK_1 -->
3636+### Task 1: Update Monitor REPL formatting for lane info
3737+3838+**Verifies:** frame-lanes.AC6.3
3939+4040+**Files:**
4141+- Modify: `monitor/formatting.py:194-202` — `format_pe_state()` tag_store display
4242+4343+**Implementation:**
4444+4545+The current tag_store display at `monitor/formatting.py:194-202` formats entries as `{key: value}`. After Phase 1, tag_store values are `(frame_id, lane)` tuples. Update the formatting to show lane info clearly.
4646+4747+**Current code (lines 194-202):**
4848+```python
4949+if pe_snapshot.tag_store:
5050+ tag_str = ", ".join(
5151+ f"{colour(str(k), 'white')}: {colour(str(v), 'white')}"
5252+ for k, v in sorted(pe_snapshot.tag_store.items())
5353+ )
5454+ lines.append(f" Tag store: {{{tag_str}}}")
5555+else:
5656+ lines.append(" Tag store: (empty)")
5757+```
5858+5959+**Updated code:**
6060+```python
6161+if pe_snapshot.tag_store:
6262+ tag_str = ", ".join(
6363+ f"{colour(str(k), 'white')}: frame {colour(str(fid), 'white')} lane {colour(str(lane), 'white')}"
6464+ for k, (fid, lane) in sorted(pe_snapshot.tag_store.items())
6565+ )
6666+ lines.append(f" Tag store: {{{tag_str}}}")
6767+else:
6868+ lines.append(" Tag store: (empty)")
6969+```
7070+7171+This changes the display from `{0: 0}` to `{0: frame 0 lane 0}`, making lane assignments visible at a glance.
7272+7373+**Testing:**
7474+7575+The REPL tests at `tests/test_repl.py:456-473` verify that `do_pe()` produces output but don't assert on specific tag_store formatting content. The formatting change is verified by manual inspection. Existing tests remain valid because they only check `len(out) > 0`.
7676+7777+**Verification:**
7878+Run: `python -m pytest tests/test_repl.py -v`
7979+Expected: All tests pass.
8080+8181+**Commit:** `jj commit -m "feat: display lane info in monitor REPL pe command"`
8282+<!-- END_TASK_1 -->
8383+8484+<!-- START_TASK_2 -->
8585+### Task 2: Update Monitor graph JSON for lane serialisation
8686+8787+**Verifies:** frame-lanes.AC6.4
8888+8989+**Files:**
9090+- Modify: `monitor/graph_json.py:104-127` — `_serialise_pe_state()` function
9191+- Modify: `tests/test_monitor_graph_json.py` — update PESnapshot constructions
9292+9393+**Implementation:**
9494+9595+The current `_serialise_pe_state()` at `monitor/graph_json.py:124` passes `tag_store` directly to JSON:
9696+```python
9797+"tag_store": pe_snap.tag_store,
9898+```
9999+100100+After Phase 1, tag_store values are `(frame_id, lane)` tuples. Use explicit dict format for self-documenting JSON that the TypeScript frontend can easily type:
101101+102102+```python
103103+"tag_store": {
104104+ str(act_id): {"frame_id": fid, "lane": lane}
105105+ for act_id, (fid, lane) in pe_snap.tag_store.items()
106106+},
107107+"lane_count": pe_snap.lane_count,
108108+```
109109+110110+This produces JSON like `{"0": {"frame_id": 2, "lane": 0}}` instead of `{"0": [2, 0]}`, which is more explicit and easier to type in TypeScript.
111111+112112+**Test updates:**
113113+114114+Update `tests/test_monitor_graph_json.py` PESnapshot constructions to include the new fields (`match_data`, `lane_count`). Tests that construct `PESnapshot` directly with `tag_store={}` will work as-is (empty dict). Tests that use non-empty tag_store must change values from `int` to `tuple[int, int]`.
115115+116116+**Verification:**
117117+Run: `python -m pytest tests/test_monitor_graph_json.py -v`
118118+Expected: All tests pass.
119119+120120+**Commit:** `jj commit -m "feat: serialise lane info in monitor graph JSON"`
121121+<!-- END_TASK_2 -->
122122+123123+<!-- END_SUBCOMPONENT_A -->
124124+125125+<!-- START_SUBCOMPONENT_B (tasks 3-4) -->
126126+127127+<!-- START_TASK_3 -->
128128+### Task 3: Update codegen to emit tuple initial_tag_store
129129+130130+**Verifies:** frame-lanes.AC7.1
131131+132132+**Files:**
133133+- Modify: `asm/codegen.py:383` — first initial_tag_store assignment
134134+- Modify: `asm/codegen.py:412` — second initial_tag_store assignment
135135+136136+**Implementation:**
137137+138138+The codegen at `asm/codegen.py` builds `initial_tag_store` as `dict[int, int]` mapping `act_id → frame_id`. Change both assignment sites to produce `dict[int, tuple[int, int]]` mapping `act_id → (frame_id, lane)` with lane 0 for all existing single-activation code.
139139+140140+**Line 383 (empty layout path):**
141141+```python
142142+initial_tag_store[act_id] = (frame_id, 0)
143143+```
144144+145145+**Line 412 (populated layout path):**
146146+```python
147147+initial_tag_store[act_id] = (frame_id, 0)
148148+```
149149+150150+No other changes needed. The type annotation for the local variable can be updated:
151151+```python
152152+initial_tag_store: dict[int, tuple[int, int]] = {}
153153+```
154154+155155+**Testing:**
156156+157157+Existing codegen tests at `tests/test_codegen_frames.py` don't explicitly assert on `initial_tag_store` contents — they test IRAM, setup_tokens, and seed_tokens. The type change is validated by the overall test suite passing (PEConfig now expects tuple values from Phase 1).
158158+159159+**Verification:**
160160+Run: `python -m pytest tests/test_codegen_frames.py -v`
161161+Expected: All tests pass.
162162+163163+Run: `python -m pytest tests/ -v -x`
164164+Expected: All tests pass.
165165+166166+**Commit:** `jj commit -m "feat: codegen emits initial_tag_store with (frame_id, lane) tuples"`
167167+<!-- END_TASK_3 -->
168168+169169+<!-- START_TASK_4 -->
170170+### Task 4: Verify no codegen changes needed for ALLOC_SHARED
171171+172172+**Verifies:** frame-lanes.AC7.2
173173+174174+**Files:**
175175+- No files to modify
176176+177177+**Implementation:**
178178+179179+AC7.2 states: "No codegen changes needed for ALLOC_SHARED (manual construction only for now)." This is a verification task — confirm that the codegen does not attempt to generate ALLOC_SHARED tokens or any lane-related control flow. ALLOC_SHARED is invoked only via manual test construction or hand-crafted assembly.
180180+181181+**Verification:**
182182+183183+Grep codegen for any reference to ALLOC_SHARED or FREE_LANE:
184184+```bash
185185+grep -r "ALLOC_SHARED\|FREE_LANE" asm/
186186+```
187187+Expected: No results. The codegen is unaware of these new FrameOp values.
188188+189189+Run: `python -m pytest tests/ -v -x`
190190+Expected: All tests pass.
191191+192192+**Commit:** No commit needed — this is a verification-only task.
193193+<!-- END_TASK_4 -->
194194+195195+<!-- END_SUBCOMPONENT_B -->
···11+# Frame Matching Lanes Implementation Plan
22+33+**Goal:** Extend the PE's frame-based matching to support multiple simultaneous pending operands per instruction within a single activation via matching lanes.
44+55+**Architecture:** Multiple `activation_id` values share one physical frame (constants/destinations) while maintaining independent matching state per lane. Tag store maps `act_id → (frame_id, lane)`. Match data, presence, and port storage gain a lane dimension.
66+77+**Tech Stack:** Python 3.12, SimPy 4.1, pytest + hypothesis
88+99+**Scope:** 6 phases from original design (phases 1-6)
1010+1111+**Codebase verified:** 2026-03-07
1212+1313+---
1414+1515+## Acceptance Criteria Coverage
1616+1717+This phase implements and tests:
1818+1919+### frame-lanes.AC8: Test Coverage (final)
2020+- **frame-lanes.AC8.6 Success:** Test: full loop pipelining scenario — two iterations of a dyadic instruction running concurrently on different lanes, both producing correct results.
2121+2222+---
2323+2424+<!-- START_TASK_1 -->
2525+### Task 1: Full loop pipelining integration test
2626+2727+**Verifies:** frame-lanes.AC8.6
2828+2929+**Files:**
3030+- Modify: `tests/test_pe_lanes.py` — add integration test class
3131+3232+**Implementation:**
3333+3434+Add an integration test that simulates the complete loop pipelining lifecycle from the design plan's Architecture section. This test exercises every Phase 1-5 feature together.
3535+3636+**Testing:**
3737+3838+The test must verify frame-lanes.AC8.6 by simulating this lifecycle:
3939+4040+```
4141+1. ALLOC(act_id=0) → frame, lane 0
4242+2. Setup: write constants/dests to frame
4343+3. Iteration 1: inject L and R DyadTokens for act_id=0
4444+4. ALLOC_SHARED(act_id=1, parent=0) → same frame, lane 1
4545+5. Iteration 2: inject L and R DyadTokens for act_id=1
4646+6. Both iterations match independently, both produce correct results
4747+7. FREE(act_id=0) → lane 0 freed, frame stays
4848+8. FREE(act_id=1) → last lane, frame returned to free list
4949+```
5050+5151+**Test structure:**
5252+5353+Follow the established test patterns from `tests/test_pe_frames.py`:
5454+- Use `simpy.Environment()` and `PEConfig(on_event=events.append)`
5555+- Direct PE construction (no `build_topology` needed for single-PE test)
5656+- Use `inject_and_run()` helper for sequential token injection
5757+- Use a `simpy.Store` as `pe.route_table[target]` to capture output tokens
5858+5959+**Detailed scenario:**
6060+6161+1. **PE setup:** Create PE with `frame_count=4`, `matchable_offsets=4`, no pre-loaded frames or tag_store. Install a dyadic ADD instruction at IRAM offset 0 with `OutputStyle.INHERIT`, `dest_count=1`, `fref=8`.
6262+6363+2. **Allocate frame for iteration 1:**
6464+ - Inject `FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)`
6565+ - Verify `FrameAllocated` event with `lane=0`
6666+ - Get `frame_id` from `pe.tag_store[0]`
6767+6868+3. **Write destination to frame:**
6969+ - Write a `FrameDest(target_pe=1, offset=0, act_id=0, port=Port.L, token_kind=TokenKind.MONADIC)` to `pe.frames[frame_id][8]`
7070+ - Set up `pe.route_table[1] = simpy.Store(env)` to capture output
7171+7272+4. **Allocate shared frame for iteration 2:**
7373+ - Inject `FrameControlToken(target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0)` (payload=0 is parent act_id)
7474+ - Verify `FrameAllocated` event with `lane=1` and same `frame_id`
7575+ - Verify `pe.tag_store[1][0] == pe.tag_store[0][0]` (same frame_id)
7676+7777+5. **Inject iteration 1 operands (act_id=0):**
7878+ - Inject `DyadToken(target=0, offset=0, act_id=0, data=100, port=Port.L)`
7979+ - Inject `DyadToken(target=0, offset=0, act_id=0, data=200, port=Port.R)`
8080+ - Verify `Matched` event for `act_id=0` with `left=100, right=200`
8181+ - Verify output token emitted with `data=300` (100+200)
8282+8383+6. **Inject iteration 2 operands (act_id=1):**
8484+ - Inject `DyadToken(target=0, offset=0, act_id=1, data=1000, port=Port.L)`
8585+ - Inject `DyadToken(target=0, offset=0, act_id=1, data=2000, port=Port.R)`
8686+ - Verify `Matched` event for `act_id=1` with `left=1000, right=2000`
8787+ - Verify output token emitted with `data=3000` (1000+2000)
8888+8989+7. **Interleaved verification:**
9090+ - Confirm that iteration 1's L operand (injected first) did NOT interfere with iteration 2's matching — they're on different lanes
9191+ - Both `Matched` events should have independent operand values
9292+9393+8. **Free iteration 1 (not last lane):**
9494+ - Inject `FrameControlToken(target=0, act_id=0, op=FrameOp.FREE, payload=0)`
9595+ - Verify `FrameFreed` with `frame_freed=False`
9696+ - Verify `0 not in pe.tag_store`
9797+ - Verify `1 in pe.tag_store` (iteration 2 still active)
9898+ - Verify `frame_id not in pe.free_frames` (frame stays allocated)
9999+100100+9. **Free iteration 2 (last lane):**
101101+ - Inject `FrameControlToken(target=0, act_id=1, op=FrameOp.FREE, payload=0)`
102102+ - Verify `FrameFreed` with `frame_freed=True`
103103+ - Verify `1 not in pe.tag_store`
104104+ - Verify `frame_id in pe.free_frames` (frame returned to pool)
105105+106106+**Key assertions for AC8.6:**
107107+- Both iterations produce mathematically correct results (100+200=300, 1000+2000=3000)
108108+- Both iterations ran on the SAME frame (shared constants/destinations)
109109+- Both iterations used DIFFERENT lanes (lane 0 and lane 1)
110110+- Freeing one iteration preserved the other's state
111111+- Freeing the last iteration returned the frame
112112+113113+**Verification:**
114114+Run: `python -m pytest tests/test_pe_lanes.py -v -k "loop_pipelining"`
115115+Expected: Test passes.
116116+117117+Run: `python -m pytest tests/ -v -x`
118118+Expected: All tests pass.
119119+120120+**Commit:** `jj commit -m "test: add full loop pipelining integration test (AC8.6)"`
121121+<!-- END_TASK_1 -->
···11+# Test Requirements: Frame Matching Lanes
22+33+## Automated Test Coverage
44+55+| AC ID | Criterion | Test Type | Expected Test File | Implementation Phase |
66+|-------|-----------|-----------|-------------------|---------------------|
77+| frame-lanes.AC1.1 | `tag_store` maps `act_id -> (frame_id, lane)` where `lane` is an `int` in range `[0, lane_count)` | unit | `tests/test_pe_frames.py` (existing tests adapted to tuple API) | Phase 1 |
88+| frame-lanes.AC1.2 | `PEConfig.initial_tag_store` type is `dict[int, tuple[int, int]]`. PE constructor initialises tag_store from it. | unit | `tests/test_pe_frames.py`, `tests/test_pe_events.py`, `tests/test_pe.py` (all existing tests updated to pass tuple values) | Phase 1 |
99+| frame-lanes.AC1.3 | `PEConfig.lane_count` field exists with default 4. Controls third dimension of match arrays. | unit | `tests/test_pe_lanes.py` (verified structurally when match arrays gain lane dimension in Phase 2) | Phase 1 |
1010+| frame-lanes.AC1.4 | All existing tests pass with updated tuple API. | integration | `tests/` (full test suite regression run) | Phase 1 |
1111+| frame-lanes.AC2.1 | Match operand data lives in `match_data[frame_id][offset][lane]`, separate from `frames[frame_id][slot]`. | unit | `tests/test_pe_frames.py`, `tests/test_pe.py` (existing matching tests exercise 3D match_data via lane 0) | Phase 2 |
1212+| frame-lanes.AC2.2 | `presence[frame_id][offset][lane]` is a 3D bool array. `port_store[frame_id][offset][lane]` likewise. | unit | `tests/test_pe.py` (existing presence assertions updated from 2D to 3D indexing with `[0]` lane suffix) | Phase 2 |
1313+| frame-lanes.AC2.3 | `_match_frame()` uses `(frame_id, match_slot, lane)` to read/write match data, presence, and port. | unit | `tests/test_pe_frames.py`, `tests/test_pe.py` (existing matching tests pass through `_match_frame` with lane parameter) | Phase 2 |
1414+| frame-lanes.AC2.4 | `frames[frame_id][slot]` remains shared across all lanes. Constants and destinations are NOT per-lane. | unit | `tests/test_pe_lanes.py` (verified in Phase 3 AC8.1 test: two act_ids sharing a frame read the same frame slot constants) | Phase 3 |
1515+| frame-lanes.AC3.1 | `FrameOp.ALLOC_SHARED`: PE looks up `parent_act_id` from payload, finds parent's `frame_id`, assigns next free lane, records `tag_store[act_id] = (frame_id, lane)`. Clears only that lane's presence/port bits. | unit | `tests/test_pe_lanes.py` | Phase 3 |
1616+| frame-lanes.AC3.2 | `FrameOp.FREE_LANE`: Removes tag_store entry, clears that lane's presence/port/match_data. Does NOT return frame to free list. | unit | `tests/test_pe_lanes.py` | Phase 3 |
1717+| frame-lanes.AC3.3 | `FrameOp.FREE` becomes smart: removes tag_store entry, clears lane data. Returns frame to free list only if no other tag_store entries reference the same frame_id. | unit | `tests/test_pe_lanes.py` | Phase 3 |
1818+| frame-lanes.AC3.4 | `FrameOp.ALLOC` unchanged: allocates fresh frame, assigns lane 0. | unit | `tests/test_pe_lanes.py` (regression check), `tests/test_pe_frames.py` (existing ALLOC tests) | Phase 3 |
1919+| frame-lanes.AC3.5 | `FrameAllocated` event gains `lane: int` field. `FrameFreed` event gains `lane: int` and `frame_freed: bool` fields. | unit | `tests/test_pe_events.py`, `tests/test_pe_frames.py` (existing event assertions updated with new fields) | Phase 3 |
2020+| frame-lanes.AC3.6 | When all lanes for a frame are occupied and ALLOC_SHARED is received, PE emits `TokenRejected` with reason "no free lanes". | unit | `tests/test_pe_lanes.py` | Phase 3 |
2121+| frame-lanes.AC4.1 | ALLOC_REMOTE reads `fref+2` from frame. If non-zero, emits `FrameControlToken` with `op=ALLOC_SHARED` and `payload=parent_act_id`. If zero, emits `op=ALLOC`. | unit | `tests/test_pe_lanes.py` | Phase 4 |
2222+| frame-lanes.AC4.2 | No new opcodes. Behaviour is entirely data-driven from frame constants. | unit | Verified by absence: `grep -r "ALLOC_SHARED\|FREE_LANE" asm/` returns no results. No dedicated test file. | Phase 5 |
2323+| frame-lanes.AC5.1 | `FREE_FRAME` opcode uses the smart FREE behaviour from AC3.3. Frees the executing token's activation lane; returns frame to free list only if last lane. | unit | `tests/test_pe_lanes.py` | Phase 4 |
2424+| frame-lanes.AC6.1 | `PESnapshot.tag_store` type becomes `dict[int, tuple[int, int]]`. | unit | `tests/test_snapshot.py`, `tests/test_monitor_graph_json.py` (PESnapshot constructions updated) | Phase 2 |
2525+| frame-lanes.AC6.2 | `PESnapshot` gains `match_data`, `lane_count` fields reflecting the separated match storage. | unit | `tests/test_snapshot.py` (snapshot capture assertions updated for new fields) | Phase 2 |
2626+| frame-lanes.AC6.4 | Monitor graph JSON serialises lane info correctly. | unit | `tests/test_monitor_graph_json.py` (assertions on serialised tag_store JSON structure with frame_id/lane keys) | Phase 5 |
2727+| frame-lanes.AC7.1 | `codegen.py` generates `initial_tag_store` with `(frame_id, lane)` tuples. Existing single-activation code uses lane 0. | unit | `tests/test_codegen_frames.py` (existing codegen tests pass with tuple-valued initial_tag_store) | Phase 5 |
2828+| frame-lanes.AC7.2 | No codegen changes needed for ALLOC_SHARED (manual construction only for now). | unit | Verified by absence: `grep -r "ALLOC_SHARED\|FREE_LANE" asm/` returns no results. No dedicated test file. | Phase 5 |
2929+| frame-lanes.AC8.1 | Two act_ids sharing a frame via ALLOC_SHARED have independent matching: L operand for act_id 0 does not interfere with L operand for act_id 1 at the same offset. | unit | `tests/test_pe_lanes.py` | Phase 3 |
3030+| frame-lanes.AC8.2 | ALLOC_SHARED with all lanes occupied emits TokenRejected. | unit | `tests/test_pe_lanes.py` | Phase 3 |
3131+| frame-lanes.AC8.3 | FREE on a shared frame frees only the lane; other lanes' data is preserved. FREE on last lane frees the frame. | unit | `tests/test_pe_lanes.py` | Phase 3 |
3232+| frame-lanes.AC8.4 | ALLOC_REMOTE emits ALLOC_SHARED when `fref+2` is non-zero. | unit | `tests/test_pe_lanes.py` | Phase 4 |
3333+| frame-lanes.AC8.5 | ALLOC_REMOTE emits ALLOC when `fref+2` is zero (backwards compatible). | unit | `tests/test_pe_lanes.py` | Phase 4 |
3434+| frame-lanes.AC8.6 | Full loop pipelining scenario: two iterations of a dyadic instruction running concurrently on different lanes, both producing correct results. | e2e | `tests/test_pe_lanes.py` | Phase 6 |
3535+3636+## Criteria Requiring Human Verification
3737+3838+| AC ID | Criterion | Justification | Verification Approach |
3939+|-------|-----------|---------------|----------------------|
4040+| frame-lanes.AC6.3 | Monitor REPL `pe` command displays lane info in tag_store output. | The existing REPL tests at `tests/test_repl.py` only assert that `do_pe()` produces non-empty output (`len(out) > 0`); they do not assert on specific formatting content. The formatting change from `{0: 0}` to `{0: frame 0 lane 0}` is a display concern that is most reliably verified by visual inspection. While a string-matching test could be added, the REPL formatting is intentionally loosely tested to allow cosmetic changes without test churn. | Run `python -m monitor` with a loaded program, execute the `pe 0` command, and confirm tag_store entries display as `act_id: frame F lane L` format. Verify that multi-lane scenarios (after ALLOC_SHARED) show distinct lane numbers per act_id. |
+76
docs/test-plans/2026-03-07-frame-lanes.md
···11+# Frame Matching Lanes — Human Test Plan
22+33+## Overview
44+55+This test plan covers manual verification steps for the frame matching lanes implementation (28 automated acceptance criteria + 1 human verification criterion).
66+77+**Automated coverage:** 28/28 acceptance criteria have automated tests across `tests/test_pe_lanes.py`, `tests/test_pe_frames.py`, `tests/test_pe_events.py`, `tests/test_pe.py`, `tests/test_snapshot.py`, `tests/test_monitor_graph_json.py`, `tests/test_codegen_frames.py`, and `tests/test_repl.py`.
88+99+**Test count:** 1300 tests collected (20 new in `test_pe_lanes.py`, remainder updated for tuple API).
1010+1111+---
1212+1313+## Manual Verification Required
1414+1515+### frame-lanes.AC6.3: Monitor REPL Lane Display
1616+1717+**Criterion:** Monitor REPL `pe` command displays lane info in tag_store output.
1818+1919+**Why manual:** REPL tests assert non-empty output only (`len(out) > 0`), not specific formatting. Display formatting is intentionally loosely tested to allow cosmetic changes without test churn.
2020+2121+**Steps:**
2222+2323+1. Start the monitor with a program that uses frame allocation:
2424+ ```bash
2525+ python -m monitor examples/simple_add.dfasm
2626+ ```
2727+2828+2. Load and step the simulation:
2929+ ```
3030+ (monitor) load examples/simple_add.dfasm
3131+ (monitor) step
3232+ ```
3333+3434+3. Inspect PE state:
3535+ ```
3636+ (monitor) pe 0
3737+ ```
3838+3939+4. **Verify:** Tag store entries display in the format:
4040+ ```
4141+ Tag store: {0: frame 0 lane 0}
4242+ ```
4343+ Not the old format `{0: 0}`.
4444+4545+5. **Multi-lane verification** (requires manual token injection or a program that uses ALLOC_SHARED):
4646+ - After ALLOC_SHARED creates a second activation on the same frame, verify output shows distinct lane numbers:
4747+ ```
4848+ Tag store: {0: frame 2 lane 0, 1: frame 2 lane 1}
4949+ ```
5050+5151+**Expected result:** Lane info is clearly visible in tag_store display for all PE state inspections.
5252+5353+---
5454+5555+## Automated Test Summary by Acceptance Criterion
5656+5757+| AC Group | Criteria Count | Primary Test File | Key Tests |
5858+|----------|---------------|-------------------|-----------|
5959+| AC1: Tag Store Tuple API | 4 | `test_pe_frames.py`, `test_pe_events.py`, `test_pe.py` | Existing tests adapted to `dict[int, tuple[int, int]]` |
6060+| AC2: Match Data Separation | 4 | `test_pe.py`, `test_pe_frames.py` | 3D presence/port/match_data indexing |
6161+| AC3: FrameOp Extensions | 6 | `test_pe_lanes.py` | ALLOC_SHARED, FREE_LANE, smart FREE, lane exhaustion |
6262+| AC4: ALLOC_REMOTE Data-Driven | 2 | `test_pe_lanes.py` | `fref+2` read for ALLOC_SHARED vs ALLOC |
6363+| AC5: FREE_FRAME Smart Free | 1 | `test_pe_lanes.py` | FREE_FRAME opcode delegates to smart free |
6464+| AC6: Monitor/Snapshot | 4 | `test_snapshot.py`, `test_monitor_graph_json.py`, `test_repl.py` | 3D snapshot, JSON lane serialisation |
6565+| AC7: Codegen | 2 | `test_codegen_frames.py` | Tuple initial_tag_store generation |
6666+| AC8: Test Coverage | 6 | `test_pe_lanes.py` | Independent matching, lane exhaustion, loop pipelining |
6767+6868+---
6969+7070+## Regression Checklist
7171+7272+- [ ] All 1300 tests pass: `python -m pytest tests/ -v`
7373+- [ ] No FrameOp references to ALLOC_SHARED/FREE_LANE in `asm/`: `grep -r "ALLOC_SHARED\|FREE_LANE" asm/` returns empty
7474+- [ ] Existing frame allocation tests still pass with tuple API
7575+- [ ] Monitor web UI loads without errors (if available)
7676+- [ ] REPL `pe` command shows lane info (AC6.3 manual check above)
+3
emu/events.py
···8787 component: str
8888 act_id: int
8989 frame_id: int
9090+ lane: int
909191929293@dataclass(frozen=True)
···9596 component: str
9697 act_id: int
9798 frame_id: int
9999+ lane: int
100100+ frame_freed: bool
9810199102100103@dataclass(frozen=True)
+175-46
emu/pe.py
···39394040 Manages:
4141 - Frame store: [frame_count][frame_slots] dense per-activation data
4242- - Tag store: act_id → frame_id mapping
4343- - Presence bits: [frame_count][matchable_offsets] for dyadic matching
4444- - Port store: [frame_count][matchable_offsets] for port metadata
4242+ - Tag store: act_id → (frame_id, lane) mapping
4343+ - Match data: [frame_id][matchable_offsets][lane_count] for operand values
4444+ - Presence bits: [frame_id][matchable_offsets][lane_count] for dyadic matching
4545+ - Port store: [frame_id][matchable_offsets][lane_count] for port metadata
4646+ - Lane free: per-frame set of available lane IDs
4547 - Free frames: pool of available frame IDs
46484749 Pipeline (per token):
···6870 for _ in range(config.frame_count)
6971 ]
70727171- # Tag store: act_id → frame_id
7272- self.tag_store: dict[int, int] = dict(config.initial_tag_store or {})
7373+ # Tag store: act_id → (frame_id, lane)
7474+ self.tag_store: dict[int, tuple[int, int]] = dict(config.initial_tag_store or {})
73757474- # Presence bits: [frame_id][match_slot] - True if operand waiting for partner
7575- self.presence: list[list[bool]] = [
7676- [False for _ in range(config.matchable_offsets)]
7676+ # Match data: [frame_id][match_slot][lane] - operand values waiting for partner
7777+ self.match_data: list[list[list[Optional[int]]]] = [
7878+ [
7979+ [None for _ in range(config.lane_count)]
8080+ for _ in range(config.matchable_offsets)
8181+ ]
8282+ for _ in range(config.frame_count)
8383+ ]
8484+8585+ # Presence bits: [frame_id][match_slot][lane] - True if operand waiting for partner
8686+ self.presence: list[list[list[bool]]] = [
8787+ [
8888+ [False for _ in range(config.lane_count)]
8989+ for _ in range(config.matchable_offsets)
9090+ ]
7791 for _ in range(config.frame_count)
7892 ]
79938080- # Port store: [frame_id][match_slot] - port of waiting operand
8181- self.port_store: list[list[Optional[Port]]] = [
8282- [None for _ in range(config.matchable_offsets)]
9494+ # Port store: [frame_id][match_slot][lane] - port of waiting operand
9595+ self.port_store: list[list[list[Optional[Port]]]] = [
9696+ [
9797+ [None for _ in range(config.lane_count)]
9898+ for _ in range(config.matchable_offsets)
9999+ ]
83100 for _ in range(config.frame_count)
84101 ]
85102103103+ self.lane_count = config.lane_count
104104+86105 # Free frames pool
87106 self.free_frames = list(range(config.frame_count))
8888- for frame_id in self.tag_store.values():
107107+ for frame_id, _lane in self.tag_store.values():
89108 if frame_id in self.free_frames:
90109 self.free_frames.remove(frame_id)
110110+111111+ # Lane tracking: which lanes are free per frame
112112+ self.lane_free: dict[int, set[int]] = {}
113113+114114+ # Initialize lane_free for pre-loaded tag_store entries
115115+ for act_id, (frame_id, lane) in self.tag_store.items():
116116+ if frame_id not in self.lane_free:
117117+ # First time seeing this frame — set up lane tracking
118118+ all_lanes = set(range(self.lane_count))
119119+ self.lane_free[frame_id] = all_lanes - {lane}
120120+ else:
121121+ self.lane_free[frame_id].discard(lane)
9112292123 # Load initial frame data
93124 if config.initial_frames:
···173204 ))
174205 return
175206176176- frame_id = self.tag_store[token.act_id]
207207+ frame_id, lane = self.tag_store[token.act_id]
177208178209 # Determine if monadic or dyadic instruction
179210 is_monadic = (
···192223 left, right = token.data, None
193224 else:
194225 # Dyadic matching via presence bits
195195- operands = self._match_frame(token, inst, frame_id)
226226+ operands = self._match_frame(token, inst, frame_id, lane)
196227 yield self.env.timeout(1) # match cycle
197228 if operands is None:
198229 return # waiting for partner
···229260 yield self.env.timeout(1) # EMIT cycle
230261 self._do_emit_new(inst, result, False, token.act_id, frame_id)
231262 elif inst.opcode == RoutingOp.ALLOC_REMOTE:
232232- # PE-level: read target PE and act_id from frame constants
263263+ # PE-level: read target PE, act_id, and optional parent act_id from frame constants
264264+ # fref+0: target PE
265265+ # fref+1: target act_id
266266+ # fref+2: parent act_id (0 = fresh ALLOC, non-zero = ALLOC_SHARED)
233267 # Total: 4 cycles (dequeue + IFETCH + EXECUTE + EMIT)
234268 target_pe = self.frames[frame_id][inst.fref] if inst.fref < len(self.frames[frame_id]) else 0
235269 target_act = self.frames[frame_id][inst.fref + 1] if inst.fref + 1 < len(self.frames[frame_id]) else 0
270270+ parent_act = self.frames[frame_id][inst.fref + 2] if inst.fref + 2 < len(self.frames[frame_id]) else 0
271271+272272+ # Guard against None slot values
273273+ if target_pe is None or target_act is None:
274274+ logger.warning(f"PE {self.pe_id}: ALLOC_REMOTE has None at fref slots, skipping")
275275+ return
276276+277277+ if parent_act:
278278+ alloc_op = FrameOp.ALLOC_SHARED
279279+ payload = parent_act
280280+ else:
281281+ alloc_op = FrameOp.ALLOC
282282+ payload = 0
283283+236284 fct = FrameControlToken(
237285 target=target_pe,
238286 act_id=target_act,
239239- op=FrameOp.ALLOC,
240240- payload=0,
287287+ op=alloc_op,
288288+ payload=payload,
241289 )
242290 self._on_event(Executed(
243291 time=self.env.now, component=self._component,
···256304 ))
257305 yield self.env.timeout(1) # EXECUTE cycle
258306 yield self.env.timeout(1) # EMIT cycle (no output token)
259259- # Frame deallocation happens during EMIT cycle
307307+ # Frame deallocation happens during EMIT cycle with smart FREE logic
260308 if token.act_id in self.tag_store:
261261- freed_frame = self.tag_store.pop(token.act_id)
262262- self.free_frames.append(freed_frame)
263263- self._on_event(FrameFreed(
264264- time=self.env.now, component=self._component,
265265- act_id=token.act_id, frame_id=freed_frame,
266266- ))
309309+ self._smart_free(token.act_id)
310310+ else:
311311+ logger.warning(f"PE {self.pe_id}: FREE_FRAME for unknown act_id {token.act_id}")
267312 else:
268313 # Normal ALU execute
269314 # MINOR FIX: Restructure const_val handling to avoid dead code
···281326 yield self.env.timeout(1) # EMIT cycle
282327 self._do_emit_new(inst, result, bool_out, token.act_id, frame_id, left=left)
283328329329+ def _smart_free(self, act_id: int) -> None:
330330+ """Smart FREE helper: deallocate lane, possibly returning frame to free list.
331331+332332+ Does NOT yield. Caller handles timing. Emits FrameFreed event.
333333+ """
334334+ if act_id not in self.tag_store:
335335+ return # Caller should have checked, but skip silently
336336+337337+ frame_id, lane = self.tag_store.pop(act_id)
338338+ # Clear this lane's match state
339339+ for i in range(self.matchable_offsets):
340340+ self.match_data[frame_id][i][lane] = None
341341+ self.presence[frame_id][i][lane] = False
342342+ self.port_store[frame_id][i][lane] = None
343343+ # Check if any other activations use this frame
344344+ frame_in_use = any(fid == frame_id for fid, _ in self.tag_store.values())
345345+ if frame_in_use:
346346+ # Return lane to pool, keep frame
347347+ self.lane_free[frame_id].add(lane)
348348+ self._on_event(FrameFreed(
349349+ time=self.env.now, component=self._component,
350350+ act_id=act_id, frame_id=frame_id,
351351+ lane=lane, frame_freed=False,
352352+ ))
353353+ else:
354354+ # Last lane — return frame to free list
355355+ self.free_frames.append(frame_id)
356356+ if frame_id in self.lane_free:
357357+ del self.lane_free[frame_id]
358358+ # Clear frame slots
359359+ for i in range(self.frame_slots):
360360+ self.frames[frame_id][i] = None
361361+ self._on_event(FrameFreed(
362362+ time=self.env.now, component=self._component,
363363+ act_id=act_id, frame_id=frame_id,
364364+ lane=lane, frame_freed=True,
365365+ ))
366366+284367 def _handle_frame_control(self, token: FrameControlToken) -> None:
285285- """Handle ALLOC and FREE operations."""
368368+ """Handle ALLOC, FREE, ALLOC_SHARED, and FREE_LANE operations."""
286369 if token.op == FrameOp.ALLOC:
287370 if self.free_frames:
288371 frame_id = self.free_frames.pop()
289289- self.tag_store[token.act_id] = frame_id
372372+ self.tag_store[token.act_id] = (frame_id, 0)
373373+ # Set up lane tracking: lane 0 is taken, rest are free
374374+ self.lane_free[frame_id] = set(range(1, self.lane_count))
290375 # Initialize frame slots to None
291376 for i in range(self.frame_slots):
292377 self.frames[frame_id][i] = None
293293- # CRITICAL FIX: Reset stale presence bits and port_store from previous activation
378378+ # Reset all lanes' match state
294379 for i in range(self.matchable_offsets):
295295- self.presence[frame_id][i] = False
296296- self.port_store[frame_id][i] = None
380380+ for ln in range(self.lane_count):
381381+ self.match_data[frame_id][i][ln] = None
382382+ self.presence[frame_id][i][ln] = False
383383+ self.port_store[frame_id][i][ln] = None
297384 self._on_event(FrameAllocated(
298385 time=self.env.now, component=self._component,
299299- act_id=token.act_id, frame_id=frame_id,
386386+ act_id=token.act_id, frame_id=frame_id, lane=0,
300387 ))
301388 else:
302389 logger.warning(f"PE {self.pe_id}: no free frames available")
303390 elif token.op == FrameOp.FREE:
304391 if token.act_id in self.tag_store:
305305- frame_id = self.tag_store.pop(token.act_id)
306306- self.free_frames.append(frame_id)
307307- self._on_event(FrameFreed(
392392+ self._smart_free(token.act_id)
393393+ else:
394394+ logger.warning(f"PE {self.pe_id}: FREE for unknown act_id {token.act_id}")
395395+ elif token.op == FrameOp.ALLOC_SHARED:
396396+ # Shared allocation: find parent's frame, assign next free lane
397397+ # Guard against self-referential act_id (would leak old lane)
398398+ if token.act_id in self.tag_store:
399399+ self._on_event(TokenRejected(
308400 time=self.env.now, component=self._component,
309309- act_id=token.act_id, frame_id=frame_id,
401401+ token=token, reason=f"act_id {token.act_id} already in tag store",
310402 ))
403403+ return
404404+ parent_act_id = token.payload
405405+ if parent_act_id not in self.tag_store:
406406+ self._on_event(TokenRejected(
407407+ time=self.env.now, component=self._component,
408408+ token=token, reason=f"parent act_id {parent_act_id} not in tag store",
409409+ ))
410410+ return
411411+ parent_frame_id, _ = self.tag_store[parent_act_id]
412412+ free_lanes = self.lane_free.get(parent_frame_id, set())
413413+ if not free_lanes:
414414+ self._on_event(TokenRejected(
415415+ time=self.env.now, component=self._component,
416416+ token=token, reason="no free lanes",
417417+ ))
418418+ return
419419+ lane = min(free_lanes) # Deterministic: pick lowest free lane
420420+ free_lanes.remove(lane)
421421+ self.tag_store[token.act_id] = (parent_frame_id, lane)
422422+ # Clear only this lane's match state
423423+ for i in range(self.matchable_offsets):
424424+ self.match_data[parent_frame_id][i][lane] = None
425425+ self.presence[parent_frame_id][i][lane] = False
426426+ self.port_store[parent_frame_id][i][lane] = None
427427+ self._on_event(FrameAllocated(
428428+ time=self.env.now, component=self._component,
429429+ act_id=token.act_id, frame_id=parent_frame_id, lane=lane,
430430+ ))
431431+ elif token.op == FrameOp.FREE_LANE:
432432+ # Free lane with smart frame deallocation.
433433+ # If this is the last lane using the frame, the frame is returned to free_frames.
434434+ # Otherwise, just the lane is returned to the pool.
435435+ if token.act_id in self.tag_store:
436436+ self._smart_free(token.act_id)
437437+ else:
438438+ logger.warning(f"PE {self.pe_id}: FREE_LANE for unknown act_id {token.act_id}")
311439312440 def _handle_local_write(self, token: PELocalWriteToken) -> None:
313441 """Handle IRAM write and frame write."""
···319447 ))
320448 elif token.region == 1: # Frame
321449 if token.act_id in self.tag_store:
322322- frame_id = self.tag_store[token.act_id]
450450+ frame_id, _lane = self.tag_store[token.act_id]
323451 if token.is_dest:
324452 # Decode flit 1 to FrameDest
325453 dest = unpack_flit1(token.data)
···345473 token: DyadToken,
346474 inst: Instruction,
347475 frame_id: int,
476476+ lane: int,
348477 ) -> Optional[tuple[int, int]]:
349349- """Frame-based dyadic matching.
478478+ """Frame-based dyadic matching with lane support.
350479351480 Derives match slot from low bits of token.offset:
352481 match_slot = token.offset % matchable_offsets
353482354354- Both L and R tokens write to frames[frame_id][match_slot].
355355- Port metadata determines left/right ordering when second arrives.
483483+ Match data, presence, and port are per-lane.
484484+ Frame constants/destinations remain shared.
356485 """
357486 match_slot = token.offset % self.matchable_offsets
358487359359- if self.presence[frame_id][match_slot]:
488488+ if self.presence[frame_id][match_slot][lane]:
360489 # Partner already waiting — pair them
361361- partner_data = self.frames[frame_id][match_slot]
362362- partner_port = self.port_store[frame_id][match_slot]
363363- self.presence[frame_id][match_slot] = False
364364- self.frames[frame_id][match_slot] = None
490490+ partner_data = self.match_data[frame_id][match_slot][lane]
491491+ partner_port = self.port_store[frame_id][match_slot][lane]
492492+ self.presence[frame_id][match_slot][lane] = False
493493+ self.match_data[frame_id][match_slot][lane] = None
365494366495 # Use port metadata to determine left/right ordering
367496 if partner_port == Port.L:
···377506 return left, right
378507 else:
379508 # Store and wait for partner
380380- self.frames[frame_id][match_slot] = token.data
381381- self.port_store[frame_id][match_slot] = token.port
382382- self.presence[frame_id][match_slot] = True
509509+ self.match_data[frame_id][match_slot][lane] = token.data
510510+ self.port_store[frame_id][match_slot][lane] = token.port
511511+ self.presence[frame_id][match_slot][lane] = True
383512 return None
384513385514 def _do_emit_new(
···5252- `__init__.py` -- Public API exports
5353- `backend.py` -- `SimulationBackend` class with thread lifecycle and command dispatch
5454- `commands.py` -- All command and result frozen dataclasses, `SimCommand` union type
5555-- `snapshot.py` -- `StateSnapshot`, `PESnapshot` (frame-based: frames, tag_store, presence, port_store, free_frames), `SMSnapshot`, `SMCellSnapshot`, `capture()`
5555+- `snapshot.py` -- `StateSnapshot`, `PESnapshot` (frame-based: frames, tag_store mapping act_id → (frame_id, lane) tuples, presence, port_store, match_data all 3D [frame_id][match_slot][lane], lane_count, free_frames), `SMSnapshot`, `SMCellSnapshot`, `capture()`
5656- `graph_json.py` -- JSON serialization with execution overlay (extends dfgraph patterns)
5757- `server.py` -- `create_app(backend)` FastAPI factory, `ConnectionManager`, WebSocket handler
5858- `repl.py` -- `MonitorREPL(cmd.Cmd)` interactive CLI
+2-2
monitor/formatting.py
···194194 # Tag store
195195 if pe_snapshot.tag_store:
196196 tag_str = ", ".join(
197197- f"{colour(str(k), 'white')}: {colour(str(v), 'white')}"
198198- for k, v in sorted(pe_snapshot.tag_store.items())
197197+ f"{colour(str(k), 'white')}: frame {colour(str(fid), 'white')} lane {colour(str(lane), 'white')}"
198198+ for k, (fid, lane) in sorted(pe_snapshot.tag_store.items())
199199 )
200200 lines.append(f" Tag store: {{{tag_str}}}")
201201 else:
···2222 pe_id: int
2323 iram: dict[int, Instruction]
2424 frames: tuple[tuple[FrameSlotValue, ...], ...]
2525- tag_store: dict[int, int]
2626- presence: tuple[tuple[bool, ...], ...]
2727- port_store: tuple[tuple[Port | None, ...], ...]
2525+ tag_store: dict[int, tuple[int, int]]
2626+ presence: tuple[tuple[tuple[bool, ...], ...], ...]
2727+ port_store: tuple[tuple[tuple[Port | None, ...], ...], ...]
2828+ match_data: tuple[tuple[tuple[int | None, ...], ...], ...]
2829 free_frames: tuple[int, ...]
3030+ lane_count: int
2931 input_queue: tuple[Token, ...]
3032 output_log: tuple[Token, ...]
3133···8082 )
8183 tag_store = dict(pe.tag_store)
8284 presence = tuple(
8383- tuple(p for p in frame_presence)
8585+ tuple(
8686+ tuple(lane_val for lane_val in offset_lanes)
8787+ for offset_lanes in frame_presence
8888+ )
8489 for frame_presence in pe.presence
8590 )
8691 port_store = tuple(
8787- tuple(p for p in frame_ports)
9292+ tuple(
9393+ tuple(lane_val for lane_val in offset_lanes)
9494+ for offset_lanes in frame_ports
9595+ )
8896 for frame_ports in pe.port_store
8997 )
9898+ match_data = tuple(
9999+ tuple(
100100+ tuple(lane_val for lane_val in offset_lanes)
101101+ for offset_lanes in frame_match
102102+ )
103103+ for frame_match in pe.match_data
104104+ )
90105 free_frames = tuple(pe.free_frames)
9110692107 pes[pe_id] = PESnapshot(
···96111 tag_store=tag_store,
97112 presence=presence,
98113 port_store=port_store,
114114+ match_data=match_data,
99115 free_frames=free_frames,
116116+ lane_count=pe.lane_count,
100117 input_queue=tuple(pe.input_store.items),
101118 output_log=tuple(pe.output_log),
102119 )
+11
tests/test_codegen_frames.py
···324324 assert len(alloc_tokens) == 2
325325 assert all(t.op == FrameOp.ALLOC for t in alloc_tokens)
326326327327+ # Verify that PE configs have initial_tag_store with tuple values
328328+ assert len(result.pe_configs) == 1
329329+ pe_cfg = result.pe_configs[0]
330330+ assert pe_cfg.initial_tag_store, "initial_tag_store should not be empty for PE with activations"
331331+ for act_id, val in pe_cfg.initial_tag_store.items():
332332+ assert isinstance(val, tuple) and len(val) == 2, \
333333+ f"initial_tag_store[{act_id}] should be (frame_id, lane) tuple, got {val}"
334334+ frame_id, lane = val
335335+ assert isinstance(frame_id, int), f"frame_id should be int, got {type(frame_id)}"
336336+ assert isinstance(lane, int), f"lane should be int, got {type(lane)}"
337337+327338328339class TestTask3SeedTokens:
329340 """Task 3: Seed token generation with act_id."""
···9696 frame_allocated = [e for e in events if isinstance(e, FrameAllocated)]
9797 assert len(token_received) > 0
9898 assert len(frame_allocated) > 0
9999- assert pe.tag_store[0] in range(pe.frame_count)
9999+ assert pe.tag_store[0][0] in range(pe.frame_count)
100100+ assert frame_allocated[0].lane == 0
100101101102 def test_free_frame_control_token(self):
102103 env = simpy.Environment()
···112113 fct_alloc = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
113114 inject_and_run(env, pe, fct_alloc)
114115115115- frame_id = pe.tag_store[0]
116116+ frame_id, _lane = pe.tag_store[0]
116117117118 # Now deallocate
118119 fct_free = FrameControlToken(target=0, act_id=0, op=FrameOp.FREE, payload=0)
···121122 # Should have FrameFreed event and tag_store should be cleared
122123 frame_freed = [e for e in events if isinstance(e, FrameFreed)]
123124 assert len(frame_freed) > 0
125125+ assert frame_freed[0].lane == 0
126126+ assert frame_freed[0].frame_freed == True
124127 assert 0 not in pe.tag_store
125128 assert frame_id in pe.free_frames
126129···143146 inject_and_run(env, pe, fct)
144147145148 # Set up: install dyadic instruction at offset 0
146146- # Mode 0: no const, dest_count=1
149149+ # Mode SINK: no output emission, just execution and matching verification
147150 inst = Instruction(
148151 opcode=ArithOp.ADD,
149149- output=OutputStyle.INHERIT,
152152+ output=OutputStyle.SINK,
150153 has_const=False,
151151- dest_count=1,
154154+ dest_count=0,
152155 wide=False,
153156 fref=0,
154157 )
···162165 port=Port.L,
163166 token_kind=TokenKind.DYADIC,
164167 )
165165- pe.frames[pe.tag_store[0]][0] = dest
168168+ pe.frames[pe.tag_store[0][0]][0] = dest
166169167170 # Inject first dyadic token (port=L, data=5)
168171 tok1 = DyadToken(
···214217 # Allocate frame
215218 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
216219 inject_and_run(env, pe, fct)
217217- frame_id = pe.tag_store[0]
220220+ frame_id, _lane = pe.tag_store[0]
218221219222 # Set up instruction: mode 0 (no const, dest_count=1), fref=8
220223 inst = Instruction(
···345348 # Allocate frame
346349 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
347350 inject_and_run(env, pe, fct)
348348- frame_id = pe.tag_store[0]
351351+ frame_id, _lane = pe.tag_store[0]
349352350353 # Set up instruction: SINK output, mode 6 (no const, dest_count=0), fref=10
351354 inst = Instruction(
···413416 port=Port.L,
414417 token_kind=TokenKind.MONADIC,
415418 )
416416- pe.frames[pe.tag_store[0]][0] = dest
419419+ pe.frames[pe.tag_store[0][0]][0] = dest
417420418421 # Wire route table
419422 pe.route_table[0] = simpy.Store(env)
···487490 pe.iram[6] = inst
488491489492 # Write target PE and target act_id to frame slots 8 and 9
490490- frame_id = pe.tag_store[0]
493493+ frame_id, _lane = pe.tag_store[0]
491494 pe.frames[frame_id][8] = 1 # target PE
492495 pe.frames[frame_id][9] = 2 # target act_id
493496···505508 frame_allocated = [e for e in pe_events if isinstance(e, FrameAllocated)]
506509 assert len(frame_allocated) > 0
507510 assert frame_allocated[0].act_id == 2
511511+ assert frame_allocated[0].lane == 0
508512509513510514class TestFreeFrameOpcode:
···523527 # Allocate frame
524528 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
525529 inject_and_run(env, pe, fct)
526526- frame_id = pe.tag_store[0]
530530+ frame_id, _lane = pe.tag_store[0]
527531528532 # Set up FREE_FRAME instruction
529533 inst = Instruction(
···550554 frame_freed = [e for e in events if isinstance(e, FrameFreed)]
551555 assert len(frame_freed) > 0
552556 assert frame_freed[0].frame_id == frame_id
557557+ assert frame_freed[0].lane == 0
558558+ assert frame_freed[0].frame_freed == True
553559554560 # tag_store should be cleared
555561 assert 0 not in pe.tag_store
···562568 assert len(emitted) == 0
563569564570571571+class TestFreeLane:
572572+ """AC3.8: FREE_LANE deallocates lane, potentially returning frame to free list."""
573573+574574+ def test_free_lane_on_last_lane_returns_frame(self):
575575+ """When FREE_LANE is called on the last remaining lane, frame should be returned."""
576576+ env = simpy.Environment()
577577+ events = []
578578+ config = PEConfig(frame_count=4, lane_count=2, on_event=events.append)
579579+ pe = ProcessingElement(
580580+ env=env,
581581+ pe_id=0,
582582+ config=config,
583583+ )
584584+585585+ # Allocate a frame with act_id=1 (gets lane 0)
586586+ fct_alloc1 = FrameControlToken(target=0, act_id=1, op=FrameOp.ALLOC, payload=0)
587587+ inject_and_run(env, pe, fct_alloc1)
588588+589589+ frame_id, lane1 = pe.tag_store[1]
590590+ assert lane1 == 0
591591+592592+ # Allocate shared child with act_id=2 (gets lane 1)
593593+ fct_alloc_shared = FrameControlToken(target=0, act_id=2, op=FrameOp.ALLOC_SHARED, payload=1)
594594+ inject_and_run(env, pe, fct_alloc_shared)
595595+596596+ frame_id2, lane2 = pe.tag_store[2]
597597+ assert frame_id2 == frame_id
598598+ assert lane2 == 1
599599+600600+ # Now FREE_LANE the child (act_id=2) — should not return frame (still in use)
601601+ fct_free_lane_child = FrameControlToken(target=0, act_id=2, op=FrameOp.FREE_LANE, payload=0)
602602+ inject_and_run(env, pe, fct_free_lane_child)
603603+604604+ # Lane should be freed, frame still in use
605605+ assert 2 not in pe.tag_store
606606+ assert frame_id in pe.lane_free or frame_id not in [fid for fid, _ in pe.tag_store.values()]
607607+ frame_freed_child = [e for e in events if isinstance(e, FrameFreed) and e.act_id == 2]
608608+ assert len(frame_freed_child) > 0
609609+ assert frame_freed_child[0].frame_freed == False # Lane freed, not frame
610610+611611+ # Now FREE_LANE the parent (act_id=1) — this is the last lane, should return frame
612612+ fct_free_lane_parent = FrameControlToken(target=0, act_id=1, op=FrameOp.FREE_LANE, payload=0)
613613+ inject_and_run(env, pe, fct_free_lane_parent)
614614+615615+ # Frame should now be in free_frames
616616+ assert 1 not in pe.tag_store
617617+ assert frame_id in pe.free_frames
618618+ frame_freed_parent = [e for e in events if isinstance(e, FrameFreed) and e.act_id == 1]
619619+ assert len(frame_freed_parent) > 0
620620+ assert frame_freed_parent[0].frame_freed == True # Last lane, frame returned
621621+622622+565623class TestPELocalWriteToken:
566624 """AC3.9: PELocalWriteToken with is_dest=True decodes data to FrameDest."""
567625···613671 # Allocate frame
614672 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
615673 inject_and_run(env, pe, fct)
616616- frame_id = pe.tag_store[0]
674674+ frame_id, _lane = pe.tag_store[0]
617675618676 # Write FrameDest to frame slot 15, is_dest=True
619677 dest = FrameDest(
···682740 assert len(rejected) > 0
683741 assert rejected[0].token == tok
684742685685- # Should not crash
686686- assert True
687687-688743689744class TestDualDestInherit:
690745 """IMPORTANT 2: dest_count=2 non-switch: verify both destinations receive same result."""
···702757 # Allocate frame
703758 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
704759 inject_and_run(env, pe, fct)
705705- frame_id = pe.tag_store[0]
760760+ frame_id, _lane = pe.tag_store[0]
706761707762 # Set up instruction: mode 2 (no const, dest_count=2), fref=8
708763 inst = Instruction(
···766821 # Allocate frame
767822 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
768823 inject_and_run(env, pe, fct)
769769- frame_id = pe.tag_store[0]
824824+ frame_id, _lane = pe.tag_store[0]
770825771826 # Set up SWEQ instruction with dest_count=2, fref=8
772827 inst = Instruction(
···829884 # Allocate frame
830885 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
831886 inject_and_run(env, pe, fct)
832832- frame_id = pe.tag_store[0]
887887+ frame_id, _lane = pe.tag_store[0]
833888834889 # Set up SWEQ instruction with dest_count=2, fref=8
835890 inst = Instruction(
···897952 # Allocate frame
898953 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
899954 inject_and_run(env, pe, fct)
900900- frame_id = pe.tag_store[0]
955955+ frame_id, _lane = pe.tag_store[0]
901956902957 # Set up GATE instruction
903958 inst = Instruction(
···9461001 # Allocate frame
9471002 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
9481003 inject_and_run(env, pe, fct)
949949- frame_id = pe.tag_store[0]
10041004+ frame_id, _lane = pe.tag_store[0]
95010059511006 # Set up GATE instruction
9521007 inst = Instruction(
···9991054 # Allocate frame
10001055 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
10011056 inject_and_run(env, pe, fct)
10021002- frame_id = pe.tag_store[0]
10571057+ frame_id, _lane = pe.tag_store[0]
1003105810041059 # Set up SM READ instruction with const (mode 1: const, dest with return route), fref=8
10051060 # Const slot contains the SM target, dest slot contains return route
···10901145 # Allocate frame
10911146 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
10921147 inject_and_run(env, pe, fct)
10931093- frame_id = pe.tag_store[0]
11481148+ frame_id, _lane = pe.tag_store[0]
1094114910951150 # Set up dyadic instruction
10961151 inst = Instruction(
···11721227 # Allocate frame
11731228 fct = FrameControlToken(target=0, act_id=0, op=FrameOp.ALLOC, payload=0)
11741229 inject_and_run(env, pe, fct)
11751175- frame_id = pe.tag_store[0]
12301230+ frame_id, _lane = pe.tag_store[0]
1176123111771232 # Set up monadic instruction
11781233 inst = Instruction(
···12821337 # Allocate frame
12831338 fct = FrameControlToken(target=0, act_id=5, op=FrameOp.ALLOC, payload=0)
12841339 inject_and_run(env, pe, fct)
12851285- frame_id = pe.tag_store[5]
13401340+ frame_id, _lane = pe.tag_store[5]
1286134112871342 # Set up EXTRACT_TAG instruction
12881343 inst = Instruction(
···13541409 # Allocate frame
13551410 fct = FrameControlToken(target=0, act_id=7, op=FrameOp.ALLOC, payload=0)
13561411 inject_and_run(env, pe, fct)
13571357- frame_id = pe.tag_store[7]
14121412+ frame_id, _lane = pe.tag_store[7]
1358141313591414 # Set up SM READ instruction (monadic in terms of PE pipeline)
13601415 inst = Instruction(
···14281483 # Allocate frame
14291484 fct = FrameControlToken(target=0, act_id=10, op=FrameOp.ALLOC, payload=0)
14301485 inject_and_run(env, pe, fct)
14311431- frame_id = pe.tag_store[10]
14861486+ frame_id, _lane = pe.tag_store[10]
1432148714331488 # Set up FREE_FRAME instruction
14341489 inst = Instruction(
···14911546 # Allocate frame
14921547 fct = FrameControlToken(target=0, act_id=12, op=FrameOp.ALLOC, payload=0)
14931548 inject_and_run(env, pe, fct)
14941494- frame_id = pe.tag_store[12]
15491549+ frame_id, _lane = pe.tag_store[12]
1495155014961551 # Set up ALLOC_REMOTE instruction
14971552 inst = Instruction(
+1165
tests/test_pe_lanes.py
···11+"""
22+Lane-based PE rewrite tests.
33+44+Verifies frame-lanes.AC3, frame-lanes.AC4, frame-lanes.AC5, and frame-lanes.AC8:
55+- AC3.1: FrameOp.ALLOC_SHARED assigns next free lane from parent frame
66+- AC3.2: FrameOp.FREE_LANE removes tag_store entry, clears lane data, keeps frame
77+- AC3.3: FrameOp.FREE on shared frame returns lane if frame still in use
88+- AC3.4: FrameOp.ALLOC unchanged — allocates fresh frame, assigns lane 0
99+- AC3.5: FrameAllocated event gains lane field
1010+- AC3.6: ALLOC_SHARED with all lanes occupied emits TokenRejected
1111+- AC4: ALLOC_REMOTE reads fref+2 for data-driven ALLOC_SHARED vs ALLOC
1212+- AC5.1: FREE_FRAME opcode uses smart FREE behaviour on shared frames
1313+- AC8.1: Two act_ids sharing a frame have independent matching
1414+- AC8.2: ALLOC_SHARED with exhausted lanes emits TokenRejected
1515+- AC8.3: FREE on shared frame preserves other lanes' data
1616+- AC8.4: ALLOC_REMOTE emits ALLOC_SHARED when fref+2 is non-zero
1717+- AC8.5: ALLOC_REMOTE emits ALLOC when fref+2 is zero (backwards compatible)
1818+- AC8.6: Full loop pipelining scenario — two iterations concurrent on different lanes
1919+"""
2020+2121+import pytest
2222+import simpy
2323+2424+from cm_inst import (
2525+ ArithOp, FrameDest, FrameOp, Instruction, Port, TokenKind, OutputStyle,
2626+ RoutingOp,
2727+)
2828+from emu.events import (
2929+ FrameAllocated, FrameFreed, TokenReceived, TokenRejected, Matched, Emitted,
3030+)
3131+from emu.pe import ProcessingElement
3232+from emu.types import PEConfig
3333+from tokens import DyadToken, FrameControlToken
3434+3535+3636+def inject_and_run(env, pe, token):
3737+ """Helper: inject token and run simulation."""
3838+ def _put():
3939+ yield pe.input_store.put(token)
4040+ env.process(_put())
4141+ env.run()
4242+4343+4444+class TestAllocShared:
4545+ """AC3.1: ALLOC_SHARED assigns next free lane from parent frame."""
4646+4747+ def test_alloc_shared_basic(self):
4848+ """Parent allocates frame, child allocates shared lane."""
4949+ env = simpy.Environment()
5050+ events = []
5151+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
5252+ pe = ProcessingElement(env=env, pe_id=0, config=config)
5353+5454+ # Parent ALLOC
5555+ fct_parent = FrameControlToken(
5656+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
5757+ )
5858+ inject_and_run(env, pe, fct_parent)
5959+6060+ parent_frame_id, parent_lane = pe.tag_store[0]
6161+ assert parent_lane == 0, "Parent should allocate lane 0"
6262+6363+ # Child ALLOC_SHARED with parent_act_id=0
6464+ fct_child = FrameControlToken(
6565+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
6666+ )
6767+ inject_and_run(env, pe, fct_child)
6868+6969+ child_frame_id, child_lane = pe.tag_store[1]
7070+ assert child_frame_id == parent_frame_id, "Child should share parent's frame"
7171+ assert child_lane == 1, "Child should allocate lane 1"
7272+ assert child_lane != parent_lane, "Child lane should differ from parent"
7373+7474+ # Verify FrameAllocated event for child
7575+ frame_allocated = [e for e in events if isinstance(e, FrameAllocated)]
7676+ assert len(frame_allocated) >= 2, "Should have 2 FrameAllocated events"
7777+ assert frame_allocated[0].lane == 0, "Parent allocated lane 0"
7878+ assert frame_allocated[1].lane == 1, "Child allocated lane 1"
7979+8080+ def test_alloc_shared_multiple_lanes(self):
8181+ """Multiple children allocate different lanes from same parent frame."""
8282+ env = simpy.Environment()
8383+ events = []
8484+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
8585+ pe = ProcessingElement(env=env, pe_id=0, config=config)
8686+8787+ # Parent ALLOC
8888+ fct_parent = FrameControlToken(
8989+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
9090+ )
9191+ inject_and_run(env, pe, fct_parent)
9292+ parent_frame_id, _parent_lane = pe.tag_store[0]
9393+9494+ # Child 1 ALLOC_SHARED
9595+ fct_child1 = FrameControlToken(
9696+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
9797+ )
9898+ inject_and_run(env, pe, fct_child1)
9999+ _child1_frame_id, child1_lane = pe.tag_store[1]
100100+101101+ # Child 2 ALLOC_SHARED
102102+ fct_child2 = FrameControlToken(
103103+ target=0, act_id=2, op=FrameOp.ALLOC_SHARED, payload=0
104104+ )
105105+ inject_and_run(env, pe, fct_child2)
106106+ _child2_frame_id, child2_lane = pe.tag_store[2]
107107+108108+ # All should share same frame
109109+ assert pe.tag_store[0][0] == parent_frame_id
110110+ assert pe.tag_store[1][0] == parent_frame_id
111111+ assert pe.tag_store[2][0] == parent_frame_id
112112+113113+ # Lanes should differ: 0, 1, 2
114114+ assert child1_lane != 0, "Child1 lane should not be 0"
115115+ assert child2_lane != 0, "Child2 lane should not be 0"
116116+ assert child1_lane != child2_lane, "Child1 and child2 lanes should differ"
117117+118118+ def test_alloc_shared_invalid_parent(self):
119119+ """ALLOC_SHARED with non-existent parent emits TokenRejected."""
120120+ env = simpy.Environment()
121121+ events = []
122122+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
123123+ pe = ProcessingElement(env=env, pe_id=0, config=config)
124124+125125+ # Try ALLOC_SHARED with non-existent parent_act_id=999
126126+ fct = FrameControlToken(
127127+ target=0, act_id=0, op=FrameOp.ALLOC_SHARED, payload=999
128128+ )
129129+ inject_and_run(env, pe, fct)
130130+131131+ rejected = [e for e in events if isinstance(e, TokenRejected)]
132132+ assert len(rejected) > 0, "Should have TokenRejected event"
133133+ assert "not in tag store" in rejected[0].reason, "Reason should mention tag_store"
134134+135135+ # Parent should not be in tag_store
136136+ assert 999 not in pe.tag_store
137137+138138+ def test_alloc_shared_self_referential_guard(self):
139139+ """ALLOC_SHARED with act_id already in tag_store emits TokenRejected."""
140140+ env = simpy.Environment()
141141+ events = []
142142+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
143143+ pe = ProcessingElement(env=env, pe_id=0, config=config)
144144+145145+ # First ALLOC to establish act_id=0 in tag_store
146146+ fct_alloc = FrameControlToken(
147147+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
148148+ )
149149+ inject_and_run(env, pe, fct_alloc)
150150+ assert 0 in pe.tag_store, "act_id=0 should be in tag_store after ALLOC"
151151+ frame_id_0, lane_0 = pe.tag_store[0]
152152+153153+ # Now try ALLOC_SHARED with act_id=0 and payload=1 (parent_act_id=1)
154154+ # This should be rejected because act_id=0 already exists
155155+ fct_alloc_parent = FrameControlToken(
156156+ target=0, act_id=1, op=FrameOp.ALLOC, payload=0
157157+ )
158158+ inject_and_run(env, pe, fct_alloc_parent)
159159+ assert 1 in pe.tag_store, "act_id=1 should be in tag_store after ALLOC"
160160+161161+ events.clear()
162162+ fct_shared = FrameControlToken(
163163+ target=0, act_id=0, op=FrameOp.ALLOC_SHARED, payload=1
164164+ )
165165+ inject_and_run(env, pe, fct_shared)
166166+167167+ rejected = [e for e in events if isinstance(e, TokenRejected)]
168168+ assert len(rejected) > 0, "Should have TokenRejected event"
169169+ assert "already in tag store" in rejected[0].reason, "Reason should mention already in tag store"
170170+171171+ # Frame and lane should be unchanged
172172+ assert pe.tag_store[0] == (frame_id_0, lane_0), "act_id=0 state should be unchanged"
173173+174174+175175+class TestLaneExhaustion:
176176+ """AC3.6, AC8.2: Lane exhaustion and TokenRejected."""
177177+178178+ def test_alloc_shared_exhausts_all_lanes(self):
179179+ """Allocate all lanes, then ALLOC_SHARED fails with TokenRejected."""
180180+ env = simpy.Environment()
181181+ events = []
182182+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
183183+ pe = ProcessingElement(env=env, pe_id=0, config=config)
184184+185185+ # Parent ALLOC uses lane 0
186186+ fct_parent = FrameControlToken(
187187+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
188188+ )
189189+ inject_and_run(env, pe, fct_parent)
190190+191191+ # Allocate lanes 1, 2, 3
192192+ for i in range(1, 4):
193193+ fct = FrameControlToken(
194194+ target=0, act_id=i, op=FrameOp.ALLOC_SHARED, payload=0
195195+ )
196196+ inject_and_run(env, pe, fct)
197197+ assert i in pe.tag_store, f"Child {i} should be allocated"
198198+199199+ # Try to allocate one more (all lanes exhausted)
200200+ fct_fail = FrameControlToken(
201201+ target=0, act_id=4, op=FrameOp.ALLOC_SHARED, payload=0
202202+ )
203203+ inject_and_run(env, pe, fct_fail)
204204+205205+ rejected = [e for e in events if isinstance(e, TokenRejected)]
206206+ assert len(rejected) > 0, "Should have TokenRejected event"
207207+ assert "no free lanes" in rejected[0].reason, "Reason should be 'no free lanes'"
208208+209209+ # act_id=4 should not be in tag_store
210210+ assert 4 not in pe.tag_store, "Failed allocation should not add to tag_store"
211211+212212+ def test_lane_exhaustion_with_multiple_frames(self):
213213+ """Lane exhaustion is per-frame; different frames have independent lanes."""
214214+ env = simpy.Environment()
215215+ events = []
216216+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
217217+ pe = ProcessingElement(env=env, pe_id=0, config=config)
218218+219219+ # Frame 1: Parent 0 allocates lane 0
220220+ fct1 = FrameControlToken(
221221+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
222222+ )
223223+ inject_and_run(env, pe, fct1)
224224+ frame1_id, _lane = pe.tag_store[0]
225225+226226+ # Frame 2: Parent 10 allocates lane 0
227227+ fct2 = FrameControlToken(
228228+ target=0, act_id=10, op=FrameOp.ALLOC, payload=0
229229+ )
230230+ inject_and_run(env, pe, fct2)
231231+ frame2_id, _lane = pe.tag_store[10]
232232+233233+ assert frame1_id != frame2_id, "Should allocate different frames"
234234+235235+ # Frame 1: Exhaust all lanes
236236+ for i in range(1, 4):
237237+ fct = FrameControlToken(
238238+ target=0, act_id=i, op=FrameOp.ALLOC_SHARED, payload=0
239239+ )
240240+ inject_and_run(env, pe, fct)
241241+242242+ # Frame 2: Can still allocate more lanes (independent)
243243+ for i in range(11, 14):
244244+ fct = FrameControlToken(
245245+ target=0, act_id=i, op=FrameOp.ALLOC_SHARED, payload=10
246246+ )
247247+ inject_and_run(env, pe, fct)
248248+ assert i in pe.tag_store, f"Frame2 child {i} should be allocated"
249249+250250+251251+class TestFreeLane:
252252+ """AC3.2: FREE_LANE clears lane data, keeps frame, returns lane to pool."""
253253+254254+ def test_free_lane_basic(self):
255255+ """FREE_LANE removes act_id from tag_store, clears lane data, keeps frame."""
256256+ env = simpy.Environment()
257257+ events = []
258258+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
259259+ pe = ProcessingElement(env=env, pe_id=0, config=config)
260260+261261+ # Parent ALLOC
262262+ fct_parent = FrameControlToken(
263263+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
264264+ )
265265+ inject_and_run(env, pe, fct_parent)
266266+ parent_frame_id, _parent_lane = pe.tag_store[0]
267267+268268+ # Child ALLOC_SHARED
269269+ fct_child = FrameControlToken(
270270+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
271271+ )
272272+ inject_and_run(env, pe, fct_child)
273273+ _child_frame_id, child_lane = pe.tag_store[1]
274274+275275+ # FREE_LANE for child
276276+ fct_free = FrameControlToken(
277277+ target=0, act_id=1, op=FrameOp.FREE_LANE, payload=0
278278+ )
279279+ inject_and_run(env, pe, fct_free)
280280+281281+ # Child should be removed from tag_store
282282+ assert 1 not in pe.tag_store, "Child should be removed from tag_store"
283283+284284+ # Parent should still be present
285285+ assert 0 in pe.tag_store, "Parent should still be in tag_store"
286286+287287+ # Frame should NOT be in free_frames (still used by parent)
288288+ assert parent_frame_id not in pe.free_frames, "Frame should not be free"
289289+290290+ # FrameFreed event should have frame_freed=False
291291+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
292292+ assert len(frame_freed) > 0, "Should have FrameFreed event"
293293+ assert frame_freed[-1].frame_freed == False, "frame_freed should be False"
294294+ assert frame_freed[-1].lane == child_lane, "Event should report correct lane"
295295+296296+ def test_free_lane_returns_lane_to_pool(self):
297297+ """After FREE_LANE, freed lane can be reused by ALLOC_SHARED."""
298298+ env = simpy.Environment()
299299+ events = []
300300+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
301301+ pe = ProcessingElement(env=env, pe_id=0, config=config)
302302+303303+ # Parent ALLOC
304304+ fct_parent = FrameControlToken(
305305+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
306306+ )
307307+ inject_and_run(env, pe, fct_parent)
308308+ parent_frame_id, _parent_lane = pe.tag_store[0]
309309+310310+ # Child 1 ALLOC_SHARED (lane 1)
311311+ fct_child1 = FrameControlToken(
312312+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
313313+ )
314314+ inject_and_run(env, pe, fct_child1)
315315+ _child1_frame_id, child1_lane = pe.tag_store[1]
316316+ assert child1_lane == 1
317317+318318+ # FREE_LANE child 1
319319+ fct_free = FrameControlToken(
320320+ target=0, act_id=1, op=FrameOp.FREE_LANE, payload=0
321321+ )
322322+ inject_and_run(env, pe, fct_free)
323323+324324+ # Child 2 ALLOC_SHARED (should get lane 1 again)
325325+ fct_child2 = FrameControlToken(
326326+ target=0, act_id=2, op=FrameOp.ALLOC_SHARED, payload=0
327327+ )
328328+ inject_and_run(env, pe, fct_child2)
329329+ _child2_frame_id, child2_lane = pe.tag_store[2]
330330+331331+ # Lane 1 should be reused for child 2
332332+ assert child2_lane == 1, "Freed lane 1 should be reused"
333333+334334+335335+class TestIndependentMatching:
336336+ """AC8.1: Two act_ids sharing a frame have independent matching."""
337337+338338+ def test_independent_matching_same_offset(self):
339339+ """L operand for act_id 0 does not interfere with L for act_id 1."""
340340+ env = simpy.Environment()
341341+ events = []
342342+ config = PEConfig(
343343+ frame_count=4, lane_count=4, matchable_offsets=4, on_event=events.append
344344+ )
345345+ pe = ProcessingElement(env=env, pe_id=0, config=config)
346346+347347+ # Parent ALLOC
348348+ fct_parent = FrameControlToken(
349349+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
350350+ )
351351+ inject_and_run(env, pe, fct_parent)
352352+ parent_frame_id, _parent_lane = pe.tag_store[0]
353353+354354+ # Child ALLOC_SHARED
355355+ fct_child = FrameControlToken(
356356+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
357357+ )
358358+ inject_and_run(env, pe, fct_child)
359359+ _child_frame_id, child_lane = pe.tag_store[1]
360360+361361+ # Install dyadic instruction at offset 0
362362+ inst = Instruction(
363363+ opcode=ArithOp.ADD,
364364+ output=OutputStyle.SINK,
365365+ has_const=False,
366366+ dest_count=0,
367367+ wide=False,
368368+ fref=0,
369369+ )
370370+ pe.iram[0] = inst
371371+372372+ # Send L operand for act_id=0
373373+ tok_l_0 = DyadToken(
374374+ target=0, offset=0, act_id=0, data=5, port=Port.L
375375+ )
376376+ inject_and_run(env, pe, tok_l_0)
377377+378378+ # Should have 1 TokenReceived, 0 Matched (waiting for R)
379379+ matched = [e for e in events if isinstance(e, Matched)]
380380+ assert len(matched) == 0, "Should not match yet (waiting for R)"
381381+382382+ # Send L operand for act_id=1 at same offset
383383+ tok_l_1 = DyadToken(
384384+ target=0, offset=0, act_id=1, data=7, port=Port.L
385385+ )
386386+ inject_and_run(env, pe, tok_l_1)
387387+388388+ # Should still have 0 Matched (both waiting for R)
389389+ matched = [e for e in events if isinstance(e, Matched)]
390390+ assert len(matched) == 0, "Both should be waiting for R"
391391+392392+ # Send R for act_id=0
393393+ tok_r_0 = DyadToken(
394394+ target=0, offset=0, act_id=0, data=3, port=Port.R
395395+ )
396396+ inject_and_run(env, pe, tok_r_0)
397397+398398+ # Should now have 1 Matched for act_id=0
399399+ matched = [e for e in events if isinstance(e, Matched)]
400400+ assert len(matched) == 1, "Should have 1 match for act_id=0"
401401+ assert matched[0].act_id == 0, "Match should be for act_id=0"
402402+ assert matched[0].left == 5, "Left should be 5"
403403+ assert matched[0].right == 3, "Right should be 3"
404404+405405+ # Send R for act_id=1
406406+ tok_r_1 = DyadToken(
407407+ target=0, offset=0, act_id=1, data=2, port=Port.R
408408+ )
409409+ inject_and_run(env, pe, tok_r_1)
410410+411411+ # Should now have 2 Matched
412412+ matched = [e for e in events if isinstance(e, Matched)]
413413+ assert len(matched) == 2, "Should have 2 matches total"
414414+ m1 = [m for m in matched if m.act_id == 1][0]
415415+ assert m1.left == 7, "act_id=1 left should be 7"
416416+ assert m1.right == 2, "act_id=1 right should be 2"
417417+418418+ def test_independent_matching_different_offsets(self):
419419+ """Different offsets per lane maintain independence."""
420420+ env = simpy.Environment()
421421+ events = []
422422+ config = PEConfig(
423423+ frame_count=4, lane_count=4, matchable_offsets=4, on_event=events.append
424424+ )
425425+ pe = ProcessingElement(env=env, pe_id=0, config=config)
426426+427427+ # Parent ALLOC
428428+ fct_parent = FrameControlToken(
429429+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
430430+ )
431431+ inject_and_run(env, pe, fct_parent)
432432+433433+ # Child ALLOC_SHARED
434434+ fct_child = FrameControlToken(
435435+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
436436+ )
437437+ inject_and_run(env, pe, fct_child)
438438+439439+ # Install dyadic instructions at offsets 0 and 1
440440+ inst0 = Instruction(
441441+ opcode=ArithOp.ADD, output=OutputStyle.SINK,
442442+ has_const=False, dest_count=0, wide=False, fref=0
443443+ )
444444+ inst1 = Instruction(
445445+ opcode=ArithOp.SUB, output=OutputStyle.SINK,
446446+ has_const=False, dest_count=0, wide=False, fref=0
447447+ )
448448+ pe.iram[0] = inst0
449449+ pe.iram[1] = inst1
450450+451451+ # Send L for act_id=0 at offset 0
452452+ tok_l_0_off0 = DyadToken(
453453+ target=0, offset=0, act_id=0, data=10, port=Port.L
454454+ )
455455+ inject_and_run(env, pe, tok_l_0_off0)
456456+457457+ # Send L for act_id=1 at offset 1
458458+ tok_l_1_off1 = DyadToken(
459459+ target=0, offset=1, act_id=1, data=20, port=Port.L
460460+ )
461461+ inject_and_run(env, pe, tok_l_1_off1)
462462+463463+ # Neither should match yet
464464+ matched = [e for e in events if isinstance(e, Matched)]
465465+ assert len(matched) == 0, "No matches yet"
466466+467467+ # Send R for act_id=0 at offset 0
468468+ tok_r_0_off0 = DyadToken(
469469+ target=0, offset=0, act_id=0, data=5, port=Port.R
470470+ )
471471+ inject_and_run(env, pe, tok_r_0_off0)
472472+473473+ # Should match for offset 0
474474+ matched = [e for e in events if isinstance(e, Matched)]
475475+ assert len(matched) == 1, "Should have 1 match"
476476+ assert matched[0].offset == 0, "Match should be at offset 0"
477477+478478+ # Send R for act_id=1 at offset 1
479479+ tok_r_1_off1 = DyadToken(
480480+ target=0, offset=1, act_id=1, data=15, port=Port.R
481481+ )
482482+ inject_and_run(env, pe, tok_r_1_off1)
483483+484484+ # Should match for offset 1
485485+ matched = [e for e in events if isinstance(e, Matched)]
486486+ assert len(matched) == 2, "Should have 2 matches"
487487+ m1 = [m for m in matched if m.offset == 1][0]
488488+ assert m1.act_id == 1, "Offset 1 match should be act_id=1"
489489+490490+491491+class TestSmartFree:
492492+ """AC3.3, AC8.3: Smart FREE on shared frames preserves data and manages lanes."""
493493+494494+ def test_free_on_shared_frame_preserves_other_lanes(self):
495495+ """FREE on act_id=0 when act_id=1 uses frame; lane 1 data preserved."""
496496+ env = simpy.Environment()
497497+ events = []
498498+ config = PEConfig(
499499+ frame_count=4, lane_count=4, matchable_offsets=4, on_event=events.append
500500+ )
501501+ pe = ProcessingElement(env=env, pe_id=0, config=config)
502502+503503+ # Parent ALLOC
504504+ fct_parent = FrameControlToken(
505505+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
506506+ )
507507+ inject_and_run(env, pe, fct_parent)
508508+ parent_frame_id, _parent_lane = pe.tag_store[0]
509509+510510+ # Child ALLOC_SHARED
511511+ fct_child = FrameControlToken(
512512+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
513513+ )
514514+ inject_and_run(env, pe, fct_child)
515515+ _child_frame_id, child_lane = pe.tag_store[1]
516516+517517+ # Install instruction
518518+ inst = Instruction(
519519+ opcode=ArithOp.ADD, output=OutputStyle.SINK,
520520+ has_const=False, dest_count=0, wide=False, fref=0
521521+ )
522522+ pe.iram[0] = inst
523523+524524+ # Store L operand on child's lane
525525+ tok_l_1 = DyadToken(
526526+ target=0, offset=0, act_id=1, data=7, port=Port.L
527527+ )
528528+ inject_and_run(env, pe, tok_l_1)
529529+530530+ # Verify child's match slot has data
531531+ frame_id, lane = pe.tag_store[1]
532532+ assert pe.match_data[frame_id][0][lane] == 7, "Child lane should have L operand"
533533+ assert pe.presence[frame_id][0][lane] == True, "Child presence should be set"
534534+535535+ # FREE parent
536536+ fct_free_parent = FrameControlToken(
537537+ target=0, act_id=0, op=FrameOp.FREE, payload=0
538538+ )
539539+ inject_and_run(env, pe, fct_free_parent)
540540+541541+ # Parent should be removed, child should still be present
542542+ assert 0 not in pe.tag_store, "Parent should be removed"
543543+ assert 1 in pe.tag_store, "Child should still be present"
544544+545545+ # Frame should NOT be in free_frames
546546+ assert parent_frame_id not in pe.free_frames, "Frame should not be free"
547547+548548+ # Child's match data should be preserved
549549+ assert pe.match_data[frame_id][0][lane] == 7, "Child data should be preserved"
550550+ assert pe.presence[frame_id][0][lane] == True, "Child presence should be preserved"
551551+552552+ # FrameFreed event should have frame_freed=False
553553+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
554554+ assert any(e.frame_freed == False for e in frame_freed), "Should have frame_freed=False"
555555+556556+ def test_free_last_lane_returns_frame(self):
557557+ """FREE on last act_id using frame returns frame to free_frames."""
558558+ env = simpy.Environment()
559559+ events = []
560560+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
561561+ pe = ProcessingElement(env=env, pe_id=0, config=config)
562562+563563+ # Parent ALLOC
564564+ fct_parent = FrameControlToken(
565565+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
566566+ )
567567+ inject_and_run(env, pe, fct_parent)
568568+ parent_frame_id, _parent_lane = pe.tag_store[0]
569569+570570+ # Child ALLOC_SHARED
571571+ fct_child = FrameControlToken(
572572+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
573573+ )
574574+ inject_and_run(env, pe, fct_child)
575575+576576+ # FREE child
577577+ fct_free_child = FrameControlToken(
578578+ target=0, act_id=1, op=FrameOp.FREE_LANE, payload=0
579579+ )
580580+ inject_and_run(env, pe, fct_free_child)
581581+582582+ # Frame should still not be free (parent still using it)
583583+ assert parent_frame_id not in pe.free_frames, "Frame should not be free yet"
584584+585585+ # FREE parent
586586+ fct_free_parent = FrameControlToken(
587587+ target=0, act_id=0, op=FrameOp.FREE, payload=0
588588+ )
589589+ inject_and_run(env, pe, fct_free_parent)
590590+591591+ # Now frame should be free
592592+ assert parent_frame_id in pe.free_frames, "Frame should be free"
593593+594594+ # tag_store should be empty
595595+ assert len(pe.tag_store) == 0, "tag_store should be empty"
596596+597597+ # lane_free entry should be cleaned up
598598+ assert parent_frame_id not in pe.lane_free, "lane_free entry should be cleaned"
599599+600600+ # FrameFreed event should have frame_freed=True
601601+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
602602+ assert any(e.frame_freed == True for e in frame_freed), "Should have frame_freed=True"
603603+604604+ def test_alloc_unchanged_allocates_fresh_frame(self):
605605+ """Regular ALLOC still works: allocates fresh frame, lane 0."""
606606+ env = simpy.Environment()
607607+ events = []
608608+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
609609+ pe = ProcessingElement(env=env, pe_id=0, config=config)
610610+611611+ # First ALLOC
612612+ fct1 = FrameControlToken(
613613+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
614614+ )
615615+ inject_and_run(env, pe, fct1)
616616+ frame_id_0, lane_0 = pe.tag_store[0]
617617+ assert lane_0 == 0, "First ALLOC should assign lane 0"
618618+619619+ # Second ALLOC (different frame)
620620+ fct2 = FrameControlToken(
621621+ target=0, act_id=10, op=FrameOp.ALLOC, payload=0
622622+ )
623623+ inject_and_run(env, pe, fct2)
624624+ frame_id_10, lane_10 = pe.tag_store[10]
625625+ assert lane_10 == 0, "Second ALLOC should assign lane 0"
626626+627627+ # Frames should be different
628628+ assert frame_id_0 != frame_id_10, "Different ALLOC should get different frames"
629629+630630+ def test_data_preservation_across_free_lanes(self):
631631+ """Match data on one lane not affected by FREE of another lane."""
632632+ env = simpy.Environment()
633633+ events = []
634634+ config = PEConfig(
635635+ frame_count=4, lane_count=4, matchable_offsets=4, on_event=events.append
636636+ )
637637+ pe = ProcessingElement(env=env, pe_id=0, config=config)
638638+639639+ # Parent ALLOC
640640+ fct_parent = FrameControlToken(
641641+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
642642+ )
643643+ inject_and_run(env, pe, fct_parent)
644644+ frame_id, _parent_lane = pe.tag_store[0]
645645+646646+ # Child 1 ALLOC_SHARED
647647+ fct_child1 = FrameControlToken(
648648+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
649649+ )
650650+ inject_and_run(env, pe, fct_child1)
651651+ _frame_id_1, lane_1 = pe.tag_store[1]
652652+653653+ # Child 2 ALLOC_SHARED
654654+ fct_child2 = FrameControlToken(
655655+ target=0, act_id=2, op=FrameOp.ALLOC_SHARED, payload=0
656656+ )
657657+ inject_and_run(env, pe, fct_child2)
658658+ _frame_id_2, lane_2 = pe.tag_store[2]
659659+660660+ # Install instruction
661661+ inst = Instruction(
662662+ opcode=ArithOp.ADD, output=OutputStyle.SINK,
663663+ has_const=False, dest_count=0, wide=False, fref=0
664664+ )
665665+ pe.iram[0] = inst
666666+667667+ # Store L operand on lane 1
668668+ tok_l_1 = DyadToken(
669669+ target=0, offset=0, act_id=1, data=7, port=Port.L
670670+ )
671671+ inject_and_run(env, pe, tok_l_1)
672672+673673+ # Store L operand on lane 2
674674+ tok_l_2 = DyadToken(
675675+ target=0, offset=0, act_id=2, data=11, port=Port.L
676676+ )
677677+ inject_and_run(env, pe, tok_l_2)
678678+679679+ # FREE lane 1
680680+ fct_free_1 = FrameControlToken(
681681+ target=0, act_id=1, op=FrameOp.FREE_LANE, payload=0
682682+ )
683683+ inject_and_run(env, pe, fct_free_1)
684684+685685+ # Lane 2's data should be untouched
686686+ assert pe.match_data[frame_id][0][lane_2] == 11, "Lane 2 data should be preserved"
687687+ assert pe.presence[frame_id][0][lane_2] == True, "Lane 2 presence should be preserved"
688688+689689+ # Lane 1 should be cleared
690690+ assert pe.match_data[frame_id][0][lane_1] is None, "Lane 1 data should be cleared"
691691+ assert pe.presence[frame_id][0][lane_1] == False, "Lane 1 presence should be cleared"
692692+693693+694694+class TestAllocRemoteDataDriven:
695695+ """AC8.4, AC8.5: ALLOC_REMOTE reads fref+2 for data-driven ALLOC_SHARED vs ALLOC."""
696696+697697+ def test_alloc_remote_emits_alloc_shared_when_parent_nonzero(self):
698698+ """AC8.4: ALLOC_REMOTE emits ALLOC_SHARED when fref+2 is non-zero."""
699699+ env = simpy.Environment()
700700+ events = []
701701+ output_store = simpy.Store(env)
702702+703703+ # PE0: source of ALLOC_REMOTE
704704+ config0 = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
705705+ pe0 = ProcessingElement(env=env, pe_id=0, config=config0)
706706+ pe0.route_table[1] = output_store # Capture emitted token
707707+708708+ # Allocate a frame for act_id=0 on PE0
709709+ fct_parent = FrameControlToken(
710710+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
711711+ )
712712+ inject_and_run(env, pe0, fct_parent)
713713+ frame_id, _lane = pe0.tag_store[0]
714714+715715+ # Set up ALLOC_REMOTE instruction with fref pointing to frame constants
716716+ # fref+0: target PE=1, fref+1: target act_id=5, fref+2: parent act_id=3
717717+ inst = Instruction(
718718+ opcode=RoutingOp.ALLOC_REMOTE,
719719+ output=OutputStyle.SINK, # Not used for ALLOC_REMOTE
720720+ has_const=False,
721721+ dest_count=0,
722722+ wide=False,
723723+ fref=10,
724724+ )
725725+ pe0.iram[0] = inst
726726+727727+ # Load frame slots with constants
728728+ pe0.frames[frame_id][10] = 1 # target PE
729729+ pe0.frames[frame_id][11] = 5 # target act_id
730730+ pe0.frames[frame_id][12] = 3 # parent act_id (non-zero = ALLOC_SHARED)
731731+732732+ # Send MonadToken to trigger ALLOC_REMOTE
733733+ tok = DyadToken(
734734+ target=0, offset=0, act_id=0, data=0, port=Port.L
735735+ )
736736+ inject_and_run(env, pe0, tok)
737737+738738+ # Verify FrameControlToken was emitted with ALLOC_SHARED
739739+ assert len(output_store.items) > 0, "Should have emitted a token"
740740+ emitted = output_store.items[0]
741741+ assert isinstance(emitted, FrameControlToken), "Should emit FrameControlToken"
742742+ assert emitted.op == FrameOp.ALLOC_SHARED, "Should emit ALLOC_SHARED"
743743+ assert emitted.payload == 3, "Payload should be parent act_id=3"
744744+ assert emitted.target == 1, "Should target PE 1"
745745+ assert emitted.act_id == 5, "Should target act_id 5"
746746+747747+ def test_alloc_remote_emits_alloc_when_parent_zero(self):
748748+ """AC8.5: ALLOC_REMOTE emits ALLOC when fref+2 is zero (backwards compatible)."""
749749+ env = simpy.Environment()
750750+ events = []
751751+ output_store = simpy.Store(env)
752752+753753+ # PE0: source of ALLOC_REMOTE
754754+ config0 = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
755755+ pe0 = ProcessingElement(env=env, pe_id=0, config=config0)
756756+ pe0.route_table[1] = output_store # Capture emitted token
757757+758758+ # Allocate a frame for act_id=0 on PE0
759759+ fct_parent = FrameControlToken(
760760+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
761761+ )
762762+ inject_and_run(env, pe0, fct_parent)
763763+ frame_id, _lane = pe0.tag_store[0]
764764+765765+ # Set up ALLOC_REMOTE instruction
766766+ # fref+0: target PE=1, fref+1: target act_id=5, fref+2: parent act_id=0
767767+ inst = Instruction(
768768+ opcode=RoutingOp.ALLOC_REMOTE,
769769+ output=OutputStyle.SINK,
770770+ has_const=False,
771771+ dest_count=0,
772772+ wide=False,
773773+ fref=10,
774774+ )
775775+ pe0.iram[0] = inst
776776+777777+ # Load frame slots with constants
778778+ pe0.frames[frame_id][10] = 1 # target PE
779779+ pe0.frames[frame_id][11] = 5 # target act_id
780780+ pe0.frames[frame_id][12] = 0 # parent act_id (zero = ALLOC)
781781+782782+ # Send MonadToken to trigger ALLOC_REMOTE
783783+ tok = DyadToken(
784784+ target=0, offset=0, act_id=0, data=0, port=Port.L
785785+ )
786786+ inject_and_run(env, pe0, tok)
787787+788788+ # Verify FrameControlToken was emitted with ALLOC (not ALLOC_SHARED)
789789+ assert len(output_store.items) > 0, "Should have emitted a token"
790790+ emitted = output_store.items[0]
791791+ assert isinstance(emitted, FrameControlToken), "Should emit FrameControlToken"
792792+ assert emitted.op == FrameOp.ALLOC, "Should emit ALLOC"
793793+ assert emitted.payload == 0, "Payload should be 0 for ALLOC"
794794+ assert emitted.target == 1, "Should target PE 1"
795795+ assert emitted.act_id == 5, "Should target act_id 5"
796796+797797+ def test_alloc_remote_fref_plus_2_missing_defaults_to_zero(self):
798798+ """ALLOC_REMOTE gracefully handles fref+2 outside frame bounds (defaults to 0)."""
799799+ env = simpy.Environment()
800800+ events = []
801801+ output_store = simpy.Store(env)
802802+803803+ # PE0: source of ALLOC_REMOTE
804804+ config0 = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
805805+ pe0 = ProcessingElement(env=env, pe_id=0, config=config0)
806806+ pe0.route_table[1] = output_store
807807+808808+ # Allocate frame
809809+ fct_parent = FrameControlToken(
810810+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
811811+ )
812812+ inject_and_run(env, pe0, fct_parent)
813813+ frame_id, _lane = pe0.tag_store[0]
814814+815815+ # Set up ALLOC_REMOTE with fref pointing near end of frame
816816+ inst = Instruction(
817817+ opcode=RoutingOp.ALLOC_REMOTE,
818818+ output=OutputStyle.SINK,
819819+ has_const=False,
820820+ dest_count=0,
821821+ wide=False,
822822+ fref=62, # frame_slots defaults to 64, so fref+2=64 is outside
823823+ )
824824+ pe0.iram[0] = inst
825825+826826+ # Load only fref+0 and fref+1 (fref+2 is beyond frame bounds)
827827+ pe0.frames[frame_id][62] = 1
828828+ pe0.frames[frame_id][63] = 7
829829+830830+ # Send MonadToken
831831+ tok = DyadToken(
832832+ target=0, offset=0, act_id=0, data=0, port=Port.L
833833+ )
834834+ inject_and_run(env, pe0, tok)
835835+836836+ # Should emit ALLOC (not ALLOC_SHARED) because fref+2 is missing/falsy
837837+ assert len(output_store.items) > 0, "Should have emitted a token"
838838+ emitted = output_store.items[0]
839839+ assert emitted.op == FrameOp.ALLOC, "Should emit ALLOC when fref+2 is missing"
840840+841841+842842+class TestFreeFrameOpcode:
843843+ """AC5.1: FREE_FRAME opcode uses smart FREE behaviour on shared frames."""
844844+845845+ def test_free_frame_opcode_shared_frame_partial_free(self):
846846+ """FREE_FRAME smart free: partial frame free when other lanes remain."""
847847+ env = simpy.Environment()
848848+ events = []
849849+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
850850+ pe = ProcessingElement(env=env, pe_id=0, config=config)
851851+852852+ # Pre-allocate frame with two act_ids on different lanes
853853+ # This simulates ALLOC for act_id=0 and ALLOC_SHARED for act_id=1
854854+ frame_id = 0
855855+ pe.frames[frame_id] = [None] * pe.frame_slots
856856+ pe.tag_store[0] = (frame_id, 0) # act_id=0 on lane 0
857857+ pe.tag_store[1] = (frame_id, 1) # act_id=1 on lane 1
858858+ pe.lane_free[frame_id] = {2, 3} # Lanes 2 and 3 are free
859859+ # Remove frame_id from free_frames (it's in use)
860860+ if frame_id in pe.free_frames:
861861+ pe.free_frames.remove(frame_id)
862862+863863+ # Install FREE_FRAME instruction
864864+ inst = Instruction(
865865+ opcode=RoutingOp.FREE_FRAME,
866866+ output=OutputStyle.SINK,
867867+ has_const=False,
868868+ dest_count=0,
869869+ wide=False,
870870+ fref=0,
871871+ )
872872+ pe.iram[0] = inst
873873+874874+ # Send MonadToken for act_id=0 to trigger FREE_FRAME
875875+ tok = DyadToken(
876876+ target=0, offset=0, act_id=0, data=0, port=Port.L
877877+ )
878878+ inject_and_run(env, pe, tok)
879879+880880+ # Verify act_id=0 is removed from tag_store
881881+ assert 0 not in pe.tag_store, "act_id=0 should be removed from tag_store"
882882+883883+ # Verify act_id=1 is still in tag_store
884884+ assert 1 in pe.tag_store, "act_id=1 should still be in tag_store"
885885+886886+ # Verify frame is NOT returned to free_frames (still in use by act_id=1)
887887+ assert frame_id not in pe.free_frames, "Frame should not be in free_frames"
888888+889889+ # Verify FrameFreed event has frame_freed=False
890890+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
891891+ assert any(e.frame_freed == False for e in frame_freed), \
892892+ "Should have FrameFreed event with frame_freed=False"
893893+ last_frame_freed = [e for e in frame_freed if e.act_id == 0][-1]
894894+ assert last_frame_freed.frame_freed == False, "Frame should not be marked as freed"
895895+896896+ def test_free_frame_opcode_shared_frame_full_free(self):
897897+ """FREE_FRAME smart free: full frame free when last lane is freed."""
898898+ env = simpy.Environment()
899899+ events = []
900900+ config = PEConfig(frame_count=4, lane_count=4, on_event=events.append)
901901+ pe = ProcessingElement(env=env, pe_id=0, config=config)
902902+903903+ # Pre-allocate frame with two act_ids
904904+ frame_id = 0
905905+ pe.frames[frame_id] = [None] * pe.frame_slots
906906+ pe.tag_store[0] = (frame_id, 0) # act_id=0 on lane 0
907907+ pe.tag_store[1] = (frame_id, 1) # act_id=1 on lane 1
908908+ pe.lane_free[frame_id] = {2, 3}
909909+ if frame_id in pe.free_frames:
910910+ pe.free_frames.remove(frame_id)
911911+912912+ # Install FREE_FRAME instruction
913913+ inst = Instruction(
914914+ opcode=RoutingOp.FREE_FRAME,
915915+ output=OutputStyle.SINK,
916916+ has_const=False,
917917+ dest_count=0,
918918+ wide=False,
919919+ fref=0,
920920+ )
921921+ pe.iram[0] = inst
922922+923923+ # First: free act_id=0
924924+ tok0 = DyadToken(
925925+ target=0, offset=0, act_id=0, data=0, port=Port.L
926926+ )
927927+ inject_and_run(env, pe, tok0)
928928+929929+ # Verify frame still not free
930930+ assert frame_id not in pe.free_frames, "Frame should not be free after first FREE_FRAME"
931931+ assert 1 in pe.tag_store, "act_id=1 should still be present"
932932+933933+ # Second: free act_id=1 (last lane on frame)
934934+ tok1 = DyadToken(
935935+ target=0, offset=0, act_id=1, data=0, port=Port.L
936936+ )
937937+ inject_and_run(env, pe, tok1)
938938+939939+ # Verify frame is now freed
940940+ assert frame_id in pe.free_frames, "Frame should be in free_frames after last FREE_FRAME"
941941+ assert 1 not in pe.tag_store, "act_id=1 should be removed from tag_store"
942942+943943+ # Verify tag_store is empty
944944+ assert len(pe.tag_store) == 0, "tag_store should be empty"
945945+946946+ # Verify lane_free is cleaned up
947947+ assert frame_id not in pe.lane_free, "lane_free entry should be deleted"
948948+949949+ # Verify FrameFreed event has frame_freed=True
950950+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
951951+ last_frame_freed = [e for e in frame_freed if e.act_id == 1][-1]
952952+ assert last_frame_freed.frame_freed == True, \
953953+ "Last FREE_FRAME should emit FrameFreed with frame_freed=True"
954954+955955+956956+class TestLoopPipelining:
957957+ """AC8.6: Full loop pipelining integration test with multiple lanes."""
958958+959959+ def test_full_loop_pipelining_scenario(self):
960960+ """
961961+ Complete loop pipelining lifecycle: two iterations of a dyadic instruction
962962+ running concurrently on different lanes, both producing correct results.
963963+964964+ Simulates:
965965+ 1. ALLOC(act_id=0) → frame, lane 0
966966+ 2. Setup: write destination to frame
967967+ 3. Iteration 1: inject L and R DyadTokens for act_id=0
968968+ 4. ALLOC_SHARED(act_id=1, parent=0) → same frame, lane 1
969969+ 5. Iteration 2: inject L and R DyadTokens for act_id=1
970970+ 6. Both iterations match independently, both produce correct results
971971+ 7. FREE(act_id=0) → lane 0 freed, frame stays
972972+ 8. FREE(act_id=1) → last lane, frame returned to free list
973973+ """
974974+ env = simpy.Environment()
975975+ events = []
976976+ config = PEConfig(
977977+ frame_count=4, lane_count=4, matchable_offsets=4,
978978+ on_event=events.append
979979+ )
980980+ pe = ProcessingElement(env=env, pe_id=0, config=config)
981981+982982+ # 1. ALLOC(act_id=0) → frame, lane 0
983983+ fct_alloc_0 = FrameControlToken(
984984+ target=0, act_id=0, op=FrameOp.ALLOC, payload=0
985985+ )
986986+ inject_and_run(env, pe, fct_alloc_0)
987987+988988+ # Verify act_id=0 is allocated
989989+ assert 0 in pe.tag_store, "act_id=0 should be in tag_store"
990990+ frame_id, lane_0 = pe.tag_store[0]
991991+ assert lane_0 == 0, "First ALLOC should assign lane 0"
992992+993993+ # Verify FrameAllocated event for iteration 1
994994+ frame_allocated = [e for e in events if isinstance(e, FrameAllocated)]
995995+ assert len(frame_allocated) >= 1, "Should have FrameAllocated event"
996996+ assert frame_allocated[0].frame_id == frame_id, "Event should report correct frame_id"
997997+ assert frame_allocated[0].lane == 0, "Event should report lane 0"
998998+999999+ # 2. Setup: write destination to frame at slot 8
10001000+ dest = FrameDest(
10011001+ target_pe=1, offset=0, act_id=0, port=Port.L,
10021002+ token_kind=TokenKind.MONADIC
10031003+ )
10041004+ pe.frames[frame_id][8] = dest
10051005+10061006+ # Set up route to capture output
10071007+ pe.route_table[1] = simpy.Store(env)
10081008+10091009+ # 3. Install ADD instruction at IRAM offset 0
10101010+ inst = Instruction(
10111011+ opcode=ArithOp.ADD,
10121012+ output=OutputStyle.INHERIT,
10131013+ has_const=False,
10141014+ dest_count=1,
10151015+ wide=False,
10161016+ fref=8,
10171017+ )
10181018+ pe.iram[0] = inst
10191019+10201020+ # 4. ALLOC_SHARED(act_id=1, parent=0) → same frame, lane 1
10211021+ fct_alloc_shared = FrameControlToken(
10221022+ target=0, act_id=1, op=FrameOp.ALLOC_SHARED, payload=0
10231023+ )
10241024+ inject_and_run(env, pe, fct_alloc_shared)
10251025+10261026+ # Verify act_id=1 is allocated on same frame, different lane
10271027+ assert 1 in pe.tag_store, "act_id=1 should be in tag_store"
10281028+ frame_id_1, lane_1 = pe.tag_store[1]
10291029+ assert frame_id_1 == frame_id, "Both should share same frame"
10301030+ assert lane_1 == 1, "Second allocation should assign lane 1"
10311031+ assert lane_1 != lane_0, "Lanes should be different"
10321032+10331033+ # Verify FrameAllocated event for iteration 2
10341034+ frame_allocated = [e for e in events if isinstance(e, FrameAllocated)]
10351035+ assert len(frame_allocated) >= 2, "Should have 2 FrameAllocated events"
10361036+ assert frame_allocated[1].frame_id == frame_id, "Event should report correct frame_id"
10371037+ assert frame_allocated[1].lane == 1, "Event should report lane 1"
10381038+10391039+ # 5. Inject iteration 1 operands (act_id=0, lane 0)
10401040+ tok_l_0 = DyadToken(
10411041+ target=0, offset=0, act_id=0, data=100, port=Port.L
10421042+ )
10431043+ inject_and_run(env, pe, tok_l_0)
10441044+10451045+ tok_r_0 = DyadToken(
10461046+ target=0, offset=0, act_id=0, data=200, port=Port.R
10471047+ )
10481048+ inject_and_run(env, pe, tok_r_0)
10491049+10501050+ # Verify Matched event for iteration 1
10511051+ matched = [e for e in events if isinstance(e, Matched)]
10521052+ assert len(matched) >= 1, "Should have Matched event for iteration 1"
10531053+ match_0 = [m for m in matched if m.act_id == 0][-1]
10541054+ assert match_0.left == 100, "Iteration 1 left operand should be 100"
10551055+ assert match_0.right == 200, "Iteration 1 right operand should be 200"
10561056+ assert match_0.offset == 0, "Iteration 1 offset should be 0"
10571057+10581058+ # Verify output token with correct data (100+200=300)
10591059+ emitted = [e for e in events if isinstance(e, Emitted)]
10601060+ assert len(emitted) >= 1, "Should have Emitted event for iteration 1"
10611061+ out_tok_0 = emitted[-1].token
10621062+ assert out_tok_0.data == 300, "Iteration 1 output should be 300 (100+200)"
10631063+ assert out_tok_0.target == 1, "Output should route to target_pe=1"
10641064+10651065+ # 6. Inject iteration 2 operands (act_id=1, lane 1)
10661066+ tok_l_1 = DyadToken(
10671067+ target=0, offset=0, act_id=1, data=1000, port=Port.L
10681068+ )
10691069+ inject_and_run(env, pe, tok_l_1)
10701070+10711071+ tok_r_1 = DyadToken(
10721072+ target=0, offset=0, act_id=1, data=2000, port=Port.R
10731073+ )
10741074+ inject_and_run(env, pe, tok_r_1)
10751075+10761076+ # Verify Matched event for iteration 2
10771077+ matched = [e for e in events if isinstance(e, Matched)]
10781078+ assert len(matched) >= 2, "Should have Matched events for both iterations"
10791079+ match_1 = [m for m in matched if m.act_id == 1][-1]
10801080+ assert match_1.left == 1000, "Iteration 2 left operand should be 1000"
10811081+ assert match_1.right == 2000, "Iteration 2 right operand should be 2000"
10821082+ assert match_1.offset == 0, "Iteration 2 offset should be 0"
10831083+10841084+ # Verify output token with correct data (1000+2000=3000)
10851085+ emitted = [e for e in events if isinstance(e, Emitted)]
10861086+ assert len(emitted) >= 2, "Should have Emitted events for both iterations"
10871087+ out_tok_1 = emitted[-1].token
10881088+ assert out_tok_1.data == 3000, "Iteration 2 output should be 3000 (1000+2000)"
10891089+ assert out_tok_1.target == 1, "Output should route to target_pe=1"
10901090+10911091+ # Interleaved verification: confirm independent lanes
10921092+ matches_by_id = {}
10931093+ for m in matched:
10941094+ if m.act_id not in matches_by_id:
10951095+ matches_by_id[m.act_id] = []
10961096+ matches_by_id[m.act_id].append(m)
10971097+10981098+ assert 0 in matches_by_id, "Should have match for iteration 1 (act_id=0)"
10991099+ assert 1 in matches_by_id, "Should have match for iteration 2 (act_id=1)"
11001100+ assert matches_by_id[0][-1].left == 100, "Iteration 1 left should be 100"
11011101+ assert matches_by_id[1][-1].left == 1000, "Iteration 2 left should be 1000"
11021102+11031103+ # 7. FREE(act_id=0) → lane 0 freed, frame stays
11041104+ fct_free_0 = FrameControlToken(
11051105+ target=0, act_id=0, op=FrameOp.FREE, payload=0
11061106+ )
11071107+ inject_and_run(env, pe, fct_free_0)
11081108+11091109+ # Verify act_id=0 removed, act_id=1 still present
11101110+ assert 0 not in pe.tag_store, "act_id=0 should be removed from tag_store"
11111111+ assert 1 in pe.tag_store, "act_id=1 should still be in tag_store"
11121112+11131113+ # Verify frame not returned (still used by act_id=1)
11141114+ assert frame_id not in pe.free_frames, "Frame should not be in free_frames"
11151115+11161116+ # Verify FrameFreed event with frame_freed=False
11171117+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
11181118+ freed_0 = [f for f in frame_freed if f.act_id == 0][-1]
11191119+ assert freed_0.frame_freed == False, "frame_freed should be False (not last lane)"
11201120+ assert freed_0.lane == lane_0, "Event should report lane 0"
11211121+11221122+ # 8. FREE(act_id=1) → last lane, frame returned to free list
11231123+ fct_free_1 = FrameControlToken(
11241124+ target=0, act_id=1, op=FrameOp.FREE, payload=0
11251125+ )
11261126+ inject_and_run(env, pe, fct_free_1)
11271127+11281128+ # Verify act_id=1 removed from tag_store
11291129+ assert 1 not in pe.tag_store, "act_id=1 should be removed from tag_store"
11301130+11311131+ # Verify tag_store is now empty
11321132+ assert len(pe.tag_store) == 0, "tag_store should be empty"
11331133+11341134+ # Verify frame returned to free_frames
11351135+ assert frame_id in pe.free_frames, "Frame should be in free_frames"
11361136+11371137+ # Verify lane_free entry cleaned up
11381138+ assert frame_id not in pe.lane_free, "lane_free entry should be deleted"
11391139+11401140+ # Verify FrameFreed event with frame_freed=True
11411141+ frame_freed = [e for e in events if isinstance(e, FrameFreed)]
11421142+ freed_1 = [f for f in frame_freed if f.act_id == 1][-1]
11431143+ assert freed_1.frame_freed == True, "frame_freed should be True (last lane)"
11441144+ assert freed_1.lane == lane_1, "Event should report lane 1"
11451145+11461146+ # Summary: verify AC8.6 acceptance criteria
11471147+ # Both iterations produce mathematically correct results
11481148+ assert matches_by_id[0][-1].left + matches_by_id[0][-1].right == 300, \
11491149+ "Iteration 1 arithmetic correct"
11501150+ assert matches_by_id[1][-1].left + matches_by_id[1][-1].right == 3000, \
11511151+ "Iteration 2 arithmetic correct"
11521152+11531153+ # Both iterations ran on SAME frame (verified at allocation, re-confirmed)
11541154+ assert frame_id_1 == frame_id, "Both iterations ran on same frame"
11551155+11561156+ # Both iterations used DIFFERENT lanes
11571157+ assert lane_0 != lane_1, "Iterations used different lanes"
11581158+ assert lane_0 == 0 and lane_1 == 1, "Lanes are 0 and 1 respectively"
11591159+11601160+ # Freeing one iteration preserved the other
11611161+ frame_freed_events = [e for e in events if isinstance(e, FrameFreed)]
11621162+ assert len(frame_freed_events) >= 2, "Should have 2 FrameFreed events"
11631163+11641164+ # Freeing the last iteration returned the frame
11651165+ assert frame_id in pe.free_frames, "Frame returned to pool after last FREE"
+2
tests/test_repl.py
···471471 out = output.getvalue()
472472 # Should show PE state or not found message
473473 assert len(out) > 0
474474+ # Verify formatting includes lane information or empty tag store marker (case-insensitive)
475475+ assert "lane" in out.lower() or "tag store: (empty)" in out.lower()
474476475477 def test_pe_invalid_id(self, repl, temp_dfasm_file):
476478 """pe with non-integer ID should error."""
+2-2
tests/test_snapshot.py
···123123 assert isinstance(frame, tuple)
124124 # Each frame has slots
125125126126- # tag_store should be dict mapping act_id to frame_id
126126+ # tag_store should be dict mapping act_id to (frame_id, lane) tuple
127127 assert isinstance(pe_snap.tag_store, dict)
128128129129 # presence should be a tuple of tuples (frame_count x matchable_offsets)
···183183 snapshot = capture(system)
184184185185 pe_snap = snapshot.pes[0]
186186- # tag_store should be a dict mapping act_id to frame_id
186186+ # tag_store should be a dict mapping act_id to (frame_id, lane) tuple
187187 assert isinstance(pe_snap.tag_store, dict)
188188 assert hasattr(pe_snap, 'frames')
189189 assert hasattr(pe_snap, 'free_frames')