commits
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the migration queue had no dedup — unlike the resolve queue which uses
queued_set. every message from a DID with a host mismatch would dupe
the DID string from c_allocator and append to the queue. with thousands
of mismatched DIDs producing messages faster than the 4 resolver threads
could drain, the queue + duped strings grew without bound (~430 MiB/hr).
- add migration_pending set to dedup migration queue entries
- on confirmed migration: remove from pending (allow re-evaluation)
- on rejected migration: leave in pending (suppress re-queueing)
- evictKey (#identity events) clears pending for that DID
- add relay_validator_migration_pending prometheus metric
- convert validator cache and DID cache to proper LRU (from lru.zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add smaps_rollup + smaps parsing: relay_smaps_rss_kb, relay_smaps_anon_kb,
relay_heap_rss_kb, relay_stack_rss_kb for heap/stack/mmap attribution
- rocksdb: max_open_files=256 (was unlimited), write_buffer_size=16M per CF
(was 64M default × 3 CFs = 192M memtable overhead)
- metrics buffer 12K → 16K
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- mallinfo() → mallinfo2() for accurate metrics past 2 GiB
- add smaps_rollup + smaps parsing: relay_smaps_rss_kb, relay_smaps_anon_kb,
relay_heap_rss_kb, relay_stack_rss_kb for heap/stack/mmap attribution
- rocksdb: max_open_files=256 (was unlimited), write_buffer_size=16M per CF
(was 64M default × 3 CFs = 192M memtable overhead)
- metrics buffer 12K → 16K
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the validator key cache was set to 500K entries but the pod OOMs at
~450K — eviction never triggers. indigo uses 5M with LRU+TTL eviction;
rsky uses 262K. 250K matches rsky and fits within the 3 GiB pod limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4 new prometheus gauges to attribute memory usage:
- relay_history_entries (ring buffer depth)
- relay_evtbuf_entries (pending flush buffer)
- relay_did_cache_entries (DID→UID cache)
- relay_consumer_queue_depth (total consumer send buffers)
malloc_trim(0) in GC loop returns freed glibc arena pages to OS.
mallinfo stays as-is (u32 bitcast, 4 GiB range) — mallinfo2 requires
glibc 2.33 but zig cross-compiles against glibc 2.2.5 by default.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- skip 50K frame ring buffer when disk persist is active (always in prod)
- remove redundant outbuf in event_log; flush writes directly from evtbuf
- reduce thread stacks from 2 MiB to 1 MiB (~2,700 threads)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_health on port 3000 competed with consumer WebSocket traffic, causing
k8s probe timeouts under load. also returned unconditional 200 without
actually checking database connectivity.
- _health now runs SELECT 1 against postgres, returns 500 on failure
- metrics port (3001) serves /_health alongside /metrics, with routing
- k8s probes move to port 3001 (in relay deploy config)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- drop commits where rev <= stored rev before persist (indigo ingest.go:114)
- verify DID→PDS host binding on first-seen accounts (async, reject on mismatch)
- on signature failure, evict cached key + re-resolve (sync spec guidance)
- add spec conformance tests for size limits and unknown frame types
- document deliberate policy divergences from indigo in design.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ignore unknown frame types instead of persisting as #identity (spec forward-compat)
- align blocks size limits with lexicon maxLength (2000000, 10000)
- update docs with current memory numbers (~1.2 GiB after CA bundle fix)
- fix websocket.zig link to point at fork
- document shared CA bundle, DID dedupe, migration interleave in design.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add in-flight set to prevent duplicate DID entries in resolve queue
- interleave migration checks (1 per 10 DID resolutions) to prevent
starvation of the 140K+ migration queue
- bitcast mallinfo i32 fields to u32, extending useful range to 4 GiB
- extract magic numbers into named constants
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
each subscriber was loading its own copy of system CA certificates via
Bundle.rescan() per connection. with ~2700 hosts, that's ~2700 copies
of the same cert data in memory. load once in slurper.start(), pass
to all subscribers via config.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The resolver threads kept a single http.Client for the process lifetime,
accumulating connection/TLS state across millions of DID resolutions.
Recreate the resolver every 1000 resolutions to shed that state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mallinfo's int fields were overflowing at 2 GiB, making arena and
mmap metrics useless. the runtime glibc (2.36) has mallinfo2 but
zig's bundled cross-compile headers don't declare it. manual extern
struct declaration bypasses the header issue.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the deploy target's glibc doesn't have mallinfo2 (needs 2.33+).
mallinfo uses int fields (cap at 2 GiB) but sufficient for
distinguishing leak vs fragmentation patterns.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mallinfo2() exposes in-use vs free-but-held-by-malloc bytes,
letting us distinguish real leaks from glibc fragmentation.
also adds VmHWM and RssAnon from /proc/self/status.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit a65e8e6842ea3b6c443cf0ce8fba7c2155f3d981.
Open collection index with tuned options via raw C API:
- 256 MB shared LRU block cache across all column families
- cache_index_and_filter_blocks=true (prevents unbounded index growth)
- pin_l0_filter_and_index_blocks_in_cache=true
- 16 KB block size (4x default, reduces index size)
- 32 MB write buffers, max 2 concurrent per CF
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change backfill_progress PK from (collection) to (collection, source)
so the same collection can be imported from multiple relays independently.
Enables backfilling from relay.waow.tech alongside bsky.network.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zlay's thread-per-PDS architecture (2,700 concurrent subscribers) allowed
interleaving between seq assignment (under persist lock) and broadcast
(unlocked), delivering frames out of order to consumers.
adds broadcast_order mutex to Broadcaster — subscriber threads now hold
it across persist → resequence → broadcast, matching Indigo's pattern of
serializing the entire pipeline through one lock.
also fixes the error fallback that broadcast with upstream seq on persist
failure (mixing sequence domains). now drops the frame instead.
includes regression test that spawns 8 threads doing concurrent
broadcast and verifies monotonic output. confirmed: test catches the
bug when the lock is removed (seq 3 after 4).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
disk pressure killed the pod with no warning in metrics. uses statvfs on
the data directory to export filesystem total and available bytes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the DID→UID lookup cache in event_log.zig was completely unbounded —
no max size, no eviction, no TTL. every DID seen on the firehose got
cached forever. with 61M+ DIDs on the network, this grows linearly
until OOM kill.
cap at 500K entries (~40 MB). on eviction, clear the entire map —
there's no per-entry timestamp to sort by, and the postgres fallback
is fast enough (~0.5ms per miss) that a full clear is fine.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
api.zig was a 926-line grab bag mixing endpoint definitions with hex
parsing, query string decoding, CBOR frame building, and HTTP response
plumbing. now follows the Ghostty/stdlib namespace pattern:
src/api.zig — 20-line namespace (re-exports HttpContext + handleHttpRequest)
src/api/router.zig — route dispatch + static content
src/api/xrpc.zig — AT Protocol sync endpoint handlers
src/api/admin.zig — admin handlers, auth, CBOR frame building
src/api/http.zig — response helpers, query string parsing
no behavior changes. main.zig unchanged.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
main.zig was 1,174 lines doing env parsing, component init, signal
handling, and the entire HTTP API. now main.zig is 265 lines (entry
point + wiring) and api.zig has all handlers, response helpers, and
query string parsing.
adds docs/design.md covering data flow, threading model, memory model,
persistence, and an honest scaling limits table for 10x growth.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the hard-coded metrics table was already stale and duplicated what the
public grafana dashboard shows. replaced the endpoints table with a
link to the sync spec. fixed RELAY_HTTP_PORT → RELAY_METRICS_PORT and
added RESOLVER_THREADS + VALIDATOR_CACHE_SIZE.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
prevents unbounded memory growth from caching every DID's signing key.
configurable via VALIDATOR_CACHE_SIZE env var. eviction runs in the
background resolve loop (not the hot validation path). adds
relay_validator_cache_evictions_total prometheus counter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
serve all HTTP endpoints via websocket.zig httpFallback on the main
WebSocket port (3000). the old HttpServer becomes a minimal MetricsServer
on RELAY_METRICS_PORT (3001) serving only /metrics.
adds process-level metrics (RSS, thread count on linux) and validator
cache/migration queue gauges to the prometheus endpoint.
updates zat to v0.2.10 and websocket.zig to 9e6d732 (HTTP fallback support).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add getHostStatus endpoint (lexicon says "implemented by relays")
- emit #account event on admin takedown so downstream consumers see it
- validate DID migrations asynchronously instead of blindly updating host_id
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig's default thread stack is 16 MB. with ~2,750 subscriber threads
that's 44 GB of virtual memory mapped. switch to explicit 2 MB stacks
(generous for websocket read loops and CBOR decoding) and replace the
debug allocator (GPA) with glibc malloc, which returns freed pages to
the OS instead of accumulating metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SUM() and COUNT() return numeric in postgres, which pg.zig's strict
type checking can't read into i64. explicit ::bigint casts fix the
panic in getStatus.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
discovers collections from lexicon garden llms.txt + RBC scan,
paginates through source relay's listReposByCollection, tracks
progress in postgres for resumability.
POST /admin/backfill-collections?source=bsky.network — trigger
GET /admin/backfill-collections — status
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the warmCache approach queued all 1.36M active DIDs — most of which
won't post for hours — competing with actually-active DIDs for resolver
time. the on-demand queue with 4 threads is a natural activity-weighted
priority queue that resolves exactly the right DIDs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
spawn N resolver threads (default 4, configurable via RESOLVER_THREADS)
instead of 1, and pre-populate the resolve queue with all active DIDs
from the database on startup. adds com.atproto.sync.listHosts XRPC
endpoint with Indigo-compatible response format (includes seq field,
filters by last_seq > 0, default limit 200).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
previously #sync frames fell through to the identity fallback, getting
persisted with kind=identity. now they're properly routed: validated via
signature-only verification (no MST/ops), persisted with kind=sync (6),
and update account state like commits. rev ordering is intentionally not
enforced since sync is used for rollbacks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- per-host sliding window rate limits (50/s, 2500/hr, 20000/day baseline;
5000/s, 50M/hr, 500M/day for trusted *.host.bsky.network hosts)
matches Go relay's slidingwindow approach
- Server: zlay (atproto-relay) header on all HTTP responses,
enabling relay loop detection by other relays
- new relay_rate_limited_total prometheus counter
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
relay_frames_received_total was only incremented in broadcast(), missing
events dropped by inactive-account filtering or validation failure.
move the counter to subscriber's serverMessage after frame decode, matching
the Go relay's events_received_counter which counts at processRepoEvent entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the 64KB stack buffer overflows with 2800+ hosts, causing 502 responses.
switch to ArrayList writer for unbounded JSON response. also adds the
missing checkAdmin gate that other admin endpoints already have.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled HTTP parser with zig 0.15's std.http.Server,
which provides buffered I/O that correctly handles POST bodies
split across TCP segments behind traefik. This is the proper fix
for the issue that commit 4733560 worked around with a multi-read loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reverse proxies (traefik) may send headers and body in separate
TCP writes. the single stream.read() only got headers, leaving
the body empty — causing all POST endpoints (requestCrawl, ban,
block/unblock) to fail with "invalid JSON" through the ingress.
now parses Content-Length from headers and keeps reading until the
full body is received.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the Go relay's HostPerDayLimiter (50/day default) was copied
wholesale, but it only makes sense there because it gates the
public requestCrawl endpoint while admin operations bypass it.
zlay applies it uniformly, blocking legitimate bulk seeding from
cronjobs. the other validation steps (hostname format, domain bans,
dedup, describeServer liveness) provide sufficient protection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clients may percent-encode colons in DIDs (did%3Aplc%3A...), which
broke cursor-based pagination — the encoded cursor didn't match any
RocksDB key so the scan restarted from the beginning.
adds queryParamDecoded() with a hexVal helper for percent-decoding
into a caller-provided buffer. applied to cursor in
listReposByCollection and did in getRepoStatus/getLatestCommit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dockerfile.runtime assumes the binary is already built (zig build)
and just copies it into debian:bookworm-slim. Used by the remote
deploy flow where zig compiles natively on the x86_64 server.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch to zzstoatzz/websocket.zig fix/handshake-tcp-split branch
which guards the handshake parser against TCP splitting data between
\r and \n, preventing panics under load.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Write headers and body as separate writes instead of formatting
everything into a single 4KB buffer. Fixes 502 errors when
listReposByCollection returns >90 results.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig 0.15's C++ codegen for musl produces illegal instructions in
RocksDB's LRU cache (std::function vtable). Switch to Debian
bookworm-slim with x86_64-linux-gnu target.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
QEMU emulation exposes CPU features that don't match the deployment
target. Force baseline x86_64 to avoid illegal instructions in RocksDB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The archive uses zig-x86_64-linux format, not zig-linux-x86_64.
Also create /data/collection-index directory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cross-compiling RocksDB C++ from ARM64 produces illegal instructions.
build inside x86_64 container instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
inspired by lightrail (microcosm.blue/lightrail). uses RocksDB with two
column families (rbc: collection→did, cbr: did→collection) for bidirectional
prefix-scan lookups. phase 1: live indexing from firehose commit ops.
- collection_index.zig: RocksDB-backed index with add/remove/scan/removeAll
- subscriber: tracks collections on each validated commit, removes on tombstone
- main.zig: serves /xrpc/com.atproto.sync.listReposByCollection endpoint
- build.zig: adds rocksdb-zig dependency (Syndica/rocksdb-zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bsky.network was already in the host table from the old code, so
the bootstrap condition (len == 0) never triggered. pullHosts is
idempotent — getOrCreateHost skips existing hosts — so just run
it every startup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broken low-level HTTP request flow (req.reader() doesn't exist
in zig 0.15) with client.fetch() + Io.Writer.Allocating — same proven
pattern as xrpc/transport.zig. Move HTTP client outside pagination loop.
Also adds connected_inbound metric and removes per-host rate limiter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the migration queue had no dedup — unlike the resolve queue which uses
queued_set. every message from a DID with a host mismatch would dupe
the DID string from c_allocator and append to the queue. with thousands
of mismatched DIDs producing messages faster than the 4 resolver threads
could drain, the queue + duped strings grew without bound (~430 MiB/hr).
- add migration_pending set to dedup migration queue entries
- on confirmed migration: remove from pending (allow re-evaluation)
- on rejected migration: leave in pending (suppress re-queueing)
- evictKey (#identity events) clears pending for that DID
- add relay_validator_migration_pending prometheus metric
- convert validator cache and DID cache to proper LRU (from lru.zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add smaps_rollup + smaps parsing: relay_smaps_rss_kb, relay_smaps_anon_kb,
relay_heap_rss_kb, relay_stack_rss_kb for heap/stack/mmap attribution
- rocksdb: max_open_files=256 (was unlimited), write_buffer_size=16M per CF
(was 64M default × 3 CFs = 192M memtable overhead)
- metrics buffer 12K → 16K
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- mallinfo() → mallinfo2() for accurate metrics past 2 GiB
- add smaps_rollup + smaps parsing: relay_smaps_rss_kb, relay_smaps_anon_kb,
relay_heap_rss_kb, relay_stack_rss_kb for heap/stack/mmap attribution
- rocksdb: max_open_files=256 (was unlimited), write_buffer_size=16M per CF
(was 64M default × 3 CFs = 192M memtable overhead)
- metrics buffer 12K → 16K
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4 new prometheus gauges to attribute memory usage:
- relay_history_entries (ring buffer depth)
- relay_evtbuf_entries (pending flush buffer)
- relay_did_cache_entries (DID→UID cache)
- relay_consumer_queue_depth (total consumer send buffers)
malloc_trim(0) in GC loop returns freed glibc arena pages to OS.
mallinfo stays as-is (u32 bitcast, 4 GiB range) — mallinfo2 requires
glibc 2.33 but zig cross-compiles against glibc 2.2.5 by default.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_health on port 3000 competed with consumer WebSocket traffic, causing
k8s probe timeouts under load. also returned unconditional 200 without
actually checking database connectivity.
- _health now runs SELECT 1 against postgres, returns 500 on failure
- metrics port (3001) serves /_health alongside /metrics, with routing
- k8s probes move to port 3001 (in relay deploy config)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- drop commits where rev <= stored rev before persist (indigo ingest.go:114)
- verify DID→PDS host binding on first-seen accounts (async, reject on mismatch)
- on signature failure, evict cached key + re-resolve (sync spec guidance)
- add spec conformance tests for size limits and unknown frame types
- document deliberate policy divergences from indigo in design.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ignore unknown frame types instead of persisting as #identity (spec forward-compat)
- align blocks size limits with lexicon maxLength (2000000, 10000)
- update docs with current memory numbers (~1.2 GiB after CA bundle fix)
- fix websocket.zig link to point at fork
- document shared CA bundle, DID dedupe, migration interleave in design.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add in-flight set to prevent duplicate DID entries in resolve queue
- interleave migration checks (1 per 10 DID resolutions) to prevent
starvation of the 140K+ migration queue
- bitcast mallinfo i32 fields to u32, extending useful range to 4 GiB
- extract magic numbers into named constants
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Open collection index with tuned options via raw C API:
- 256 MB shared LRU block cache across all column families
- cache_index_and_filter_blocks=true (prevents unbounded index growth)
- pin_l0_filter_and_index_blocks_in_cache=true
- 16 KB block size (4x default, reduces index size)
- 32 MB write buffers, max 2 concurrent per CF
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zlay's thread-per-PDS architecture (2,700 concurrent subscribers) allowed
interleaving between seq assignment (under persist lock) and broadcast
(unlocked), delivering frames out of order to consumers.
adds broadcast_order mutex to Broadcaster — subscriber threads now hold
it across persist → resequence → broadcast, matching Indigo's pattern of
serializing the entire pipeline through one lock.
also fixes the error fallback that broadcast with upstream seq on persist
failure (mixing sequence domains). now drops the frame instead.
includes regression test that spawns 8 threads doing concurrent
broadcast and verifies monotonic output. confirmed: test catches the
bug when the lock is removed (seq 3 after 4).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the DID→UID lookup cache in event_log.zig was completely unbounded —
no max size, no eviction, no TTL. every DID seen on the firehose got
cached forever. with 61M+ DIDs on the network, this grows linearly
until OOM kill.
cap at 500K entries (~40 MB). on eviction, clear the entire map —
there's no per-entry timestamp to sort by, and the postgres fallback
is fast enough (~0.5ms per miss) that a full clear is fine.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
api.zig was a 926-line grab bag mixing endpoint definitions with hex
parsing, query string decoding, CBOR frame building, and HTTP response
plumbing. now follows the Ghostty/stdlib namespace pattern:
src/api.zig — 20-line namespace (re-exports HttpContext + handleHttpRequest)
src/api/router.zig — route dispatch + static content
src/api/xrpc.zig — AT Protocol sync endpoint handlers
src/api/admin.zig — admin handlers, auth, CBOR frame building
src/api/http.zig — response helpers, query string parsing
no behavior changes. main.zig unchanged.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
main.zig was 1,174 lines doing env parsing, component init, signal
handling, and the entire HTTP API. now main.zig is 265 lines (entry
point + wiring) and api.zig has all handlers, response helpers, and
query string parsing.
adds docs/design.md covering data flow, threading model, memory model,
persistence, and an honest scaling limits table for 10x growth.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the hard-coded metrics table was already stale and duplicated what the
public grafana dashboard shows. replaced the endpoints table with a
link to the sync spec. fixed RELAY_HTTP_PORT → RELAY_METRICS_PORT and
added RESOLVER_THREADS + VALIDATOR_CACHE_SIZE.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
prevents unbounded memory growth from caching every DID's signing key.
configurable via VALIDATOR_CACHE_SIZE env var. eviction runs in the
background resolve loop (not the hot validation path). adds
relay_validator_cache_evictions_total prometheus counter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
serve all HTTP endpoints via websocket.zig httpFallback on the main
WebSocket port (3000). the old HttpServer becomes a minimal MetricsServer
on RELAY_METRICS_PORT (3001) serving only /metrics.
adds process-level metrics (RSS, thread count on linux) and validator
cache/migration queue gauges to the prometheus endpoint.
updates zat to v0.2.10 and websocket.zig to 9e6d732 (HTTP fallback support).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig's default thread stack is 16 MB. with ~2,750 subscriber threads
that's 44 GB of virtual memory mapped. switch to explicit 2 MB stacks
(generous for websocket read loops and CBOR decoding) and replace the
debug allocator (GPA) with glibc malloc, which returns freed pages to
the OS instead of accumulating metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
discovers collections from lexicon garden llms.txt + RBC scan,
paginates through source relay's listReposByCollection, tracks
progress in postgres for resumability.
POST /admin/backfill-collections?source=bsky.network — trigger
GET /admin/backfill-collections — status
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the warmCache approach queued all 1.36M active DIDs — most of which
won't post for hours — competing with actually-active DIDs for resolver
time. the on-demand queue with 4 threads is a natural activity-weighted
priority queue that resolves exactly the right DIDs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
spawn N resolver threads (default 4, configurable via RESOLVER_THREADS)
instead of 1, and pre-populate the resolve queue with all active DIDs
from the database on startup. adds com.atproto.sync.listHosts XRPC
endpoint with Indigo-compatible response format (includes seq field,
filters by last_seq > 0, default limit 200).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
previously #sync frames fell through to the identity fallback, getting
persisted with kind=identity. now they're properly routed: validated via
signature-only verification (no MST/ops), persisted with kind=sync (6),
and update account state like commits. rev ordering is intentionally not
enforced since sync is used for rollbacks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- per-host sliding window rate limits (50/s, 2500/hr, 20000/day baseline;
5000/s, 50M/hr, 500M/day for trusted *.host.bsky.network hosts)
matches Go relay's slidingwindow approach
- Server: zlay (atproto-relay) header on all HTTP responses,
enabling relay loop detection by other relays
- new relay_rate_limited_total prometheus counter
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
relay_frames_received_total was only incremented in broadcast(), missing
events dropped by inactive-account filtering or validation failure.
move the counter to subscriber's serverMessage after frame decode, matching
the Go relay's events_received_counter which counts at processRepoEvent entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled HTTP parser with zig 0.15's std.http.Server,
which provides buffered I/O that correctly handles POST bodies
split across TCP segments behind traefik. This is the proper fix
for the issue that commit 4733560 worked around with a multi-read loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reverse proxies (traefik) may send headers and body in separate
TCP writes. the single stream.read() only got headers, leaving
the body empty — causing all POST endpoints (requestCrawl, ban,
block/unblock) to fail with "invalid JSON" through the ingress.
now parses Content-Length from headers and keeps reading until the
full body is received.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the Go relay's HostPerDayLimiter (50/day default) was copied
wholesale, but it only makes sense there because it gates the
public requestCrawl endpoint while admin operations bypass it.
zlay applies it uniformly, blocking legitimate bulk seeding from
cronjobs. the other validation steps (hostname format, domain bans,
dedup, describeServer liveness) provide sufficient protection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clients may percent-encode colons in DIDs (did%3Aplc%3A...), which
broke cursor-based pagination — the encoded cursor didn't match any
RocksDB key so the scan restarted from the beginning.
adds queryParamDecoded() with a hexVal helper for percent-decoding
into a caller-provided buffer. applied to cursor in
listReposByCollection and did in getRepoStatus/getLatestCommit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
inspired by lightrail (microcosm.blue/lightrail). uses RocksDB with two
column families (rbc: collection→did, cbr: did→collection) for bidirectional
prefix-scan lookups. phase 1: live indexing from firehose commit ops.
- collection_index.zig: RocksDB-backed index with add/remove/scan/removeAll
- subscriber: tracks collections on each validated commit, removes on tombstone
- main.zig: serves /xrpc/com.atproto.sync.listReposByCollection endpoint
- build.zig: adds rocksdb-zig dependency (Syndica/rocksdb-zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broken low-level HTTP request flow (req.reader() doesn't exist
in zig 0.15) with client.fetch() + Io.Writer.Allocating — same proven
pattern as xrpc/transport.zig. Move HTTP client outside pagination loop.
Also adds connected_inbound metric and removes per-host rate limiter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>