commits
relay_frames_received_total was only incremented in broadcast(), missing
events dropped by inactive-account filtering or validation failure.
move the counter to subscriber's serverMessage after frame decode, matching
the Go relay's events_received_counter which counts at processRepoEvent entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the 64KB stack buffer overflows with 2800+ hosts, causing 502 responses.
switch to ArrayList writer for unbounded JSON response. also adds the
missing checkAdmin gate that other admin endpoints already have.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled HTTP parser with zig 0.15's std.http.Server,
which provides buffered I/O that correctly handles POST bodies
split across TCP segments behind traefik. This is the proper fix
for the issue that commit 4733560 worked around with a multi-read loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reverse proxies (traefik) may send headers and body in separate
TCP writes. the single stream.read() only got headers, leaving
the body empty — causing all POST endpoints (requestCrawl, ban,
block/unblock) to fail with "invalid JSON" through the ingress.
now parses Content-Length from headers and keeps reading until the
full body is received.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the Go relay's HostPerDayLimiter (50/day default) was copied
wholesale, but it only makes sense there because it gates the
public requestCrawl endpoint while admin operations bypass it.
zlay applies it uniformly, blocking legitimate bulk seeding from
cronjobs. the other validation steps (hostname format, domain bans,
dedup, describeServer liveness) provide sufficient protection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clients may percent-encode colons in DIDs (did%3Aplc%3A...), which
broke cursor-based pagination — the encoded cursor didn't match any
RocksDB key so the scan restarted from the beginning.
adds queryParamDecoded() with a hexVal helper for percent-decoding
into a caller-provided buffer. applied to cursor in
listReposByCollection and did in getRepoStatus/getLatestCommit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dockerfile.runtime assumes the binary is already built (zig build)
and just copies it into debian:bookworm-slim. Used by the remote
deploy flow where zig compiles natively on the x86_64 server.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch to zzstoatzz/websocket.zig fix/handshake-tcp-split branch
which guards the handshake parser against TCP splitting data between
\r and \n, preventing panics under load.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Write headers and body as separate writes instead of formatting
everything into a single 4KB buffer. Fixes 502 errors when
listReposByCollection returns >90 results.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig 0.15's C++ codegen for musl produces illegal instructions in
RocksDB's LRU cache (std::function vtable). Switch to Debian
bookworm-slim with x86_64-linux-gnu target.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
QEMU emulation exposes CPU features that don't match the deployment
target. Force baseline x86_64 to avoid illegal instructions in RocksDB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The archive uses zig-x86_64-linux format, not zig-linux-x86_64.
Also create /data/collection-index directory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cross-compiling RocksDB C++ from ARM64 produces illegal instructions.
build inside x86_64 container instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
inspired by lightrail (microcosm.blue/lightrail). uses RocksDB with two
column families (rbc: collection→did, cbr: did→collection) for bidirectional
prefix-scan lookups. phase 1: live indexing from firehose commit ops.
- collection_index.zig: RocksDB-backed index with add/remove/scan/removeAll
- subscriber: tracks collections on each validated commit, removes on tombstone
- main.zig: serves /xrpc/com.atproto.sync.listReposByCollection endpoint
- build.zig: adds rocksdb-zig dependency (Syndica/rocksdb-zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bsky.network was already in the host table from the old code, so
the bootstrap condition (len == 0) never triggered. pullHosts is
idempotent — getOrCreateHost skips existing hosts — so just run
it every startup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broken low-level HTTP request flow (req.reader() doesn't exist
in zig 0.15) with client.fetch() + Io.Writer.Allocating — same proven
pattern as xrpc/transport.zig. Move HTTP client outside pagination loop.
Also adds connected_inbound metric and removes per-host rate limiter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- host validation pipeline: format check, domain ban, SSRF protection,
describeServer liveness, relay loop detection via Server header
- dual account status: host_id + upstream_status columns, combined
local/upstream active check on XRPC endpoints
- #account event processing: updates upstream_status, drops commits
for inactive accounts
- per-host rate limiting: 100 evt/sec token bucket per subscriber
- per-day new host rate limit: 50/day (configurable)
- FutureCursor handling: per-subscriber shutdown, host set to idle
- time-based cursor flush every 4s (matches Go relay)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
downstream consumers now see monotonically increasing relay sequence
numbers in the CBOR payload instead of overlapping upstream PDS sequences.
resequencing happens in both the live broadcast path and disk replay path,
with graceful fallback to raw bytes on decode failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- host table for tracking known PDS instances with cursor/status/failures
- refactor Subscriber into per-host worker with cursor persistence and
failure tracking (exhausted after 15 consecutive failures)
- Slurper manages one Subscriber thread per active host, processes crawl
requests asynchronously
- requestCrawl routes through Slurper to spawn new host workers
- admin endpoints: GET /admin/hosts, POST /admin/hosts/block, POST /admin/hosts/unblock
- RELAY_UPSTREAM becomes seed host (backward compatible single-host default)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
raw CID bytes were stored as hex — now uses zat's multibase encoder
for proper CIDv1 base32lower strings (b... prefix) matching the Go
relay's format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the legacy verifyCommitCar path was returning null data_cid, causing
updateAccountState to never run. now returns commit_cid from the CAR
root. also fixes the subscriber to always call updateAccountState when
rev is available (not just when data_cid is non-null).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
implements the three high-priority sync endpoints matching the Go indigo
relay's response format. listRepos supports cursor/limit pagination,
getRepoStatus returns account status + rev, getLatestCommit checks
account status before returning CID/rev.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
replace zqlite with pg.zig for metadata storage. schema now matches
the Go indigo relay (account, account_repo, log_file_refs, domain_ban).
removes the seqToSqlite/sqliteToSeq XOR hack — Postgres BIGINT handles
u64 natively. DATABASE_URL env var replaces RELAY_DB_PATH.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
upstream firehose seq numbers exceed i64 max — switch all seq types
to u64 across subscriber, broadcaster, ring_buffer, and event_log.
SQLite storage uses XOR-with-sign-bit encoding to preserve ordering.
update zat dependency to v0.2.7 (adds cbor.getUint).
also: bind WebSocket server to 0.0.0.0 (was 127.0.0.1, unreachable
in k8s), add /favicon.svg route.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig cross-compiles to x86_64-linux natively — no Docker emulation
needed. Dockerfile is now just alpine + pre-built static binary.
added Spindle CI workflow for format check + build verification.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
macOS sigset_t is an integer, linux is [16]c_ulong.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
matches indigo's pattern — purge DID key cache before any other
processing so next commit triggers fresh DID doc resolution.
handles did:web key rotation (ephemeral keys on PDS restart).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
subscriber (upstream firehose), validator (DID key cache + sig verification),
broadcaster (WebSocket fan-out), event_log (SQLite persistence), ring_buffer.
depends on zat SDK for CBOR/CAR/MST/crypto primitives.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
relay_frames_received_total was only incremented in broadcast(), missing
events dropped by inactive-account filtering or validation failure.
move the counter to subscriber's serverMessage after frame decode, matching
the Go relay's events_received_counter which counts at processRepoEvent entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled HTTP parser with zig 0.15's std.http.Server,
which provides buffered I/O that correctly handles POST bodies
split across TCP segments behind traefik. This is the proper fix
for the issue that commit 4733560 worked around with a multi-read loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reverse proxies (traefik) may send headers and body in separate
TCP writes. the single stream.read() only got headers, leaving
the body empty — causing all POST endpoints (requestCrawl, ban,
block/unblock) to fail with "invalid JSON" through the ingress.
now parses Content-Length from headers and keeps reading until the
full body is received.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the Go relay's HostPerDayLimiter (50/day default) was copied
wholesale, but it only makes sense there because it gates the
public requestCrawl endpoint while admin operations bypass it.
zlay applies it uniformly, blocking legitimate bulk seeding from
cronjobs. the other validation steps (hostname format, domain bans,
dedup, describeServer liveness) provide sufficient protection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clients may percent-encode colons in DIDs (did%3Aplc%3A...), which
broke cursor-based pagination — the encoded cursor didn't match any
RocksDB key so the scan restarted from the beginning.
adds queryParamDecoded() with a hexVal helper for percent-decoding
into a caller-provided buffer. applied to cursor in
listReposByCollection and did in getRepoStatus/getLatestCommit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
inspired by lightrail (microcosm.blue/lightrail). uses RocksDB with two
column families (rbc: collection→did, cbr: did→collection) for bidirectional
prefix-scan lookups. phase 1: live indexing from firehose commit ops.
- collection_index.zig: RocksDB-backed index with add/remove/scan/removeAll
- subscriber: tracks collections on each validated commit, removes on tombstone
- main.zig: serves /xrpc/com.atproto.sync.listReposByCollection endpoint
- build.zig: adds rocksdb-zig dependency (Syndica/rocksdb-zig)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broken low-level HTTP request flow (req.reader() doesn't exist
in zig 0.15) with client.fetch() + Io.Writer.Allocating — same proven
pattern as xrpc/transport.zig. Move HTTP client outside pagination loop.
Also adds connected_inbound metric and removes per-host rate limiter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- host validation pipeline: format check, domain ban, SSRF protection,
describeServer liveness, relay loop detection via Server header
- dual account status: host_id + upstream_status columns, combined
local/upstream active check on XRPC endpoints
- #account event processing: updates upstream_status, drops commits
for inactive accounts
- per-host rate limiting: 100 evt/sec token bucket per subscriber
- per-day new host rate limit: 50/day (configurable)
- FutureCursor handling: per-subscriber shutdown, host set to idle
- time-based cursor flush every 4s (matches Go relay)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
downstream consumers now see monotonically increasing relay sequence
numbers in the CBOR payload instead of overlapping upstream PDS sequences.
resequencing happens in both the live broadcast path and disk replay path,
with graceful fallback to raw bytes on decode failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- host table for tracking known PDS instances with cursor/status/failures
- refactor Subscriber into per-host worker with cursor persistence and
failure tracking (exhausted after 15 consecutive failures)
- Slurper manages one Subscriber thread per active host, processes crawl
requests asynchronously
- requestCrawl routes through Slurper to spawn new host workers
- admin endpoints: GET /admin/hosts, POST /admin/hosts/block, POST /admin/hosts/unblock
- RELAY_UPSTREAM becomes seed host (backward compatible single-host default)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the legacy verifyCommitCar path was returning null data_cid, causing
updateAccountState to never run. now returns commit_cid from the CAR
root. also fixes the subscriber to always call updateAccountState when
rev is available (not just when data_cid is non-null).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
implements the three high-priority sync endpoints matching the Go indigo
relay's response format. listRepos supports cursor/limit pagination,
getRepoStatus returns account status + rev, getLatestCommit checks
account status before returning CID/rev.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
replace zqlite with pg.zig for metadata storage. schema now matches
the Go indigo relay (account, account_repo, log_file_refs, domain_ban).
removes the seqToSqlite/sqliteToSeq XOR hack — Postgres BIGINT handles
u64 natively. DATABASE_URL env var replaces RELAY_DB_PATH.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
upstream firehose seq numbers exceed i64 max — switch all seq types
to u64 across subscriber, broadcaster, ring_buffer, and event_log.
SQLite storage uses XOR-with-sign-bit encoding to preserve ordering.
update zat dependency to v0.2.7 (adds cbor.getUint).
also: bind WebSocket server to 0.0.0.0 (was 127.0.0.1, unreachable
in k8s), add /favicon.svg route.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>