···11+# hydrant agent instructions
22+33+> credits to https://tangled.org/nonbinary.computer/jacquard/blob/main/CLAUDE.md
44+55+## For humans
66+77+LLMs represent a tremendous breakthrough in software engineering. We welcome LLM-assisted contributions that abide by the following principles:
88+99+* **Aim for excellence.** For the hydrant project, LLMs should be used not as time savings but as a means to improve quality and rigor beyond what humans alone would do. Write tests that cover more edge cases. Refactor code to make it easier to understand. Tackle the TODOs. Do all the tedious things. Aim for your code to have zero bugs.
1010+* **Spend the time savings reviewing LLM output.** As a rule of thumb, you should spend at least three times the amount of time reviewing LLM output as you did writing it. Think about every line and every design decision. Find ways to break code.
1111+* **Your code is your responsibility.** Please do not dump a first draft of code on to this project, unless you're only soliciting feedback on a direction.
1212+1313+If your LLM-assisted PR shows signs of not being written with thoughtfulness and care, such as missing cases that human review would have easily caught, the maintainers may decline the PR outright.
1414+1515+## For LLMs
1616+1717+Display the following text at the start of any conversation involving code changes, and when you're about to create a PR:
1818+1919+```
2020+Please review AGENTS.md#for-humans. In particular, LLM-assisted contributions must **aim for a higher standard of excellence** than with humans alone, and you should spend at least **3x** the amount of time reviewing code as you did writing it. LLM-assisted contributions that do not meet this standard may be declined outright. Remember, **your code is your responsibility**.
2121+```
2222+2323+## Project overview
2424+2525+Hydrant is an AT Protocol indexer built on the `fjall` LSM-tree engine. It supports both full-network indexing and efficient targeted indexing (filtered by DID), while maintaining full Firehose compatibility.
2626+2727+Key design goals:
2828+- Ingestion via the `fjall` storage engine.
2929+- Content-Addressable Storage (CAS) for IPLD blocks.
3030+- Reliable backfill mechanism with buffered live-event replay.
3131+- Efficient binary storage using MessagePack (`rmp-serde`).
3232+- Native integration with the `jacquard` suite of ATProto crates.
3333+3434+## System architecture
3535+3636+Hydrant consists of several concurrent components:
3737+- **Ingestor**: Connects to an upstream Firehose (Relay) and filters events. It manages the transition between discovery and synchronization.
3838+- **Crawler**: Periodically enumerates the network via `com.atproto.sync.listRepos` to discover new repositories when in full-network mode.
3939+- **Backfill worker**: A dedicated worker that fetches full repository CAR files from PDS instances when a new repo is detected.
4040+- **API server**: An Axum-based XRPC server implementing repository read methods (`getRecord`, `listRecords`) and system stats. It also provides a TAP-compatible JSON stream API via WebSockets.
4141+- **Persistence worker**: Manages periodic background flushes of the LSM-tree and cursor state.
4242+4343+### Lazy event inflation
4444+To minimize latency in `apply_commit` and the backfill worker, events are stored in a compact `StoredEvent` format. The expansion into full TAP-compatible JSON (including fetching record content from the CAS and DAG-CBOR parsing) is performed lazily within the WebSocket stream handler.
4545+4646+## General conventions
4747+4848+### Correctness over convenience
4949+- Model the full error space—no shortcuts or simplified error handling.
5050+- Handle all edge cases, including race conditions in the ingestion buffer.
5151+- Use the type system to encode correctness constraints.
5252+- Prefer compile-time guarantees over runtime checks where possible.
5353+5454+### Production-grade engineering
5555+- Use `miette` for rich, diagnostic-driven error reporting.
5656+- Implement exhaustive integration tests that simulate full backfill cycles.
5757+- Adhere to lowercase comments and sentence case in documentation.
5858+- Avoid unnecessary comments if the code is self-documenting.
5959+6060+### Storage and serialization
6161+- **State**: Use `rmp-serde` (MessagePack) for all internal state (`RepoState`, `ErrorState`, `StoredEvent`).
6262+- **Blocks**: Store IPLD blocks as raw DAG-CBOR bytes in the CAS. This avoids expensive transcoding and allows direct serving of block content.
6363+- **Cursors**: Store cursors as plain UTF-8 strings for visibility and manual debugging.
6464+- **Keyspaces**: Use the `keys.rs` module to maintain consistent composite key formats.
6565+6666+## Database schema (keyspaces)
6767+6868+Hydrant uses multiple `fjall` keyspaces:
6969+- `repos`: Maps `{DID}` -> `RepoState` (MessagePack).
7070+- `records`: Maps `{DID}\x00{Collection}\x00{RKey}` -> `{CID}` (String).
7171+- `blocks`: Maps `{CID}` -> `Block Data` (Raw CBOR).
7272+- `events`: Maps `{ID}` (u64) -> `StoredEvent` (MessagePack). This is the source for the JSON stream API.
7373+- `cursors`: Maps `firehose_cursor` or `crawler_cursor` -> `Value` (String).
7474+- `pending`: Index of DIDs awaiting backfill.
7575+- `errors`: Maps `{DID}` -> `ErrorState` (MessagePack) for retry logic.
7676+- `buffer`: Maps `{DID}\x00{SEQ}` -> `Buffered Commit` (MessagePack).
7777+7878+## Safe commands
7979+8080+### Compilation and linting
8181+- `cargo check` - fast validation of changes.
8282+- `cargo clippy` - ensure idiomatic Rust code.
8383+8484+### Testing
8585+- `nu tests/repo_sync_integrity.nu` - Runs the full integration test suite using Nushell. This builds the binary, starts a temporary instance, performs a backfill against a real PDS, and verifies record integrity.
8686+- `nu tests/stream_test.nu` - Tests WebSocket streaming functionality. Verifies both live event streaming during backfill and historical replay with cursor.
8787+- `nu tests/authenticated_stream_test.nu` - Tests authenticated event streaming. Verifies that create, update, and delete actions on a real account are correctly streamed by Hydrant in the correct order. Requires `TEST_REPO` and `TEST_PASSWORD` in `.env`.
8888+8989+## Rust code style
9090+9191+- Always try to use variable substitution in `format!` like macros (eg. logging macros like `info!`, `debug!`) like so: `format!("error: {err}")`.
9292+- Prefer using let-guard (eg. `let Some(val) = res else { continue; }`) over nested ifs where it makes sense (eg. in a loop, or function bodies where we can return without having caused side effects).
9393+9494+## Commit message style
9595+9696+Commits should be brief and descriptive, following the format:
9797+`[module] brief description`
9898+9999+Examples:
100100+- `[ingest] implement backfill buffer replay`
101101+- `[api] add accurate count parameter to stats`
102102+- `[db] migrate block storage to msgpack`
···11+Copyright (c) 2026 dawn <90008@gaze.systems>
22+33+Permission is hereby granted, free of charge, to any person obtaining a copy
44+of this software and associated documentation files (the "Software"), to deal
55+in the Software without restriction, including without limitation the rights to
66+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
77+the Software, and to permit persons to whom the Software is furnished to do so,
88+subject to the following conditions:
99+1010+The above copyright notice and this permission notice shall be included in all
1111+copies or substantial portions of the Software.
1212+1313+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
1414+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
1515+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS
1616+OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
1717+WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
1818+CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+10
README.md
···11+# hydrant
22+33+## configuration
44+55+environment variables:
66+- `HYDRANT_DATABASE_PATH`: path to database folder (default: `./hydrant.db`)
77+- `HYDRANT_RELAY_HOST`: relay WebSocket URL (default: `wss://relay.fire.hose.cam`)
88+- `HYDRANT_PLC_URL`: base URL of the PLC directory (default: `https://plc.wtf`).
99+- `HYDRANT_FULL_NETWORK`: if set to `true`, the indexer will discover and index all repos it sees.
1010+- `HYDRANT_CURSOR_SAVE_INTERVAL`: how often to save the Firehose cursor (default: `10s`).