···3939- **[`hydrant::crawler`]**: Periodically enumerates the network via `com.atproto.sync.listRepos` to discover new repositories when in full-network mode.
4040- **[`hydrant::resolver`]**: Manages DID resolution and key lookups. Supports multiple PLC directory sources with failover and caching.
4141- **[`hydrant::backfill`]**: A dedicated worker that fetches full repository CAR files. Uses LIFO prioritization and adaptive concurrency to manage backfill load efficiently.
4242-- **[`hydrant::api`]**: An Axum-based XRPC server implementing repository read methods (`getRecord`, `listRecords`) and system stats. It also provides a event stream API via WebSockets.
4242+- **[`hydrant::api`]**: An Axum-based XRPC server implementing repository read methods (`getRecord`, `listRecords`) and system stats. It also provides a WebSocket event stream and a filter management API (`GET`/`PATCH /filter`) for configuring indexing mode, DID lists, signals, and collection patterns.
4343- **Persistence worker** (in `src/main.rs`): Manages periodic background flushes of the LSM-tree and cursor state.
44444545### Lazy event inflation
···8282- `resync`: Maps `{DID}` -> `ResyncState` (MessagePack) for retry logic/tombstones.
8383- `resync_buffer`: Maps `{DID}|{Rev}` -> `Commit` (MessagePack). Used to buffer live events during backfill.
8484- `counts`: Maps `k|{NAME}` or `r|{DID}|{COL}` -> `Count` (u64 BE Bytes).
8585+- `filter`: Stores filter config: mode key `m` -> `FilterMode` (MessagePack), and set entries for DIDs (`d|{DID}`), signals (`s|{NSID}`), collections (`c|{NSID}`), and excludes (`x|{DID}`) -> empty value.
85868687## Safe commands
8788
+74-9
README.md
···11# hydrant
2233-`hydrant` is an AT Protocol indexer built on the `fjall` database. it's meant to be a flexible indexer, supporting both full-network indexing and filtered indexing (e.g., by DID), also allowing querying with XRPC's like `com.atproto.sync.getRepo`, `com.atproto.repo.listRecords`, and so on, which should allow many more usecases compared to just providing an event stream.
33+`hydrant` is an AT Protocol indexer built on the `fjall` database. it's meant to be a flexible indexer, supporting both full-network indexing and filtered indexing (e.g., by DID), also allowing querying with XRPCs and providing an ordered event stream with cursor support.
4455## configuration
66···40404141### management
42424343-- `POST /repo/add`: register a DID, start backfilling, and subscribe to updates.
4444- - body: `{ "dids": ["did:plc:..."] }`
4545-- `POST /repo/remove`: unregister a DID and delete all associated data.
4646- - body: `{ "dids": ["did:plc:..."] }`
4343+- `GET /filter`: get the current filter configuration.
4444+- `PATCH /filter`: update the filter configuration.
4545+4646+#### filter mode
4747+4848+the `mode` field controls what gets indexed:
4949+5050+| mode | behaviour |
5151+| :--- | :--- |
5252+| `dids` | only index repositories explicitly listed in `dids`. new accounts seen on the firehose are ignored unless they are in the list. |
5353+| `signal` | like `dids`, but also auto-discovers and backfills any account whose firehose commit touches a collection matching one of the `signals` patterns. |
5454+| `full` | index the entire network. `dids` and `signals` are ignored for discovery, but `excludes` and `collections` still apply. |
5555+5656+#### fields
5757+5858+| field | type | description |
5959+| :--- | :--- | :--- |
6060+| `mode` | `"dids"` \| `"signal"` \| `"full"` | indexing mode (see above). |
6161+| `dids` | set update | set of DIDs to explicitly track. in `dids` and `signal` modes, always processed regardless of signal matching. adding an untracked DID enqueues a backfill. |
6262+| `signals` | set update | NSID patterns (e.g. `app.bsky.feed.post` or `app.bsky.*`) that trigger auto-discovery in `signal` mode. |
6363+| `collections` | set update | NSID patterns used to filter which records are stored. if empty, all collections are stored. applies in all modes. |
6464+| `excludes` | set update | set of DIDs to always skip, regardless of mode. checked before any other filter logic. |
6565+6666+#### set updates
6767+6868+each set field accepts one of two forms:
6969+7070+- **replace**: an array replaces the entire set — `["did:plc:abc", "did:plc:xyz"]`
7171+- **patch**: an object maps items to `true` (add) or `false` (remove) — `{"did:plc:abc": true, "did:plc:xyz": false}`
7272+7373+#### NSID patterns
7474+7575+`signals` and `collections` support an optional `.*` suffix to match an entire namespace:
7676+7777+- `app.bsky.feed.post` — exact match only
7878+- `app.bsky.feed.*` — matches any collection under `app.bsky.feed`
47794880### data access (xrpc)
49815050-`hydrant` implements some AT Protocol XRPC endpoints for reading data:
8282+`hydrant` implements the following XRPC endpoints under `/xrpc/`:
8383+8484+#### `com.atproto.repo.getRecord`
8585+8686+retrieve a single record by its AT-URI components.
8787+8888+| param | required | description |
8989+| :--- | :--- | :--- |
9090+| `repo` | yes | DID or handle of the repository. |
9191+| `collection` | yes | NSID of the collection. |
9292+| `rkey` | yes | record key. |
9393+9494+returns the record value, its CID, and its AT-URI. responds with `RecordNotFound` if not present.
9595+9696+#### `com.atproto.repo.listRecords`
9797+9898+list records in a collection, newest-first by default.
9999+100100+| param | required | description |
101101+| :--- | :--- | :--- |
102102+| `repo` | yes | DID or handle of the repository. |
103103+| `collection` | yes | NSID of the collection. |
104104+| `limit` | no | max records to return (default `50`, max `100`). |
105105+| `cursor` | no | opaque cursor for pagination (from a previous response). |
106106+| `reverse` | no | if `true`, iterates oldest-first. |
511075252-- `com.atproto.repo.getRecord`: retrieve a single record by collection and rkey.
5353-- `com.atproto.repo.listRecords`: list records in a collection, with pagination.
5454-- `systems.gaze.hydrant.countRecords`: count records in a collection.
108108+returns `{ records, cursor }`. if `cursor` is present there are more results.
109109+110110+#### `systems.gaze.hydrant.countRecords`
111111+112112+return the total number of stored records in a collection.
113113+114114+| param | required | description |
115115+| :--- | :--- | :--- |
116116+| `identifier` | yes | DID or handle of the repository. |
117117+| `collection` | yes | NSID of the collection. |
118118+119119+returns `{ count }`.
5512056121### event stream
57122