learn and share notes on atproto (wip) 🦉
malfestio.stormlightlabs.org/
readability
solid
axum
atproto
srs
1# AT Protocol Research Notes
2
3Reference material for AT Protocol integration. For implementation details, see [todo.md](todo.md).
4
5## OAuth 2.1 Specification
6
7AT Protocol uses a specific profile of OAuth 2.1 for client↔PDS authorization.
8
9### Required Components
10
11- **Client Metadata Endpoint**: Serve `client_metadata.json` at a public HTTPS URL (this URL becomes the `client_id`)
12
13 ```json
14 {
15 "client_id": "https://your-app.com/oauth/client-metadata.json",
16 "application_type": "web",
17 "grant_types": ["authorization_code", "refresh_token"],
18 "scope": "atproto transition:generic",
19 "response_types": ["code"],
20 "redirect_uris": ["https://your-app.com/oauth/callback"],
21 "client_name": "Malfestio",
22 "client_uri": "https://your-app.com"
23 }
24 ```
25
26- **PKCE (Mandatory)**: Generate `code_verifier` and `code_challenge` (S256 only)
27- **DPoP (Mandatory)**: Bind tokens to client instances with proof-of-possession JWTs
28- **Handle/DID Resolution**: Resolve user identity to discover their PDS
29- **Token Exchange**: Authorization code flow with token refresh
30
31### DPoP (Demonstrating Proof-of-Possession)
32
33DPoP (RFC 9449) binds access tokens to specific client instances, preventing token theft/replay.
34
35**Proof JWT Structure:**
36
37- **Header**: `typ: dpop+jwt`, `alg: EdDSA` (or ES256), `jwk: <public key>`
38- **Payload Claims**:
39 - `jti` — Unique identifier (nonce) per request
40 - `htm` — HTTP method (e.g., "POST", "GET")
41 - `htu` — HTTP target URI (without query/fragment)
42 - `iat` — Issued-at timestamp
43 - `ath` — SHA-256 hash of access token (for resource requests)
44 - `nonce` — Server-provided nonce (if required)
45
46**Usage:**
47
481. Client generates DPoP keypair per session (not reused across devices/users)
492. Each request includes `Authorization: DPoP <token>` and `DPoP: <proof JWT>`
503. Server validates signature, checks claims match request, verifies token binding
51
52**Server Behavior:**
53
54- May return `DPoP-Nonce` header; client must include in subsequent proofs
55- Validates `jti` uniqueness to prevent replay attacks
56- Checks `ath` matches provided access token
57
58## Record Publishing
59
60### XRPC Endpoints
61
62- `com.atproto.repo.putRecord` — Create or update records
63- `com.atproto.repo.deleteRecord` — Remove records
64- `com.atproto.repo.uploadBlob` — Upload media attachments
65
66### Record Keys
67
68Use TID (timestamp-based identifiers) per Lexicon spec.
69
70### AT-URIs
71
72Format: `at://<did>/<collection>/<rkey>`
73
74Example: `at://did:plc:abc123/org.stormlightlabs.malfestio.deck/3k5abc123`
75
76## Firehose / Jetstream
77
78### Overview
79
80The AT Protocol provides two main options for consuming real-time repository events:
81
821. **Raw Firehose** (`com.atproto.sync.subscribeRepos`) - Full-fidelity, CBOR-encoded, cryptographically signed
832. **Jetstream** - Simplified JSON format, lower bandwidth, easier to consume
84
85### Raw Firehose
86
87- **WebSocket**: Subscribe to `com.atproto.sync.subscribeRepos` from a Relay
88- **CBOR Decoding**: Parse CAR files containing MST blocks
89- **Cryptographic Verification**: Validate commit signatures against DID signing keys
90- **Cursor Management**: Track `seq` position for reliable reconnection
91
92**Event Types:**
93
94- `#commit` - Repository changes (record create/update/delete)
95- `#identity` - DID/handle updates
96- `#account` - Account status changes (active, deactivated, etc.)
97
98### Jetstream (Simplified)
99
100Bluesky's simplified JSON firehose - ideal for indexing and discovery:
101
102- **JSON format**: No CBOR decoding required
103- **zstd compression**: Reduced bandwidth (enable with `compress=true`)
104- **Collection filtering**: Subscribe to specific NSIDs
105- **DID filtering**: Watch specific accounts
106- **Cursor-based reconnection**: Microsecond timestamps
107
108**Public Endpoints:**
109
110- `wss://jetstream1.us-east.bsky.network/subscribe`
111- `wss://jetstream2.us-west.bsky.network/subscribe`
112
113**Tradeoffs:**
114
115- ⚠️ Events are NOT cryptographically signed (trust the Jetstream operator)
116- ⚠️ Not self-authenticating data
117- ✅ Much simpler to implement
118- ✅ Lower bandwidth and compute requirements
119
120### Reliable Synchronization
121
122**Cursor Tracking:**
123
124- Store cursor position (microsecond timestamp) per endpoint
125- Resume from last processed cursor on reconnect
126- Handle gaps by fetching missing commits via `getRepo` if needed
127
128**Per-Repo Revision Tracking:**
129
130- Track latest `rev` (TID) for each DID
131- Compare incoming `rev` against stored value to detect gaps
132- Use `since` field to detect out-of-order events
133
134**Deletion Handling:**
135
136- Handle `operation: "delete"` in commit events
137- Mark records as deleted (soft or hard delete)
138
139**Best Practices:**
140
141- Process events sequentially per-DID (partition by DID)
142- Ignore events with `rev` ≤ stored latest rev
143- Validate records against Lexicon schema before indexing
144
145## Well-Known Endpoints
146
147- `/.well-known/atproto-did` — Domain verification for handle claims
148- `/.well-known/oauth-protected-resource` — PDS OAuth metadata
149- `/.well-known/oauth-authorization-server` — Auth server metadata
150
151## Labelers
152
153**Architecture:**
154
1551. Labels = metadata (source DID + subject AT-URI + value string)
1562. User Subscription = users subscribe to labelers; clients include in API requests
1573. Label Interpretation = per-user config to hide, warn, or ignore content
158
159**Structure:**
160
161```json
162{
163 "src": "did:plc:labeler",
164 "uri": "at://did:user/app.bsky.feed.post/123",
165 "val": "spam",
166 "cts": "2026-01-01T00:00:00Z"
167}
168```
169
170## Feeds
171
172**Core Flow**:
173
1741. User requests feed via at-uri of declared feed
1752. PDS resolves at-uri → Feed Generator's DID doc
1763. PDS sends `getFeedSkeleton` to service endpoint (authenticated by user's JWT)
1774. Feed Generator returns skeleton (list of post URIs + cursor)
1785. PDS hydrates skeleton with full content (via AppView)
1796. Hydrated feed returned to user
180
181## AppView
182
183**Responsibilities**:
184
1851. Record Processing & Indexing - consume firehose, build indices for likes, threads, follows
1862. Moderation Enforcement - apply labels from subscribed labelers
1873. Query Interface - expose XRPC API (proxied through PDS)
1884. Media CDN - fetch/cache blobs from upstream PDSes, generate thumbnails
1895. Search & Discovery - full-text search, type-ahead, content ranking
190
191## Patterns from Real AT Protocol Apps
192
193### plyr.fm (Music)
194
195- OAuth 2.1 via `@atproto/oauth-client` library
196- Records synced to PDS: tracks, likes, playlists
197- Separate moderation service (Rust labeler)
198
199### leaflet.pub (Writing)
200
201- React/Next.js frontend with Supabase + Replicache for sync
202- Bluesky integration via dedicated `lexicons/` and `appview/` directories
203
204### wisp.place (Static Sites)
205
206- Stores site files as `place.wisp.fs` records in user's PDS
207- Firehose consumer to index and serve sites
208- CDN layer caches content from PDS
209
210### Common Patterns
211
2121. Local database for fast queries + PDS for portable, signed records
2132. Firehose consumption for discovery/aggregation
2143. OAuth 2.1 for production auth (app passwords only for development)
2154. Lexicons define the public contract; internal state stays private
216
217## References
218
219- [AT Protocol OAuth Spec](https://atproto.com/specs/oauth)
220- [Lexicon Schema Language](https://atproto.com/specs/lexicon)
221- [Repository & XRPC](https://atproto.com/specs/xrpc)
222- [Feed Generator Starter Kit](https://github.com/bluesky-social/feed-generator)
223- [atproto TypeScript SDK](https://github.com/bluesky-social/atproto)
224- [Ozone Moderation Service](https://github.com/bluesky-social/ozone)
225- [Jetstream Firehose](https://docs.bsky.app/blog/jetstream)
226- [Labels and Moderation Guide](https://docs.bsky.app/docs/advanced-guides/moderation)