# AT Protocol Research Notes Reference material for AT Protocol integration. For implementation details, see [todo.md](todo.md). ## OAuth 2.1 Specification AT Protocol uses a specific profile of OAuth 2.1 for client↔PDS authorization. ### Required Components - **Client Metadata Endpoint**: Serve `client_metadata.json` at a public HTTPS URL (this URL becomes the `client_id`) ```json { "client_id": "https://your-app.com/oauth/client-metadata.json", "application_type": "web", "grant_types": ["authorization_code", "refresh_token"], "scope": "atproto transition:generic", "response_types": ["code"], "redirect_uris": ["https://your-app.com/oauth/callback"], "client_name": "Malfestio", "client_uri": "https://your-app.com" } ``` - **PKCE (Mandatory)**: Generate `code_verifier` and `code_challenge` (S256 only) - **DPoP (Mandatory)**: Bind tokens to client instances with proof-of-possession JWTs - **Handle/DID Resolution**: Resolve user identity to discover their PDS - **Token Exchange**: Authorization code flow with token refresh ### DPoP (Demonstrating Proof-of-Possession) DPoP (RFC 9449) binds access tokens to specific client instances, preventing token theft/replay. **Proof JWT Structure:** - **Header**: `typ: dpop+jwt`, `alg: EdDSA` (or ES256), `jwk: ` - **Payload Claims**: - `jti` — Unique identifier (nonce) per request - `htm` — HTTP method (e.g., "POST", "GET") - `htu` — HTTP target URI (without query/fragment) - `iat` — Issued-at timestamp - `ath` — SHA-256 hash of access token (for resource requests) - `nonce` — Server-provided nonce (if required) **Usage:** 1. Client generates DPoP keypair per session (not reused across devices/users) 2. Each request includes `Authorization: DPoP ` and `DPoP: ` 3. Server validates signature, checks claims match request, verifies token binding **Server Behavior:** - May return `DPoP-Nonce` header; client must include in subsequent proofs - Validates `jti` uniqueness to prevent replay attacks - Checks `ath` matches provided access token ## Record Publishing ### XRPC Endpoints - `com.atproto.repo.putRecord` — Create or update records - `com.atproto.repo.deleteRecord` — Remove records - `com.atproto.repo.uploadBlob` — Upload media attachments ### Record Keys Use TID (timestamp-based identifiers) per Lexicon spec. ### AT-URIs Format: `at:////` Example: `at://did:plc:abc123/org.stormlightlabs.malfestio.deck/3k5abc123` ## Firehose / Jetstream ### Overview The AT Protocol provides two main options for consuming real-time repository events: 1. **Raw Firehose** (`com.atproto.sync.subscribeRepos`) - Full-fidelity, CBOR-encoded, cryptographically signed 2. **Jetstream** - Simplified JSON format, lower bandwidth, easier to consume ### Raw Firehose - **WebSocket**: Subscribe to `com.atproto.sync.subscribeRepos` from a Relay - **CBOR Decoding**: Parse CAR files containing MST blocks - **Cryptographic Verification**: Validate commit signatures against DID signing keys - **Cursor Management**: Track `seq` position for reliable reconnection **Event Types:** - `#commit` - Repository changes (record create/update/delete) - `#identity` - DID/handle updates - `#account` - Account status changes (active, deactivated, etc.) ### Jetstream (Simplified) Bluesky's simplified JSON firehose - ideal for indexing and discovery: - **JSON format**: No CBOR decoding required - **zstd compression**: Reduced bandwidth (enable with `compress=true`) - **Collection filtering**: Subscribe to specific NSIDs - **DID filtering**: Watch specific accounts - **Cursor-based reconnection**: Microsecond timestamps **Public Endpoints:** - `wss://jetstream1.us-east.bsky.network/subscribe` - `wss://jetstream2.us-west.bsky.network/subscribe` **Tradeoffs:** - ⚠️ Events are NOT cryptographically signed (trust the Jetstream operator) - ⚠️ Not self-authenticating data - ✅ Much simpler to implement - ✅ Lower bandwidth and compute requirements ### Reliable Synchronization **Cursor Tracking:** - Store cursor position (microsecond timestamp) per endpoint - Resume from last processed cursor on reconnect - Handle gaps by fetching missing commits via `getRepo` if needed **Per-Repo Revision Tracking:** - Track latest `rev` (TID) for each DID - Compare incoming `rev` against stored value to detect gaps - Use `since` field to detect out-of-order events **Deletion Handling:** - Handle `operation: "delete"` in commit events - Mark records as deleted (soft or hard delete) **Best Practices:** - Process events sequentially per-DID (partition by DID) - Ignore events with `rev` ≤ stored latest rev - Validate records against Lexicon schema before indexing ## Well-Known Endpoints - `/.well-known/atproto-did` — Domain verification for handle claims - `/.well-known/oauth-protected-resource` — PDS OAuth metadata - `/.well-known/oauth-authorization-server` — Auth server metadata ## Labelers **Architecture:** 1. Labels = metadata (source DID + subject AT-URI + value string) 2. User Subscription = users subscribe to labelers; clients include in API requests 3. Label Interpretation = per-user config to hide, warn, or ignore content **Structure:** ```json { "src": "did:plc:labeler", "uri": "at://did:user/app.bsky.feed.post/123", "val": "spam", "cts": "2026-01-01T00:00:00Z" } ``` ## Feeds **Core Flow**: 1. User requests feed via at-uri of declared feed 2. PDS resolves at-uri → Feed Generator's DID doc 3. PDS sends `getFeedSkeleton` to service endpoint (authenticated by user's JWT) 4. Feed Generator returns skeleton (list of post URIs + cursor) 5. PDS hydrates skeleton with full content (via AppView) 6. Hydrated feed returned to user ## AppView **Responsibilities**: 1. Record Processing & Indexing - consume firehose, build indices for likes, threads, follows 2. Moderation Enforcement - apply labels from subscribed labelers 3. Query Interface - expose XRPC API (proxied through PDS) 4. Media CDN - fetch/cache blobs from upstream PDSes, generate thumbnails 5. Search & Discovery - full-text search, type-ahead, content ranking ## Patterns from Real AT Protocol Apps ### plyr.fm (Music) - OAuth 2.1 via `@atproto/oauth-client` library - Records synced to PDS: tracks, likes, playlists - Separate moderation service (Rust labeler) ### leaflet.pub (Writing) - React/Next.js frontend with Supabase + Replicache for sync - Bluesky integration via dedicated `lexicons/` and `appview/` directories ### wisp.place (Static Sites) - Stores site files as `place.wisp.fs` records in user's PDS - Firehose consumer to index and serve sites - CDN layer caches content from PDS ### Common Patterns 1. Local database for fast queries + PDS for portable, signed records 2. Firehose consumption for discovery/aggregation 3. OAuth 2.1 for production auth (app passwords only for development) 4. Lexicons define the public contract; internal state stays private ## References - [AT Protocol OAuth Spec](https://atproto.com/specs/oauth) - [Lexicon Schema Language](https://atproto.com/specs/lexicon) - [Repository & XRPC](https://atproto.com/specs/xrpc) - [Feed Generator Starter Kit](https://github.com/bluesky-social/feed-generator) - [atproto TypeScript SDK](https://github.com/bluesky-social/atproto) - [Ozone Moderation Service](https://github.com/bluesky-social/ozone) - [Jetstream Firehose](https://docs.bsky.app/blog/jetstream) - [Labels and Moderation Guide](https://docs.bsky.app/docs/advanced-guides/moderation)