An easy-to-host PDS on the ATProtocol, MacOS. Grandma-approved.

Initial commit: ezpds design documentation

Four reconciled spec documents covering architecture, provisioning API,
data migration, and mobile architecture — plus three new specs (OAuth,
blob handling, unified milestone map) and a cross-spec gap analysis.

All documents aligned to mobile-first, relay-signs model with four-phase
milestones (v0.1 mobile-only → v0.2 desktop enrollment → v1.0 production
→ v2.0+ signing sovereignty). Original pre-reconciliation documents
preserved in docs/archive/ for transparency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+5501
+341
docs/archive/architecture-v8-amendments.md
··· 1 + # Architecture v8 Amendments 2 + 3 + Changes required to update the PDS Architecture from v7 to v8. Each section maps to an action item from cross-spec-analysis.md. 4 + 5 + --- 6 + 7 + ## Changelog Entry (replace v7 changelog) 8 + 9 + ``` 10 + v8 Changes — Mobile-First Reconciliation 11 + 12 + Architecture reconciled with mobile architecture spec v1.2 (canonical). 13 + The relay is no longer just a tunnel+proxy — in the mobile-only phase, 14 + it IS the PDS. 15 + 16 + NEW Four-phase milestone model (v0.1 / v0.2 / v1.0 / v2.0+) 17 + NEW Phase 0: Mobile-Only lifecycle (relay as full PDS) 18 + FIX Signing model: relay always signs, device constructs unsigned commits 19 + FIX Tier model: Free/Pro/Business + BYO as deployment model 20 + FIX Firehose: native emission (mobile-only) vs proxy (desktop-enrolled) 21 + FIX Shamir: basic share generation moves to v0.1 (required at onboarding) 22 + FIX DID keystore: Shamir split required at account creation, not v1.0 23 + REF See unified-milestone-map.md for phase details 24 + ``` 25 + 26 + --- 27 + 28 + ## Item 1: Update Signing Model 29 + 30 + **Current (v7):** The architecture implies the device signs commits. The Iroh Endpoint description says "Pushes signed repo commits to relay when online." The Custom PDS Shell description says "Owns XRPC surface and sync protocol." 31 + 32 + **Change:** The relay always holds the signing key and signs all commits. The desktop constructs the Merkle tree and unsigned commits, sends them to the relay via Iroh, and the relay signs. 33 + 34 + ### Iroh Endpoint — new description 35 + 36 + ``` 37 + QUIC-based tunnel to relay. NAT traversal, connection resumption on wake. 38 + Pushes unsigned repo commits to relay for signing when online. 39 + ``` 40 + 41 + ### Custom PDS Shell — new description 42 + 43 + ``` 44 + Purpose-built embedded repo engine. SQLite-backed, local-first. Constructs 45 + Merkle tree, builds unsigned commits, manages local repo state. Does NOT 46 + hold signing keys — relay signs all commits. The core technical risk — 47 + repo construction must produce valid MST structures that the relay can sign 48 + and federate. 49 + ``` 50 + 51 + --- 52 + 53 + ## Item 2: Update Data Flow 54 + 55 + **Current (v7):** No explicit data flow diagram, but the component descriptions imply: device creates → device signs → pushes to relay → relay proxies. 56 + 57 + **Change:** Add a data flow section or update descriptions to show: 58 + 59 + ``` 60 + Desktop-Enrolled Write Path: 61 + 1. App creates record via XRPC (Tauri webview → Rust backend) 62 + 2. PDS shell constructs MST diff + unsigned commit 63 + 3. Unsigned commit sent to relay via Iroh tunnel 64 + 4. Relay signs commit with P-256 signing key 65 + 5. Relay stores signed commit in buffer 66 + 6. Relay emits to firehose / serves via XRPC 67 + 68 + Mobile-Only Write Path (v0.1): 69 + 1. Third-party app (e.g. Bluesky) calls relay XRPC directly 70 + 2. Relay constructs record, MST diff, signs commit 71 + 3. Relay stores and emits to firehose 72 + 73 + Desktop-Offline Read Path: 74 + 1. XRPC read request hits relay 75 + 2. Relay serves from commit buffer / repo cache 76 + 3. Writes return 503 (relay cannot construct commits from stale state) 77 + ``` 78 + 79 + --- 80 + 81 + ## Item 3: Update PDS Shell Description 82 + 83 + **Current (v7):** "Owns XRPC surface and sync protocol." 84 + 85 + **Change:** The PDS shell is a repo construction engine, not the XRPC authority. In mobile-only mode, the relay owns the XRPC surface. In desktop-enrolled mode, the relay proxies writes to the desktop's PDS shell, which constructs the commit, then the relay signs it. 86 + 87 + ### Custom PDS Shell — new name suggestion: "Repo Engine" 88 + 89 + ``` 90 + Purpose-built repo construction engine. SQLite-backed, local-first. 91 + Builds MST structures, constructs unsigned commits, manages collection 92 + storage. In desktop-enrolled mode, the relay proxies XRPC writes here, 93 + then signs the resulting commits. Does not serve XRPC directly to the 94 + network — the relay is always the network-facing endpoint. 95 + ``` 96 + 97 + --- 98 + 99 + ## Item 4: Add Phase 0 — Mobile-Only Lifecycle 100 + 101 + **Current (v7):** The architecture is entirely desktop-centric. Layer 01 is "Device Layer — User's Mac." There's no representation of the mobile-only phase where no desktop exists. 102 + 103 + **Change:** Add a new lifecycle section before the layer diagram, or add a prominent callout: 104 + 105 + ### New Section: "Device Lifecycle Phases" 106 + 107 + Place before Layer 01. This section frames the entire architecture: 108 + 109 + ``` 110 + Device Lifecycle Phases 111 + 112 + The product launches mobile-first. The relay is a full PDS before any 113 + desktop is involved. 114 + 115 + Phase: Mobile-Only (v0.1) 116 + Relay behavior: Full PDS — hosts repo, serves XRPC, signs commits, 117 + emits firehose 118 + Repo location: Relay (primary and only copy) 119 + Phone role: Identity wallet (key management, device admin) 120 + Desktop: Does not exist yet 121 + 122 + Phase: Desktop-Enrolled (v0.2) 123 + Relay behavior: XRPC proxy + signer — forwards writes to desktop, 124 + signs commits, serves reads from cache 125 + Repo location: Desktop (primary), relay (cache) 126 + Phone role: Identity wallet + device manager 127 + Desktop: Runs repo engine, constructs unsigned commits 128 + 129 + Phase: Desktop-Offline (v0.2+) 130 + Relay behavior: Serves reads from cache, 503 on writes 131 + Repo location: Desktop (authoritative but unreachable) 132 + Phone role: Same as desktop-enrolled 133 + Desktop: Sleeping / powered off 134 + ``` 135 + 136 + ### Impact on Layer 01 137 + 138 + The current "Device Layer — User's Mac" heading should change to acknowledge that in v0.1, there is no Mac. Options: 139 + 140 + - Rename to "Device Layer — Desktop (v0.2+)" and note that v0.1 has no device layer 141 + - Or add a "Mobile-Only" badge to relay components that serve as the PDS in v0.1 142 + 143 + ### Impact on Layer 02 144 + 145 + The relay layer needs a dual identity: 146 + 147 + ``` 148 + In mobile-only phase: Relay IS the PDS (not just a proxy) 149 + In desktop-enrolled phase: Relay is proxy + signer (current description) 150 + ``` 151 + 152 + The Iroh Relay Node description should add: 153 + 154 + ``` 155 + In mobile-only mode, serves as full PDS — no tunnel needed, relay handles 156 + all XRPC directly. In desktop-enrolled mode, acts as tunnel endpoint for 157 + device ↔ relay communication. 158 + ``` 159 + 160 + --- 161 + 162 + ## Item 5: Fix Tier Model 163 + 164 + **Current (v7):** The tier section is titled "Relay Tier Pricing — v1.0 Launch." The Free tier includes "BYO relay supported (Nix / Docker)." 165 + 166 + **Change:** BYO relay is a deployment model, not a tier feature. Free/Pro/Business are subscription tiers for the managed relay. BYO is orthogonal — a BYO operator runs their own relay and doesn't subscribe to any tier. 167 + 168 + ### Tier Section — changes 169 + 170 + 1. Remove "BYO relay supported (Nix / Docker)" from Free tier bullet list. 171 + 2. Add a separate section after tiers: 172 + 173 + ``` 174 + BYO Relay (Self-Hosted) 175 + 176 + Not a subscription tier — an alternative deployment model. Operators run 177 + their own relay binary (distributed via Nix flake or Docker image). 178 + 179 + Includes: 180 + - Full relay functionality (identical binary to managed relay) 181 + - SQLite or PostgreSQL backend (operator's choice) 182 + - Local or S3-compatible blob storage 183 + - No subscription fees — operator provides their own infrastructure 184 + - No managed monitoring or support 185 + 186 + Available at v1.0 launch. 187 + ``` 188 + 189 + 3. In the Free tier, replace the BYO line with: 190 + ``` 191 + - See "BYO Relay" section for self-hosted option 192 + ``` 193 + 194 + --- 195 + 196 + ## Item 6: Distinguish Firehose Native vs Proxy 197 + 198 + **Current (v7):** The Firehose Proxy component is tagged v1.0 with description: "Persistent WebSocket to BGS on behalf of PDS. Streams commits continuously. BGS sees 100% uptime. Key paid feature." 199 + 200 + This conflates two different things: 201 + 202 + 1. **Native firehose emission** — the relay emitting events from its own repo (mobile-only mode). This is required at v0.1 for federation. Every PDS must emit a firehose. 203 + 2. **Firehose proxy** — the relay maintaining a BGS WebSocket on behalf of a sleeping desktop. This is a v0.2+ feature for desktop-enrolled mode. 204 + 205 + **Change:** Split into two components: 206 + 207 + ### Firehose Emitter (new, v0.1) 208 + 209 + ``` 210 + Badge: v0.1 211 + Name: Firehose Emitter 212 + Description: Native com.atproto.sync.subscribeRepos WebSocket endpoint. 213 + Required for federation — every PDS must emit a firehose. In mobile-only 214 + mode, the relay is the PDS and emits directly. In desktop-enrolled mode, 215 + emits commits as they're signed. 216 + ``` 217 + 218 + ### Firehose Proxy (keep, reclassify to v0.2) 219 + 220 + ``` 221 + Badge: v0.2 222 + Name: Firehose Proxy 223 + Description: Maintains persistent BGS WebSocket on behalf of sleeping 224 + desktop. Replays commits from buffer when desktop reconnects. Ensures BGS 225 + sees continuous uptime even when desktop is offline. Desktop-enrolled 226 + feature — not applicable in mobile-only mode. Pro/Business tier on 227 + managed relay. 228 + ``` 229 + 230 + --- 231 + 232 + ## Item 7: Move Basic Shamir to v0.1 233 + 234 + **Current (v7):** "Recovery Share Manager" is tagged v1.0. "DID Keystore" is tagged v0.1 with note "Basic key management for v0.1. Shamir split added in v1.0." 235 + 236 + **Change:** Mobile onboarding (mobile spec §3.1 Step 7) generates Shamir shares during account creation. Basic Shamir support is required from day one. The full Recovery Share Manager UI can stay at v1.0, but the crypto primitives and share generation must be v0.1. 237 + 238 + ### DID Keystore — new description 239 + 240 + ``` 241 + Badge: v0.1 242 + Name: DID Keystore 243 + Description: Signing keys in macOS Keychain (desktop) / Secure Enclave 244 + (phone). At account creation, root rotation key is split via 2-of-3 245 + Shamir: Share 1 = iCloud Keychain, Share 2 = relay escrow, Share 3 = 246 + user's choice (device-local or BIP-39 paper backup). Basic key 247 + management for v0.1. Full recovery UI in v1.0. 248 + ``` 249 + 250 + ### Recovery Share Manager — clarify scope 251 + 252 + ``` 253 + Badge: v1.0 254 + Name: Recovery Share Manager 255 + Description: Full UI for Shamir share management and recovery ceremony. 256 + View share status, rotate shares, initiate recovery from device loss. 257 + Note: basic share GENERATION happens at v0.1 (during account creation). 258 + This component adds the management and recovery interface. 259 + ``` 260 + 261 + --- 262 + 263 + ## Item 8: Unified Milestones 264 + 265 + **Current (v7):** Two milestones — v0.1 (technical preview, 3–4 months) and v1.0 (public launch, 3–4 months after). 266 + 267 + **Change:** Four phases. See unified-milestone-map.md for full details. The architecture's milestone summary section should be updated: 268 + 269 + ### Milestone Summary — replace current 2-column grid with 4 phases 270 + 271 + ``` 272 + v0.1 — Mobile-Only PDS (~3-4 months) 273 + Goal: User creates ATProto identity from iPhone, logs into Bluesky. 274 + Relay is a full PDS. No desktop involved. 275 + 276 + Relay: Axum + SQLite + repo engine + signing + XRPC + firehose emitter 277 + OAuth: atproto-oauth-axum integration (blocks Bluesky login) 278 + Blobs: upload/serve with local storage 279 + Identity: DID creation + Shamir split at onboarding 280 + Federation: 25 XRPC endpoints (see unified-milestone-map.md §2.1) 281 + Testing: L1 interop tests + cargo-audit 282 + 283 + v0.2 — Desktop Enrollment (~2-3 months) 284 + Goal: User pairs desktop Mac, relay becomes proxy+signer. 285 + 286 + Device pairing via QR code + desktop promotion 287 + XRPC write proxying (relay → desktop → relay signs) 288 + Firehose proxy for sleeping desktop 289 + Blob forwarding via Iroh 290 + Desktop offline → 503 on writes, reads from cache 291 + 292 + v1.0 — Production Launch (~3-4 months) 293 + Goal: Production-ready product with recovery and self-hosting. 294 + 295 + Shamir recovery ceremony + full share management UI 296 + Tier pricing (Free/Pro/Business) 297 + BYO relay binary (Nix/Docker) 298 + S3 blob backend + CDN 299 + PostgreSQL option 300 + L2 oracle suite + L3 canary 301 + XRPC hardening + rate limiting 302 + 303 + v2.0+ — Signing Sovereignty (TBD) 304 + Goal: User's hardware signs commits directly. 305 + Contingent on ATProto protocol evolution (multi-key support). 306 + ``` 307 + 308 + ### Milestone Legend — add v0.2 309 + 310 + Add a v0.2 badge/swatch to the legend at the top: 311 + 312 + ``` 313 + v0.2 Desktop enrollment · device management 314 + ``` 315 + 316 + And reclassify components accordingly: 317 + - Firehose Proxy: v1.0 → v0.2 318 + - Provisioning API: v1.0 → split (core at v0.1, full at v1.0) 319 + - Key Share Escrow: v1.0 → v0.1 (needed for Shamir at onboarding) 320 + 321 + --- 322 + 323 + ## Summary of Badge Reclassifications 324 + 325 + | Component | v7 Badge | v8 Badge | Reason | 326 + |-----------|----------|----------|--------| 327 + | DID Keystore | v0.1 | v0.1 | Unchanged but description updated (Shamir at creation) | 328 + | Key Share Escrow | v1.0 | v0.1 | Relay holds Share 2 from account creation | 329 + | Recovery Share Manager | v1.0 | v1.0 | Unchanged — UI for managing/recovering shares | 330 + | Firehose Proxy | v1.0 | v0.2 | Desktop-enrolled feature, not v1.0 | 331 + | Firehose Emitter | — | v0.1 | NEW — native emission required for federation | 332 + | Provisioning API | v1.0 | v0.1 | Core provisioning needed from day one | 333 + | Commit Buffer | v1.0 | v0.2 | Feeds firehose proxy, needed at desktop enrollment | 334 + | Custom PDS Shell | v0.1 | v0.2 | Not needed until desktop enrolls (relay is PDS in v0.1) | 335 + | Tauri Shell | v0.1 | v0.2 | No desktop app in mobile-only phase | 336 + 337 + ### Major implication 338 + 339 + In v7, the Custom PDS Shell and Tauri Shell were both v0.1 because the architecture assumed desktop-first. With mobile-first, these move to v0.2. The v0.1 work is all relay-side: building a federating PDS that runs on the relay, not on the desktop. 340 + 341 + This significantly changes what v0.1 development looks like. Instead of building Tauri + PDS shell + Iroh, you're building a hosted PDS service (the relay) with an iOS companion for key management.
docs/archive/data-migration-spec.docx

This is a binary file and will not be displayed.

+119
docs/archive/migration-v02-amendments.md
··· 1 + # Migration Spec v0.2 Amendments 2 + 3 + Changes required to update the Data Migration & Recovery Spec from v0.1 to v0.2. Maps to action items 16–18 from cross-spec-analysis.md. 4 + 5 + --- 6 + 7 + ## Changelog Entry 8 + 9 + ``` 10 + v0.2 Changes — Shamir Model Update + Mobile Cross-References 11 + 12 + FIX Shamir share model: Share 3 is user's choice (device-local or BIP-39) 13 + NEW Cross-references to mobile spec §7 for phone recovery 14 + FIX Milestone alignment with unified-milestone-map.md 15 + ``` 16 + 17 + --- 18 + 19 + ## Item 16: Update Shamir Share Model 20 + 21 + The migration spec's current Shamir share assignments need updating to match the decision made during cross-spec review. 22 + 23 + ### Current Model (v0.1) 24 + 25 + The migration spec describes three shares but the assignment varies by section. Some sections say device/relay/iCloud, others are inconsistent. 26 + 27 + ### New Model (v0.2) 28 + 29 + ``` 30 + 2-of-3 Shamir Secret Sharing for Root Rotation Key: 31 + 32 + Share 1: iCloud Keychain (automatic, transparent to user) 33 + Share 2: Relay escrow (encrypted at rest, access-logged) 34 + Share 3: User's choice at account creation: 35 + Option A: Device-local (stored in Secure Enclave / Keychain) 36 + Option B: BIP-39 mnemonic (paper backup or USB) 37 + 38 + Reconstruction requires any 2 of the 3 shares. 39 + 40 + Recovery scenarios: 41 + - Lost phone, have desktop: Share 1 (iCloud) + Share 2 (relay) → reconstruct 42 + - Lost phone, no desktop: Share 1 (iCloud) + Share 2 (relay) → reconstruct 43 + - Lost phone + iCloud: Share 2 (relay) + Share 3 (user) → reconstruct 44 + - Relay compromised: Share 1 (iCloud) + Share 3 (user) → reconstruct 45 + - All three lost: DID is orphaned (permanent, by design) 46 + ``` 47 + 48 + ### Sections to Update 49 + 50 + Every section that references Shamir shares should use the Share 1/2/3 naming above. Specifically: 51 + 52 + - Section 4 (Unplanned Device Loss): Update share descriptions 53 + - Any table listing share locations: Use the canonical assignment 54 + - Recovery ceremony flow: Reference shares by number and location 55 + 56 + --- 57 + 58 + ## Item 17: Cross-Reference Mobile Spec §7 59 + 60 + The migration spec covers desktop-to-desktop migration but doesn't reference phone-to-phone migration. The mobile spec §7.2 covers phone recovery using the same Shamir infrastructure. 61 + 62 + ### Add Cross-Reference Section 63 + 64 + Add after the unplanned device loss section: 65 + 66 + ``` 67 + 4.x Phone Recovery 68 + 69 + Phone-to-phone recovery uses the same Shamir infrastructure as 70 + desktop recovery. The mobile architecture spec (§7) details the 71 + iOS-specific flow: 72 + 73 + 1. New phone signs into iCloud → Share 1 is available 74 + 2. User authenticates with relay → Share 2 is available 75 + 3. Relay reconstructs rotation key from 2 shares 76 + 4. Relay re-generates signing key, updates DID document 77 + 5. New phone stores new rotation key in Secure Enclave 78 + 79 + The key difference from desktop recovery: in phone recovery, the 80 + relay already holds the repo (it's the PDS in mobile-only mode), 81 + so there's no repo transfer step. Recovery is purely a key 82 + reconstruction + DID update operation. 83 + 84 + See: mobile-architecture-spec-v1.2 §7.2 for the complete flow. 85 + ``` 86 + 87 + --- 88 + 89 + ## Item 18: Align Milestone Timing 90 + 91 + The migration spec puts "basic Shamir" in v0.1 and "full recovery" in v1.0. This aligns with the unified milestone map, but the language should be explicit. 92 + 93 + ### Update Section 8 (Implementation Milestones) 94 + 95 + ``` 96 + v0.1 — Basic Migration + Shamir Generation 97 + - Planned device swap (LAN transfer via Iroh, 6-digit code) 98 + - Shamir share generation during account creation 99 + - Share 1 → iCloud Keychain storage 100 + - Share 2 → relay escrow 101 + - Share 3 → user's choice (device-local or BIP-39) 102 + Note: Share GENERATION is v0.1. Share RECOVERY is v1.0. 103 + 104 + v1.0 — Full Recovery 105 + - Unplanned device loss recovery ceremony 106 + - Shamir reconstruction (2-of-3) 107 + - DID key rotation after recovery 108 + - Recovery UI in mobile app 109 + - Relay-side recovery session management 110 + 111 + Later 112 + - Multi-device sync (share key across devices without migration) 113 + ``` 114 + 115 + Add note: 116 + ``` 117 + See unified-milestone-map.md for how these milestones align with 118 + the architecture, provisioning API, and mobile spec phases. 119 + ```
docs/archive/mobile-architecture-spec-v1.2.docx

This is a binary file and will not be displayed.

+51
docs/archive/mobile-spec-amendments.md
··· 1 + # Mobile Architecture Spec — Minor Amendments 2 + 3 + Changes required for the Mobile Architecture Spec v1.2. Maps to action items 19–20 from cross-spec-analysis.md. These are minor updates — the mobile spec is canonical and needs the fewest changes. 4 + 5 + --- 6 + 7 + ## Item 19: Update Shamir Share 3 — User's Choice 8 + 9 + The mobile spec currently specifies Share 3 as BIP-39 mnemonic phrase. Per the cross-spec decision, Share 3 should be the user's choice between device-local storage or BIP-39. 10 + 11 + ### Section to Update 12 + 13 + Find the Shamir share description (likely §3.1 Step 7 or §7) and change: 14 + 15 + ``` 16 + Current: 17 + Share 3: BIP-39 mnemonic phrase (paper backup) 18 + 19 + Change to: 20 + Share 3: User's choice at account creation 21 + Option A: Device-local (Secure Enclave / Keychain on a second device) 22 + Option B: BIP-39 mnemonic phrase (paper backup or USB storage) 23 + 24 + The onboarding flow presents both options with a brief explanation: 25 + - Device-local is more convenient but requires a second device 26 + - BIP-39 is more resilient but requires physical safekeeping 27 + 28 + Default recommendation: BIP-39 (safer for users with only one device) 29 + ``` 30 + 31 + --- 32 + 33 + ## Item 20: Cross-Reference Migration Spec 34 + 35 + The mobile spec covers phone recovery (§7) but doesn't reference the migration spec's desktop-to-desktop transfer or the planned device swap flow. 36 + 37 + ### Add Cross-Reference 38 + 39 + In §7 (or wherever recovery is discussed), add: 40 + 41 + ``` 42 + For planned device upgrades (e.g., new iPhone), see the Data 43 + Migration & Recovery Spec §3 which covers the Iroh-based peer 44 + transfer with 6-digit verification code. The same transfer 45 + protocol works for phone-to-phone and desktop-to-desktop swaps. 46 + 47 + For desktop-specific migration (desktop-to-desktop), see the 48 + Data Migration & Recovery Spec §3.x. The phone acts as the 49 + authorization device during desktop transfers — the user 50 + confirms the transfer from the mobile app. 51 + ```
+684
docs/archive/pds-architecture-v7-final.html
··· 1 + <!DOCTYPE html> 2 + <html lang="en"> 3 + <head> 4 + <meta charset="UTF-8" /> 5 + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> 6 + <title>Desktop PDS Architecture v7 — Final</title> 7 + <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600;700&family=DM+Sans:wght@400;500;600;700&display=swap" rel="stylesheet" /> 8 + <style> 9 + :root { 10 + --bg: #0a0e17; 11 + --surface: #111827; 12 + --surface-2: #1a2332; 13 + --border: #1e2d3d; 14 + --border-bright: #2a3f55; 15 + --text: #e2e8f0; 16 + --text-dim: #8494a7; 17 + --accent-blue: #38bdf8; 18 + --accent-green: #34d399; 19 + --accent-amber: #fbbf24; 20 + --accent-rose: #fb7185; 21 + --accent-purple: #a78bfa; 22 + --accent-cyan: #22d3ee; 23 + } 24 + 25 + * { margin: 0; padding: 0; box-sizing: border-box; } 26 + 27 + body { 28 + background: var(--bg); 29 + color: var(--text); 30 + font-family: 'DM Sans', sans-serif; 31 + padding: 40px 24px; 32 + min-height: 100vh; 33 + } 34 + 35 + .container { max-width: 1100px; margin: 0 auto; } 36 + 37 + /* Header */ 38 + .header { text-align: center; margin-bottom: 48px; position: relative; } 39 + .header::after { content: ''; position: absolute; bottom: -24px; left: 50%; transform: translateX(-50%); width: 60px; height: 2px; background: linear-gradient(90deg, var(--accent-blue), var(--accent-purple)); border-radius: 1px; } 40 + .header h1 { font-family: 'JetBrains Mono', monospace; font-size: 28px; font-weight: 700; letter-spacing: -0.5px; background: linear-gradient(135deg, var(--accent-blue), var(--accent-purple)); -webkit-background-clip: text; -webkit-text-fill-color: transparent; margin-bottom: 8px; } 41 + .header .version { font-family: 'JetBrains Mono', monospace; font-size: 12px; color: var(--accent-green); letter-spacing: 1px; text-transform: uppercase; } 42 + .header .subtitle { color: var(--text-dim); font-size: 14px; margin-top: 8px; } 43 + 44 + /* Layer sections */ 45 + .layer { margin-bottom: 32px; border-radius: 12px; overflow: hidden; border: 1px solid var(--border); } 46 + .layer-header { display: flex; align-items: center; gap: 12px; padding: 16px 24px; font-family: 'JetBrains Mono', monospace; font-size: 13px; font-weight: 600; letter-spacing: 0.5px; text-transform: uppercase; } 47 + .layer-header .layer-num { font-size: 11px; opacity: 0.5; font-weight: 400; } 48 + .layer-header .changed-badge { font-size: 10px; padding: 2px 8px; border-radius: 4px; background: rgba(251, 191, 36, 0.15); color: var(--accent-amber); text-transform: uppercase; letter-spacing: 1px; margin-left: auto; font-weight: 500; } 49 + .layer-device .layer-header { background: linear-gradient(135deg, rgba(12,74,110,0.4), rgba(12,74,110,0.15)); color: var(--accent-blue); } 50 + .layer-relay .layer-header { background: linear-gradient(135deg, rgba(6,78,59,0.4), rgba(6,78,59,0.15)); color: var(--accent-green); } 51 + .layer-infra .layer-header { background: linear-gradient(135deg, rgba(74,29,110,0.4), rgba(74,29,110,0.15)); color: var(--accent-purple); } 52 + .layer-ops .layer-header { background: linear-gradient(135deg, rgba(110,29,46,0.4), rgba(110,29,46,0.15)); color: var(--accent-rose); } 53 + .layer-body { background: var(--surface); padding: 24px; } 54 + 55 + /* Component cards */ 56 + .components { display: grid; grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); gap: 16px; } 57 + .component { background: var(--surface-2); border: 1px solid var(--border); border-radius: 8px; padding: 16px; transition: border-color 0.2s; position: relative; } 58 + .component:hover { border-color: var(--border-bright); } 59 + .component.removed { border-color: rgba(251, 113, 133, 0.2); opacity: 0.35; } 60 + .component-icon { font-size: 20px; margin-bottom: 8px; } 61 + .component-name { font-family: 'JetBrains Mono', monospace; font-size: 13px; font-weight: 600; margin-bottom: 6px; color: var(--text); } 62 + .component-desc { font-size: 12px; color: var(--text-dim); line-height: 1.6; } 63 + 64 + /* Milestone badges */ 65 + .badge { 66 + position: absolute; 67 + top: 8px; 68 + right: 8px; 69 + font-family: 'JetBrains Mono', monospace; 70 + font-size: 9px; 71 + padding: 2px 6px; 72 + border-radius: 3px; 73 + letter-spacing: 0.5px; 74 + font-weight: 600; 75 + } 76 + .badge-01 { background: rgba(56, 189, 248, 0.15); color: var(--accent-blue); } 77 + .badge-10 { background: rgba(167, 139, 250, 0.15); color: var(--accent-purple); } 78 + .badge-later { background: rgba(251, 191, 36, 0.1); color: rgba(251, 191, 36, 0.5); } 79 + .badge-removed { background: rgba(251, 113, 133, 0.15); color: var(--accent-rose); } 80 + 81 + /* Flow diagram */ 82 + .flow-section { margin-top: 32px; margin-bottom: 32px; border: 1px solid var(--border); border-radius: 12px; overflow: hidden; } 83 + .flow-header { padding: 16px 24px; background: var(--surface-2); font-family: 'JetBrains Mono', monospace; font-size: 13px; font-weight: 600; color: var(--accent-cyan); letter-spacing: 0.5px; text-transform: uppercase; } 84 + .flow-body { background: var(--surface); padding: 24px; } 85 + .flow-diagram { display: flex; align-items: center; justify-content: center; gap: 0; flex-wrap: wrap; padding: 16px 0; } 86 + .flow-node { background: var(--surface-2); border: 1px solid var(--border-bright); border-radius: 8px; padding: 12px 16px; text-align: center; min-width: 130px; } 87 + .flow-node-label { font-family: 'JetBrains Mono', monospace; font-size: 11px; font-weight: 600; margin-bottom: 4px; } 88 + .flow-node-detail { font-size: 10px; color: var(--text-dim); } 89 + .flow-arrow { font-family: 'JetBrains Mono', monospace; font-size: 11px; color: var(--text-dim); padding: 0 8px; display: flex; flex-direction: column; align-items: center; gap: 2px; } 90 + .flow-arrow-line { color: var(--accent-cyan); font-size: 14px; } 91 + .flow-arrow-label { font-size: 9px; color: var(--text-dim); white-space: nowrap; } 92 + .flow-node.node-device { border-color: rgba(56, 189, 248, 0.3); } 93 + .flow-node.node-device .flow-node-label { color: var(--accent-blue); } 94 + .flow-node.node-relay { border-color: rgba(52, 211, 153, 0.3); } 95 + .flow-node.node-relay .flow-node-label { color: var(--accent-green); } 96 + .flow-node.node-network { border-color: rgba(167, 139, 250, 0.3); } 97 + .flow-node.node-network .flow-node-label { color: var(--accent-purple); } 98 + 99 + /* Tier pricing */ 100 + .tier-section { margin-top: 32px; border: 1px solid var(--border); border-radius: 12px; overflow: hidden; } 101 + .tier-header { padding: 16px 24px; background: var(--surface-2); font-family: 'JetBrains Mono', monospace; font-size: 13px; font-weight: 600; color: var(--accent-amber); letter-spacing: 0.5px; text-transform: uppercase; } 102 + .tier-body { background: var(--surface); padding: 24px; display: grid; grid-template-columns: repeat(auto-fit, minmax(200px, 1fr)); gap: 16px; } 103 + .tier-card { border: 1px solid var(--border); border-radius: 8px; padding: 20px; background: var(--surface-2); } 104 + .tier-card.highlighted { border-color: var(--accent-green); } 105 + .tier-name { font-family: 'JetBrains Mono', monospace; font-size: 14px; font-weight: 700; margin-bottom: 4px; } 106 + .tier-price { font-family: 'JetBrains Mono', monospace; font-size: 11px; color: var(--text-dim); margin-bottom: 12px; } 107 + .tier-features { list-style: none; font-size: 12px; color: var(--text-dim); line-height: 2; } 108 + .tier-features li::before { content: '→ '; color: var(--accent-green); font-family: 'JetBrains Mono', monospace; } 109 + 110 + /* Security strip */ 111 + .security-section { margin-top: 32px; border: 1px solid var(--border); border-radius: 12px; overflow: hidden; } 112 + .security-header { padding: 16px 24px; background: var(--surface-2); font-family: 'JetBrains Mono', monospace; font-size: 13px; font-weight: 600; letter-spacing: 0.5px; text-transform: uppercase; } 113 + .security-body { background: var(--surface); padding: 24px; display: grid; grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); gap: 16px; } 114 + .sec-card { background: var(--surface-2); border: 1px solid var(--border); border-radius: 8px; padding: 16px; } 115 + .sec-icon { font-size: 18px; margin-bottom: 8px; } 116 + .sec-title { font-family: 'JetBrains Mono', monospace; font-size: 12px; font-weight: 600; margin-bottom: 6px; } 117 + .sec-desc { font-size: 11px; color: var(--text-dim); line-height: 1.6; } 118 + 119 + /* Changelog */ 120 + .changelog { margin-bottom: 32px; padding: 16px 20px; background: rgba(251, 191, 36, 0.05); border: 1px solid rgba(251, 191, 36, 0.15); border-radius: 8px; font-size: 12px; color: var(--text-dim); line-height: 1.8; } 121 + .changelog-title { font-family: 'JetBrains Mono', monospace; font-size: 11px; font-weight: 600; color: var(--accent-amber); text-transform: uppercase; letter-spacing: 1px; margin-bottom: 8px; } 122 + .changelog code { font-family: 'JetBrains Mono', monospace; font-size: 11px; background: rgba(251, 191, 36, 0.1); padding: 1px 5px; border-radius: 3px; color: var(--accent-amber); } 123 + 124 + /* Milestone legend */ 125 + .milestone-legend { 126 + display: flex; gap: 24px; justify-content: center; flex-wrap: wrap; 127 + margin-bottom: 32px; padding: 16px; 128 + border: 1px solid var(--border); border-radius: 8px; 129 + background: var(--surface); 130 + } 131 + .legend-item { display: flex; align-items: center; gap: 8px; font-size: 12px; font-family: 'JetBrains Mono', monospace; } 132 + .legend-swatch { width: 10px; height: 10px; border-radius: 2px; } 133 + 134 + /* Sublabel */ 135 + .sublabel { font-family: 'JetBrains Mono', monospace; font-size: 10px; text-transform: uppercase; letter-spacing: 1px; margin-bottom: 12px; opacity: 0.7; } 136 + </style> 137 + </head> 138 + <body> 139 + <div class="container"> 140 + 141 + <!-- Header --> 142 + <div class="header"> 143 + <h1>Desktop PDS — System Architecture</h1> 144 + <div class="version">v7 — final · all questions resolved · milestones scoped</div> 145 + <div class="subtitle">Sovereign AT Protocol PDS on macOS via Tauri + Custom PDS Shell + Iroh</div> 146 + </div> 147 + 148 + <!-- Milestone Legend --> 149 + <div class="milestone-legend"> 150 + <div class="legend-item"><div class="legend-swatch" style="background:var(--accent-blue)"></div> <span style="color:var(--accent-blue)">v0.1</span> <span style="color:var(--text-dim)">Technical preview · validate PDS federation</span></div> 151 + <div class="legend-item"><div class="legend-swatch" style="background:var(--accent-purple)"></div> <span style="color:var(--accent-purple)">v1.0</span> <span style="color:var(--text-dim)">Public launch · product-ready</span></div> 152 + <div class="legend-item"><div class="legend-swatch" style="background:rgba(251,191,36,0.5)"></div> <span style="color:rgba(251,191,36,0.5)">LATER</span> <span style="color:var(--text-dim)">Designed, built post-launch</span></div> 153 + </div> 154 + 155 + <!-- Changelog --> 156 + <div class="changelog"> 157 + <div class="changelog-title">v7 Changes — Milestone Scoping + Runtime Threats</div> 158 + <strong style="color: var(--text);">All open questions resolved.</strong> Solo developer reality check: two milestones. v0.1 validates PDS federation (3-4 months). v1.0 ships the product (3-4 months after). Runtime threat mitigations prioritized: XRPC hardening → Tauri sandbox → relay trust verification (designed) → supply chain auditing.<br> 159 + <code>NEW</code> Milestone badges on every component (v0.1 / v1.0 / LATER)<br> 160 + <code>NEW</code> Runtime security mitigations: XRPC fuzzing, Tauri IPC lockdown, commit ack protocol, cargo-audit<br> 161 + <code>RESOLVED</code> All four open questions now closed 162 + </div> 163 + 164 + <div class="changelog" style="opacity: 0.5;"> 165 + <div class="changelog-title">Previous Versions</div> 166 + v2: Corrected relay model (outbound only). v3: Shamir key recovery + repo snapshots. v4: Conformance strategy. v5: Custom PDS shell + atrium/rsky deps. v6: GeoDNS + BYO relay. 167 + </div> 168 + 169 + <!-- ============================================ --> 170 + <!-- LAYER 1: DEVICE --> 171 + <!-- ============================================ --> 172 + <div class="layer layer-device"> 173 + <div class="layer-header"> 174 + <span class="layer-num">01</span> 175 + Device Layer — User's Mac 176 + </div> 177 + <div class="layer-body"> 178 + <div class="components"> 179 + 180 + <div class="component"> 181 + <div class="badge badge-01">v0.1</div> 182 + <div class="component-icon">🖥</div> 183 + <div class="component-name">Tauri Shell</div> 184 + <div class="component-desc">Native macOS app. Process lifecycle, auto-updates, system tray. Minimal IPC allowlist — webview cannot access filesystem, shell, or network directly.</div> 185 + </div> 186 + 187 + <div class="component"> 188 + <div class="badge badge-01">v0.1</div> 189 + <div class="component-icon">📦</div> 190 + <div class="component-name">Custom PDS Shell</div> 191 + <div class="component-desc">Purpose-built embedded PDS. SQLite-backed, local-first. Owns XRPC surface and sync protocol. The core technical risk — must federate on the live network before anything else matters.</div> 192 + </div> 193 + 194 + <div class="component"> 195 + <div class="badge badge-01">v0.1</div> 196 + <div class="component-icon">🧩</div> 197 + <div class="component-name">Dependency Stack</div> 198 + <div class="component-desc"><strong>atrium-api</strong> — XRPC types, lexicon defs (auto-generated). <strong>atrium-repo</strong> — MST read/write, CAR export. <strong>rsky-crypto</strong> — P-256/K-256 commit signing.</div> 199 + </div> 200 + 201 + <div class="component"> 202 + <div class="badge badge-01">v0.1</div> 203 + <div class="component-icon">🔗</div> 204 + <div class="component-name">Iroh Endpoint</div> 205 + <div class="component-desc">QUIC-based tunnel to relay. NAT traversal, connection resumption on wake. Pushes signed repo commits to relay when online.</div> 206 + </div> 207 + 208 + <div class="component"> 209 + <div class="badge badge-01">v0.1</div> 210 + <div class="component-icon">🔐</div> 211 + <div class="component-name">DID Keystore</div> 212 + <div class="component-desc">Signing keys in macOS Keychain. Basic key management for v0.1. Shamir split added in v1.0.</div> 213 + </div> 214 + 215 + <div class="component"> 216 + <div class="badge badge-10">v1.0</div> 217 + <div class="component-icon">🛟</div> 218 + <div class="component-name">Recovery Share Manager</div> 219 + <div class="component-desc">Generic UI for managing Shamir shares. Supports iCloud Keychain, encrypted file export, relay escrow. Interface designed for future social recovery.</div> 220 + </div> 221 + 222 + <div class="component"> 223 + <div class="badge badge-10">v1.0</div> 224 + <div class="component-icon">⚠️</div> 225 + <div class="component-name">Compat Warning Banner</div> 226 + <div class="component-desc">Non-blocking in-app warning when spec drift detected. Links to update. Never blocks launch.</div> 227 + </div> 228 + 229 + <div class="component"> 230 + <div class="badge badge-10">v1.0</div> 231 + <div class="component-icon">🛡</div> 232 + <div class="component-name">XRPC Hardening</div> 233 + <div class="component-desc">Request size limits on all endpoints. Rate limiting at relay. cargo-fuzz targets for CBOR/CAR/MST parsing paths. Adversarial MST key distribution testing per ATProto spec guidance.</div> 234 + </div> 235 + 236 + <div class="component removed"> 237 + <div class="badge badge-removed">REMOVED</div> 238 + <div class="component-icon">📦</div> 239 + <div class="component-name">rsky-pds (fork)</div> 240 + <div class="component-desc">Replaced by custom PDS shell in v5. Now tracking a spec, not a codebase.</div> 241 + </div> 242 + 243 + </div> 244 + </div> 245 + </div> 246 + 247 + <!-- ============================================ --> 248 + <!-- LAYER 2: RELAY --> 249 + <!-- ============================================ --> 250 + <div class="layer layer-relay"> 251 + <div class="layer-header"> 252 + <span class="layer-num">02</span> 253 + Relay Layer — Managed + BYO 254 + </div> 255 + <div class="layer-body"> 256 + 257 + <div class="sublabel" style="color:var(--accent-green);">Managed Relay (Your Infrastructure)</div> 258 + <div class="components"> 259 + 260 + <div class="component"> 261 + <div class="badge badge-01">v0.1</div> 262 + <div class="component-icon">🚇</div> 263 + <div class="component-name">Iroh Relay Node</div> 264 + <div class="component-desc">Single-region for v0.1. Always-on tunnel endpoint. Receives commits from device, proxies XRPC repo reads.</div> 265 + </div> 266 + 267 + <div class="component"> 268 + <div class="badge badge-01">v0.1</div> 269 + <div class="component-icon">📣</div> 270 + <div class="component-name">requestCrawl Trigger</div> 271 + <div class="component-desc">On device reconnect, pings BGS requestCrawl so new content propagates immediately.</div> 272 + </div> 273 + 274 + <div class="component"> 275 + <div class="badge badge-10">v1.0</div> 276 + <div class="component-icon">📡</div> 277 + <div class="component-name">Firehose Proxy</div> 278 + <div class="component-desc">Persistent WebSocket to BGS on behalf of PDS. Streams commits continuously. BGS sees 100% uptime. Key paid feature. Beta/preview at launch.</div> 279 + </div> 280 + 281 + <div class="component"> 282 + <div class="badge badge-10">v1.0</div> 283 + <div class="component-icon">💾</div> 284 + <div class="component-name">Commit Buffer</div> 285 + <div class="component-desc">Rolling log of signed repo commits. Feeds firehose proxy during offline. Tiered retention: 7d free, 30d paid, 90d business.</div> 286 + </div> 287 + 288 + <div class="component"> 289 + <div class="badge badge-10">v1.0</div> 290 + <div class="component-icon">⚙️</div> 291 + <div class="component-name">Provisioning API</div> 292 + <div class="component-desc">Account setup, domain linking, relay config. Onboarding flow for new Tauri app connections.</div> 293 + </div> 294 + 295 + <div class="component"> 296 + <div class="badge badge-10">v1.0</div> 297 + <div class="component-icon">🔏</div> 298 + <div class="component-name">Key Share Escrow</div> 299 + <div class="component-desc">Holds one encrypted Shamir share. Cannot reconstruct alone. Encrypted at rest, access-logged.</div> 300 + </div> 301 + 302 + <div class="component"> 303 + <div class="badge badge-10">v1.0</div> 304 + <div class="component-icon">💓</div> 305 + <div class="component-name">Health Monitor</div> 306 + <div class="component-desc">Device liveness, relay uptime, ATProto spec compat. Includes canary account for silent federation failure detection.</div> 307 + </div> 308 + 309 + <div class="component"> 310 + <div class="badge badge-later">LATER</div> 311 + <div class="component-icon">🌎</div> 312 + <div class="component-name">GeoDNS Multi-Region</div> 313 + <div class="component-desc">2-3 relay nodes, route to nearest healthy. Simple failover with brief firehose gap. Cross-region replication interface designed but not built.</div> 314 + </div> 315 + 316 + <div class="component"> 317 + <div class="badge badge-later">LATER</div> 318 + <div class="component-icon">🗄</div> 319 + <div class="component-name">Repo Snapshot</div> 320 + <div class="component-desc">Full repo backup on relay. Incremental from commit buffer. Pro+ feature. Enables one-click device migration.</div> 321 + </div> 322 + 323 + <div class="component"> 324 + <div class="badge badge-later">LATER</div> 325 + <div class="component-icon">🌐</div> 326 + <div class="component-name">CDN / Public Cache</div> 327 + <div class="component-desc">Serves public repo content during offline windows.</div> 328 + </div> 329 + 330 + </div> 331 + 332 + <div class="sublabel" style="color:var(--accent-cyan); margin-top:24px;">BYO Relay (User-Hosted, Free)</div> 333 + <div class="components"> 334 + 335 + <div class="component" style="border-color: rgba(34, 211, 238, 0.3);"> 336 + <div class="badge badge-10">v1.0</div> 337 + <div class="component-icon">📦</div> 338 + <div class="component-name">Relay Binary</div> 339 + <div class="component-desc">Open-source. Nix flake (source of truth) → Docker image + NixOS module. Tunnel + commit forwarding + requestCrawl. No firehose proxy, no snapshots.</div> 340 + </div> 341 + 342 + <div class="component" style="border-color: rgba(34, 211, 238, 0.3);"> 343 + <div class="badge badge-10">v1.0</div> 344 + <div class="component-icon">📜</div> 345 + <div class="component-name">Device-Relay Protocol Spec</div> 346 + <div class="component-desc">Documented contract: handshake/auth, commit push, health ping, optional feature negotiation. Includes commit ack for future trust verification.</div> 347 + </div> 348 + 349 + <div class="component" style="border-color: rgba(34, 211, 238, 0.3);"> 350 + <div class="badge badge-10">v1.0</div> 351 + <div class="component-icon">🔌</div> 352 + <div class="component-name">Feature Negotiation</div> 353 + <div class="component-desc">App queries relay capabilities on connect. Gracefully degrades when extended features unavailable. Suggests upgrade for missing features.</div> 354 + </div> 355 + 356 + </div> 357 + 358 + <div class="components" style="margin-top: 16px;"> 359 + <div class="component removed"> 360 + <div class="badge badge-removed">REMOVED</div> 361 + <div class="component-icon">📬</div> 362 + <div class="component-name">Inbound Message Queue</div> 363 + <div class="component-desc">Not needed. ATProto records live in author's repo.</div> 364 + </div> 365 + </div> 366 + 367 + </div> 368 + </div> 369 + 370 + <!-- ============================================ --> 371 + <!-- DATA FLOW --> 372 + <!-- ============================================ --> 373 + <div class="flow-section"> 374 + <div class="flow-header">Data Flow — How Bob's Post Reaches the Network</div> 375 + <div class="flow-body"> 376 + <div class="flow-diagram"> 377 + <div class="flow-node node-device"> 378 + <div class="flow-node-label">Bob's Mac</div> 379 + <div class="flow-node-detail">PDS shell signs commit</div> 380 + </div> 381 + <div class="flow-arrow"><div class="flow-arrow-label">Iroh tunnel</div><div class="flow-arrow-line">→</div></div> 382 + <div class="flow-node node-relay"> 383 + <div class="flow-node-label">Commit Buffer</div> 384 + <div class="flow-node-detail">Persists signed commit</div> 385 + </div> 386 + <div class="flow-arrow"><div class="flow-arrow-label">streams</div><div class="flow-arrow-line">→</div></div> 387 + <div class="flow-node node-relay"> 388 + <div class="flow-node-label">Firehose Proxy</div> 389 + <div class="flow-node-detail">Stable WebSocket</div> 390 + </div> 391 + <div class="flow-arrow"><div class="flow-arrow-label">subscribeRepos</div><div class="flow-arrow-line">→</div></div> 392 + <div class="flow-node node-network"> 393 + <div class="flow-node-label">BGS / Relay</div> 394 + <div class="flow-node-detail">Network firehose</div> 395 + </div> 396 + <div class="flow-arrow"><div class="flow-arrow-label">indexes</div><div class="flow-arrow-line">→</div></div> 397 + <div class="flow-node node-network"> 398 + <div class="flow-node-label">AppView</div> 399 + <div class="flow-node-detail">Bluesky etc.</div> 400 + </div> 401 + </div> 402 + <div style="text-align:center; margin-top:12px; font-size:11px; color:var(--text-dim); font-family:'JetBrains Mono',monospace;"> 403 + If Bob's Mac is asleep → commit buffer feeds firehose proxy from stored commits on wake 404 + </div> 405 + </div> 406 + </div> 407 + 408 + <!-- Recovery Flow --> 409 + <div class="flow-section"> 410 + <div class="flow-header" style="color: var(--accent-green);">Recovery Flow — Device Migration / Dead SSD <span style="opacity:0.5; font-size:10px; margin-left:8px;">v1.0</span></div> 411 + <div class="flow-body"> 412 + <div class="flow-diagram"> 413 + <div class="flow-node node-device"> 414 + <div class="flow-node-label">New Mac</div> 415 + <div class="flow-node-detail">Installs Tauri app</div> 416 + </div> 417 + <div class="flow-arrow"><div class="flow-arrow-label">authenticates</div><div class="flow-arrow-line">→</div></div> 418 + <div class="flow-node node-relay"> 419 + <div class="flow-node-label">2-of-3 Shares</div> 420 + <div class="flow-node-detail">iCloud + relay escrow</div> 421 + </div> 422 + <div class="flow-arrow"><div class="flow-arrow-label">reconstructs</div><div class="flow-arrow-line">→</div></div> 423 + <div class="flow-node node-device"> 424 + <div class="flow-node-label">Rotation Key</div> 425 + <div class="flow-node-detail">Shamir recombination</div> 426 + </div> 427 + <div class="flow-arrow"><div class="flow-arrow-label">did:plc op</div><div class="flow-arrow-line">→</div></div> 428 + <div class="flow-node node-network"> 429 + <div class="flow-node-label">Key Rotation</div> 430 + <div class="flow-node-detail">New signing key</div> 431 + </div> 432 + <div class="flow-arrow"><div class="flow-arrow-label">syncs</div><div class="flow-arrow-line">→</div></div> 433 + <div class="flow-node node-relay"> 434 + <div class="flow-node-label">Repo Snapshot</div> 435 + <div class="flow-node-detail">Full repo restore</div> 436 + </div> 437 + </div> 438 + <div style="text-align:center; margin-top:16px; font-size:11px; color:var(--text-dim); font-family:'JetBrains Mono',monospace; line-height: 1.8;"> 439 + Share sources (any 2 of 3): &nbsp; 440 + <span style="color:var(--accent-blue)">① iCloud Keychain</span> &nbsp;·&nbsp; 441 + <span style="color:var(--accent-green)">② Relay escrow</span> &nbsp;·&nbsp; 442 + <span style="color:var(--accent-amber)">③ Exported recovery file</span><br> 443 + <span style="color:var(--text-dim); opacity:0.5">Future: ④ Trusted contact (social recovery) — interface designed, not yet shipped</span> 444 + </div> 445 + </div> 446 + </div> 447 + 448 + <!-- ============================================ --> 449 + <!-- LAYER 3: INFRASTRUCTURE --> 450 + <!-- ============================================ --> 451 + <div class="layer layer-infra"> 452 + <div class="layer-header"> 453 + <span class="layer-num">03</span> 454 + Infrastructure — ATProto Network 455 + </div> 456 + <div class="layer-body"> 457 + <div class="components"> 458 + <div class="component"> 459 + <div class="badge badge-01">v0.1</div> 460 + <div class="component-icon">🌍</div> 461 + <div class="component-name">Federation</div> 462 + <div class="component-desc">PDS participates in ATProto network via relay. DID document points to relay URL as canonical PDS endpoint.</div> 463 + </div> 464 + <div class="component"> 465 + <div class="badge badge-10">v1.0</div> 466 + <div class="component-icon">🔤</div> 467 + <div class="component-name">DNS / Domain Automation</div> 468 + <div class="component-desc">Handle-as-domain resolution. Automated DNS config for custom domain handles.</div> 469 + </div> 470 + <div class="component"> 471 + <div class="badge badge-01">v0.1</div> 472 + <div class="component-icon">🪪</div> 473 + <div class="component-name">DID Resolution</div> 474 + <div class="component-desc">did:plc or did:web pointing to relay endpoint. Relay always reachable, DID resolution never fails due to offline device.</div> 475 + </div> 476 + </div> 477 + </div> 478 + </div> 479 + 480 + <!-- ============================================ --> 481 + <!-- LAYER 4: OPS --> 482 + <!-- ============================================ --> 483 + <div class="layer layer-ops"> 484 + <div class="layer-header"> 485 + <span class="layer-num">04</span> 486 + Ops — Security, Conformance &amp; Updates 487 + </div> 488 + <div class="layer-body"> 489 + 490 + <div class="sublabel" style="color:var(--accent-rose);">Update &amp; Supply Chain Security</div> 491 + <div class="components"> 492 + <div class="component"> 493 + <div class="badge badge-10">v1.0</div> 494 + <div class="component-icon">🔑</div> 495 + <div class="component-name">2-of-3 Threshold Signing</div> 496 + <div class="component-desc">CI key + offline engineer key + cold storage. Compromised CI alone cannot ship malicious updates.</div> 497 + </div> 498 + <div class="component"> 499 + <div class="badge badge-later">LATER</div> 500 + <div class="component-icon">📋</div> 501 + <div class="component-name">Transparency Log</div> 502 + <div class="component-desc">Sigstore-backed. Every release publicly logged.</div> 503 + </div> 504 + <div class="component"> 505 + <div class="badge badge-10">v1.0</div> 506 + <div class="component-icon">🍎</div> 507 + <div class="component-name">Apple Notarization</div> 508 + <div class="component-desc">First verification layer via Tauri build pipeline.</div> 509 + </div> 510 + <div class="component"> 511 + <div class="badge badge-10">v1.0</div> 512 + <div class="component-icon">📬</div> 513 + <div class="component-name">Responsible Disclosure</div> 514 + <div class="component-desc">security@ + published PGP key from day one.</div> 515 + </div> 516 + <div class="component"> 517 + <div class="badge badge-01">v0.1</div> 518 + <div class="component-icon">📦</div> 519 + <div class="component-name">cargo-audit in CI</div> 520 + <div class="component-desc">Dependency vulnerability scanning on every build. Pin exact versions in Cargo.lock. Review diffs on dep updates. Verify atrium codegen input against upstream lexicons.</div> 521 + </div> 522 + </div> 523 + 524 + <div class="sublabel" style="color:var(--accent-rose); margin-top: 24px;">Conformance Testing</div> 525 + <div class="components"> 526 + <div class="component"> 527 + <div class="badge badge-01">v0.1</div> 528 + <div class="component-icon">🧪</div> 529 + <div class="component-name">L1: Interop Test Vectors</div> 530 + <div class="component-desc">Every commit. Official atproto-interop-tests + interop-test-files. Byte-level checks for MST, CAR, CBOR, CID, commit proofs. Strict MST validation.</div> 531 + </div> 532 + <div class="component"> 533 + <div class="badge badge-10">v1.0</div> 534 + <div class="component-icon">🔬</div> 535 + <div class="component-name">L2: Oracle Compat Suite</div> 536 + <div class="component-desc">Nightly CI. Docker Compose: reference TypeScript PDS vs your Rust PDS. Compare CAR output, firehose events, MST roots.</div> 537 + </div> 538 + <div class="component"> 539 + <div class="badge badge-10">v1.0</div> 540 + <div class="component-icon">🐤</div> 541 + <div class="component-name">L3: Production Canary</div> 542 + <div class="component-desc">Live account on real Bluesky via your relay. Health monitor verifies posts appear in AppView. Catches silent federation failures.</div> 543 + </div> 544 + </div> 545 + 546 + <div class="sublabel" style="color:var(--accent-rose); margin-top: 24px;">Runtime Threat Mitigations</div> 547 + <div class="components"> 548 + <div class="component"> 549 + <div class="badge badge-10">v1.0</div> 550 + <div class="component-icon">🔒</div> 551 + <div class="component-name">XRPC Input Hardening</div> 552 + <div class="component-desc">Request size limits per endpoint. Rate limiting at relay layer. cargo-fuzz targets for CBOR/CAR/MST parsing. Adversarial MST key testing per spec DoS guidance.</div> 553 + </div> 554 + <div class="component"> 555 + <div class="badge badge-01">v0.1</div> 556 + <div class="component-icon">🏗</div> 557 + <div class="component-name">Tauri IPC Lockdown</div> 558 + <div class="component-desc">Minimal allowlist: create/list/get records + status. Webview cannot access filesystem, shell, HTTP, or crypto. All sensitive ops in Rust backend only.</div> 559 + </div> 560 + <div class="component"> 561 + <div class="badge badge-later">LATER</div> 562 + <div class="component-icon">🤝</div> 563 + <div class="component-name">Relay Trust Verification</div> 564 + <div class="component-desc">Device verifies commits appear in firehose. Protocol designed now (commit ack with seq number), verification logic built later. Protects against censorship by relay.</div> 565 + </div> 566 + </div> 567 + 568 + </div> 569 + </div> 570 + 571 + <!-- ============================================ --> 572 + <!-- TIERS --> 573 + <!-- ============================================ --> 574 + <div class="tier-section"> 575 + <div class="tier-header">Relay Tier Pricing — v1.0 Launch</div> 576 + <div class="tier-body"> 577 + <div class="tier-card"> 578 + <div class="tier-name" style="color:var(--text-dim)">Free</div> 579 + <div class="tier-price">$0/mo</div> 580 + <ul class="tier-features"> 581 + <li>Iroh tunnel (NAT traversal)</li> 582 + <li>Basic XRPC proxy</li> 583 + <li>7-day commit buffer</li> 584 + <li>Key share escrow (1 share)</li> 585 + <li>BYO relay supported (Nix / Docker)</li> 586 + <li>Apple notarized updates</li> 587 + <li>No firehose proxy — BGS drops on sleep</li> 588 + </ul> 589 + </div> 590 + <div class="tier-card highlighted"> 591 + <div class="tier-name" style="color:var(--accent-green)">Pro</div> 592 + <div class="tier-price">$X/mo</div> 593 + <ul class="tier-features"> 594 + <li>Everything in Free</li> 595 + <li>Stable firehose proxy (always-on WebSocket)</li> 596 + <li>30-day commit buffer</li> 597 + <li>CDN cache for public content</li> 598 + <li>requestCrawl auto-trigger</li> 599 + <li>Custom domain handle</li> 600 + <li style="color:var(--accent-amber)">Multi-region GeoDNS (post-launch)</li> 601 + <li style="color:var(--accent-amber)">Full repo snapshot (post-launch)</li> 602 + <li>One-click device migration</li> 603 + </ul> 604 + </div> 605 + <div class="tier-card"> 606 + <div class="tier-name" style="color:var(--accent-purple)">Business</div> 607 + <div class="tier-price">$XX/mo</div> 608 + <ul class="tier-features"> 609 + <li>Everything in Pro</li> 610 + <li>90-day commit buffer</li> 611 + <li style="color:var(--accent-amber)">Continuous repo snapshot (post-launch)</li> 612 + <li>Admin dashboard</li> 613 + <li>Priority support</li> 614 + <li>Custom relay config</li> 615 + <li>Audit logs</li> 616 + </ul> 617 + </div> 618 + </div> 619 + </div> 620 + 621 + <!-- ============================================ --> 622 + <!-- RESOLVED QUESTIONS --> 623 + <!-- ============================================ --> 624 + <div class="security-section"> 625 + <div class="security-header" style="color:var(--accent-green)">All Questions Resolved</div> 626 + <div class="security-body"> 627 + <div class="sec-card" style="opacity:0.6; border-color: rgba(52, 211, 153, 0.2);"> 628 + <div class="sec-icon">✅</div> 629 + <div class="sec-title" style="color:var(--accent-green)">Availability — v2</div> 630 + <div class="sec-desc">Firehose proxy + commit buffer. Relay maintains persistent BGS WebSocket. No inbound buffering needed.</div> 631 + </div> 632 + <div class="sec-card" style="opacity:0.6; border-color: rgba(52, 211, 153, 0.2);"> 633 + <div class="sec-icon">✅</div> 634 + <div class="sec-title" style="color:var(--accent-green)">Durability — v3</div> 635 + <div class="sec-desc">2-of-3 Shamir key recovery. Tiered repo snapshots. iCloud + file export + relay escrow.</div> 636 + </div> 637 + <div class="sec-card" style="opacity:0.6; border-color: rgba(52, 211, 153, 0.2);"> 638 + <div class="sec-icon">✅</div> 639 + <div class="sec-title" style="color:var(--accent-green)">Spec Drift — v5</div> 640 + <div class="sec-desc">Custom PDS shell (atrium + rsky-crypto). 3-layer conformance: interop vectors → oracle → canary.</div> 641 + </div> 642 + <div class="sec-card" style="opacity:0.6; border-color: rgba(52, 211, 153, 0.2);"> 643 + <div class="sec-icon">✅</div> 644 + <div class="sec-title" style="color:var(--accent-green)">Relay Redundancy — v6</div> 645 + <div class="sec-desc">GeoDNS multi-region + BYO relay (Nix/Docker, free). Device-relay protocol spec. Feature negotiation.</div> 646 + </div> 647 + <div class="sec-card" style="opacity:0.6; border-color: rgba(52, 211, 153, 0.2);"> 648 + <div class="sec-icon">✅</div> 649 + <div class="sec-title" style="color:var(--accent-green)">Runtime Threats — v7</div> 650 + <div class="sec-desc">XRPC fuzzing + size limits. Tauri IPC lockdown. Commit ack protocol (designed). cargo-audit. Relay trust verification (designed, deferred).</div> 651 + </div> 652 + </div> 653 + </div> 654 + 655 + <!-- ============================================ --> 656 + <!-- MILESTONE SUMMARY --> 657 + <!-- ============================================ --> 658 + <div class="flow-section" style="margin-top:32px;"> 659 + <div class="flow-header" style="color:var(--accent-blue);">Milestone Summary — Solo Developer</div> 660 + <div class="flow-body" style="font-size:13px; line-height:1.9;"> 661 + <div style="display:grid; grid-template-columns: 1fr 1fr; gap: 32px;"> 662 + <div> 663 + <div style="font-family:'JetBrains Mono',monospace; font-size:14px; font-weight:700; color:var(--accent-blue); margin-bottom:12px;">v0.1 — Technical Preview</div> 664 + <div style="font-family:'JetBrains Mono',monospace; font-size:10px; color:var(--text-dim); margin-bottom:12px;">~3-4 MONTHS · YOU + 10 BETA TESTERS</div> 665 + <div style="color:var(--text-dim); font-size:12px; line-height:1.8;"> 666 + <strong style="color:var(--text)">Goal:</strong> Can the custom PDS shell federate on the live ATProto network?<br><br> 667 + Tauri shell + custom PDS shell + dependency stack + Iroh endpoint + DID keystore (basic) + single-region relay + requestCrawl + federation + DID resolution + L1 interop tests + cargo-audit + Tauri IPC lockdown 668 + </div> 669 + </div> 670 + <div> 671 + <div style="font-family:'JetBrains Mono',monospace; font-size:14px; font-weight:700; color:var(--accent-purple); margin-bottom:12px;">v1.0 — Public Launch</div> 672 + <div style="font-family:'JetBrains Mono',monospace; font-size:10px; color:var(--text-dim); margin-bottom:12px;">~3-4 MONTHS AFTER v0.1 · EARLY ADOPTERS</div> 673 + <div style="color:var(--text-dim); font-size:12px; line-height:1.8;"> 674 + <strong style="color:var(--text)">Goal:</strong> A product people pay for with real sovereignty guarantees.<br><br> 675 + Shamir key recovery + BYO relay binary (Nix/Docker) + device-relay protocol spec + feature negotiation + firehose proxy (beta) + commit buffer + provisioning API + key share escrow + health monitor + XRPC hardening + L2 oracle suite + L3 canary + compat warnings + Apple notarization + threshold signing + responsible disclosure + DNS automation + tier pricing 676 + </div> 677 + </div> 678 + </div> 679 + </div> 680 + </div> 681 + 682 + </div> 683 + </body> 684 + </html>
docs/archive/provisioning-api-spec-v2.docx

This is a binary file and will not be displayed.

+567
docs/archive/provisioning-api-v03-amendments.md
··· 1 + # Provisioning API v0.3 Amendments 2 + 3 + Changes required to update the Provisioning API from v0.2 to v0.3. Each section maps to an action item from cross-spec-analysis.md. 4 + 5 + --- 6 + 7 + ## Changelog Entry 8 + 9 + ``` 10 + v0.3 Changes — Mobile-First Reconciliation + Endpoint Consolidation 11 + 12 + Reconciled with mobile architecture spec v1.2 (canonical) and 13 + migration spec v0.1. All endpoints tagged with milestone phase. 14 + 15 + FIX Ed25519 → P-256/secp256k1 throughout (ATProto requirement) 16 + FIX DID ceremony: client sends key material, relay constructs DID doc 17 + FIX Tier model: Free/Pro/Business + BYO as deployment model 18 + NEW POST /v1/accounts/mobile — combined mobile account creation 19 + NEW 9 endpoints from mobile spec (relay keys, device mgmt, signing) 20 + NEW 8 endpoints from migration spec (transfer, recovery, Shamir) 21 + NEW Blob endpoints (uploadBlob, getBlob, listBlobs) 22 + NEW Milestone tags on all endpoint groups 23 + REF See unified-milestone-map.md for phase details 24 + ``` 25 + 26 + --- 27 + 28 + ## Item 9: Replace Ed25519 with P-256/secp256k1 29 + 30 + ATProto uses P-256 (secp256r1) and secp256k1. All Ed25519 references in the provisioning API are incorrect. 31 + 32 + ### Affected Locations 33 + 34 + **Section 3 — POST /v1/devices request body:** 35 + 36 + Current: 37 + ``` 38 + device_public_key string yes Ed25519 public key, base64url-encoded 39 + ``` 40 + 41 + Change to: 42 + ``` 43 + device_public_key string yes P-256 (secp256r1) public key, base64url-encoded 44 + ``` 45 + 46 + **Section 4 — POST /v1/dids/:did/rotate request body:** 47 + 48 + Current: 49 + ``` 50 + new_public_key string yes New Ed25519 public key, base64url-encoded 51 + ``` 52 + 53 + Change to: 54 + ``` 55 + new_public_key string yes New P-256 public key, base64url-encoded 56 + ``` 57 + 58 + **Section 6.1 — Setup wizard step 1:** 59 + 60 + Current: 61 + ``` 62 + Tauri generates an Ed25519 keypair and calls POST /v1/devices. 63 + ``` 64 + 65 + Change to: 66 + ``` 67 + Tauri generates a P-256 keypair and calls POST /v1/devices. 68 + ``` 69 + 70 + **Section 9.2 — Key Management:** 71 + 72 + No explicit curve mention, but add clarification: 73 + ``` 74 + Device keys use P-256 (secp256r1). The relay signing key uses 75 + P-256 or secp256k1 per ATProto spec. Ed25519 is NOT supported 76 + by the ATProto network. 77 + ``` 78 + 79 + --- 80 + 81 + ## Item 10: Rewrite DID Ceremony (POST /v1/dids) 82 + 83 + The current spec has the client building a full DID document and submitting it. Per the mobile spec (canonical), the client sends key material and the relay constructs the DID document. 84 + 85 + ### Current Request Body 86 + 87 + ``` 88 + did_document object yes W3C DID Document with verification methods 89 + method string no "did:plc" (default) or "did:web" 90 + rotation_keys array no Additional recovery keys for did:plc 91 + ``` 92 + 93 + ### New Request Body 94 + 95 + ``` 96 + Field Type Required Description 97 + signing_pub_key string yes P-256 public key for signing, base64url-encoded 98 + rotation_pub_key string yes P-256 public key for DID rotation, base64url-encoded 99 + method string no "did:plc" (default) or "did:web" (requires custom domain) 100 + handle string no Desired handle (if not already set via POST /v1/handles) 101 + recovery_keys array no Additional rotation keys for did:plc recovery 102 + ``` 103 + 104 + ### Updated Description 105 + 106 + ``` 107 + Initiate a DID ceremony. The client submits key material (public keys 108 + only — private keys never leave the device). The relay constructs the 109 + DID document, including: 110 + - verification methods from submitted keys 111 + - service endpoint pointing to the relay 112 + - handle (atproto_handle alias) 113 + 114 + For did:plc, the relay constructs and signs the genesis operation, 115 + then submits it to plc.directory. For did:web, the relay begins 116 + serving /.well-known/did.json through the user's custom domain. 117 + 118 + The relay holds the signing key and uses it for all commit signing. 119 + The rotation key stays on the device (Secure Enclave on iOS, 120 + Keychain on macOS) and is only needed for DID recovery/rotation. 121 + ``` 122 + 123 + ### Updated Response 124 + 125 + Add field: 126 + ``` 127 + relay_signing_key_id string Identifier of the relay-generated signing key 128 + ``` 129 + 130 + ### Updated Error Responses 131 + 132 + Replace: 133 + ``` 134 + 422 KEY_MISMATCH DID document key doesn't match device's registered public key 135 + ``` 136 + 137 + With: 138 + ``` 139 + 422 INVALID_KEY_FORMAT Public key is malformed or not on a supported curve (P-256/secp256k1) 140 + ``` 141 + 142 + ### Design Decision to Add 143 + 144 + ``` 145 + Design Decision: Relay constructs DID document 146 + The client sends raw key material, not a pre-built DID document. 147 + This ensures the relay controls the document structure, service 148 + endpoints, and signing key binding. The client only needs to 149 + provide public keys — no DID document assembly logic required 150 + on the client side. This simplifies mobile clients significantly. 151 + ``` 152 + 153 + --- 154 + 155 + ## Item 11: Add POST /v1/accounts/mobile 156 + 157 + New endpoint for mobile account creation. Combines account creation + device binding + DID ceremony in a single call, avoiding the web dashboard → claim code → Tauri handoff flow. 158 + 159 + ### New Endpoint 160 + 161 + ``` 162 + POST /v1/accounts/mobile [v0.1] 163 + 164 + Combined account creation for mobile clients. Creates the account, 165 + binds the device, generates the relay signing key, and initiates 166 + the DID ceremony in one request. Replaces the multi-step web 167 + dashboard flow for iOS users. 168 + 169 + Request Body: 170 + Field Type Required Description 171 + email string yes User's email address 172 + password string yes Minimum 12 characters 173 + display_name string no Optional display name 174 + device_public_key string yes P-256 public key from Secure Enclave 175 + device_name string no e.g. "iPhone 15 Pro" 176 + rotation_pub_key string yes P-256 rotation key (stays on device) 177 + handle string no Desired handle (subdomain assigned if omitted) 178 + did_method string no "did:plc" (default) or "did:web" 179 + 180 + Response (200 OK): 181 + Field Type Description 182 + account_id string UUID v7 account identifier 183 + device_id string UUID v7 device identifier 184 + device_token string Long-lived opaque token 185 + session_token string JWT, 24-hour expiry 186 + did string Fully qualified DID string 187 + did_document object The constructed DID document 188 + handle string Assigned handle 189 + relay_endpoint object Relay Endpoint Object (see §3.2) 190 + relay_signing_key string Key ID of the relay's signing key 191 + tier string Always "free" on creation 192 + 193 + Error Responses: 194 + Status Code Description 195 + 409 ACCOUNT_EXISTS Email already registered 196 + 422 WEAK_PASSWORD Password doesn't meet requirements 197 + 422 INVALID_KEY Public key is malformed or unsupported curve 198 + 409 HANDLE_TAKEN Requested handle is already in use 199 + 429 RATE_LIMITED Too many signup attempts 200 + 201 + Note: This endpoint performs Shamir share generation as part of 202 + account creation. Share 1 is returned in the response for the 203 + client to store in iCloud Keychain. Share 2 is escrowed at the 204 + relay. Share 3 handling depends on user choice (communicated in 205 + a follow-up call or during onboarding flow). 206 + 207 + Additional Response Fields (Shamir): 208 + shamir_share_1 string Encrypted share for iCloud Keychain storage 209 + shamir_share_3_options array Available storage methods for Share 3 210 + ``` 211 + 212 + --- 213 + 214 + ## Item 12: Integrate 17 Endpoints from Mobile + Migration Specs 215 + 216 + These endpoints are defined in the mobile and migration specs but missing from the provisioning API. They should be added as new sections. 217 + 218 + ### New Section: Relay Key Management 219 + 220 + ``` 221 + 11. Relay Key Management [v0.1] 222 + 223 + The relay holds the ATProto signing key. These endpoints manage 224 + the relay's key lifecycle. 225 + 226 + POST /v1/relay/keys [v0.1] 227 + Generate a new relay signing key. Called during account creation 228 + or key rotation. The relay generates the key internally — the 229 + private key is never exposed. 230 + 231 + Response (200 OK): 232 + key_id string Key identifier 233 + public_key string P-256 public key, base64url-encoded 234 + algorithm string "ES256" (P-256) or "ES256K" (secp256k1) 235 + created_at string ISO 8601 236 + 237 + DELETE /v1/relay/keys/:keyId [v1.0] 238 + Revoke a relay signing key. Triggers DID rotation to update 239 + the signing key in the DID document. 240 + 241 + Response (200 OK): 242 + revoked_at string ISO 8601 243 + rotation_status string "pending" | "complete" 244 + 245 + POST /v1/relay/commits/sign [v0.2] 246 + Sign an unsigned commit constructed by the desktop PDS. 247 + Desktop-enrolled mode only. 248 + 249 + Request Body: 250 + unsigned_commit bytes CAR-encoded unsigned commit 251 + repo_did string DID of the repo 252 + 253 + Response (200 OK): 254 + signed_commit bytes CAR-encoded signed commit 255 + commit_cid string CID of the signed commit 256 + 257 + GET /v1/relay/repo/snapshot [v0.2] 258 + Full repo export as CAR file. Used by desktop during initial 259 + sync after enrollment. 260 + 261 + Response: streaming CAR file (same format as com.atproto.sync.getRepo) 262 + 263 + GET /v1/relay/mode [v0.2] 264 + Current relay operating mode for this account. 265 + 266 + Response (200 OK): 267 + mode string "mobile-only" | "desktop-enrolled" | "desktop-offline" 268 + primary_device string Device ID of repo host (null in mobile-only) 269 + signing_key_id string Active signing key identifier 270 + ``` 271 + 272 + ### New Section: Device Management (Mobile) 273 + 274 + ``` 275 + 12. Device Management [v0.2] 276 + 277 + Extended device operations for the mobile app. These supplement 278 + the existing device registration endpoints in §3. 279 + 280 + POST /v1/devices/:id/pair [v0.2] 281 + Initiate device pairing via QR code. The phone generates a 282 + pairing session, the desktop scans the QR code containing 283 + the session details. 284 + 285 + Request Body: 286 + pairing_code string Code from QR scan 287 + device_type string "desktop" | "mobile" 288 + 289 + Response (200 OK): 290 + paired_at string ISO 8601 291 + device_id string The paired device's ID 292 + pairing_status string "paired" | "pending_promotion" 293 + 294 + POST /v1/devices/:id/promote [v0.2] 295 + Promote a paired desktop to repo host. Transitions the relay 296 + from mobile-only to desktop-enrolled mode. The relay transfers 297 + the repo to the desktop via Iroh. 298 + 299 + Response (200 OK): 300 + promoted_at string ISO 8601 301 + mode string "desktop-enrolled" 302 + repo_transfer string "in_progress" | "complete" 303 + 304 + GET /v1/devices/:id/status [v0.2] 305 + Device health and connectivity status. 306 + 307 + Response (200 OK): 308 + device_id string Device identifier 309 + status string "online" | "offline" | "degraded" 310 + last_seen string ISO 8601 311 + is_primary boolean Whether this device hosts the repo 312 + mode string Current lifecycle phase 313 + 314 + DELETE /v1/devices/:id [v0.2] 315 + De-enroll a device. Already exists in §3 — this note confirms 316 + mobile app can also call it (not just web dashboard). 317 + 318 + Note: Update §3 to allow device_token auth (not just session_token) 319 + for mobile-initiated device removal. 320 + ``` 321 + 322 + ### New Section: Data Transfer 323 + 324 + ``` 325 + 13. Data Transfer [v0.1] 326 + 327 + Planned device swap (e.g., upgrading phones). Uses Iroh for 328 + direct peer-to-peer transfer with a 6-digit verification code. 329 + 330 + POST /v1/transfer/initiate [v0.1] 331 + Generate a transfer session. Returns a 6-digit code for the 332 + new device to enter. 333 + 334 + Response (200 OK): 335 + transfer_id string Transfer session identifier 336 + code string 6-digit verification code 337 + expires_at string ISO 8601 (15 minutes) 338 + iroh_ticket string Iroh connection ticket for direct transfer 339 + 340 + POST /v1/transfer/accept [v0.1] 341 + New device submits the transfer code to join the session. 342 + 343 + Request Body: 344 + code string 6-digit code from old device 345 + device_public_key string P-256 public key of new device 346 + 347 + Response (200 OK): 348 + transfer_id string Transfer session ID 349 + status string "accepted" | "transferring" 350 + 351 + POST /v1/transfer/complete [v0.1] 352 + Finalize the transfer. Old device's token is revoked, new 353 + device receives a fresh device_token. 354 + 355 + Response (200 OK): 356 + new_device_id string New device's identifier 357 + device_token string New device's long-lived token 358 + old_device_revoked boolean Confirmation old token is dead 359 + ``` 360 + 361 + ### New Section: Recovery 362 + 363 + ``` 364 + 14. Recovery [v1.0] 365 + 366 + Unplanned device loss recovery via Shamir share reconstruction. 367 + 368 + POST /v1/recovery/initiate [v1.0] 369 + Begin a recovery ceremony. User must present 2 of 3 Shamir 370 + shares to reconstruct the rotation key. 371 + 372 + Request Body: 373 + email string Account email for verification 374 + share_1 string First Shamir share (e.g., from iCloud) 375 + share_source_1 string "icloud" | "relay" | "device" | "paper" 376 + 377 + Response (200 OK): 378 + recovery_id string Recovery session identifier 379 + shares_needed integer Number of additional shares required 380 + status string "awaiting_shares" | "ready_to_verify" 381 + 382 + POST /v1/recovery/verify-key [v1.0] 383 + Submit reconstructed key material to prove DID ownership. 384 + 385 + Request Body: 386 + recovery_id string Recovery session ID 387 + share_2 string Second Shamir share 388 + share_source_2 string Source of the second share 389 + 390 + Response (200 OK): 391 + status string "verified" | "failed" 392 + rotation_key string Reconstructed rotation public key (for verification) 393 + 394 + GET /v1/recovery/restore [v1.0] 395 + Stream the repo and blobs from the relay to the new device 396 + after successful key verification. 397 + 398 + Response: streaming CAR file + blob manifest 399 + 400 + PUT /v1/keys/shares/:id [v1.0] 401 + Update the relay-held Shamir share (Share 2). Used after 402 + key rotation to re-split with new shares. 403 + 404 + Request Body: 405 + encrypted_share string New encrypted share data 406 + 407 + Response (200 OK): 408 + updated_at string ISO 8601 409 + 410 + GET /v1/keys/rotation-log [v1.0] 411 + Immutable audit log of all Shamir share rotations and 412 + recovery attempts. 413 + 414 + Response (200 OK): 415 + entries array List of rotation/recovery events with timestamps 416 + ``` 417 + 418 + --- 419 + 420 + ## Item 13: Fix Tier Model 421 + 422 + ### Section 1.4 — Rate Limiting Table 423 + 424 + Current: 425 + ``` 426 + Self-Hosted Unlimited Unlimited Configurable 427 + ``` 428 + 429 + Change to: 430 + ``` 431 + Business 300/min 1200/min 50 concurrent 432 + ``` 433 + 434 + And add a note: 435 + ``` 436 + Note: BYO relay operators configure their own rate limits. 437 + BYO is a deployment model, not a subscription tier. 438 + See §6.3 for BYO relay configuration. 439 + ``` 440 + 441 + ### Section 2 — GET /v1/accounts/:id/usage 442 + 443 + Current response field: 444 + ``` 445 + tier string Current tier: "free" | "pro" | "self_hosted" 446 + ``` 447 + 448 + Change to: 449 + ``` 450 + tier string Current tier: "free" | "pro" | "business" 451 + ``` 452 + 453 + ### Section 8 — Free Tier Enforcement Table 454 + 455 + Add Business column and remove Self-Hosted references. BYO operators don't hit the managed relay's enforcement — they run their own. 456 + 457 + ### Section 10.2 — Business Metrics 458 + 459 + Current: 460 + ``` 461 + accounts_by_tier Count of accounts per tier (free/pro/self-hosted) 462 + ``` 463 + 464 + Change to: 465 + ``` 466 + accounts_by_tier Count of accounts per tier (free/pro/business) 467 + ``` 468 + 469 + --- 470 + 471 + ## Item 14: Add Milestone Tags to All Endpoints 472 + 473 + Every endpoint group should have a milestone badge. Here's the complete mapping: 474 + 475 + ### v0.1 — Mobile-Only PDS 476 + 477 + ``` 478 + POST /v1/accounts Account creation (web) 479 + POST /v1/accounts/mobile Account creation (mobile) [NEW] 480 + POST /v1/accounts/sessions Login 481 + POST /v1/accounts/claim-codes Generate claim code 482 + GET /v1/accounts/:id/usage Usage metrics 483 + POST /v1/devices Device registration 484 + GET /v1/devices/:id/relay Relay endpoint discovery 485 + POST /v1/dids DID ceremony 486 + GET /v1/dids/:did DID resolution 487 + POST /v1/handles Handle creation 488 + GET /v1/handles/:handle/status Handle status 489 + DELETE /v1/handles/:handle Handle release 490 + POST /v1/relay/keys Generate relay signing key [NEW] 491 + POST /v1/transfer/initiate Device transfer [NEW] 492 + POST /v1/transfer/accept Device transfer [NEW] 493 + POST /v1/transfer/complete Device transfer [NEW] 494 + ``` 495 + 496 + ### v0.2 — Desktop Enrollment 497 + 498 + ``` 499 + POST /v1/devices/:id/pair Device pairing [NEW] 500 + POST /v1/devices/:id/promote Desktop promotion [NEW] 501 + GET /v1/devices/:id/status Device health [NEW] 502 + POST /v1/devices/:id/lease Write lease management 503 + POST /v1/relay/commits/sign Commit signing [NEW] 504 + GET /v1/relay/repo/snapshot Repo export [NEW] 505 + GET /v1/relay/mode Operating mode [NEW] 506 + ``` 507 + 508 + ### v1.0 — Production Launch 509 + 510 + ``` 511 + POST /v1/dids/:did/rotate Key rotation 512 + POST /v1/dids/:did/migrate DID migration (exit) 513 + GET /v1/export/repo Full repo export 514 + DELETE /v1/accounts/:id Account deletion 515 + POST /v1/accounts/:id/restore Cancel deletion 516 + DELETE /v1/devices/:id Device revocation 517 + DELETE /v1/relay/keys/:keyId Key revocation [NEW] 518 + POST /v1/recovery/initiate Recovery ceremony [NEW] 519 + POST /v1/recovery/verify-key Recovery verification [NEW] 520 + GET /v1/recovery/restore Recovery restore [NEW] 521 + PUT /v1/keys/shares/:id Share update [NEW] 522 + GET /v1/keys/rotation-log Rotation audit [NEW] 523 + ``` 524 + 525 + --- 526 + 527 + ## Item 15: Add Blob Endpoints 528 + 529 + These are the provisioning API's view of blob operations. The full blob handling spec covers storage architecture; these are the XRPC-adjacent endpoints the relay serves. 530 + 531 + ``` 532 + 15. Blob Management [v0.1] 533 + 534 + Blob endpoints follow the ATProto spec. See blob-handling-spec.md 535 + for storage architecture and lifecycle details. 536 + 537 + POST /v1/blobs/upload [v0.1] 538 + Alias for com.atproto.repo.uploadBlob. Accepts multipart upload, 539 + returns CID reference. Subject to per-account storage quotas. 540 + 541 + Note: This is the same endpoint as the XRPC uploadBlob — listed 542 + here for completeness. The provisioning API does not add a 543 + separate blob upload path. 544 + 545 + GET /v1/accounts/:id/storage [v0.1] 546 + Blob storage usage for an account. Extends the existing usage 547 + endpoint with blob-specific metrics. 548 + 549 + Response (200 OK): 550 + blob_count integer Total blobs stored 551 + blob_bytes integer Total blob storage consumed 552 + blob_limit integer Tier storage limit for blobs 553 + largest_blob integer Size of largest blob (bytes) 554 + ``` 555 + 556 + --- 557 + 558 + ## Summary: New Endpoint Count 559 + 560 + | Phase | Existing (v0.2) | New in v0.3 | Total | 561 + |-------|-----------------|-------------|-------| 562 + | v0.1 | 11 | 5 | 16 | 563 + | v0.2 | 0 | 7 | 7 | 564 + | v1.0 | 5 | 7 | 12 | 565 + | **Total** | **16** | **19** | **35** | 566 + 567 + The provisioning API grows from 16 untagged endpoints to 35 milestone-tagged endpoints across three phases.
+398
docs/blob-handling-spec.md
··· 1 + # Blob Handling Spec 2 + 3 + Relay Blob Upload, Storage, Proxy & CDN 4 + 5 + v0.1 Draft — March 2026 6 + 7 + Companion to: Provisioning API Spec, Mobile Architecture Spec, Data Migration Spec 8 + 9 + --- 10 + 11 + ## 1. Overview 12 + 13 + Blobs (images, video, media files) are a core part of ATProto but are handled separately from the repo. They are not stored in CAR files and have their own upload, serving, and sync endpoints. This document specifies how the relay handles blobs across all lifecycle phases. 14 + 15 + ### 1.1 Why This Matters 16 + 17 + Every image a user posts through Bluesky is a blob. Without blob handling, the relay can't serve a functional PDS — users can't upload profile pictures, attach images to posts, or share media. Blob support is on the critical path alongside OAuth. 18 + 19 + ### 1.2 ATProto Blob Model 20 + 21 + Key protocol facts that drive the design: 22 + 23 + - Blobs are uploaded via `com.atproto.repo.uploadBlob` before any record references them. 24 + - After upload, blobs are temporary until a record references them (then permanent). 25 + - Unreferenced blobs are garbage-collected after a grace period (spec recommends ≥1 hour). 26 + - Blobs are served via `com.atproto.sync.getBlob` (server-to-server) and typically mirrored to CDNs for end-user serving. 27 + - Blobs are NOT in CAR files. They sync separately via `getBlob` and `listBlobs`. 28 + - Each blob is identified by its CID (Content Identifier, raw multicodec, base32 `b` prefix). 29 + - The ATProto spec does not mandate global size limits — those are per-Lexicon and per-server. 30 + 31 + --- 32 + 33 + ## 2. Lifecycle Phase Behavior 34 + 35 + ### 2.1 Mobile-Only Phase 36 + 37 + The relay is a full PDS. Blob handling is straightforward: 38 + 39 + 1. Third-party app uploads blob → relay stores it. 40 + 2. App creates a record referencing the blob → blob becomes permanent. 41 + 3. AppView/CDN fetches blob via `getBlob` for serving to users. 42 + 4. If record is deleted and no other records reference the blob → blob is garbage-collected. 43 + 44 + The relay is the authoritative blob store. Standard PDS behavior. 45 + 46 + ### 2.2 Desktop-Enrolled Phase 47 + 48 + Blobs need to exist in two places: the relay (for serving to the network) and the desktop (authoritative copy). The flow changes: 49 + 50 + **Upload path (third-party app uploads via XRPC):** 51 + 52 + 1. Bluesky calls `uploadBlob` on the relay (the public XRPC endpoint). 53 + 2. Relay stores the blob locally and assigns a temporary CID. 54 + 3. When the app creates a record referencing the blob, the relay proxies the record-creation to the desktop (per mobile spec §4.2). 55 + 4. The relay forwards the blob data to the desktop via Iroh alongside the record data. 56 + 5. Desktop stores the blob locally as the authoritative copy. 57 + 6. Relay retains its copy as a cache for serving. 58 + 59 + **Upload path (desktop creates content locally — future):** 60 + 61 + If/when the desktop supports local content creation (e.g., a local client): 62 + 63 + 1. Desktop stores the blob locally. 64 + 2. Desktop pushes the blob to the relay via Iroh (alongside the unsigned commit). 65 + 3. Relay stores and serves the blob. 66 + 67 + **Read path:** 68 + 69 + 1. `getBlob` requests hit the relay. 70 + 2. Relay serves from its local cache. 71 + 3. If cache miss (blob was garbage-collected from relay but exists on desktop), relay fetches from desktop via Iroh and re-caches. 72 + 73 + ### 2.3 Desktop Offline (During Desktop-Enrolled) 74 + 75 + - Reads: relay serves blobs from cache. Previously-uploaded blobs remain available. 76 + - Writes: not applicable — write XRPC returns 503 when desktop is offline, so no new blobs can be uploaded. 77 + - Cache miss: if a `getBlob` request arrives for a blob not in the relay's cache while the desktop is offline, relay returns 404. This should be rare if the relay's cache TTL is reasonable. 78 + 79 + --- 80 + 81 + ## 3. Rust Implementation Stack 82 + 83 + ### 3.1 Existing Reference: rsky-pds 84 + 85 + The `blacksky-algorithms/rsky` project includes a full Rust PDS implementation (`rsky-pds`) that already handles blob upload, storage, and serving with S3-compatible backends. This is our primary reference for blob implementation patterns. 86 + 87 + Repo: https://github.com/blacksky-algorithms/rsky 88 + 89 + ### 3.2 Recommended Crates 90 + 91 + | Crate | Version | Purpose | Downloads/mo | 92 + |-------|---------|---------|-------------| 93 + | **rust-s3** | 0.37.0+ | S3-compatible object storage (R2, MinIO, S3) | ~357K | 94 + | **cid** | 0.11.1+ | Content Identifier generation/parsing (ATProto blob refs) | ~13.7M all-time | 95 + | **opendal** | 0.55.0+ | Alternative: unified storage abstraction (Apache project) | — | 96 + 97 + **rust-s3 vs opendal vs aws-sdk-s3:** 98 + 99 + - **rust-s3** is the pragmatic choice — lightweight, supports async and sync, well-tested with R2 and MinIO. Lower dependency footprint than the official AWS SDK. 100 + - **opendal** (Apache OpenDAL) provides a unified API across storage backends. Heavier abstraction but lets you swap from local filesystem → S3 → R2 → MinIO without code changes. Worth considering if we want backend flexibility from the start. 101 + - **aws-sdk-s3** is the official AWS SDK. Excellent maintenance but heavyweight (~100+ transitive deps) and async-only (Tokio). Overkill if R2 or MinIO is the primary target. 102 + 103 + **Recommendation:** Start with **rust-s3** for v0.1 (lowest friction). Evaluate migrating to **opendal** for v1.0 if multi-backend support becomes important. Use the **cid** crate for all CID operations — it's the standard multiformats implementation used across the IPFS/content-addressing ecosystem. 104 + 105 + ### 3.3 MIME Type Sniffing 106 + 107 + For validating blob content types, use the `infer` crate (https://crates.io/crates/infer) — it detects file type from magic bytes without external dependencies. Lightweight and widely used (~5M downloads). 108 + 109 + --- 110 + 111 + ## 4. Storage Architecture 112 + 113 + ### 4.1 Relay Storage 114 + 115 + Blob data lives in S3-compatible object storage. Blob metadata lives in the relay's database (SQLite for single-node, PostgreSQL for production). 116 + 117 + **Blob metadata table:** 118 + 119 + | Column | Type | Description | 120 + |--------|------|-------------| 121 + | cid | TEXT PK | Content identifier (base32, `b` prefix) | 122 + | account_id | TEXT FK | Owning account | 123 + | mime_type | TEXT | MIME type (validated via sniffing) | 124 + | size_bytes | INTEGER | Blob size | 125 + | status | TEXT | `temporary` / `permanent` / `pending_gc` | 126 + | uploaded_at | TEXT | ISO 8601 | 127 + | referenced_at | TEXT | When first referenced by a record (null if temporary) | 128 + | last_accessed_at | TEXT | For cache eviction decisions | 129 + | storage_backend | TEXT | `local` / `s3` — where the blob data lives | 130 + 131 + **Object storage key format:** 132 + 133 + `{bucket}/{account_id}/{cid[0:2]}/{cid[2:4]}/{cid}` 134 + 135 + The two-level prefix hash prevents S3 listing performance issues with large flat namespaces. The CID is the filename — content-addressed storage is naturally deduplicated. 136 + 137 + **Backend configuration (relay.toml):** 138 + 139 + ```toml 140 + [blobs] 141 + backend = "s3" # "local" for dev, "s3" for production 142 + 143 + [blobs.s3] 144 + endpoint = "https://account-id.r2.cloudflarestorage.com" # R2, MinIO, S3 145 + bucket = "pds-blobs" 146 + region = "auto" # R2 uses "auto" 147 + access_key = "..." 148 + secret_key = "..." 149 + ``` 150 + 151 + For local development, blobs fall back to filesystem storage at `{data_dir}/blobs/` using the same key structure. The `storage_backend` column in the metadata table lets the relay serve blobs from either backend during migration. 152 + 153 + ### 4.2 S3-Compatible Providers 154 + 155 + Tested/supported providers: 156 + 157 + | Provider | Notes | 158 + |----------|-------| 159 + | **Cloudflare R2** | No egress fees. Native CDN integration via Workers. Recommended for production. | 160 + | **MinIO** | Self-hosted S3. Ideal for BYO relay operators. Ships as a single binary. | 161 + | **AWS S3** | Standard. Higher egress costs than R2. | 162 + | **Backblaze B2** | Cheap storage, S3-compatible API. | 163 + 164 + BYO relay operators who don't want to run object storage can use `backend = "local"` — blobs stay on the local filesystem. This is the default for the open-source relay binary. 165 + 166 + ### 4.3 Desktop Storage 167 + 168 + The desktop PDS stores blobs in its local filesystem, indexed in its local SQLite. The desktop is the authoritative copy when enrolled. No S3 dependency on the desktop — blob data stays local. 169 + 170 + ### 4.4 Storage Migration Path 171 + 172 + v0.1 (dev/beta): `backend = "local"` — filesystem only, no S3 dependency. 173 + v1.0 (production): `backend = "s3"` — R2 or MinIO. A migration tool copies existing local blobs to the S3 bucket and updates the `storage_backend` column. 174 + 175 + --- 176 + 177 + ## 4. XRPC Endpoints 178 + 179 + The relay must implement these standard ATProto endpoints: 180 + 181 + ### 4.1 com.atproto.repo.uploadBlob 182 + 183 + **Method:** POST 184 + **Auth:** Required (OAuth bearer token) 185 + **Request:** Raw binary body with `Content-Type` header 186 + **Response:** 187 + ```json 188 + { 189 + "$type": "blob", 190 + "ref": {"$link": "bafkrei..."}, 191 + "mimeType": "image/jpeg", 192 + "size": 54499 193 + } 194 + ``` 195 + 196 + **Relay behavior:** 197 + 1. Validate MIME type (sniff bytes if needed, reject disallowed types). 198 + 2. Check account storage quota. 199 + 3. Store blob with `status: temporary`. 200 + 4. Return blob reference. 201 + 5. In desktop-enrolled mode: also forward blob to desktop via Iroh (can be async, before record creation). 202 + 203 + ### 4.2 com.atproto.sync.getBlob 204 + 205 + **Method:** GET 206 + **Params:** `did` (string), `cid` (string) 207 + **Response:** Raw blob data with appropriate `Content-Type` 208 + 209 + **Relay behavior:** 210 + 1. Look up blob in local cache. 211 + 2. If found, serve directly. 212 + 3. If not found and desktop is online, fetch from desktop via Iroh, re-cache, serve. 213 + 4. If not found and desktop is offline, return 404. 214 + 215 + **Security:** Must set Content Security Policy headers. Blobs are untrusted user content — serving them without CSP is a parsing vulnerability risk. 216 + 217 + ### 4.3 com.atproto.sync.listBlobs 218 + 219 + **Method:** GET 220 + **Params:** `did` (string), `since` (string, optional — repo revision) 221 + **Response:** Array of blob CIDs 222 + 223 + Lists all committed (permanent) blobs for an account, optionally since a given revision. Used by AppViews and relays for synchronization. 224 + 225 + --- 226 + 227 + ## 5. Size Limits & Quotas 228 + 229 + ### 5.1 Per-Blob Limits 230 + 231 + ATProto doesn't mandate global limits, but the relay should enforce sensible defaults: 232 + 233 + | Tier | Max blob size | Rationale | 234 + |------|--------------|-----------| 235 + | Free | 5 MB | Covers images, short audio. Matches common PDS limits. | 236 + | Pro | 50 MB | Covers video, large media. | 237 + | Business | 100 MB | Enterprise media needs. | 238 + 239 + These limits apply at upload time. Lexicon-specific limits (e.g., Bluesky's 1 MB for images) are enforced at record creation time. 240 + 241 + ### 5.2 Per-Account Storage Quotas 242 + 243 + Blob storage counts toward the account's total storage quota (defined in provisioning API §8): 244 + 245 + | Tier | Total storage (repo + blobs) | 246 + |------|------------------------------| 247 + | Free | 500 MB | 248 + | Pro | 50 GB | 249 + | Business | 500 GB | 250 + 251 + When an account exceeds its quota, `uploadBlob` returns 413 (Payload Too Large) with a `STORAGE_EXCEEDED` error code. 252 + 253 + ### 5.3 MIME Type Restrictions 254 + 255 + The relay should accept a generous allowlist and reject known-dangerous types: 256 + 257 + **Allowed:** `image/*`, `video/*`, `audio/*`, `application/pdf`, `text/plain`, `application/octet-stream` 258 + 259 + **Blocked:** Executable types (`application/x-executable`, `application/x-mach-binary`, `application/javascript`, etc.), archive types that could contain executables (`.zip`, `.tar.gz` unless explicitly needed by a Lexicon). 260 + 261 + The relay should sniff blob bytes to validate the declared MIME type and reject mismatches (e.g., a blob declared as `image/jpeg` that's actually a PE executable). 262 + 263 + --- 264 + 265 + ## 6. Garbage Collection 266 + 267 + ### 6.1 Temporary Blob Cleanup 268 + 269 + Blobs uploaded but never referenced by a record are garbage-collected: 270 + 271 + - **Grace period:** 6 hours (ATProto spec recommends ≥1 hour; 6 hours gives apps plenty of time). 272 + - **Check frequency:** Every 30 minutes, a background job scans for temporary blobs past the grace period. 273 + - **Action:** Delete blob data and metadata row. 274 + 275 + ### 6.2 Dereferenced Blob Cleanup 276 + 277 + When a record is deleted, check if any other records in the same repo reference the blob's CID: 278 + 279 + - If no references remain → mark blob as `pending_gc`. 280 + - Run a second check after 24 hours (in case a new record references it). 281 + - If still unreferenced → delete. 282 + 283 + ### 6.3 Account Deletion Cleanup 284 + 285 + On account teardown (provisioning API §7), all blobs are deleted: 286 + 287 + - During grace period: blobs are retained (account is read-only). 288 + - After grace period: bulk-delete all blobs for the account. 289 + 290 + ### 6.4 Relay Cache Eviction (Desktop-Enrolled) 291 + 292 + When the desktop is the authoritative blob store, the relay's copy is a cache. Eviction strategy: 293 + 294 + - **LRU eviction** when relay storage exceeds a per-account cache limit. 295 + - Cache limit per tier: Free = 100 MB, Pro = 5 GB, Business = 50 GB. 296 + - Evicted blobs can be re-fetched from the desktop on demand (via `getBlob` → Iroh → desktop). 297 + - Never evict blobs that are less than 7 days old (matches commit buffer retention). 298 + 299 + --- 300 + 301 + ## 7. CDN Integration 302 + 303 + ### 7.1 Why CDN 304 + 305 + The ATProto spec recommends that AppViews mirror blobs to their own CDN rather than hitting `getBlob` directly. But for a desktop PDS that goes offline, having a relay-side CDN cache prevents blob unavailability. 306 + 307 + ### 7.2 Architecture 308 + 309 + For Pro and Business tiers, the relay can optionally front blob serving with a CDN (Cloudflare R2 + Workers, or similar): 310 + 311 + ``` 312 + [AppView] → CDN → [Relay getBlob] → (cache or Iroh → desktop) 313 + ``` 314 + 315 + The CDN caches public blob responses with appropriate cache headers. This reduces load on the relay and ensures blobs remain available even during brief relay restarts. 316 + 317 + ### 7.3 Cache Headers 318 + 319 + `getBlob` responses should include: 320 + - `Cache-Control: public, max-age=31536000, immutable` — blobs are content-addressed, so they never change. 321 + - `Content-Type`: the validated MIME type. 322 + - `Content-Security-Policy: default-src 'none'; sandbox` — prevent blob content from executing. 323 + 324 + The `immutable` directive is safe because CIDs are content hashes — if the content changed, the CID would change. 325 + 326 + --- 327 + 328 + ## 8. Data Migration Implications 329 + 330 + ### 8.1 Planned Device Swap 331 + 332 + During a planned swap (migration spec §3), the blob archive is included in the transfer bundle: 333 + 334 + 1. Old device exports blobs alongside the CAR file. 335 + 2. Bundle includes a blob manifest mapping CIDs → MIME types → sizes. 336 + 3. New device imports blobs and verifies CIDs match. 337 + 338 + ### 8.2 Unplanned Device Loss 339 + 340 + On the free tier, blobs not crawled by an AppView may be permanently lost (migration spec §4.3). The relay's cache retention helps: 341 + 342 + - **Paid tiers:** Relay holds a full blob mirror. All blobs recoverable from relay. 343 + - **Free tier:** Relay holds only recently-accessed blobs (cache eviction). Older blobs attempted via `getBlob` against known AppView CDNs. Blobs never crawled are lost. 344 + 345 + ### 8.3 Proactive Crawl 346 + 347 + After every blob upload, the relay should call `requestCrawl` to the configured AppView. This maximizes the chance that blobs are indexed before any loss event. Already noted in the migration spec (§4.3) but important to implement at the relay level. 348 + 349 + --- 350 + 351 + ## 9. Implementation Milestones 352 + 353 + ### v0.1 — Basic Blob Support (blocks mobile-only phase) 354 + 355 + - `uploadBlob` endpoint with local filesystem storage 356 + - `getBlob` endpoint for serving 357 + - `listBlobs` endpoint 358 + - CID generation/validation via `cid` crate 359 + - Temporary blob garbage collection (6-hour grace) 360 + - MIME type validation via `infer` crate 361 + - Per-blob size limits 362 + - Account storage quota enforcement 363 + - `requestCrawl` after record creation with blob references 364 + - S3 backend support via `rust-s3` (optional, configurable — local is default) 365 + 366 + ### v1.0 — Production Blobs 367 + 368 + - S3 backend as default for managed relay (R2 recommended) 369 + - Local → S3 migration tool 370 + - Dereferenced blob cleanup 371 + - CDN integration for Pro/Business tiers (R2 + Workers or equivalent) 372 + - Cache eviction for desktop-enrolled accounts 373 + - Blob forwarding to desktop via Iroh on upload 374 + - Desktop → relay blob fetch on cache miss 375 + - Blob manifest in device transfer bundle 376 + - MinIO deployment docs for BYO relay operators 377 + 378 + ### Later 379 + 380 + - Video transcoding (serve multiple resolutions) 381 + - Blob deduplication across accounts (content-addressed storage makes this natural) 382 + - Blob access analytics (which blobs are hot/cold for cache optimization) 383 + 384 + --- 385 + 386 + ## 10. Design Decisions 387 + 388 + | Decision | Rationale | Alternatives Considered | 389 + |----------|-----------|------------------------| 390 + | rust-s3 crate for S3 operations | Lightweight, async/sync flexible, well-tested with R2 and MinIO. 357K downloads/month. Lower deps than aws-sdk-s3. | aws-sdk-s3 (heavyweight, 100+ deps), opendal (heavier abstraction, may adopt later). | 391 + | S3-compatible object storage for blob data | Blobs are large, write-once, and content-addressed — a perfect fit for object storage. R2 has no egress fees. MinIO works for self-hosted. | Local filesystem only (doesn't scale, no redundancy), database BLOBs (terrible performance at scale). | 392 + | Local filesystem as default, S3 as production option | BYO relay operators shouldn't need to run MinIO for a small instance. Local works fine for single-user. S3 for managed relay at scale. | S3 required from day one (barrier to self-hosting), local only (no production path). | 393 + | Cloudflare R2 as recommended provider | Zero egress fees (biggest cost for blob serving). Native CDN via Workers. S3-compatible API. | AWS S3 (egress costs add up), Backblaze B2 (less ecosystem integration). | 394 + | 6-hour temp blob grace period | 6x the ATProto minimum. Generous for apps with slow record creation. Low storage cost. | 1 hour (spec minimum — too aggressive), 24 hours (unnecessary). | 395 + | MIME type sniffing via infer crate | Prevents content-type spoofing. No external deps. Critical for security — a mislabeled executable served as an image is dangerous. | Trust client Content-Type (unsafe), reject without sniffing (too strict). | 396 + | CDN with immutable cache headers | Blobs are content-addressed — the CID changes if content changes. Immutable caching is safe and eliminates invalidation complexity. | Short TTL caching (wastes CDN bandwidth), no CDN (higher relay load). | 397 + | Relay caches blobs in desktop-enrolled mode | Ensures blobs are served when desktop is offline. `getBlob` from AppViews needs to work 24/7. | No relay cache (blobs unavailable when desktop sleeps — breaks federation), desktop-only (same problem). | 398 + | Reference rsky-pds for implementation patterns | Production Rust PDS with S3 blob storage already implemented. Don't reinvent. | Build from scratch (slower, more bugs), fork rsky-pds (too coupled). |
+151
docs/cross-spec-analysis.md
··· 1 + # Cross-Spec Gap & Inconsistency Analysis 2 + 3 + **Documents analyzed:** 4 + - PDS Architecture v7 (HTML) — "architecture" 5 + - Provisioning API Spec v0.2 (docx) — "provisioning" 6 + - Data Migration & Recovery Spec v0.1 (docx) — "migration" 7 + - Mobile Architecture Spec v1.2 (docx) — "mobile" 8 + 9 + **Canonical source:** The mobile architecture spec (v1.2) represents the most recent thinking and takes precedence where documents conflict. 10 + 11 + --- 12 + 13 + ## Decisions Made 14 + 15 + The following contradictions were identified and resolved during review: 16 + 17 + | # | Issue | Decision | Affects | 18 + |---|-------|----------|---------| 19 + | 1.3 | **Signing key custody** — architecture says device signs; mobile says relay always signs | **Relay always signs.** Desktop constructs Merkle tree, relay signs commits. Desktop-local signing is a future option (mobile spec §10). | architecture (data flow, PDS shell desc, v0.1 scope) | 20 + | 1.1 | **Key types** — provisioning API uses Ed25519; mobile uses P-256; ATProto requires P-256/secp256k1 | **Fix to P-256/secp256k1.** All Ed25519 references in provisioning API are wrong. | provisioning (POST /v1/devices, POST /v1/dids/:did/rotate) | 21 + | 1.2 | **DID ceremony** — provisioning says client builds full DID doc; mobile says client sends keys, relay assembles | **Relay constructs it.** Client sends key material, relay orchestrates did:plc creation. | provisioning (POST /v1/dids request body) | 22 + | 1.4 | **Tier naming** — architecture has Free/Pro/Business; provisioning has Free/Pro/Self-Hosted | **Three tiers + BYO.** Free/Pro/Business are subscription tiers. Self-Hosted (BYO relay) is an orthogonal deployment model, not a tier. | architecture, provisioning | 23 + | 2.5 | **Shamir shares** — migration has device/relay/iCloud; mobile has iCloud/relay/BIP-39 phrase | **User chooses Share 3.** Share 1 = iCloud Keychain, Share 2 = relay escrow, Share 3 = user's choice of device-local OR BIP-39 paper/USB. | migration, mobile | 24 + | 2.1 | **Mobile account creation** — provisioning API only supports web-first flow | **Add dedicated mobile endpoint.** New POST /v1/accounts/mobile combines account creation + device binding. Web flow unchanged. | provisioning | 25 + | 2.4 | **No OAuth spec** | **Spec it now.** ATProto OAuth (DPoP, PAR, client metadata) needs its own document. Blocks third-party app integration. | new document | 26 + | 2.8 | **Desktop offline writes** — architecture implies offline compose+sync; mobile says 503 | **Resolved by signing decision.** Relay signs all commits, so writes return 503 when desktop is offline (relay can't construct commits from stale state). Architecture needs updating. | architecture | 27 + 28 + --- 29 + 30 + ## Remaining Gaps (Not Yet Resolved) 31 + 32 + ### 2.2 + 2.3: API endpoint fragmentation 33 + 34 + 15+ endpoints defined in the mobile and migration specs are not in the provisioning API: 35 + 36 + **From mobile spec (9 endpoints):** 37 + - POST /v1/relay/keys — generate relay signing key 38 + - DELETE /v1/relay/keys/:keyId — revoke relay signing key 39 + - GET /v1/relay/repo/snapshot — full repo snapshot (CAR) 40 + - POST /v1/devices/:id/pair — device pairing 41 + - POST /v1/devices/:id/promote — desktop promotion to repo host 42 + - DELETE /v1/devices/:id — de-enroll device 43 + - GET /v1/devices/:id/status — device health/status 44 + - POST /v1/relay/commits/sign — sign unsigned commit 45 + - GET /v1/relay/mode — current operating mode 46 + 47 + **From migration spec (8 endpoints):** 48 + - POST /v1/transfer/initiate — generate transfer session + code 49 + - POST /v1/transfer/accept — new device submits transfer code 50 + - POST /v1/transfer/complete — finalize transfer + lease handover 51 + - POST /v1/recovery/initiate — begin recovery ceremony 52 + - POST /v1/recovery/verify-key — prove DID key reconstruction 53 + - GET /v1/recovery/restore — stream repo + blobs from relay 54 + - PUT /v1/keys/shares/:id — update relay-held Shamir share 55 + - GET /v1/keys/rotation-log — audit log of Shamir rotations 56 + 57 + **Action needed:** Consolidate into provisioning API v0.3 or create a unified Relay API Reference. 58 + 59 + ### 2.6: Firehose in mobile-only phase 60 + 61 + Architecture tags Firehose Proxy as v1.0 and says "BGS drops on sleep" for free tier in v0.1. But in the mobile-only phase (mobile spec), the relay IS a full PDS and must emit firehose events from day one — there's no "sleep" because the relay is always on. 62 + 63 + The architecture's firehose proxy concept (relay maintains a persistent BGS WebSocket on behalf of a desktop that sleeps) is a *desktop-enrolled* feature. In mobile-only mode, the relay just emits firehose natively like any hosted PDS. 64 + 65 + **Action needed:** Architecture needs to distinguish between "relay as native PDS firehose emitter" (mobile-only, always available) and "relay as firehose proxy for sleeping desktop" (desktop-enrolled, v1.0 paid feature). 66 + 67 + ### 2.7: No blob handling spec 68 + 69 + No document specifies how blobs (images, media) are uploaded, stored, or served through the relay. The migration spec discusses blob loss on free tier, but the upload/storage/proxy path is unspecified. 70 + 71 + **Action needed:** Add blob handling to the provisioning API or create a separate spec. Covers: upload endpoint, storage limits per tier, proxy behavior in desktop-enrolled mode, CDN caching. 72 + 73 + ### 3.1: Migration spec doesn't reference mobile recovery 74 + 75 + The migration spec covers desktop-to-desktop migration but not phone-to-phone migration (lost iPhone). The mobile spec covers phone recovery in §7.2. These share Shamir infrastructure and should cross-reference. 76 + 77 + **Action needed:** Add cross-references between migration spec §4 and mobile spec §7. 78 + 79 + ### 3.2: Architecture doesn't mention mobile at all 80 + 81 + The architecture is entirely desktop-centric. The mobile-only lifecycle phase (relay as full hosted PDS) isn't represented. 82 + 83 + **Action needed:** Add a "Phase 0: Mobile-Only" to the architecture showing the relay as a complete PDS before any desktop is enrolled. 84 + 85 + ### 3.3: No relay internals spec 86 + 87 + All four docs describe the relay from the outside. None covers database schema, process architecture, deployment model, or scaling strategy. 88 + 89 + **Action needed:** Not blocking for now, but will be needed before implementation. 90 + 91 + --- 92 + 93 + ## Milestone Alignment Issues 94 + 95 + ### 4.1: Shamir timing 96 + 97 + Migration spec puts basic Shamir in v0.1. Architecture puts Recovery Share Manager in v1.0. Since mobile onboarding (§3.1 Step 7) generates Shamir shares during account creation, basic Shamir support is required from the mobile v0.1 launch. 98 + 99 + **Action needed:** Architecture should move basic Shamir to v0.1 (or acknowledge it's a relay-side feature available from mobile launch). 100 + 101 + ### 4.2: Provisioning API has no milestone tags 102 + 103 + No endpoints are marked v0.1 vs v1.0. 104 + 105 + **Action needed:** Tag each endpoint group with a milestone. 106 + 107 + ### 4.3: Mobile has 4 phases, architecture has 2 108 + 109 + Mobile: v0.1 (identity wallet) → v0.2 (device mgmt) → v1.0 (recovery) → v2.0+ (signing sovereignty). 110 + Architecture: v0.1 (technical preview) → v1.0 (public launch). 111 + 112 + Mobile v0.1 has no architecture milestone — it's relay-only. 113 + 114 + **Action needed:** Create a unified milestone map across all four documents. 115 + 116 + --- 117 + 118 + ## Action List by Document 119 + 120 + ### Architecture (v8 needed) 121 + 1. Update signing model: device constructs, relay signs 122 + 2. Update data flow diagram: show unsigned commit → relay → signed commit 123 + 3. Update PDS shell description: "repo construction engine" not "owns XRPC surface" 124 + 4. Add "Phase 0: Mobile-Only" lifecycle phase 125 + 5. Fix tier model: Free/Pro/Business + BYO relay as deployment option 126 + 6. Distinguish native firehose (mobile-only) from firehose proxy (desktop-enrolled) 127 + 7. Move basic Shamir to v0.1 scope 128 + 8. Add unified milestone map 129 + 130 + ### Provisioning API (v0.3 needed) 131 + 9. Replace all Ed25519 references with P-256/secp256k1 132 + 10. Rewrite POST /v1/dids: accept key material, not full DID document 133 + 11. Add POST /v1/accounts/mobile endpoint 134 + 12. Integrate 17 endpoints from mobile + migration specs 135 + 13. Fix tier model: Free/Pro/Business + Self-Hosted as deployment option 136 + 14. Add milestone tags to all endpoints 137 + 15. Add blob upload/storage endpoints 138 + 139 + ### Migration Spec (v0.2 needed) 140 + 16. Update Shamir share model: Share 1=iCloud, Share 2=relay, Share 3=user's choice 141 + 17. Cross-reference mobile spec §7 for phone recovery 142 + 18. Align milestone timing with architecture 143 + 144 + ### Mobile Spec (minor updates) 145 + 19. Update Shamir share model: Share 3 = user's choice (device-local or BIP-39) 146 + 20. Cross-reference migration spec for desktop-to-desktop flows 147 + 148 + ### New Documents Needed 149 + 21. **ATProto OAuth Spec** — DPoP, PAR, client metadata discovery, token lifecycle 150 + 22. **Blob Handling Spec** — upload, storage, proxy, CDN, tier limits 151 + 23. **Unified Milestone Map** — single source of truth for all phases across all docs
+273
docs/data-migration-spec.md
··· 1 + **Data Migration Spec** 2 + 3 + Device Swap, Recovery & Key Migration 4 + 5 + v0.2 — Shamir Model Update + Mobile Cross-References 6 + 7 + March 2026 8 + 9 + Companion to Provisioning API Spec v0.2 10 + 11 + **Changelog** 12 + 13 + ``` 14 + v0.2 Changes — Shamir Model Update + Mobile Cross-References 15 + 16 + FIX Shamir share model: Share 3 is user's choice (device-local or BIP-39) 17 + NEW Cross-references to mobile spec §7 for phone recovery 18 + FIX Milestone alignment with unified-milestone-map.md 19 + ``` 20 + 21 + **1. Overview** 22 + 23 + This document specifies the data migration system for the desktop PDS application. It covers two primary scenarios: planned device swaps (user voluntarily moves to a new machine) and unplanned device loss (hardware failure, theft, or accidental damage). Both scenarios share core infrastructure but diverge in their recovery ceremony. 24 + 25 + The migration system builds on three foundational components from the existing architecture: the Shamir secret sharing scheme for DID key protection, the relay layer's configurable caching behavior, and the Iroh peer-to-peer transport. 26 + 27 + **1.1 Design Principles** 28 + 29 + - **Zero key exposure:** DID signing keys never transit the network unencrypted, even during migration. All key material is wrapped before leaving the source device or reconstructed only on the destination device. 30 + 31 + - **Sovereignty preserved:** The relay never holds sufficient key material to impersonate the user. Recovery always requires at least two of three Shamir shares, and the relay holds at most one. 32 + 33 + - **Grandma-proof UX:** The planned swap happy path requires entering a 6-digit code. Unplanned recovery requires signing into iCloud on the new device (which most users have already done). 34 + 35 + - **Tier-aware restoration:** Paid users get full repo mirrors for instant recovery. Free users reconstruct from the network, accepting possible blob loss. 36 + 37 + **1.2 Migration Assets** 38 + 39 + Each asset has a distinct risk profile and recovery strategy: 40 + 41 + ----------------------- ------------------ ------------------------------- ----------------------------------------------------------- 42 + **Asset** **Risk if Lost** **Recovery Source** **Notes** 43 + DID signing key **Catastrophic** Shamir reconstruction Identity is permanently lost without 2-of-3 shares 44 + ATProto repo (CAR) High Relay mirror or network crawl Signed commit history; can be re-fetched if crawled 45 + Blob store Medium Relay mirror or CDN Images/media; may be lost if never crawled by AppView 46 + App config Low Relay account metadata Handle, relay endpoint, preferences; easily reconstructed 47 + iCloud Keychain share Low (redundant) Apple iCloud sync Auto-syncs to new device via Apple ID 48 + ----------------------- ------------------ ------------------------------- ----------------------------------------------------------- 49 + 50 + **2. Shamir Key Recovery Model** 51 + 52 + The DID signing key is split into three shares using Shamir's Secret Sharing (2-of-3 threshold). Any two shares are sufficient to reconstruct the key; no single share reveals any information about the key. 53 + 54 + **2.1 Share Distribution** 55 + 56 + ----------- ------------------------------------------ ----------------------------------------------------- ------------------------------------------------------------------ 57 + **Share** **Holder** **Storage** **Recovery Access** 58 + Share 1 iCloud Keychain Keychain (E2E encrypted by Apple) Available on any device signed into the same iCloud account 59 + Share 2 Relay service Encrypted at rest, server-side HSM-wrapped Released after account authentication (email + password) 60 + Share 3 User's choice (device-local OR BIP-39) Secure Enclave / Keychain (device-local) or paper/USB export Auto-available on configured device, or manual entry from backup 61 + ----------- ------------------------------------------ ----------------------------------------------------- ------------------------------------------------------------------ 62 + 63 + **2.2 Share Holder Rationale** 64 + 65 + The user's choice for share 3 balances convenience and resilience. For device-local storage, the user designates a second device (e.g., iPad) where share 3 is stored in the Secure Enclave/Keychain, making recovery seamless across their ecosystem if they lose their primary device. For BIP-39 backup, power users who want full air-gap sovereignty can export share 3 as a recovery phrase for paper or USB storage. The recovery ceremony code is identical in both cases; only the share retrieval step differs. 66 + 67 + iCloud Keychain (share 1) was chosen as the default anchor for UX simplicity: most macOS users are already signed into iCloud, making unplanned recovery require zero additional user action beyond installing the app on a new device. 68 + 69 + **2.3 Threat Model** 70 + 71 + - **Relay compromise alone:** Attacker obtains share 2 only. Insufficient for key reconstruction. 72 + 73 + - **iCloud compromise alone:** Attacker obtains share 1 only. Insufficient. 74 + 75 + - **Relay + iCloud compromise:** Attacker can reconstruct key. Mitigation: relay share is HSM-wrapped and requires account auth; iCloud Keychain is E2E encrypted and requires Apple ID + device passcode. Combined compromise is a sophisticated, targeted attack. 76 + 77 + - **Device theft (unlocked):** Attacker has share 3 if device-local, or nothing if BIP-39 is in a separate location. Mitigation: biometric/password gate on the app's key export flow. 78 + 79 + **3. Planned Device Swap** 80 + 81 + The happy path. The user's old machine is still accessible. This flow uses a direct Iroh peer connection for local transfer, with relay-mediated fallback for remote swaps. 82 + 83 + **3.1 Flow** 84 + 85 + 1. **Initiate transfer.** User opens Settings → Transfer to New Device on the old machine. The app generates a one-time 6-digit transfer code and displays it on screen. Internally, the app bundles: full repo snapshot (CAR file export), blob archive, DID signing key (encrypted with the transfer code as symmetric key via AES-256-GCM), app config (handle, relay endpoint, preferences), and a manifest with checksums. 86 + 87 + 2. **Establish peer connection.** User installs the app on the new machine, selects "Transfer from Existing Device," and enters the 6-digit code. The app uses Iroh's peer discovery to find the old machine on the local network. If both machines are on the same LAN, the transfer is direct (no relay involvement). If remote, the transfer routes through the Iroh relay, encrypted end-to-end with the transfer code as the shared secret. 88 + 89 + 3. **Transfer and verify.** The bundle streams from old → new. The new machine verifies the manifest checksums, decrypts the DID key, and validates it can sign a test commit against the repo's Merkle root. 90 + 91 + 4. **Device lease handover.** The new machine calls POST /v1/devices/:id/lease to acquire the primary device lease from the relay. The old machine's lease is released. The relay begins routing traffic to the new device's Iroh node ID. 92 + 93 + 5. **Shamir share rotation.** The new machine generates a fresh Shamir split of the DID key and updates share 1 (iCloud Keychain), share 2 (relay via PUT /v1/keys/shares/:id), and share 3 (device-local Keychain or BIP-39 export). This ensures the old machine's local share is invalidated. 94 + 95 + 6. **Decommission old device.** The old machine's app detects the lease release and prompts: "Transfer complete. Wipe local data?" On confirmation, it securely erases the local repo, blobs, and key material. 96 + 97 + **3.2 Transfer Code Security** 98 + 99 + The 6-digit code provides approximately 20 bits of entropy, which is intentionally low for usability. Security relies on the transfer window being short-lived (default: 10 minutes), the Iroh connection requiring the code for handshake, and rate limiting on connection attempts (3 failures = code invalidated, regenerate required). For power users, a "Show full code" option reveals a 24-character alphanumeric code for higher entropy. 100 + 101 + **4. Unplanned Device Loss** 102 + 103 + The old machine is gone. Recovery depends entirely on the Shamir shares and the relay's cached data. 104 + 105 + **4.1 Recovery Ceremony** 106 + 107 + 1. **Install and select recovery.** User installs the app on a new machine and selects "Recover Existing Identity." 108 + 109 + 2. **Authenticate with relay.** User signs in with their account credentials (email + password from initial provisioning). The relay verifies identity and releases Shamir share 2. 110 + 111 + 3. **Retrieve share 3.** If the user chose device-local storage: the app attempts to retrieve share 3 from the configured backup device (via iCloud Keychain sync or local network if available). If the configured device is inaccessible, the user can enter a BIP-39 backup phrase if one was exported during setup. If the user chose BIP-39 backup: the app prompts for manual entry of the recovery phrase. 112 + 113 + 4. **Reconstruct DID key.** Shares 2 + 3 are combined via Shamir reconstruction. The app verifies the key by checking its public component against the DID document retrieved from the PLC directory. 114 + 115 + 5. **Restore repo and blobs.** Restoration behavior depends on the user's tier (see section 4.2). 116 + 117 + 6. **Re-establish relay presence.** Register new device lease, publish DID rotation operation if key was rotated, and resume Iroh tunnel to relay. 118 + 119 + 7. **Rotate Shamir shares.** Same as planned swap step 5: generate fresh split, update all three share holders. This invalidates the lost device's share 3. 120 + 121 + **4.2 Tier-Based Repo Restoration** 122 + 123 + ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 124 + **Tier** **Behavior** **Tradeoffs** 125 + **Paid** Relay holds a full repo mirror (synced continuously). On recovery, the relay streams the complete CAR file + blobs to the new device. Target: \< 5 min for a typical repo. Near-instant, zero data loss. Relay storage cost scales with repo size. User pays for this via subscription. 126 + **Free** Relay holds only the recent activity buffer (configurable, default 7 days). On recovery, the app: (a) imports the buffer from the relay, (b) calls com.atproto.sync.getRepo against the AppView/BGS to fetch the historical repo, (c) attempts to recover blobs from known CDN endpoints. Slower recovery (minutes to hours depending on repo size and network). Blobs that were never crawled by the AppView are permanently lost. Commit history intact if the BGS indexed it. 127 + ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 128 + 129 + **4.3 The Blob Loss Problem (Free Tier)** 130 + 131 + On the free tier, blobs (images, media) that were uploaded but never crawled by the AppView or any relay are unrecoverable after device loss. This is an inherent tradeoff of not paying for relay-side mirroring. 132 + 133 + Mitigations: 134 + 135 + - **Proactive crawl requests:** After every blob upload, the app calls requestCrawl to the configured AppView, increasing the likelihood the blob is indexed before any loss event. 136 + 137 + - **Blob inventory warnings:** The health monitor tracks which blobs have been confirmed crawled vs. uncrawled. Settings → Data Health shows a "X blobs not yet backed up by the network" count, with an upgrade prompt. 138 + 139 + - **Grace period on tier downgrade:** If a paid user downgrades to free, the relay retains the full mirror for 30 days before pruning to buffer-only. 140 + 141 + **4.4 Phone Recovery** 142 + 143 + Phone-to-phone recovery uses the same Shamir infrastructure as desktop recovery. The mobile architecture spec (§7) details the iOS-specific flow: 144 + 145 + 1. New phone signs into iCloud → Share 1 is available 146 + 2. User authenticates with relay → Share 2 is available 147 + 3. Relay reconstructs rotation key from 2 shares 148 + 4. Relay re-generates signing key, updates DID document 149 + 5. New phone stores new rotation key in Secure Enclave 150 + 151 + The key difference from desktop recovery: in phone recovery, the relay already holds the repo (it's the PDS in mobile-only mode), so there's no repo transfer step. Recovery is purely a key reconstruction + DID update operation. 152 + 153 + See: mobile-architecture-spec-v1.3 §7.2 for the complete flow. 154 + 155 + **5. API Surface** 156 + 157 + These endpoints extend the existing provisioning API. All endpoints require bearer token authentication. 158 + 159 + **5.1 Transfer Endpoints** 160 + 161 + ----------------------- ------------ ---------------------------------------------------------------------------------------------------- 162 + **Endpoint** **Method** **Purpose** 163 + /v1/transfer/initiate POST Generate transfer session + code. Returns session ID and encrypted bundle metadata. 164 + /v1/transfer/accept POST New device submits transfer code. Relay brokers Iroh peer introduction if direct connection fails. 165 + /v1/transfer/complete POST Finalize transfer. Triggers lease handover and old device notification. 166 + ----------------------- ------------ ---------------------------------------------------------------------------------------------------- 167 + 168 + **5.2 Recovery Endpoints** 169 + 170 + ------------------------- ----------------- ------------------------------------------------------------------------------------------------------ 171 + **Endpoint** **Method** **Purpose** 172 + /v1/recovery/initiate POST Begin recovery ceremony. Requires account credentials. Returns share 2 (encrypted to session key). 173 + /v1/recovery/verify-key POST Client proves it reconstructed the correct DID key by signing a challenge. Unlocks repo restoration. 174 + /v1/recovery/restore GET (streaming) Stream repo + blobs from relay cache. Paid tier: full mirror. Free tier: buffer only. 175 + ------------------------- ----------------- ------------------------------------------------------------------------------------------------------ 176 + 177 + **5.3 Key Management Endpoints** 178 + 179 + ----------------------- ------------ --------------------------------------------------------------------------------------------- 180 + **Endpoint** **Method** **Purpose** 181 + /v1/keys/shares/:id PUT Update the relay-held Shamir share after rotation. Requires proof of current key ownership. 182 + /v1/keys/shares/:id DELETE Permanently delete relay-held share (account deletion flow). 183 + /v1/keys/rotation-log GET Audit log of all Shamir rotations and share updates for the account. 184 + ----------------------- ------------ --------------------------------------------------------------------------------------------- 185 + 186 + **6. Sequence Summaries** 187 + 188 + **6.1 Planned Swap Sequence** 189 + 190 + Old Device → generates transfer code → bundles repo + encrypted key 191 + 192 + New Device → enters code → discovers old device via Iroh LAN / relay fallback 193 + 194 + Old Device → streams bundle → New Device 195 + 196 + New Device → verifies checksums + decrypts key → POST /v1/devices/:id/lease 197 + 198 + New Device → generates fresh Shamir split → updates all 3 share holders 199 + 200 + Old Device → detects lease release → prompts wipe → securely erases 201 + 202 + **6.2 Unplanned Loss Sequence** 203 + 204 + New Device → POST /v1/recovery/initiate (credentials) → receives share 2 205 + 206 + New Device → retrieves share 3 from device-local storage (automatic) or paper (manual) 207 + 208 + New Device → Shamir reconstruct → POST /v1/recovery/verify-key (signed challenge) 209 + 210 + New Device → GET /v1/recovery/restore (streaming) → imports repo + blobs 211 + 212 + New Device → POST /v1/devices/:id/lease → DID rotation → Iroh tunnel up 213 + 214 + New Device → fresh Shamir split → updates all 3 share holders 215 + 216 + **7. Edge Cases and Risks** 217 + 218 + **7.1 Lost Share 3 (Paper or Device)** 219 + 220 + If the user chose BIP-39 backup and loses the paper, only shares 1 + 2 (iCloud + relay) remain. If the user chose device-local and loses the backup device, share 3 is inaccessible. Mitigation: during onboarding, the app clearly explains both options and recommends exporting a BIP-39 backup even for device-local mode. The app also offers periodic "recovery key health check" reminders. 221 + 222 + **7.2 Relay Downtime During Recovery** 223 + 224 + If the relay is unreachable when the user attempts recovery, share 2 is temporarily inaccessible. Mitigation: the relay is the only infrastructure component the user depends on, and its SLA is part of the service tier. Multi-region relay failover (designed in the provisioning spec) covers this. The recovery ceremony gracefully retries with exponential backoff. 225 + 226 + **7.3 Stale Relay Mirror** 227 + 228 + For paid users, the relay mirror may lag behind the device's latest commits if the device was actively posting when it was lost. The relay sync interval determines the maximum data loss window. Default: 5-minute sync interval, meaning up to 5 minutes of commits could be lost. Configurable per account. 229 + 230 + **7.4 Concurrent Recovery Attempts** 231 + 232 + If an attacker attempts recovery while the legitimate user is also recovering, the relay's recovery/initiate endpoint enforces a single active session per account. Second attempts return 409 Conflict with a "recovery already in progress" message. Sessions expire after 30 minutes. 233 + 234 + **7.5 Transfer Interrupted Mid-Stream** 235 + 236 + If the Iroh connection drops during a planned swap, the transfer session remains valid for 10 minutes. The new device can reconnect and resume from the last acknowledged chunk (the bundle is transferred in content-addressed blocks). After timeout, a new transfer code must be generated. 237 + 238 + **8. Implementation Milestones** 239 + 240 + **v0.1 — Basic Migration + Shamir Generation** 241 + 242 + - Planned device swap (LAN transfer via Iroh, 6-digit code) 243 + - Shamir share generation during account creation 244 + - Share 1 → iCloud Keychain storage 245 + - Share 2 → relay escrow 246 + - Share 3 → user's choice (device-local or BIP-39) 247 + - Note: Share GENERATION is v0.1. Share RECOVERY is v1.0. 248 + 249 + **v1.0 — Full Recovery** 250 + 251 + - Unplanned device loss recovery ceremony 252 + - Shamir reconstruction (2-of-3) 253 + - DID key rotation after recovery 254 + - Recovery UI in mobile app 255 + - Relay-side recovery session management 256 + 257 + **Later** 258 + 259 + - Multi-device sync (share key across devices without migration) 260 + 261 + See unified-milestone-map.md for how these milestones align with the architecture, provisioning API, and mobile spec phases. 262 + 263 + **9. Design Decisions Log** 264 + 265 + ------------------------------------------- ----------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- 266 + **Decision** **Rationale** **Alternatives Considered** 267 + iCloud Keychain as share 1 Best UX for target audience (non-technical macOS users). Zero user action on recovery. E2E encrypted by Apple. Paper-only (too fragile for grandma), relay holds 2 shares (breaks sovereignty model). 268 + Share 3 as user's choice Balances convenience (device-local for multi-device users) with resilience (BIP-39 for air-gap backup). Single-option model (less flexible for different user preferences). 269 + 6-digit transfer code for planned swap Balances usability (easy to read aloud) with security (short-lived session, rate-limited attempts). QR code (requires camera), Bluetooth pairing (unreliable), pre-shared key (complex UX). 270 + Shamir rotation on every migration Ensures the old device's local share cannot be used even if physically recovered by an attacker after the swap. Reuse shares (simpler but leaves old share 1 valid indefinitely). 271 + Configurable relay caching per tier Aligns cost with value. Full mirror is expensive; free users accept the tradeoff. Upgrade path is clear. Full mirror for all (unsustainable at scale), no caching (too risky). 272 + Proactive requestCrawl after blob upload Reduces blob loss risk on free tier without requiring relay storage. Leverages existing ATProto infrastructure. Accept blob loss (bad UX), require paid tier for blob uploads (too restrictive). 273 + ------------------------------------------- ----------------------------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
+600
docs/mobile-architecture-spec.md
··· 1 + **Mobile Architecture Specification** 2 + 3 + iOS Identity Wallet with Mobile-First Onboarding 4 + 5 + *v1.3 — Shamir Share 3 User's Choice + Migration Spec Cross-References* 6 + 7 + *Companion to: Provisioning API Spec, Data Migration & Recovery Spec* 8 + 9 + **Table of Contents** 10 + 11 + 1 Architectural Overview 12 + 13 + > 1.1 Design Principles 14 + > 15 + > 1.2 ATProto Signing Constraint 16 + > 17 + > 1.3 The Relay as Permanent Proxy and Signer 18 + > 19 + > 1.4 Device Lifecycle Phases 20 + > 21 + > 1.5 Sovereignty Model 22 + 23 + 2 Identity and Key Management 24 + 25 + > 2.1 Key Architecture 26 + > 27 + > 2.2 DID Document Structure 28 + > 29 + > 2.3 Secure Enclave / Keychain Integration 30 + 31 + 3 Mobile-First Onboarding 32 + 33 + > 3.1 Onboarding Sequence 34 + > 35 + > 3.2 UX Considerations 36 + 37 + 4 Relay Architecture 38 + 39 + > 4.1 Relay Responsibilities 40 + > 41 + > 4.2 XRPC Proxying to Desktop 42 + > 43 + > 4.3 Pluggable Signer Interface 44 + > 45 + > 4.4 Durability Requirements 46 + 47 + 5 Desktop Promotion 48 + 49 + > 5.1 Device Pairing 50 + > 51 + > 5.2 Repo Migration to Desktop 52 + > 53 + > 5.3 Post-Promotion Data Flow 54 + > 55 + > 5.4 Desktop Offline Behavior 56 + 57 + 6 iOS App: Identity Wallet 58 + 59 + > 6.1 What the App Is (and Is Not) 60 + > 61 + > 6.2 Core Capabilities 62 + > 63 + > 6.3 Technology Stack 64 + > 65 + > 6.4 App Store Considerations 66 + 67 + 7 Key Recovery 68 + 69 + > 7.1 Shamir Share Distribution 70 + > 71 + > 7.2 Recovery Scenarios 72 + 73 + 8 API Surface 74 + 75 + 9 Edge Cases and Failure Modes 76 + 77 + 10 Future: Desktop-Local Signing 78 + 79 + 11 Implementation Milestones 80 + 81 + 12 Design Decisions Log 82 + 83 + **1. Architectural Overview** 84 + 85 + **1.1 Design Principles** 86 + 87 + The iOS app is an identity wallet. It is not a social client. Users interact with the ATProto network through third-party applications like Bluesky, which connect to the user's PDS via XRPC to create records. The iOS app's job is to provision and manage the infrastructure that makes that possible. 88 + 89 + Three principles govern the design: 90 + 91 + - **Sovereignty by rotation key custody.** The user's root rotation key lives on their device (iOS Keychain or macOS Keychain). This key is the ultimate authority over the identity. It can revoke the signing key, change the PDS endpoint, and migrate the account. No server operator can override it. 92 + 93 + - **Mobile-first onboarding.** A user can create an account, establish their DID, and start using Bluesky using only an iPhone. No desktop required. 94 + 95 + - **Progressive data sovereignty.** The relay holds the signing key in all phases. Data sovereignty increases as the user adds a desktop: the repo moves to their hardware, the relay becomes a proxy, and the user's rotation key can revoke the relay's signing authority at any time. The architecture is designed so that signing sovereignty can follow if ATProto adds multi-key support. 96 + 97 + **1.2 ATProto Signing Constraint** 98 + 99 + A critical constraint in the current ATProto specification: a DID document supports exactly one active atproto signing key. The verificationMethods field in a did:plc operation is a key-value map with a single "atproto" slot. The spec explicitly states that the first valid atproto signing key in the verificationMethod array should be used, and any others ignored. 100 + 101 + This means: 102 + 103 + - Only one entity can sign repo commits at any given time. 104 + 105 + - Switching the signing key from the relay to a desktop (or vice versa) requires a DID document update via the PLC directory. 106 + 107 + - There is no mechanism for "fall back to the relay when the desktop is offline" without performing a DID update each time. 108 + 109 + - Key scoping (e.g., "this key can sign posts but not rotate identity") does not exist beyond the inherent rotation-key vs. signing-key distinction. 110 + 111 + Rotation keys, by contrast, do support multiple entries with a priority ordering. This is the basis of the user's sovereignty: they hold a higher-priority rotation key than the relay, giving them the ability to revoke the relay's signing key and reissue to a different provider at any time. 112 + 113 + This constraint is the single most important architectural fact in this spec. Every design decision flows from it. 114 + 115 + **1.3 The Relay as Permanent Proxy and Signer** 116 + 117 + The relay serves two permanent roles: 118 + 119 + - **XRPC proxy.** The desktop PDS may be behind CGNAT or a home firewall and cannot be directly reached from the public internet. The relay's URL is the service endpoint in the DID document. It forwards XRPC requests to the desktop via the Iroh tunnel when a desktop is enrolled. This endpoint never changes. 120 + 121 + - **Commit signer.** The relay holds the single atproto signing key registered in the DID document. When third-party apps (Bluesky, etc.) call XRPC to create records, the relay signs the resulting commits. In desktop-enrolled mode, the relay still signs, but it does so by forwarding the unsigned commit to the desktop for repo construction, receiving the commit data back, and signing it. 122 + 123 + This dual role means the relay is always in the critical path. The user's sovereignty comes not from removing the relay, but from holding the rotation key that controls it. The user can swap to a different relay at any time by revoking the current relay's signing key and provisioning a new one --- the same migration pattern used when switching PDS hosts today. 124 + 125 + **1.4 Device Lifecycle Phases** 126 + 127 + ---------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------- ------------------------------------------------------------ 128 + **Phase** **Relay Behavior** **Repo Location** **Phone Role** 129 + **Mobile-Only** Full PDS: hosts repo, serves XRPC, signs commits, emits firehose. Identical to any hosted PDS. Relay (primary). Phone maintains periodic backup. Identity wallet: holds root rotation key, manages account. 130 + **Desktop Enrolled** XRPC proxy + signer: forwards record-creation requests to desktop for repo construction, signs the resulting commits, serves public reads from cache. Desktop (primary). Relay caches public data. Identity wallet + device manager. 131 + **Desktop Offline** Serves public reads from cache. Write requests return 503 (desktop unreachable). Relay cannot create valid commits alone because the desktop holds the repo state. Desktop (authoritative but unreachable). Relay cache serves reads. Same. May receive push alert that desktop went offline. 132 + ---------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------- ------------------------------------------------------------ 133 + 134 + **1.5 Sovereignty Model** 135 + 136 + The sovereignty story has two layers that mature at different rates: 137 + 138 + - **Identity sovereignty (available from day one).** The user holds the highest-priority rotation key. They can: revoke the relay's signing key, change their PDS endpoint to a different provider, migrate their account, and recover their identity even if the relay disappears. This is the same level of identity control as running your own PDS today --- the standard pattern is for the PDS to hold a rotation key, but the user holds a higher-priority one. 139 + 140 + - **Data sovereignty (available when desktop is enrolled).** The user's repo lives on their hardware. The relay is a cache/proxy that cannot unilaterally modify the repo (it signs commits, but the desktop constructs the Merkle tree and validates consistency). If the relay misbehaves, the user's local repo is the authoritative copy. 141 + 142 + A third layer, signing sovereignty (the user's own hardware signs commits), is architecturally prepared for but not available in v1.0 due to ATProto's single-key constraint. See Section 10 for the future design. 143 + 144 + **2. Identity and Key Management** 145 + 146 + **2.1 Key Architecture** 147 + 148 + The system uses three keys with clearly separated roles: 149 + 150 + ----------------------- ----------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- 151 + **Key** **Location** **Role** **ATProto Mechanism** 152 + **Root rotation key** iOS Keychain (Secure Enclave-backed where possible) or macOS Keychain Ultimate identity authority. Can modify the DID document: add/remove keys, change service endpoint, rotate itself. Highest-priority entry in the did:plc rotationKeys array. Signed PLC operations. 153 + **Relay signing key** Relay server (HSM or software key) Signs all repo commits. The only key that third-party apps and the AppView see on commits. The single "atproto" entry in verificationMethods. Also a lower-priority rotation key (so relay can update handles). 154 + **Phone device key** iOS Keychain Authenticates the phone to the relay's management API (account settings, device operations). Not registered in the DID document. None. This is an application-layer key for the relay's REST API, not an ATProto key. 155 + ----------------------- ----------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- 156 + 157 + When a desktop is enrolled, no new key is added to the DID document. The desktop connects to the relay via Iroh and participates in repo construction (building the Merkle tree, validating records), but the relay still signs all commits. The desktop may hold a local key for authenticating to the relay's management API, analogous to the phone's device key. 158 + 159 + **2.2 DID Document Structure** 160 + 161 + The DID document is the same in all lifecycle phases. No DID updates are required when enrolling or removing a desktop: 162 + 163 + - rotationKeys: \[user\_root\_key (highest priority), relay\_rotation\_key\] 164 + 165 + - verificationMethods: { atproto: relay\_signing\_key } 166 + 167 + - services: { atproto\_pds: relay\_url } 168 + 169 + - alsoKnownAs: \[at://handle\] 170 + 171 + This stability is a significant advantage: desktop enrollment and removal are internal operations between the user's devices and the relay. No PLC directory interaction is needed. The DID document only changes for identity operations: handle changes, relay migration, or key rotation. 172 + 173 + **2.3 Secure Enclave / Keychain Integration** 174 + 175 + The root rotation key is the most security-critical asset. It must be protected against extraction while remaining usable for signing PLC operations. 176 + 177 + ATProto requires secp256k1 or P-256 for rotation keys. The iOS Secure Enclave natively supports P-256 but not secp256k1. For v1.0: 178 + 179 + - Use P-256 for the root rotation key. This is natively supported in the Secure Enclave, providing hardware-backed key isolation where the private key never leaves the enclave. 180 + 181 + - The signing key (held by the relay) uses secp256k1 or P-256 per the relay's configuration. This is independent of the user's rotation key curve. 182 + 183 + - Shamir recovery shares are derived from a recovery seed that can reconstruct the rotation key (see Section 7). 184 + 185 + This approach provides genuine hardware-backed protection for the rotation key, which is the most important key the user holds. 186 + 187 + **3. Mobile-First Onboarding** 188 + 189 + **3.1 Onboarding Sequence** 190 + 191 + The user goes from app install to a working ATProto identity in a single session. At the end, they can open Bluesky and log in. 192 + 193 + 1. **Step 1: Account creation.** User downloads the iOS app, creates an account via the provisioning API (POST /v1/accounts). This allocates a relay instance and returns the relay endpoint URL. 194 + 195 + 2. **Step 2: Root key generation.** The app generates a P-256 key pair in the Secure Enclave. The public key is extracted (the private key never leaves the enclave). A phone device key is also generated in the Keychain for API authentication. 196 + 197 + 3. **Step 3: Relay key provisioning.** The relay generates its own signing key pair server-side and returns the public key to the phone. 198 + 199 + 4. **Step 4: DID creation.** The app calls the DID ceremony endpoint (POST /v1/dids) with the user's rotation public key and the relay's signing and rotation public keys. The relay orchestrates did:plc creation with the PLC directory. The service endpoint is set to the relay URL. 200 + 201 + 5. **Step 5: Handle assignment.** User selects a handle (user.yourapp.social for free tier, or custom domain). The relay configures DNS and verifies resolution. 202 + 203 + 6. **Step 6: Repo initialization.** The relay creates an empty ATProto repo. The relay is now a fully functional PDS. Third-party apps can authenticate and create records. 204 + 205 + 7. **Step 7: Shamir share generation.** The app generates 2-of-3 Shamir shares for the root rotation key's recovery seed: Share 1 in iCloud Keychain, Share 2 escrowed to relay, Share 3 as user's choice at account creation. 206 + 207 + The onboarding flow presents both options with a brief explanation: 208 + 209 + - **Option A: Device-local** (Secure Enclave / Keychain on a second device) — More convenient but requires a second device. 210 + - **Option B: BIP-39 mnemonic phrase** (paper backup or USB storage) — More resilient but requires physical safekeeping. 211 + 212 + Default recommendation: BIP-39 (safer for users with only one device). 213 + 214 + 8. **Step 8: Federation activation.** The relay calls requestCrawl on the configured AppView. The user can now open Bluesky, log in with their handle, and start posting. 215 + 216 + **3.2 UX Considerations** 217 + 218 + Steps 2--4 should present as a single "Creating your identity..." moment (1--3 seconds, bottlenecked by the PLC directory). 219 + 220 + Step 7 is the critical UX challenge. Recommended approach: iCloud Keychain backup is automatic and silent; the choice between device-local and BIP-39 gets a dedicated full-screen with clear explanation and safekeeping guidance; relay escrow is explained in an optional "Learn more" flow. 221 + 222 + Step 8 should conclude with: "Your identity is ready. Open Bluesky and log in with \[handle\]." Deep-link to Bluesky if installed, App Store link if not. 223 + 224 + **4. Relay Architecture** 225 + 226 + **4.1 Relay Responsibilities** 227 + 228 + The relay is the user's PDS as seen by the ATProto network. Its responsibilities vary by lifecycle phase but it is always the public-facing endpoint: 229 + 230 + -------------------- ------------------------------------------------------------------------------------------------ ---------------------------------------------------------------------------------------------------------------------------------------------------- 231 + **Responsibility** **Mobile-Only Phase** **Desktop Enrolled Phase** 232 + **XRPC endpoint** Terminates all XRPC directly. Serves reads from local repo. Handles writes by signing commits. Proxies writes to desktop for repo construction, signs the resulting commit. Serves reads from cache (fast) or proxies to desktop (authoritative). 233 + **Commit signing** Signs all commits using the DID-registered signing key. Same. The relay always signs. The desktop constructs the commit (Merkle tree update), the relay signs it. 234 + **Repo storage** Primary repo host. Durable storage with daily backups to object storage. Cache of public data. Desktop is authoritative. Relay pulls updates after each commit. 235 + **Firehose** Emits commit events to subscribers and calls requestCrawl on AppViews. Same. Emits after signing each commit. 236 + **OAuth / auth** Handles OAuth flows for third-party app authentication. Same. OAuth state is managed at the relay, not forwarded to desktop. 237 + -------------------- ------------------------------------------------------------------------------------------------ ---------------------------------------------------------------------------------------------------------------------------------------------------- 238 + 239 + **4.2 XRPC Proxying to Desktop** 240 + 241 + When a desktop is enrolled, the relay's write path changes. A write request (e.g., com.atproto.repo.createRecord from Bluesky) follows this flow: 242 + 243 + 1. **Step 1.** Relay receives the XRPC request and validates the OAuth session. 244 + 245 + 2. **Step 2.** Relay forwards the record data to the desktop PDS via the Iroh QUIC tunnel. 246 + 247 + 3. **Step 3.** Desktop constructs the repo commit: creates the record, updates the Merkle tree, computes the new root hash, and builds the unsigned commit object. 248 + 249 + 4. **Step 4.** Desktop sends the unsigned commit back to the relay. 250 + 251 + 5. **Step 5.** Relay signs the commit with the DID-registered signing key. 252 + 253 + 6. **Step 6.** Relay sends the signed commit back to the desktop for storage. 254 + 255 + 7. **Step 7.** Relay emits the commit to the firehose and responds to the XRPC caller. 256 + 257 + Read requests (com.atproto.repo.getRecord, etc.) can be served from the relay's cache for low latency, or proxied to the desktop for authoritative reads. Cached reads have a configurable TTL. 258 + 259 + **4.3 Pluggable Signer Interface** 260 + 261 + The relay's signing logic is implemented behind a clean interface. In v1.0, the only implementation is "relay-local key." The interface is designed so that a "desktop-remote key" implementation can be added later without architectural changes: 262 + 263 + ----------------------- --------------------------------------------- ---------------------------------------------------------------------------------------------------- 264 + **Interface Method** **v1.0: Relay-Local Signer** **Future: Desktop-Remote Signer** 265 + **sign(commitBytes)** Relay signs using its local key. Relay forwards to desktop for signing via Iroh. Requires DID doc update to register desktop's key. 266 + **getPublicKey()** Returns the relay's signing key public key. Returns the desktop's signing key public key. 267 + **isAvailable()** Always true (key is local). True when desktop is reachable. False when offline. 268 + ----------------------- --------------------------------------------- ---------------------------------------------------------------------------------------------------- 269 + 270 + This interface is the extension point. If ATProto adds multi-key signing support, or if users demand desktop-local signing despite the DID-update latency, the "desktop-remote" implementation can be activated per-account as a configuration change. See Section 10 for the full future design. 271 + 272 + **4.4 Durability Requirements** 273 + 274 + During the mobile-only phase, the relay is the primary repo host: 275 + 276 + - Storage: SQLite with WAL mode on persistent volumes. 277 + 278 + - Backup: daily repo snapshots to object storage (S3/R2). 279 + 280 + - SLA: 99.9% uptime target. Downtime means the identity is unreachable. 281 + 282 + - Free tier: full repo hosting with storage cap from provisioning spec. 283 + 284 + The phone maintains a periodic backup (BGAppRefreshTask, 6-hour interval) as a resilience measure. This is a snapshot for disaster recovery, not a sync mechanism. 285 + 286 + **5. Desktop Promotion** 287 + 288 + **5.1 Device Pairing** 289 + 290 + Pairing is initiated via QR code. Because no DID document update is needed (the relay's signing key doesn't change), this is purely an internal configuration change: 291 + 292 + 1. **Step 1.** Desktop displays a QR code: one-time pairing token, ephemeral X25519 public key, relay endpoint URL. 293 + 294 + 2. **Step 2.** Phone scans QR, derives shared secret via X25519 key agreement. 295 + 296 + 3. **Step 3.** Phone and desktop exchange device management keys over the encrypted channel (relayed through the relay as message broker). 297 + 298 + 4. **Step 4.** Relay registers the desktop as an enrolled device and establishes the Iroh tunnel endpoint. 299 + 300 + 5. **Step 5.** Both devices confirm the pairing is active. 301 + 302 + Critically: no PLC directory operation is required. The DID document is unchanged. This means pairing is instant, reversible, and invisible to the ATProto network. 303 + 304 + **5.2 Repo Migration to Desktop** 305 + 306 + After pairing, the desktop needs a copy of the repo: 307 + 308 + 1. **Step 1.** Desktop requests a full repo export from the relay (GET /v1/export/repo). Streamed as a CAR file. 309 + 310 + 2. **Step 2.** Desktop imports the repo, validates the Merkle root against the latest commit. 311 + 312 + 3. **Step 3.** Desktop signals readiness (POST /v1/devices/:id/promote). 313 + 314 + 4. **Step 4.** Relay switches to proxy mode: write requests are now forwarded to the desktop for repo construction before the relay signs them. 315 + 316 + **5.3 Post-Promotion Data Flow** 317 + 318 + After promotion, the write path is: Bluesky → relay (XRPC) → desktop (repo construction via Iroh) → relay (signing) → desktop (storage) → relay (firehose emission + XRPC response). 319 + 320 + The desktop is the authoritative repo. The relay holds a cache. This gives the user data sovereignty: their posts, follows, and all records live on their hardware. The relay cannot unilaterally modify the repo because the desktop validates Merkle tree consistency --- any commit the relay signs must be built from the desktop's repo state. 321 + 322 + **5.4 Desktop Offline Behavior** 323 + 324 + When the desktop is unreachable (lid closed, power off, network down): 325 + 326 + - Read requests: relay serves from its cache. The user's profile, posts, and public data remain available. 327 + 328 + - Write requests: relay returns 503 to XRPC callers. Third-party apps like Bluesky see "PDS temporarily unavailable" and may retry or show an error. The user cannot post until the desktop comes online. 329 + 330 + - Why not fall back to relay-local writes? Because the relay doesn't have the current repo state. The desktop holds the authoritative Merkle tree. If the relay signed a commit built from stale state, the repo would fork. This is an integrity constraint, not a policy choice. 331 + 332 + This is the primary UX tradeoff of desktop enrollment. Users who find this unacceptable can delay enrolling a desktop and remain in the mobile-only phase (where the relay handles everything). The identity wallet should clearly communicate this tradeoff during the promotion flow. 333 + 334 + Mitigation: the desktop PDS should be configured as a launchd daemon (macOS) that starts at boot and runs even with the lid closed (if connected to power and network). Users should be educated that their Mac is now infrastructure. 335 + 336 + **6. iOS App: Identity Wallet** 337 + 338 + **6.1 What the App Is (and Is Not)** 339 + 340 + The iOS app is: 341 + 342 + - An identity provisioning tool (create DID, set up handle, generate keys). 343 + 344 + - A key management interface (view rotation key status, initiate recovery). 345 + 346 + - A device manager (pair desktop, trigger promotion, monitor device health). 347 + 348 + - A recovery tool (Shamir share management, root key reconstruction). 349 + 350 + - An account settings interface (display name, avatar, handle changes, account deletion). 351 + 352 + - A relay health monitor (uptime, storage usage, federation status). 353 + 354 + The iOS app is not: 355 + 356 + - A social client. No feeds, timelines, notifications, or post content. 357 + 358 + - A content creation tool. Users do not compose posts or replies here. 359 + 360 + - A Bluesky client. It does not implement app.bsky.\* lexicons. 361 + 362 + **6.2 Core Capabilities** 363 + 364 + - **Onboarding wizard:** account creation, key generation, DID ceremony, handle selection, Shamir setup, "Open Bluesky" CTA. 365 + 366 + - **Identity dashboard:** DID, handle, service endpoint, rotation key status, signing key info (relay-held), last federation crawl. 367 + 368 + - **Device management:** pair desktop (QR scanner), promote desktop, view device status (online/offline/last seen), de-enroll desktop. 369 + 370 + - **Recovery center:** Shamir share status (iCloud sync verified? relay escrow healthy? device-local backup available?), initiate recovery ceremony, regenerate shares. 371 + 372 + - **Account settings:** display name, avatar (uploaded via relay), handle management, account deletion (exit ceremony). 373 + 374 + - **Relay health:** uptime, storage usage vs. tier cap, current operating mode (hosted PDS vs. proxy). 375 + 376 + - **Sovereignty actions:** revoke relay signing key (nuclear option --- identity goes offline until a new key is provisioned), migrate to a different relay provider. 377 + 378 + **6.3 Technology Stack** 379 + 380 + - **UI:** SwiftUI. Native is required for Secure Enclave / Keychain integration. 381 + 382 + - **Networking:** URLSession for HTTPS to relay API. No Iroh on iOS in v1.0. 383 + 384 + - **Crypto:** CryptoKit for P-256 Secure Enclave key generation and signing (rotation key). No secp256k1 library needed on the phone --- the relay handles signing. 385 + 386 + - **Local storage:** SQLite (via GRDB.swift) for repo backup. Core Data for app state. 387 + 388 + - **Background:** BGAppRefreshTask for periodic repo backup (6-hour interval). 389 + 390 + - **Push:** APNs for relay health alerts (desktop offline, storage cap approaching). 391 + 392 + **6.4 App Store Considerations** 393 + 394 + - Crypto export compliance: uses CryptoKit (system framework). No custom cryptography. Should qualify for CCATS exemption. 395 + 396 + - Account deletion: maps to exit ceremony (DELETE /v1/accounts/:id). 30-day grace period is compliant. 397 + 398 + - Minimum functionality: position as "decentralized identity wallet" with onboarding wizard, device management, recovery tools, and relay dashboard. The onboarding flow alone provides sufficient interactive surface. 399 + 400 + - IAP considerations: if tier upgrades are offered in-app, Apple takes 30%. Consider directing upgrades to web dashboard. 401 + 402 + **7. Key Recovery** 403 + 404 + **7.1 Shamir Share Distribution** 405 + 406 + The root rotation key's recovery seed is split into 2-of-3 Shamir shares: 407 + 408 + ------------- ---------------------------------- ---------------------------------------------------------------------------------------------------------- 409 + **Share** **Location** **Availability** 410 + **Share 1** iCloud Keychain Survives phone loss if iCloud account is intact. Available on any Apple device signed into the same iCloud. 411 + **Share 2** Relay escrow Retrieved via account authentication (email + password or OAuth). Available as long as relay is operational. 412 + **Share 3** User's choice (device-local or BIP-39) Device-local: stored on designated backup device. BIP-39 phrase: paper or USB backup. User's responsibility. 413 + ------------- ---------------------------------- ---------------------------------------------------------------------------------------------------------- 414 + 415 + **7.2 Recovery Scenarios** 416 + 417 + - **Lost phone, iCloud intact:** Share 1 + Share 2 = recovery. Install app on new phone, authenticate to iCloud and relay, reconstruct root key in new Secure Enclave. If share 3 is device-local on another device, access it there for redundancy. 418 + 419 + - **Lost phone, iCloud compromised:** Share 2 + Share 3 = recovery. Authenticate to relay, retrieve device-local share from backup device or enter BIP-39 recovery phrase. Same key reconstruction flow. 420 + 421 + - **Lost phone + relay down:** Share 1 + Share 3 = recovery. Reconstruct root key locally, provision new relay, update DID doc with new service endpoint. 422 + 423 + - **Compromised relay signing key:** Phone's root rotation key signs a DID doc update revoking the relay's key. Relay generates a new key, root key authorizes it via another DID update. Brief service interruption. 424 + 425 + Important: because the relay holds the only signing key, losing access to the relay is a service interruption but not an identity loss. The rotation key (held by the user) can always provision a new relay with a new signing key. 426 + 427 + **7.3 Planned Device Upgrades** 428 + 429 + For planned device upgrades (e.g., new iPhone), see the Data Migration & Recovery Spec §3 which covers the Iroh-based peer transfer with 6-digit verification code. The same transfer protocol works for phone-to-phone and desktop-to-desktop swaps. 430 + 431 + For desktop-specific migration (desktop-to-desktop), see the Data Migration & Recovery Spec §3. The phone acts as the authorization device during desktop transfers — the user confirms the transfer from the mobile app. 432 + 433 + **8. API Surface** 434 + 435 + New endpoints extending the provisioning API: 436 + 437 + ------------ ------------------------- --------------------------------------------------------------------------------------------------------------------------- 438 + **Method** **Path** **Description** 439 + POST /v1/relay/keys Generate a new relay signing key. Returns public key. Called during onboarding and after key rotation. 440 + DELETE /v1/relay/keys/:keyId Revoke the relay's signing key. Requires root rotation key signature. Relay stops signing immediately. 441 + GET /v1/relay/repo/snapshot Full repo snapshot (CAR file). For phone backup and desktop repo import. 442 + POST /v1/devices/:id/pair Initiate device pairing. Body: pairing token, ephemeral public key. Returns encrypted channel ID. 443 + POST /v1/devices/:id/promote Promote desktop to repo host. Relay switches to proxy mode for writes. 444 + DELETE /v1/devices/:id De-enroll a device. If desktop, relay reverts to hosted PDS mode. 445 + GET /v1/devices/:id/status Device status: online/offline, last seen, role, lifecycle phase. 446 + POST /v1/relay/commits/sign Sign an unsigned commit. Used by the desktop in proxy mode. Desktop sends unsigned commit bytes, relay returns signature. 447 + GET /v1/relay/mode Current relay operating mode (hosted-pds or proxy). Includes desktop connectivity status. 448 + ------------ ------------------------- --------------------------------------------------------------------------------------------------------------------------- 449 + 450 + Modified existing endpoints: 451 + 452 + - **POST /v1/dids:** accepts relay\_signing\_key and relay\_rotation\_key fields. DID ceremony registers all keys atomically. 453 + 454 + - **GET /v1/export/repo:** supports chunked transfer encoding for cellular. Also used during desktop promotion. 455 + 456 + - **POST /v1/keys/shares:** called during mobile onboarding (Step 7), not just desktop setup. 457 + 458 + **9. Edge Cases and Failure Modes** 459 + 460 + -------------------------------------------- ------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------- 461 + **Scenario** **Behavior** **Recovery** 462 + Relay compromised (signing key leaked) Attacker can sign repo commits. Cannot rotate identity (no root key). Phone's root key revokes signing key via PLC update. Relay generates new key, root key authorizes it. Brief outage. 463 + Desktop offline, user tries to post Relay returns 503 to Bluesky. User sees "PDS unavailable." User opens Mac / wakes from sleep. Desktop reconnects via Iroh. Writes resume. 464 + Desktop offline, user reads own profile Relay serves from cache. Profile and posts remain visible. No action needed. Reads always work from cache. 465 + Relay down during mobile-only Identity unreachable. No reads or writes. Phone has local backup. Wait for restoration, or: root key provisions new relay, updates DID doc, restores repo from phone backup. 466 + Relay down during desktop-enrolled Identity unreachable (relay is the service endpoint). Desktop has full repo. Same: provision new relay, update DID doc. Desktop's repo is authoritative. No data loss. 467 + Phone lost, root key gone Relay and desktop continue operating. No identity operations possible. Shamir recovery (Section 7). Reconstruct root key on new phone. Normal service is uninterrupted during recovery. 468 + DID update fails during relay key rotation Old signing key still active. New key not yet authorized. Retry the PLC operation. Old key continues working until rotation completes. No service interruption. 469 + Desktop de-enrolled while offline Relay switches back to hosted PDS but has stale repo (last cache). Relay's cache becomes the new source of truth. Some recent commits may be lost if they were only on the desktop. Phone backup can supplement. 470 + iCloud Keychain sync fails for Share 1 Share 1 only on local device. Phone loss leaves Shares 2 + 3. App verifies sync success, warns if failed. Device-local or BIP-39 backup is available as fallback. 471 + User wants to switch relay providers Standard PDS migration: provision new relay, export repo, root key updates DID doc. This is the credible exit story. Fully supported by ATProto's existing migration protocol. 472 + -------------------------------------------- ------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------- 473 + 474 + **10. Future: Desktop-Local Signing** 475 + 476 + This section documents the architectural path to full signing sovereignty, for when ATProto adds multi-key support or when the DID-update-per-switch tradeoff becomes acceptable. 477 + 478 + **10.1 If ATProto Adds Multi-Key Signing** 479 + 480 + If the verificationMethods field is expanded to support multiple atproto signing keys (e.g., via an array instead of a single-entry map), the architecture is ready: 481 + 482 + - Register the desktop's signing key alongside the relay's key in the DID doc. 483 + 484 + - Activate the "desktop-remote signer" implementation behind the pluggable signer interface (Section 4.3). 485 + 486 + - When the desktop is online, commits are signed by the desktop's key. When offline, the relay signs with its key. No DID update needed for failover. 487 + 488 + - This is the "sovereignty dial" from v1.1 of this spec, deferred until the protocol supports it. 489 + 490 + **10.2 If Users Demand Desktop Signing Now** 491 + 492 + Even under current ATProto constraints, desktop-local signing is technically possible at the cost of DID updates: 493 + 494 + - Desktop holds the signing key. DID doc's atproto verificationMethod points to the desktop's key. 495 + 496 + - Relay proxies unsigned commits to the desktop for signing (reversing the current flow). 497 + 498 + - When the desktop goes offline, the user has two options: accept write downtime, or trigger a DID update to swap the signing key to the relay (and swap back when the desktop returns). 499 + 500 + - The DID update path adds 1--3 seconds of latency per switch. The PLC directory also rate-limits operations per DID. Frequent switching (daily lid-close/open cycles) would be impractical. 501 + 502 + Recommendation: offer this as an "advanced" option for power users who explicitly want signing sovereignty, with clear warnings about the downtime tradeoff. The pluggable signer interface supports this with no relay code changes --- it's purely a configuration and DID doc update. 503 + 504 + **10.3 Abstraction Checklist** 505 + 506 + To ensure the architecture stays flexible, the following must remain true: 507 + 508 + - The relay's signing logic is behind the signer interface (Section 4.3). No commit-signing code exists outside this interface. 509 + 510 + - The desktop's repo construction is signing-agnostic: it builds unsigned commits and expects a signature from an external source (today the relay, potentially itself in the future). 511 + 512 + - The DID ceremony endpoint accepts the signing key as a parameter, not a hardcoded relay key. The key source can be the relay or the desktop. 513 + 514 + - The phone's identity wallet UI has a placeholder for "signing authority" in the identity dashboard, even if it's always "relay" in v1.0. 515 + 516 + **11. Implementation Milestones** 517 + 518 + **11.1 Phase 1: Identity Wallet MVP (iOS v0.1)** 519 + 520 + Goal: user can create an ATProto identity from their iPhone and log into Bluesky. 521 + 522 + - P-256 key generation in Secure Enclave (root rotation key). 523 + 524 + - Account creation via provisioning API. 525 + 526 + - Relay signing key provisioning. 527 + 528 + - DID ceremony (root rotation key + relay keys). 529 + 530 + - Handle selection (yourapp.social subdomain). 531 + 532 + - Shamir share generation (iCloud + relay + user's choice of device-local or BIP-39). 533 + 534 + - Federation activation (requestCrawl). 535 + 536 + - Onboarding wizard with "Open Bluesky" CTA. 537 + 538 + - Basic identity dashboard. 539 + 540 + **11.2 Phase 2: Device Management (iOS v0.2)** 541 + 542 + Goal: user can pair a desktop and manage devices. 543 + 544 + - QR-code device pairing. 545 + 546 + - Desktop promotion (repo migration + proxy mode activation). 547 + 548 + - Device list with status and health. 549 + 550 + - De-enrollment flow. 551 + 552 + - Periodic repo backup via BGAppRefreshTask. 553 + 554 + - Push notifications for relay health alerts. 555 + 556 + **11.3 Phase 3: Recovery and Polish (iOS v1.0)** 557 + 558 + Goal: production-ready identity wallet. 559 + 560 + - Shamir recovery ceremony (all three share combinations). 561 + 562 + - Custom domain handle setup. 563 + 564 + - Account deletion (exit ceremony). 565 + 566 + - Relay signing key rotation (revoke and reissue). 567 + 568 + - Relay migration (switch to different provider). 569 + 570 + - Share status monitoring. 571 + 572 + - Storage usage dashboard. 573 + 574 + **11.4 Future: Signing Sovereignty (v2.0+)** 575 + 576 + Goal: user's own hardware signs commits (contingent on ATProto evolution). 577 + 578 + - Desktop-remote signer implementation behind pluggable interface. 579 + 580 + - DID doc update automation for signing key swaps. 581 + 582 + - Multi-key support (if ATProto adds it). 583 + 584 + - Sovereignty dial UI in identity wallet. 585 + 586 + **12. Design Decisions Log** 587 + 588 + --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------- 589 + **Decision** **Rationale** **Alternatives Rejected** 590 + Relay always holds the signing key ATProto supports exactly one active signing key per DID. The relay must be always-available for third-party apps. Switching keys requires DID updates (1--3s latency, rate-limited). Architecture is designed with a pluggable signer interface for future flexibility. Desktop signs (503 on offline). Dual signing keys (not supported by ATProto). Phone as signing oracle (iOS throttles pushes). 591 + Relay is the permanent service endpoint Desktop may be behind CGNAT/firewall. Relay proxies via Iroh. No DID doc changes when enrolling/removing desktop. Network-invisible promotion. Direct desktop exposure (CGNAT blocks). DID endpoint swap on promotion (complex, visible). 592 + P-256 for rotation key (not secp256k1) P-256 is natively supported in iOS Secure Enclave, providing hardware-backed key isolation. ATProto accepts both P-256 and secp256k1 for rotation keys. No need to use a software key with weaker protection. secp256k1 software key with Keychain protection (weaker than Secure Enclave). Wait for Secure Enclave secp256k1 support (unlikely). 593 + No DID doc changes for desktop enrollment The relay's signing key and service endpoint don't change. Desktop is an internal infrastructure detail. This makes enrollment instant, reversible, and invisible to the network. Register desktop key in DID doc (unnecessary, adds PLC latency, visible to network). 594 + iOS app is identity wallet, not social client Users interact with ATProto through third-party apps. Building a social client duplicates work and competes with the ecosystem. The wallet serves the infrastructure layer. Full Bluesky client (scope expansion). Companion with feed viewing (half-measure). 595 + 503 on desktop-offline writes (no fallback) The relay cannot construct valid commits from stale repo state. The desktop holds the authoritative Merkle tree. Signing a commit from stale state would fork the repo. This is an integrity constraint. Relay signs from stale cache (repo fork risk). Queue writes until desktop wakes (complex, stale commits). 596 + Pluggable signer interface for future flexibility If ATProto adds multi-key signing, or users demand desktop-local signing despite DID update costs, the relay can swap implementations with a config change. No architectural redesign needed. Hard-code relay signing (inflexible). Build desktop signing now (premature, protocol doesn't support it cleanly). 597 + SwiftUI native (not cross-platform) Secure Enclave P-256 integration requires native CryptoKit. Core function is key management. Cross-platform abstraction over crypto is a security risk. React Native, Flutter (abstraction over security primitives). 598 + QR-code pairing (not BLE/NFC) Works at any distance, no hardware requirements, familiar UX (Signal, WhatsApp). BLE is fragile, NFC requires proximity. Bluetooth LE (unreliable). NFC (proximity). Manual key entry (bad UX). 599 + Share 3 as user's choice Balances convenience (device-local for multi-device users) with resilience (BIP-39 for air-gap backup). Accommodates diverse user security preferences. Single-option model (less flexible). Relay-held share (breaks sovereignty model). 600 + --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------
+284
docs/oauth-integration-spec.md
··· 1 + # ATProto OAuth Integration Spec 2 + 3 + Relay OAuth Provider 4 + 5 + v0.1 Draft — March 2026 6 + 7 + Companion to: Provisioning API Spec, Mobile Architecture Spec 8 + 9 + --- 10 + 11 + ## 1. Overview 12 + 13 + The relay must be a compliant ATProto OAuth 2.1 authorization server so that third-party apps (Bluesky, etc.) can authenticate users and create records via XRPC. This document specifies how the relay integrates existing Rust OAuth libraries rather than building OAuth from scratch. 14 + 15 + ### 1.1 Why OAuth Matters 16 + 17 + Without a compliant OAuth provider, no third-party app can authenticate against the relay. A user who creates an identity through the mobile app or desktop PDS cannot log into Bluesky — the entire product is unusable. OAuth is on the critical path for every lifecycle phase. 18 + 19 + ### 1.2 ATProto OAuth Requirements 20 + 21 + The ATProto OAuth spec requires PDS implementations to support: 22 + 23 + - **OAuth 2.1** authorization code flow with PKCE (S256 only) 24 + - **DPoP** (Demonstrating Proof-of-Possession) using ES256, with unique JTI per request and nonce support 25 + - **PAR** (Pushed Authorization Requests) — mandatory for all client types 26 + - **Dynamic Client Registration** (RFC 7591) — clients provide metadata URLs, not pre-registered credentials 27 + - **Server metadata** at `/.well-known/oauth-authorization-server` 28 + - **JWKS endpoint** for public key discovery 29 + - Grant types: `authorization_code` and `refresh_token` 30 + - Token endpoint auth: `none` and `private_key_jwt` 31 + - Scopes: `atproto` and `transition:generic` 32 + - CORS support for browser-based apps 33 + - Refresh tokens are single-use (rotation on each use) 34 + - Tokens bound to DPoP key and client_id 35 + 36 + --- 37 + 38 + ## 2. Existing Rust Ecosystem 39 + 40 + ### 2.1 Recommended: `atproto-oauth-axum` 41 + 42 + **Crate:** [atproto-oauth-axum](https://crates.io/crates/atproto-oauth-axum) (v0.14.0, Feb 2026) 43 + **Author:** Nick Gerakines 44 + **Status:** Actively maintained, 22 releases since June 2025, ~440 downloads/month 45 + 46 + Provides pre-built Axum handlers for: 47 + - Authorization endpoint 48 + - Token endpoint 49 + - PAR endpoint 50 + - JWKS endpoint 51 + - Server metadata endpoint 52 + - Client metadata resolution 53 + - Authorization callback handling 54 + 55 + This is the most direct integration path if the relay uses Axum (which aligns with the Rust web server ecosystem). 56 + 57 + ### 2.2 Alternative: `atproto-oauth-aip` 58 + 59 + **Crate:** [atproto-oauth-aip](https://crates.io/crates/atproto-oauth-aip) 60 + **Status:** Same author, lower-level workflow library 61 + 62 + Use this if the relay uses a different HTTP framework (e.g., actix-web) or needs more control over the OAuth flow. Provides the OAuth logic without Axum-specific bindings. 63 + 64 + ### 2.3 Reference Implementation: graze-social/aip 65 + 66 + **Repo:** [graze-social/aip](https://github.com/graze-social/aip) (105 stars, v2.2.3, Jan 2026) 67 + **Status:** Production-ready, Docker support, multiple storage backends 68 + 69 + A complete standalone OAuth 2.1 authorization server with native ATProto integration. Useful as: 70 + - Reference for how a production ATProto OAuth server works 71 + - Potential deployment as a separate sidecar service (vs. embedding in the relay) 72 + - Storage backend patterns (SQLite, PostgreSQL) 73 + 74 + --- 75 + 76 + ## 3. Integration Architecture 77 + 78 + ### 3.1 Deployment Model 79 + 80 + Two viable approaches: 81 + 82 + **Option A: Embedded (recommended for v1.0)** 83 + 84 + The relay process embeds `atproto-oauth-axum` handlers directly into its Axum router. OAuth state lives in the same database as relay state. Simplest deployment — one process, one database. 85 + 86 + ``` 87 + [Third-party app] → HTTPS → [Relay: Axum router] 88 + ├── /oauth/* → atproto-oauth-axum handlers 89 + ├── /xrpc/* → XRPC proxy/handler 90 + └── /v1/* → Provisioning API 91 + ``` 92 + 93 + **Option B: Sidecar** 94 + 95 + Deploy graze-social/aip as a separate service. The relay delegates OAuth to the sidecar and validates tokens on XRPC requests. More complex but isolates OAuth concerns. 96 + 97 + Not recommended for v1.0 — adds operational complexity for a solo developer. 98 + 99 + ### 3.2 Storage 100 + 101 + OAuth state (authorization codes, tokens, sessions, client metadata cache) stored in the relay's SQLite database. Both `atproto-oauth-axum` and graze-social/aip support SQLite backends. 102 + 103 + Tables needed: 104 + - `oauth_authorization_codes` — short-lived, per-authorization-flow 105 + - `oauth_access_tokens` — bound to DPoP key, client_id, account 106 + - `oauth_refresh_tokens` — single-use, rotated on each use 107 + - `oauth_client_metadata_cache` — cached client metadata from discovery URLs 108 + - `oauth_dpop_nonces` — replay prevention 109 + 110 + ### 3.3 Account Binding 111 + 112 + The OAuth provider needs to map ATProto DIDs to relay accounts. During authorization: 113 + 114 + 1. User is redirected to relay's authorization endpoint 115 + 2. Relay resolves the user's DID → account_id 116 + 3. User authenticates (password, or session token if already logged in) 117 + 4. Relay issues tokens bound to the account 118 + 119 + The relay's existing session/authentication system (provisioning API §2) handles step 3. The OAuth library handles everything else. 120 + 121 + --- 122 + 123 + ## 4. Lifecycle Phase Behavior 124 + 125 + ### 4.1 Mobile-Only Phase 126 + 127 + The relay is a full PDS. OAuth works identically to any hosted PDS: 128 + - Authorization, token, and XRPC endpoints all on the relay 129 + - Relay stores repo, signs commits, serves reads 130 + - Third-party apps see a normal PDS 131 + 132 + No special behavior needed. This is the standard ATProto OAuth flow. 133 + 134 + ### 4.2 Desktop-Enrolled Phase 135 + 136 + The relay is still the OAuth provider and XRPC endpoint. The difference is internal: 137 + - Write XRPC calls (createRecord, etc.) are proxied to the desktop for repo construction before the relay signs them 138 + - Read XRPC calls can be served from relay cache 139 + - OAuth tokens and sessions are managed entirely at the relay — the desktop is invisible to third-party apps 140 + 141 + No OAuth changes needed for desktop enrollment. This is the key advantage of the relay-as-permanent-endpoint architecture. 142 + 143 + ### 4.3 Desktop Offline (During Desktop-Enrolled Phase) 144 + 145 + - Read XRPC calls: served from relay cache (no change to OAuth) 146 + - Write XRPC calls: relay returns 503 to the XRPC caller 147 + - OAuth tokens remain valid — the 503 is at the XRPC layer, not the auth layer 148 + 149 + Third-party apps see a PDS that accepts reads but rejects writes. This is a known ATProto pattern (PDS maintenance mode). 150 + 151 + --- 152 + 153 + ## 5. Endpoints 154 + 155 + The relay must serve these endpoints at its base URL (the DID document's service endpoint): 156 + 157 + | Endpoint | Source | Purpose | 158 + |----------|--------|---------| 159 + | `/.well-known/oauth-authorization-server` | atproto-oauth-axum | Server metadata (issuer, endpoints, supported flows) | 160 + | `/oauth/authorize` | atproto-oauth-axum | Authorization endpoint (user-facing) | 161 + | `/oauth/token` | atproto-oauth-axum | Token endpoint (app-facing) | 162 + | `/oauth/par` | atproto-oauth-axum | Pushed Authorization Request endpoint | 163 + | `/oauth/jwks` | atproto-oauth-axum | Public keys for token verification | 164 + | `/oauth/callback` | atproto-oauth-axum | Authorization callback | 165 + 166 + These are in addition to the relay's existing endpoints: 167 + - `/v1/*` — provisioning API 168 + - `/xrpc/*` — ATProto XRPC 169 + 170 + ### 5.1 Server Metadata 171 + 172 + The `/.well-known/oauth-authorization-server` response must include: 173 + 174 + ```json 175 + { 176 + "issuer": "https://relay.example.com", 177 + "authorization_endpoint": "https://relay.example.com/oauth/authorize", 178 + "token_endpoint": "https://relay.example.com/oauth/token", 179 + "pushed_authorization_request_endpoint": "https://relay.example.com/oauth/par", 180 + "jwks_uri": "https://relay.example.com/oauth/jwks", 181 + "scopes_supported": ["atproto", "transition:generic"], 182 + "response_types_supported": ["code"], 183 + "grant_types_supported": ["authorization_code", "refresh_token"], 184 + "token_endpoint_auth_methods_supported": ["none", "private_key_jwt"], 185 + "code_challenge_methods_supported": ["S256"], 186 + "dpop_signing_alg_values_supported": ["ES256"] 187 + } 188 + ``` 189 + 190 + --- 191 + 192 + ## 6. Authorization UI 193 + 194 + The relay needs a minimal web UI for the OAuth authorization screen. When a third-party app redirects a user to `/oauth/authorize`, the relay must: 195 + 196 + 1. Show the app's name and permissions requested 197 + 2. Allow the user to approve or deny 198 + 3. Redirect back to the app with an authorization code 199 + 200 + For v1.0, this can be a minimal server-rendered page. No SPA needed. The provisioning API's session system handles user authentication. 201 + 202 + For BYO relay operators, the authorization UI should be customizable (branding, colors) via relay config. 203 + 204 + --- 205 + 206 + ## 7. Security Considerations 207 + 208 + ### 7.1 Token Storage 209 + 210 + Access tokens and refresh tokens are stored server-side. The relay validates DPoP proofs on every request, preventing token theft from being useful without the DPoP private key. 211 + 212 + ### 7.2 Client Metadata Caching 213 + 214 + ATProto uses dynamic client registration — clients provide a metadata URL, not pre-registered credentials. The relay must: 215 + - Fetch and cache client metadata on first authorization 216 + - Re-validate periodically (TTL: 24 hours recommended) 217 + - Reject clients with unreachable or invalid metadata 218 + 219 + ### 7.3 Rate Limiting 220 + 221 + OAuth endpoints should be rate-limited separately from XRPC and provisioning API endpoints. Recommended limits: 222 + - Authorization: 10/min per IP 223 + - Token: 30/min per client_id 224 + - PAR: 30/min per client_id 225 + 226 + ### 7.4 BYO Relay Implications 227 + 228 + Self-hosted relay operators run their own OAuth provider. The BYO relay binary (Nix/Docker) must include the OAuth endpoints. The authorization UI defaults should be sensible without configuration. 229 + 230 + --- 231 + 232 + ## 8. Implementation Milestones 233 + 234 + ### v0.1 — Basic OAuth (blocks mobile-only phase) 235 + 236 + - Integrate `atproto-oauth-axum` into relay's Axum router 237 + - SQLite-backed token storage 238 + - Minimal authorization UI (server-rendered) 239 + - Server metadata endpoint 240 + - Test with Bluesky app as client 241 + 242 + ### v1.0 — Production OAuth 243 + 244 + - PostgreSQL storage backend option 245 + - Client metadata caching with TTL 246 + - Rate limiting on OAuth endpoints 247 + - Customizable authorization UI for BYO relay operators 248 + - Token revocation endpoint 249 + - Audit logging of authorization grants 250 + 251 + ### Later 252 + 253 + - Scoped tokens (read-only grants for specific collections) 254 + - Token introspection endpoint 255 + - Admin dashboard for managing active OAuth sessions 256 + 257 + --- 258 + 259 + ## 9. Integration Checklist 260 + 261 + Before the relay can accept third-party app logins: 262 + 263 + - [ ] `/.well-known/oauth-authorization-server` returns valid metadata 264 + - [ ] `/oauth/authorize` renders authorization UI and handles consent 265 + - [ ] `/oauth/token` issues DPoP-bound access + refresh tokens 266 + - [ ] `/oauth/par` accepts pushed authorization requests 267 + - [ ] `/oauth/jwks` returns current signing keys 268 + - [ ] PKCE (S256) enforced on all flows 269 + - [ ] DPoP proof validated on every token request 270 + - [ ] Refresh token rotation (single-use) working 271 + - [ ] Bluesky app can complete full OAuth flow 272 + - [ ] Bluesky app can create a post via XRPC after OAuth 273 + - [ ] Token bound to correct account/DID 274 + 275 + --- 276 + 277 + ## 10. Design Decisions 278 + 279 + | Decision | Rationale | Alternatives Considered | 280 + |----------|-----------|------------------------| 281 + | Embed `atproto-oauth-axum` in relay process | Simplest deployment for solo dev. One process, one DB. | Sidecar (graze-social/aip) — more complex ops. | 282 + | SQLite for OAuth storage in v1.0 | Matches relay's existing storage. No additional infra. | PostgreSQL from day one — overkill for early users. | 283 + | Minimal server-rendered auth UI | OAuth authorization screen is visited rarely. No SPA needed. | Full React SPA — unnecessary complexity. | 284 + | Use existing crates, don't build OAuth | ATProto OAuth is complex (DPoP, PAR, PKCE, dynamic registration). Building from scratch is months of work. | Build custom — slower, more bugs, no community fixes. |
+481
docs/pds-architecture.md
··· 1 + # Desktop PDS — System Architecture 2 + 3 + **v8 — Mobile-First Reconciliation · Four-Phase Milestones** 4 + 5 + Sovereign AT Protocol PDS on macOS via Tauri + Repo Engine + Iroh 6 + 7 + ## Milestone Legend 8 + 9 + **v0.1** — Mobile-Only PDS · relay is full PDS 10 + 11 + **v0.2** — Desktop Enrollment · relay as proxy+signer 12 + 13 + **v1.0** — Public Launch · product-ready 14 + 15 + **LATER** — Designed, built post-launch 16 + 17 + --- 18 + 19 + ## Changelog 20 + 21 + ### v8 Changes — Mobile-First Reconciliation 22 + 23 + Architecture reconciled with mobile architecture spec v1.2 (canonical). The relay is no longer just a tunnel+proxy — in the mobile-only phase, it IS the PDS. 24 + 25 + - **NEW** Four-phase milestone model (v0.1 / v0.2 / v1.0 / v2.0+) 26 + - **NEW** Phase 0: Mobile-Only lifecycle (relay as full PDS) 27 + - **FIX** Signing model: relay always signs, device constructs unsigned commits 28 + - **FIX** Tier model: Free/Pro/Business + BYO as deployment model 29 + - **FIX** Firehose: native emission (mobile-only) vs proxy (desktop-enrolled) 30 + - **FIX** Shamir: basic share generation moves to v0.1 (required at onboarding) 31 + - **FIX** DID keystore: Shamir split required at account creation, not v1.0 32 + - **REF** See unified-milestone-map.md for phase details 33 + 34 + ### Previous Versions 35 + 36 + v2: Corrected relay model (outbound only). v3: Shamir key recovery + repo snapshots. v4: Conformance strategy. v5: Custom PDS shell + atrium/rsky deps. v6: GeoDNS + BYO relay. v7: Milestone scoping + runtime threats. 37 + 38 + --- 39 + 40 + ## Device Lifecycle Phases 41 + 42 + The product launches mobile-first. The relay is a full PDS before any desktop is involved. 43 + 44 + **Phase: Mobile-Only (v0.1)** 45 + - Relay behavior: Full PDS — hosts repo, serves XRPC, signs commits, emits firehose 46 + - Repo location: Relay (primary and only copy) 47 + - Phone role: Identity wallet (key management, device admin) 48 + - Desktop: Does not exist yet 49 + 50 + **Phase: Desktop-Enrolled (v0.2)** 51 + - Relay behavior: XRPC proxy + signer — forwards writes to desktop, signs commits, serves reads from cache 52 + - Repo location: Desktop (primary), relay (cache) 53 + - Phone role: Identity wallet + device manager 54 + - Desktop: Runs repo engine, constructs unsigned commits 55 + 56 + **Phase: Desktop-Offline (v0.2+)** 57 + - Relay behavior: Serves reads from cache, 503 on writes 58 + - Repo location: Desktop (authoritative but unreachable) 59 + - Phone role: Same as desktop-enrolled 60 + - Desktop: Sleeping / powered off 61 + 62 + --- 63 + 64 + ## Layer 01 — Device Layer (Desktop, v0.2+) 65 + 66 + *In v0.1 (Mobile-Only), there is no device layer. All operations run on the relay.* 67 + 68 + ### Tauri Shell 69 + 70 + **v0.2** 71 + 72 + 🖥️ Native macOS app. Process lifecycle, auto-updates, system tray. Minimal IPC allowlist — webview cannot access filesystem, shell, or network directly. 73 + 74 + ### Repo Engine 75 + 76 + **v0.2** 77 + 78 + 📦 Purpose-built repo construction engine. SQLite-backed, local-first. Builds MST structures, constructs unsigned commits, manages collection storage. In desktop-enrolled mode, the relay proxies XRPC writes here, then signs the resulting commits. Does not serve XRPC directly to the network — the relay is always the network-facing endpoint. 79 + 80 + ### Dependency Stack 81 + 82 + **v0.1** 83 + 84 + 🧩 **atrium-api** — XRPC types, lexicon defs (auto-generated). **atrium-repo** — MST read/write, CAR export. **rsky-crypto** — P-256/K-256 commit signing. 85 + 86 + ### Iroh Endpoint 87 + 88 + **v0.1** 89 + 90 + 🔗 QUIC-based tunnel to relay. NAT traversal, connection resumption on wake. Pushes unsigned repo commits to relay for signing when online. 91 + 92 + ### DID Keystore 93 + 94 + **v0.1** 95 + 96 + 🔐 Signing keys in macOS Keychain (desktop) / Secure Enclave (phone). At account creation, root rotation key is split via 2-of-3 Shamir: Share 1 = iCloud Keychain, Share 2 = relay escrow, Share 3 = user's choice (device-local or BIP-39 paper backup). Basic key management for v0.1. Full recovery UI in v1.0. 97 + 98 + ### Recovery Share Manager 99 + 100 + **v1.0** 101 + 102 + 🛟 Full UI for Shamir share management and recovery ceremony. View share status, rotate shares, initiate recovery from device loss. Note: basic share GENERATION happens at v0.1 (during account creation). This component adds the management and recovery interface. 103 + 104 + ### Compat Warning Banner 105 + 106 + **v1.0** 107 + 108 + ⚠️ Non-blocking in-app warning when spec drift detected. Links to update. Never blocks launch. 109 + 110 + ### XRPC Hardening 111 + 112 + **v1.0** 113 + 114 + 🛡️ Request size limits on all endpoints. Rate limiting at relay. cargo-fuzz targets for CBOR/CAR/MST parsing paths. Adversarial MST key distribution testing per ATProto spec guidance. 115 + 116 + *REMOVED: rsky-pds (fork) — Replaced by repo engine in v5. Now tracking a spec, not a codebase.* 117 + 118 + --- 119 + 120 + ## Layer 02 — Relay Layer (Managed + BYO) 121 + 122 + ### Managed Relay (Your Infrastructure) 123 + 124 + #### Iroh Relay Node 125 + 126 + **v0.1** 127 + 128 + 🚇 Single-region for v0.1. Always-on tunnel endpoint. In mobile-only mode, serves as full PDS — no tunnel needed, relay handles all XRPC directly. In desktop-enrolled mode, acts as tunnel endpoint for device ↔ relay communication. Receives unsigned commits from device, constructs signed commits, proxies XRPC repo reads. 129 + 130 + #### requestCrawl Trigger 131 + 132 + **v0.1** 133 + 134 + 📣 On device reconnect, pings BGS requestCrawl so new content propagates immediately. 135 + 136 + #### Firehose Emitter 137 + 138 + **v0.1** 139 + 140 + 📡 Native com.atproto.sync.subscribeRepos WebSocket endpoint. Required for federation — every PDS must emit a firehose. In mobile-only mode, the relay is the PDS and emits directly. In desktop-enrolled mode, emits commits as they're signed. 141 + 142 + #### Firehose Proxy 143 + 144 + **v0.2** 145 + 146 + 📡 Maintains persistent BGS WebSocket on behalf of sleeping desktop. Replays commits from buffer when desktop reconnects. Ensures BGS sees continuous uptime even when desktop is offline. Desktop-enrolled feature — not applicable in mobile-only mode. Pro/Business tier on managed relay. 147 + 148 + #### Commit Buffer 149 + 150 + **v0.2** 151 + 152 + 💾 Rolling log of signed repo commits. Feeds firehose proxy during offline. Tiered retention: 7d free, 30d paid, 90d business. 153 + 154 + #### Provisioning API 155 + 156 + **v0.1** 157 + 158 + ⚙️ Account setup, domain linking, relay config. Onboarding flow for new connections. Core provisioning needed from day one. 159 + 160 + #### Key Share Escrow 161 + 162 + **v0.1** 163 + 164 + 🔏 Holds one encrypted Shamir share. Cannot reconstruct alone. Encrypted at rest, access-logged. Relay holds Share 2 from account creation. 165 + 166 + #### Health Monitor 167 + 168 + **v1.0** 169 + 170 + 💓 Device liveness, relay uptime, ATProto spec compat. Includes canary account for silent federation failure detection. 171 + 172 + #### GeoDNS Multi-Region 173 + 174 + **LATER** 175 + 176 + 🌎 2–3 relay nodes, route to nearest healthy. Simple failover with brief firehose gap. Cross-region replication interface designed but not built. 177 + 178 + #### Repo Snapshot 179 + 180 + **LATER** 181 + 182 + 🗄️ Full repo backup on relay. Incremental from commit buffer. Pro+ feature. Enables one-click device migration. 183 + 184 + #### CDN / Public Cache 185 + 186 + **LATER** 187 + 188 + 🌐 Serves public repo content during offline windows. 189 + 190 + ### BYO Relay (User-Hosted, Free) 191 + 192 + #### Relay Binary 193 + 194 + **v1.0** 195 + 196 + 📦 Open-source. Nix flake (source of truth) → Docker image + NixOS module. Tunnel + commit forwarding + requestCrawl. No firehose proxy, no snapshots. 197 + 198 + #### Device-Relay Protocol Spec 199 + 200 + **v1.0** 201 + 202 + 📜 Documented contract: handshake/auth, commit push, health ping, optional feature negotiation. Includes commit ack for future trust verification. 203 + 204 + #### Feature Negotiation 205 + 206 + **v1.0** 207 + 208 + 🔌 App queries relay capabilities on connect. Gracefully degrades when extended features unavailable. Suggests upgrade for missing features. 209 + 210 + *REMOVED: Inbound Message Queue — Not needed. ATProto records live in author's repo.* 211 + 212 + --- 213 + 214 + ## Data Flow 215 + 216 + ### Desktop-Enrolled Write Path 217 + 218 + 1. App creates record via XRPC (Tauri webview → Rust backend) 219 + 2. Repo Engine constructs MST diff + unsigned commit 220 + 3. Unsigned commit sent to relay via Iroh tunnel 221 + 4. Relay signs commit with P-256 signing key 222 + 5. Relay stores signed commit in buffer 223 + 6. Relay emits to firehose / serves via XRPC 224 + 225 + ### Mobile-Only Write Path (v0.1) 226 + 227 + 1. Third-party app (e.g. Bluesky) calls relay XRPC directly 228 + 2. Relay constructs record, MST diff, signs commit 229 + 3. Relay stores and emits to firehose 230 + 231 + ### Desktop-Offline Read Path 232 + 233 + 1. XRPC read request hits relay 234 + 2. Relay serves from commit buffer / repo cache 235 + 3. Writes return 503 (relay cannot construct commits from stale state) 236 + 237 + --- 238 + 239 + ## Data Flow — How Bob's Post Reaches the Network 240 + 241 + *Current scenario showing firehose proxy operation during v0.2+:* 242 + 243 + **Bob's Mac** (Repo Engine constructs unsigned commit) → **Iroh tunnel** → **Relay** (Signs commit with P-256 key) → **Commit Buffer** (Persists signed commit) → **Firehose Proxy** (Stable WebSocket) → **subscribeRepos** → **BGS** (Network firehose) → **indexes** → **AppView** (Bluesky etc.) 244 + 245 + *If Bob's Mac is asleep → relay serves reads from cache, returns 503 on writes. Commit buffer feeds firehose proxy from stored commits on wake.* 246 + 247 + --- 248 + 249 + ## Recovery Flow — Device Migration / Dead SSD (v1.0+) 250 + 251 + **New Mac** (Installs Tauri app) → **authenticates** → **2-of-3 Shares** (iCloud + relay escrow) → **reconstructs** → **Rotation Key** (Shamir recombination) → **did:plc op** → **Key Rotation** (New signing key) → **syncs** → **Repo Snapshot** (Full repo restore) 252 + 253 + **Share sources (any 2 of 3):** 254 + - ① iCloud Keychain 255 + - ② Relay escrow 256 + - ③ Exported recovery file 257 + - Future: ④ Trusted contact (social recovery) — interface designed, not yet shipped 258 + 259 + --- 260 + 261 + ## Layer 03 — Infrastructure (ATProto Network) 262 + 263 + ### Federation 264 + 265 + **v0.1** 266 + 267 + 🌍 PDS participates in ATProto network via relay. DID document points to relay URL as canonical PDS endpoint. 268 + 269 + ### DNS / Domain Automation 270 + 271 + **v1.0** 272 + 273 + 🔤 Handle-as-domain resolution. Automated DNS config for custom domain handles. 274 + 275 + ### DID Resolution 276 + 277 + **v0.1** 278 + 279 + 🪪 did:plc or did:web pointing to relay endpoint. Relay always reachable, DID resolution never fails due to offline device. 280 + 281 + --- 282 + 283 + ## Layer 04 — Ops (Security, Conformance & Updates) 284 + 285 + ### Update & Supply Chain Security 286 + 287 + #### 2-of-3 Threshold Signing 288 + 289 + **v1.0** 290 + 291 + 🔑 CI key + offline engineer key + cold storage. Compromised CI alone cannot ship malicious updates. 292 + 293 + #### Transparency Log 294 + 295 + **LATER** 296 + 297 + 📋 Sigstore-backed. Every release publicly logged. 298 + 299 + #### Apple Notarization 300 + 301 + **v1.0** 302 + 303 + 🍎 First verification layer via Tauri build pipeline. 304 + 305 + #### Responsible Disclosure 306 + 307 + **v1.0** 308 + 309 + 📬 security@ + published PGP key from day one. 310 + 311 + #### cargo-audit in CI 312 + 313 + **v0.1** 314 + 315 + 📦 Dependency vulnerability scanning on every build. Pin exact versions in Cargo.lock. Review diffs on dep updates. Verify atrium codegen input against upstream lexicons. 316 + 317 + ### Conformance Testing 318 + 319 + #### L1: Interop Test Vectors 320 + 321 + **v0.1** 322 + 323 + 🧪 Every commit. Official atproto-interop-tests + interop-test-files. Byte-level checks for MST, CAR, CBOR, CID, commit proofs. Strict MST validation. 324 + 325 + #### L2: Oracle Compat Suite 326 + 327 + **v1.0** 328 + 329 + 🔬 Nightly CI. Docker Compose: reference TypeScript PDS vs your Rust PDS. Compare CAR output, firehose events, MST roots. 330 + 331 + #### L3: Production Canary 332 + 333 + **v1.0** 334 + 335 + 🐤 Live account on real Bluesky via your relay. Health monitor verifies posts appear in AppView. Catches silent federation failures. 336 + 337 + ### Runtime Threat Mitigations 338 + 339 + #### XRPC Input Hardening 340 + 341 + **v1.0** 342 + 343 + 🔒 Request size limits per endpoint. Rate limiting at relay layer. cargo-fuzz targets for CBOR/CAR/MST parsing. Adversarial MST key testing per spec DoS guidance. 344 + 345 + #### Tauri IPC Lockdown 346 + 347 + **v0.1** 348 + 349 + 🏗️ Minimal allowlist: create/list/get records + status. Webview cannot access filesystem, shell, HTTP, or crypto. All sensitive ops in Rust backend only. 350 + 351 + #### Relay Trust Verification 352 + 353 + **LATER** 354 + 355 + 🤝 Device verifies commits appear in firehose. Protocol designed now (commit ack with seq number), verification logic built later. Protects against censorship by relay. 356 + 357 + --- 358 + 359 + ## Relay Tier Pricing — v1.0 Launch 360 + 361 + ### Free 362 + 363 + **$0/mo** 364 + 365 + - Iroh tunnel (NAT traversal) 366 + - Basic XRPC proxy 367 + - 7-day commit buffer 368 + - Key share escrow (1 share) 369 + - Apple notarized updates 370 + - No firehose proxy — BGS drops on sleep 371 + - See "BYO Relay" section for self-hosted option 372 + 373 + ### Pro 374 + 375 + **$X/mo** 376 + 377 + - Everything in Free 378 + - Stable firehose proxy (always-on WebSocket) 379 + - 30-day commit buffer 380 + - CDN cache for public content 381 + - requestCrawl auto-trigger 382 + - Custom domain handle 383 + - Multi-region GeoDNS (post-launch) 384 + - Full repo snapshot (post-launch) 385 + - One-click device migration 386 + 387 + ### Business 388 + 389 + **$XX/mo** 390 + 391 + - Everything in Pro 392 + - 90-day commit buffer 393 + - Continuous repo snapshot (post-launch) 394 + - Admin dashboard 395 + - Priority support 396 + - Custom relay config 397 + - Audit logs 398 + 399 + ### BYO Relay (Self-Hosted) 400 + 401 + Not a subscription tier — an alternative deployment model. Operators run their own relay binary (distributed via Nix flake or Docker image). 402 + 403 + **Includes:** 404 + - Full relay functionality (identical binary to managed relay) 405 + - SQLite or PostgreSQL backend (operator's choice) 406 + - Local or S3-compatible blob storage 407 + - No subscription fees — operator provides their own infrastructure 408 + - No managed monitoring or support 409 + 410 + Available at v1.0 launch. 411 + 412 + --- 413 + 414 + ## All Questions Resolved 415 + 416 + ### ✅ Availability — v0.1+ 417 + 418 + Firehose emitter for native federation. Firehose proxy + commit buffer for v0.2+. In mobile-only phase, relay is the PDS. In desktop-enrolled, relay maintains persistent connection for sleeping device. 419 + 420 + ### ✅ Durability — v1.0+ 421 + 422 + 2-of-3 Shamir key recovery. Tiered repo snapshots. iCloud + file export + relay escrow. 423 + 424 + ### ✅ Spec Drift — v0.1+ 425 + 426 + Repo Engine (atrium + rsky-crypto). 3-layer conformance: interop vectors → oracle → canary. 427 + 428 + ### ✅ Relay Redundancy — v1.0+ 429 + 430 + GeoDNS multi-region + BYO relay (Nix/Docker, free). Device-relay protocol spec. Feature negotiation. 431 + 432 + ### ✅ Runtime Threats — v1.0+ 433 + 434 + XRPC fuzzing + size limits. Tauri IPC lockdown. Commit ack protocol (designed). cargo-audit. Relay trust verification (designed, deferred). 435 + 436 + ### ✅ Mobile-First Architecture — v8 437 + 438 + Relay is full PDS in v0.1. Desktop enrolls in v0.2 as repo construction engine. Four-phase milestones reconcile mobile and desktop workflows. 439 + 440 + --- 441 + 442 + ## Milestone Summary — Four Phases 443 + 444 + ### v0.1 — Mobile-Only PDS (~3–4 months) 445 + 446 + **Goal:** User creates ATProto identity from iPhone, logs into Bluesky. 447 + Relay is a full PDS. No desktop involved. 448 + 449 + **Relay:** Axum + SQLite + repo engine + signing + XRPC + firehose emitter 450 + **OAuth:** atproto-oauth-axum integration (blocks Bluesky login) 451 + **Blobs:** upload/serve with local storage 452 + **Identity:** DID creation + Shamir split at onboarding 453 + **Federation:** 25 XRPC endpoints (see unified-milestone-map.md §2.1) 454 + **Testing:** L1 interop tests + cargo-audit 455 + 456 + ### v0.2 — Desktop Enrollment (~2–3 months) 457 + 458 + **Goal:** User pairs desktop Mac, relay becomes proxy+signer. 459 + 460 + **Device pairing:** via QR code + desktop promotion 461 + **XRPC write proxying:** relay → desktop → relay signs 462 + **Firehose proxy:** for sleeping desktop 463 + **Blob forwarding:** via Iroh 464 + **Desktop offline:** → 503 on writes, reads from cache 465 + 466 + ### v1.0 — Production Launch (~3–4 months) 467 + 468 + **Goal:** Production-ready product with recovery and self-hosting. 469 + 470 + **Shamir recovery ceremony:** + full share management UI 471 + **Tier pricing:** Free/Pro/Business 472 + **BYO relay binary:** Nix/Docker 473 + **S3 blob backend:** + CDN 474 + **PostgreSQL option:** for scale 475 + **L2 oracle suite + L3 canary** 476 + **XRPC hardening:** + rate limiting 477 + 478 + ### v2.0+ — Signing Sovereignty (TBD) 479 + 480 + **Goal:** User's hardware signs commits directly. 481 + **Contingent on:** ATProto protocol evolution (multi-key support).
+1200
docs/provisioning-api-spec.md
··· 1 + **Provisioning API** 2 + 3 + Design Specification 4 + 5 + Desktop PDS Relay Layer 6 + 7 + Version 0.3 --- Mobile-First Reconciliation 8 + 9 + March 2026 10 + 11 + *v0.3 Changes — Mobile-First Reconciliation + Endpoint Consolidation* 12 + 13 + *Reconciled with mobile architecture spec v1.2 (canonical) and migration spec v0.1.* 14 + *All endpoints tagged with milestone phase.* 15 + 16 + *FIX Ed25519 → P-256/secp256k1 throughout (ATProto requirement)* 17 + *FIX DID ceremony: client sends key material, relay constructs DID doc* 18 + *FIX Tier model: Free/Pro/Business + BYO as deployment model* 19 + *NEW POST /v1/accounts/mobile — combined mobile account creation* 20 + *NEW 9 endpoints from mobile spec (relay keys, device mgmt, signing)* 21 + *NEW 8 endpoints from migration spec (transfer, recovery, Shamir)* 22 + *NEW Blob endpoints (uploadBlob, getBlob, listBlobs)* 23 + *NEW Milestone tags on all endpoint groups* 24 + 25 + **CONFIDENTIAL** 26 + 27 + 1\. Overview 28 + 29 + This document specifies the provisioning API for the Desktop PDS relay layer. The API orchestrates five flows: account creation, device binding, DID ceremony, handle/domain setup, and account exit. Two client types consume the API: the web dashboard (account lifecycle, billing) and the Tauri desktop app (device binding, runtime operations). 30 + 31 + All endpoints are served over HTTPS at the relay's base URL. The API follows REST conventions with JSON request/response bodies. Authentication uses Bearer tokens with three scopes: account, session, and device. 32 + 33 + 1.1 Base URL 34 + 35 + > https://relay.{service-domain}/v1 36 + 37 + 1.2 Identity Model 38 + 39 + The relay supports two DID methods. The choice is made during the setup wizard and cannot be changed after the DID ceremony completes. 40 + 41 + ------------ ------------- ----------------------------------- --------------------------------------------------------------------------------------------- 42 + **Method** **Default** **Requirement** **Exit Story** 43 + did:plc Yes Available to all tiers User signs a PLC operation to repoint service endpoint. Clean exit, zero ongoing liability. 44 + did:web No Custom domain required (Pro tier) User controls the domain, repoints DNS at new PDS. Zero ongoing liability. 45 + ------------ ------------- ----------------------------------- --------------------------------------------------------------------------------------------- 46 + 47 + > ***Design Decision:** did:web is only available to users who bring their own domain. Subdomain-based did:web is not offered because it creates exit liability --- the relay would be obligated to host the DID document indefinitely after the user leaves.* 48 + 49 + 1.3 Authentication Model 50 + 51 + The API uses three token types, each scoped to a specific client and lifetime: 52 + 53 + ---------------- --------------------- ---------------------------- -------------------------------------------- 54 + **Token** **Issued To** **Lifetime** **Scope** 55 + session\_token Web dashboard 24 hours (renewable) Account management, billing, handle config 56 + device\_token Tauri app Long-lived (until revoked) Relay connection, DID ops, sync 57 + claim\_code Web → Tauri handoff 15 minutes (single-use) Device registration only 58 + ---------------- --------------------- ---------------------------- -------------------------------------------- 59 + 60 + All authenticated requests must include the token in the Authorization header: 61 + 62 + > Authorization: Bearer {token} 63 + > 64 + > ***Design Decision:** All users authenticate through the web dashboard, including self-hosted relay operators. There is no static\_token bypass. One auth path means one security model to audit.* 65 + 66 + 1.4 Rate Limiting 67 + 68 + All endpoints are rate-limited per token. Free-tier accounts have stricter limits. When a limit is hit, the API returns 429 Too Many Requests with a Retry-After header. 69 + 70 + ------------- ------------ ----------- --------------- 71 + **Tier** **Writes** **Reads** **Burst** 72 + Free 30/min 120/min 5 concurrent 73 + Pro 120/min 600/min 20 concurrent 74 + Business 300/min 1200/min 50 concurrent 75 + ------------- ------------ ----------- --------------- 76 + 77 + Note: BYO relay operators configure their own rate limits. BYO is a deployment model, not a subscription tier. See §6.3 for BYO relay configuration. 78 + 79 + 1.5 Error Envelope 80 + 81 + All error responses use a consistent JSON envelope: 82 + 83 + > { 84 + > 85 + > \"error\": { 86 + > 87 + > \"code\": \"ACCOUNT\_EXISTS\", 88 + > 89 + > \"message\": \"An account with this email already exists.\", 90 + > 91 + > \"details\": { \... } // optional, endpoint-specific 92 + > 93 + > } 94 + > 95 + > } 96 + 97 + 2\. Account Lifecycle 98 + 99 + Account endpoints are consumed by the web dashboard. They handle signup, authentication, and account management. Accounts start on the free tier with usage caps enforced at the relay level. 100 + 101 + **POST /v1/accounts** [v0.1] 102 + 103 + Create a new account. Returns session credentials and a one-time claim code for device binding. 104 + 105 + **Request Body** 106 + 107 + --------------- ---------- -------------- ----------------------- 108 + **Field** **Type** **Required** **Description** 109 + email string yes User's email address 110 + password string yes Minimum 12 characters 111 + display\_name string no Optional display name 112 + --------------- ---------- -------------- ----------------------- 113 + 114 + **Response (200 OK)** 115 + 116 + ---------------- ---------- -------------------------------------------- 117 + **Field** **Type** **Description** 118 + account\_id string UUID v7 account identifier 119 + session\_token string JWT, 24-hour expiry 120 + claim\_code string 6-character alphanumeric, 15-minute expiry 121 + tier string Always \"free\" on creation 122 + ---------------- ---------- -------------------------------------------- 123 + 124 + **Error Responses** 125 + 126 + ------------ ----------------- --------------------------------------- 127 + **Status** **Code** **Description** 128 + 409 ACCOUNT\_EXISTS Email already registered 129 + 422 WEAK\_PASSWORD Password doesn't meet requirements 130 + 429 RATE\_LIMITED Too many signup attempts from this IP 131 + ------------ ----------------- --------------------------------------- 132 + 133 + > ***Note:** The claim\_code is displayed in the web dashboard for the user to paste into their Tauri app. It is single-use and expires after 15 minutes. A new one can be generated via POST /v1/accounts/claim-codes.* 134 + 135 + **POST /v1/accounts/mobile** [v0.1] 136 + 137 + Combined account creation for mobile clients. Creates the account, binds the device, generates the relay signing key, and initiates the DID ceremony in one request. Replaces the multi-step web dashboard flow for iOS users. 138 + 139 + **Request Body** 140 + 141 + ----------------------- ---------- -------------- ------------------------------------------------------- 142 + **Field** **Type** **Required** **Description** 143 + email string yes User's email address 144 + password string yes Minimum 12 characters 145 + display_name string no Optional display name 146 + device_public_key string yes P-256 public key from Secure Enclave 147 + device_name string no e.g. "iPhone 15 Pro" 148 + rotation_pub_key string yes P-256 rotation key (stays on device) 149 + handle string no Desired handle (subdomain assigned if omitted) 150 + did_method string no "did:plc" (default) or "did:web" 151 + ----------------------- ---------- -------------- ------------------------------------------------------- 152 + 153 + **Response (200 OK)** 154 + 155 + ----------------------- ---------- ------------------------------------------------------- 156 + **Field** **Type** **Description** 157 + account_id string UUID v7 account identifier 158 + device_id string UUID v7 device identifier 159 + device_token string Long-lived opaque token 160 + session_token string JWT, 24-hour expiry 161 + did string Fully qualified DID string 162 + did_document object The constructed DID document 163 + handle string Assigned handle 164 + relay_endpoint object Relay Endpoint Object (see §3.2) 165 + relay_signing_key string Key ID of the relay's signing key 166 + tier string Always "free" on creation 167 + shamir_share_1 string Encrypted share for iCloud Keychain storage 168 + shamir_share_3_options array Available storage methods for Share 3 169 + ----------------------- ---------- ------------------------------------------------------- 170 + 171 + **Error Responses** 172 + 173 + ------------ ----------------------- ------------------------------------------------------- 174 + **Status** **Code** **Description** 175 + 409 ACCOUNT_EXISTS Email already registered 176 + 422 WEAK_PASSWORD Password doesn't meet requirements 177 + 422 INVALID_KEY Public key is malformed or unsupported curve 178 + 409 HANDLE_TAKEN Requested handle is already in use 179 + 429 RATE_LIMITED Too many signup attempts 180 + ------------ ----------------------- ------------------------------------------------------- 181 + 182 + Note: This endpoint performs Shamir share generation as part of account creation. Share 1 is returned in the response for the client to store in iCloud Keychain. Share 2 is escrowed at the relay. Share 3 handling depends on user choice (communicated in a follow-up call or during onboarding flow). 183 + 184 + **POST /v1/accounts/sessions** [v0.1] 185 + 186 + Authenticate and obtain a session token. Supports email/password and refresh token flows. 187 + 188 + **Request Body** 189 + 190 + ---------------- ---------- -------------- ---------------------------------------- 191 + **Field** **Type** **Required** **Description** 192 + email string yes\* Required for password auth 193 + password string yes\* Required for password auth 194 + refresh\_token string yes\* Alternative: renew an existing session 195 + ---------------- ---------- -------------- ---------------------------------------- 196 + 197 + **Response (200 OK)** 198 + 199 + ---------------- ---------- ----------------------------- 200 + **Field** **Type** **Description** 201 + session\_token string JWT, 24-hour expiry 202 + refresh\_token string Opaque token, 30-day expiry 203 + account\_id string UUID v7 204 + ---------------- ---------- ----------------------------- 205 + 206 + **Error Responses** 207 + 208 + ------------ ---------------------- --------------------------- 209 + **Status** **Code** **Description** 210 + 401 INVALID\_CREDENTIALS Email/password mismatch 211 + 401 TOKEN\_EXPIRED Refresh token has expired 212 + 423 ACCOUNT\_LOCKED Too many failed attempts 213 + ------------ ---------------------- --------------------------- 214 + 215 + **POST /v1/accounts/claim-codes** [v0.1] 216 + 217 + Generate a new device claim code. Invalidates any previously active claim code for this account. 218 + 219 + **Response (200 OK)** 220 + 221 + ------------- ---------- -------------------------------------------- 222 + **Field** **Type** **Description** 223 + claim\_code string 6-character alphanumeric, 15-minute expiry 224 + expires\_at string ISO 8601 timestamp 225 + ------------- ---------- -------------------------------------------- 226 + 227 + **Error Responses** 228 + 229 + ------------ --------------- ---------------------------------- 230 + **Status** **Code** **Description** 231 + 401 UNAUTHORIZED Invalid or missing session token 232 + 429 RATE\_LIMITED Max 5 claim codes per hour 233 + ------------ --------------- ---------------------------------- 234 + 235 + > ***Note:** Requires session\_token authentication.* 236 + 237 + **GET /v1/accounts/:id/usage** [v0.1] 238 + 239 + Returns current usage metrics for the account. Consumed by both the web dashboard (billing page) and the Tauri app (status bar indicator). 240 + 241 + **Response (200 OK)** 242 + 243 + ------------------ ---------- ------------------------------------------------------- 244 + **Field** **Type** **Description** 245 + tier string Current tier: \"free\" \| \"pro\" \| \"self\_hosted\" 246 + period\_start string ISO 8601, start of current billing period 247 + storage\_bytes integer Repo storage consumed 248 + storage\_limit integer Tier limit in bytes 249 + bandwidth\_bytes integer Relay bandwidth this period 250 + bandwidth\_limit integer Tier limit in bytes 251 + requests\_count integer XRPC requests proxied this period 252 + requests\_limit integer Tier limit 253 + ------------------ ---------- ------------------------------------------------------- 254 + 255 + **Error Responses** 256 + 257 + ------------ -------------- -------------------------------- 258 + **Status** **Code** **Description** 259 + 401 UNAUTHORIZED Invalid token 260 + 403 FORBIDDEN Token doesn't own this account 261 + ------------ -------------- -------------------------------- 262 + 263 + > ***Note:** Both session\_token and device\_token can access this endpoint, but only for their own account.* 264 + 265 + 3\. Device Registration 266 + 267 + Device endpoints are consumed by the Tauri app. The claim code handoff binds a specific device to an account, and the device\_token becomes the app's long-lived credential for all relay interactions. 268 + 269 + 3.1 Multi-Device Model 270 + 271 + Pro accounts support up to 5 devices. To maintain a linear commit chain on the ATProto repo (required for federation), the relay enforces a primary-device model with lease-based write ownership. 272 + 273 + - **Primary device:** Holds the write lease. Can commit to the repo, push to the relay, and trigger federation events. 274 + 275 + - **Secondary devices:** Read-only replicas that sync from the relay. They see the full repo but cannot commit. A secondary can request promotion to primary. 276 + 277 + - **Lease transfer:** Explicit via the web dashboard or Tauri app. The current primary relinquishes the lease, the requesting device acquires it. If the primary is offline for longer than the lease TTL (configurable, default 24 hours), the lease expires and any device can claim it. 278 + 279 + > ***Design Decision:** Primary-device with lease was chosen over last-write-wins or conflict queues. LWW risks lost writes on rebase; conflict queues require users to resolve Merkle tree forks. Primary-device sidesteps the problem entirely and matches how most people use desktop apps.* 280 + 281 + **POST /v1/devices** [v0.1] 282 + 283 + Register a device by redeeming a claim code. The Tauri app generates a keypair locally and sends the public key. The relay binds the device and returns a device token. The first device registered to an account automatically receives the primary write lease. 284 + 285 + **Request Body** 286 + 287 + --------------------- ---------- -------------- -------------------------------------------- 288 + **Field** **Type** **Required** **Description** 289 + claim\_code string yes 6-character code from web dashboard 290 + device\_public\_key string yes P-256 (secp256r1) public key, base64url-encoded 291 + device\_name string no Human-readable label, e.g. \"MacBook Pro\" 292 + os string no Operating system identifier 293 + app\_version string no Tauri app version string 294 + --------------------- ---------- -------------- -------------------------------------------- 295 + 296 + **Response (200 OK)** 297 + 298 + ----------------- ---------- ------------------------------------------- 299 + **Field** **Type** **Description** 300 + device\_id string UUID v7 device identifier 301 + device\_token string Long-lived opaque token 302 + account\_id string Bound account UUID 303 + is\_primary boolean Whether this device holds the write lease 304 + relay\_endpoint object See Relay Endpoint Object below 305 + ----------------- ---------- ------------------------------------------- 306 + 307 + **Error Responses** 308 + 309 + ------------ ---------------- --------------------------------------------------- 310 + **Status** **Code** **Description** 311 + 400 INVALID\_CLAIM Claim code is invalid, expired, or already used 312 + 409 DEVICE\_LIMIT Account has reached maximum device count for tier 313 + 422 INVALID\_KEY Public key is malformed or unsupported curve 314 + ------------ ---------------- --------------------------------------------------- 315 + 316 + > ***Note:** The device private key never leaves the Tauri app. The relay stores only the public key for challenge-response verification during reconnection.* 317 + 318 + 3.2 Relay Endpoint Object 319 + 320 + Returned by device registration and the relay info endpoint: 321 + 322 + ---------------- ---------- -------------------------------------------- 323 + **Field** **Type** **Description** 324 + host string Relay hostname 325 + port integer Relay port (typically 443) 326 + iroh\_node\_id string Iroh node identifier for direct connection 327 + region string Relay region code, e.g. \"us-east-1\" 328 + protocol string Connection protocol: \"iroh\" \| \"wss\" 329 + ---------------- ---------- -------------------------------------------- 330 + 331 + **GET /v1/devices/:id/relay** [v0.1] 332 + 333 + Retrieve the assigned relay endpoint for a device. Used by the Tauri app on startup to discover where to connect. 334 + 335 + **Response (200 OK)** 336 + 337 + ----------------- ---------- --------------------------------------------------------- 338 + **Field** **Type** **Description** 339 + relay\_endpoint object Relay Endpoint Object (see 3.2) 340 + status string Relay status: \"online\" \| \"degraded\" \| \"offline\" 341 + buffer\_depth integer Messages buffered while device was offline 342 + ----------------- ---------- --------------------------------------------------------- 343 + 344 + **Error Responses** 345 + 346 + ------------ -------------------- ---------------------------------------- 347 + **Status** **Code** **Description** 348 + 401 UNAUTHORIZED Invalid device token 349 + 404 DEVICE\_NOT\_FOUND Device ID doesn't exist or was revoked 350 + ------------ -------------------- ---------------------------------------- 351 + 352 + **POST /v1/devices/:id/lease** [v1.0] 353 + 354 + Request or release the primary write lease for a device. 355 + 356 + **Request Body** 357 + 358 + ----------- ---------- -------------- ---------------------------- 359 + **Field** **Type** **Required** **Description** 360 + action string yes \"acquire\" or \"release\" 361 + ----------- ---------- -------------- ---------------------------- 362 + 363 + **Response (200 OK)** 364 + 365 + -------------------- ---------- --------------------------------------------------------------- 366 + **Field** **Type** **Description** 367 + is\_primary boolean Whether this device now holds the lease 368 + lease\_expires\_at string ISO 8601, when the lease auto-expires if not renewed 369 + previous\_primary string Device ID of the previous primary (null if lease was expired) 370 + -------------------- ---------- --------------------------------------------------------------- 371 + 372 + **Error Responses** 373 + 374 + ------------ ---------------------- --------------------------------------------------------------------------------- 375 + **Status** **Code** **Description** 376 + 409 LEASE\_HELD Another device holds an active lease. Must wait for expiry or explicit release. 377 + 403 SINGLE\_DEVICE\_TIER Free tier accounts have only one device; lease management is not applicable. 378 + ------------ ---------------------- --------------------------------------------------------------------------------- 379 + 380 + > ***Note:** The Tauri app should silently renew the lease by calling this endpoint periodically (recommended: every 6 hours). If the primary device is offline beyond the lease TTL (default 24h), any other device can acquire the lease.* 381 + 382 + **DELETE /v1/devices/:id** [v1.0] 383 + 384 + Revoke a device. Invalidates its device\_token and disconnects it from the relay. If the revoked device was primary, the lease is released. Buffered messages are held for 72 hours before purging. 385 + 386 + **Response (200 OK)** 387 + 388 + ------------------- ---------- ----------------------------------------------------------- 389 + **Field** **Type** **Description** 390 + revoked\_at string ISO 8601 timestamp 391 + buffer\_purge\_at string ISO 8601, when buffered messages will be deleted 392 + lease\_released boolean Whether the primary lease was released by this revocation 393 + ------------------- ---------- ----------------------------------------------------------- 394 + 395 + **Error Responses** 396 + 397 + ------------ -------------- ------------------------------- 398 + **Status** **Code** **Description** 399 + 401 UNAUTHORIZED Invalid token 400 + 403 FORBIDDEN Token doesn't own this device 401 + ------------ -------------- ------------------------------- 402 + 403 + > ***Note:** Requires session\_token (web dashboard). Device tokens cannot self-revoke.* 404 + 405 + 4\. DID Ceremony 406 + 407 + The DID ceremony binds a decentralized identifier to the user's device and relay. The Tauri app generates the DID document locally (the private key never leaves the device) and submits it to the relay for registration. The DID is the user's identity --- the follow graph, repo, and all social connections reference it. There is no such thing as migrating from one DID to another; that would be creating a new account. 408 + 409 + 4.1 PLC Mirror 410 + 411 + The relay operates a PLC directory mirror to improve resolution speed and provide resilience against upstream outages. The mirror starts as read-only (caching and serving existing PLC documents) and may graduate to a read-write authority in a future version. 412 + 413 + ----------- ---------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 414 + **Phase** **Mode** **Behavior** 415 + v1.0 Read-only cache Mirrors the PLC directory. Serves cached DID documents for faster resolution. Falls back to upstream plc.directory if cache misses. Provides resilience if upstream is temporarily unavailable. 416 + Future Read-write authority Can accept and validate new PLC operations independently. Participates in the PLC directory network as a peer. 417 + ----------- ---------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 418 + 419 + > ***Note:** New DID operations (creation, rotation) are submitted to plc.directory via the relay as a proxy in v1.0. The mirror is purely for read-side performance and resilience.* 420 + 421 + **POST /v1/dids** [v0.1] 422 + 423 + Initiate a DID ceremony. The client submits key material (public keys only — private keys never leave the device). The relay constructs the DID document, including verification methods from submitted keys, service endpoint pointing to the relay, and handle (atproto_handle alias). 424 + 425 + For did:plc, the relay constructs and signs the genesis operation, then submits it to plc.directory. For did:web, the relay begins serving /.well-known/did.json through the user's custom domain. 426 + 427 + The relay holds the signing key and uses it for all commit signing. The rotation key stays on the device (Secure Enclave on iOS, Keychain on macOS) and is only needed for DID recovery/rotation. 428 + 429 + **Request Body** 430 + 431 + ----------------------- ---------- -------------- ------------------------------------------------------- 432 + **Field** **Type** **Required** **Description** 433 + signing_pub_key string yes P-256 public key for signing, base64url-encoded 434 + rotation_pub_key string yes P-256 public key for DID rotation, base64url-encoded 435 + method string no "did:plc" (default) or "did:web" (requires custom domain) 436 + handle string no Desired handle (if not already set via POST /v1/handles) 437 + recovery_keys array no Additional rotation keys for did:plc recovery 438 + ----------------------- ---------- -------------- ------------------------------------------------------- 439 + 440 + **Response (200 OK)** 441 + 442 + ----------------------- ---------- ------------------------------------------------------- 443 + **Field** **Type** **Description** 444 + did string Fully qualified DID string 445 + did_document object W3C DID Document 446 + relay_signing_key_id string Identifier of the relay-generated signing key 447 + ----------------------- ---------- ------------------------------------------------------- 448 + 449 + **Error Responses** 450 + 451 + ------------ -------------------------------- ------------------------------------------------------- 452 + **Status** **Code** **Description** 453 + 409 ACCOUNT_NOT_FOUND Account does not exist 454 + 422 INVALID_KEY_FORMAT Public key is malformed or not on supported curve (P-256/secp256k1) 455 + 422 UNSUPPORTED_DID_METHOD Requested method is not supported or missing required fields 456 + 429 RATE_LIMITED Too many DID ceremonies initiated 457 + ------------ -------------------------------- ------------------------------------------------------- 458 + 459 + > ***Note:** For did:plc, the relay submits the signed genesis operation to plc.directory and status will be "pending_propagation" until confirmed. For did:web, the relay begins serving /.well-known/did.json through the user's custom domain.* 460 + 461 + **GET /v1/dids/:did** [v0.1] 462 + 463 + Retrieve the current DID document and resolution status. For did:plc, resolution checks the PLC mirror first, then falls back to upstream. 464 + 465 + **Response (200 OK)** 466 + 467 + ------------------- ---------- ----------------------------------------------------------- 468 + **Field** **Type** **Description** 469 + did string Fully qualified DID string 470 + did\_document object Current DID document 471 + method string \"did:plc\" \| \"did:web\" 472 + status string \"active\" \| \"pending\_propagation\" \| \"deactivated\" 473 + resolution\_url string Public resolution URL 474 + last\_rotated\_at string ISO 8601, last key rotation timestamp 475 + mirror\_age\_ms integer For did:plc: staleness of the PLC mirror cache entry 476 + ------------------- ---------- ----------------------------------------------------------- 477 + 478 + **Error Responses** 479 + 480 + ------------ ----------------- --------------------------------- 481 + **Status** **Code** **Description** 482 + 404 DID\_NOT\_FOUND DID doesn't exist in this relay 483 + ------------ ----------------- --------------------------------- 484 + 485 + **POST /v1/dids/:did/rotate** [v1.0] 486 + 487 + Rotate the signing key for a DID. The Tauri app generates a new keypair, signs the rotation operation with the current key, and submits it. Critical for key compromise recovery. 488 + 489 + **Request Body** 490 + 491 + ------------------ ---------- -------------- -------------------------------------------------------- 492 + **Field** **Type** **Required** **Description** 493 + new\_public\_key string yes New P-256 (secp256r1) public key, base64url-encoded 494 + rotation\_proof string yes Signed proof from the current key authorizing rotation 495 + ------------------ ---------- -------------- -------------------------------------------------------- 496 + 497 + **Response (200 OK)** 498 + 499 + --------------------- ---------- ---------------------------------------- 500 + **Field** **Type** **Description** 501 + did\_document object Updated DID document with new key 502 + previous\_key\_hash string Hash of the rotated-out key for audit 503 + status string \"active\" \| \"pending\_propagation\" 504 + --------------------- ---------- ---------------------------------------- 505 + 506 + **Error Responses** 507 + 508 + ------------ ------------------------ ------------------------------------- 509 + **Status** **Code** **Description** 510 + 400 INVALID\_PROOF Rotation proof signature is invalid 511 + 409 ROTATION\_IN\_PROGRESS A rotation is already pending 512 + ------------ ------------------------ ------------------------------------- 513 + 514 + > ***Note:** Key rotation also updates the device's registered public key at the relay. For did:plc, the signed rotation operation is submitted to plc.directory. The old key remains valid for challenge-response for a 24-hour grace period to handle in-flight requests.* 515 + 516 + 5\. Handle & Domain Management 517 + 518 + Handle endpoints manage the user's ATProto handle (e.g., alice.yourservice.net or alice.com). Free-tier users get a subdomain on the service domain. Pro users can bring their own domain. The relay automates DNS record provisioning via Cloudflare or Route53. 519 + 520 + > ***Note:** Custom domains serve double duty: they are required for both custom handles AND did:web identity. When a Pro user verifies a custom domain, both the handle and the did:web option become available in the setup wizard.* 521 + 522 + **POST /v1/handles** [v0.1] 523 + 524 + Request a handle. For subdomain handles, the relay provisions DNS immediately. For custom domains, the relay returns required DNS records for the user to configure. 525 + 526 + **Request Body** 527 + 528 + ----------- ---------- -------------- -------------------------------------------------------------------------- 529 + **Field** **Type** **Required** **Description** 530 + handle string yes Desired handle (e.g., \"alice\" for subdomain, \"alice.com\" for custom) 531 + type string no \"subdomain\" (default) or \"custom\" 532 + ----------- ---------- -------------- -------------------------------------------------------------------------- 533 + 534 + **Response (200 OK)** 535 + 536 + --------------------- ---------- ------------------------------------------------------------- 537 + **Field** **Type** **Description** 538 + handle string Full handle (e.g., \"alice.relay.example.net\") 539 + type string \"subdomain\" \| \"custom\" 540 + status string \"active\" \| \"pending\_dns\" \| \"pending\_verification\" 541 + dns\_records array Required DNS records (for custom domains) 542 + verification\_token string TXT record value for domain ownership proof 543 + didweb\_eligible boolean Whether this domain unlocks did:web as a DID method option 544 + --------------------- ---------- ------------------------------------------------------------- 545 + 546 + **Error Responses** 547 + 548 + ------------ ------------------ --------------------------------------------------- 549 + **Status** **Code** **Description** 550 + 409 HANDLE\_TAKEN Handle is already in use 551 + 403 TIER\_RESTRICTED Custom domains require Pro tier 552 + 422 INVALID\_HANDLE Handle contains invalid characters or is reserved 553 + ------------ ------------------ --------------------------------------------------- 554 + 555 + 5.1 DNS Records Object 556 + 557 + For custom domains, the response includes an array of DNS records the user must create: 558 + 559 + -------------- ---------- ------------------------------------------------ 560 + **Field** **Type** **Description** 561 + record\_type string DNS record type: \"CNAME\" \| \"TXT\" \| \"A\" 562 + name string Record name (e.g., \"\_atproto.alice.com\") 563 + value string Record value 564 + ttl integer Recommended TTL in seconds 565 + -------------- ---------- ------------------------------------------------ 566 + 567 + **GET /v1/handles/:handle/status** [v0.1] 568 + 569 + Poll DNS propagation and verification status for a handle. The Tauri app and web dashboard poll this endpoint after handle creation. 570 + 571 + **Response (200 OK)** 572 + 573 + ----------------- ---------- --------------------------------------------------------------------------- 574 + **Field** **Type** **Description** 575 + handle string The handle being checked 576 + status string \"active\" \| \"pending\_dns\" \| \"pending\_verification\" \| \"failed\" 577 + dns\_checks array Per-record check results with pass/fail and last\_checked timestamps 578 + verified\_at string ISO 8601, null until verified 579 + failure\_reason string Present only if status is \"failed\" 580 + ----------------- ---------- --------------------------------------------------------------------------- 581 + 582 + **Error Responses** 583 + 584 + ------------ -------------------- --------------------------------------- 585 + **Status** **Code** **Description** 586 + 404 HANDLE\_NOT\_FOUND Handle doesn't exist for this account 587 + ------------ -------------------- --------------------------------------- 588 + 589 + > ***Note:** For subdomain handles, status transitions directly to \"active\". For custom domains, expect 5--30 minutes for DNS propagation. The relay checks every 60 seconds.* 590 + 591 + **DELETE /v1/handles/:handle** [v0.1] 592 + 593 + Release a handle. For subdomain handles, the DNS record is removed immediately. For custom domains, the relay stops serving resolution but does not modify the user's DNS. 594 + 595 + **Response (200 OK)** 596 + 597 + ----------------- ---------- --------------------------------------------------------------- 598 + **Field** **Type** **Description** 599 + released\_at string ISO 8601 timestamp 600 + cooldown\_until string ISO 8601, handle is reserved for 30 days to prevent squatting 601 + ----------------- ---------- --------------------------------------------------------------- 602 + 603 + **Error Responses** 604 + 605 + ------------ -------------- ------------------------------- 606 + **Status** **Code** **Description** 607 + 401 UNAUTHORIZED Invalid token 608 + 403 FORBIDDEN Token doesn't own this handle 609 + ------------ -------------- ------------------------------- 610 + 611 + 6\. Setup Wizard Flow 612 + 613 + The Tauri app includes a first-run setup wizard that guides the user through the complete provisioning flow in a single sitting. The wizard orchestrates the API calls described in sections 2--5 into a linear, non-technical experience. 614 + 615 + 6.1 Wizard Steps 616 + 617 + 1. **Enter claim code** --- User copies the 6-character code from the web dashboard. Tauri generates a P-256 keypair and calls POST /v1/devices. 618 + 619 + 2. **Choose a handle** --- Wizard offers a subdomain handle immediately. If the account has a verified custom domain (Pro tier), the custom domain option also appears. Calls POST /v1/handles. 620 + 621 + 3. **DID ceremony** --- Wizard pre-selects did:plc. If the user selected a custom domain handle in step 2, a toggle appears offering did:web as an alternative. Wizard prompts for optional recovery key setup (Shamir shares). Calls POST /v1/dids. 622 + 623 + 4. **Wait for propagation** --- Wizard polls GET /v1/dids/:did and GET /v1/handles/:handle/status until both are active. Shows a progress indicator with estimated time. 624 + 625 + 5. **Federation handshake** --- Relay calls requestCrawl to the BGS. Wizard confirms the PDS is discoverable on the ATProto network. Displays a \"You're live\" confirmation. 626 + 627 + 6.2 Failure Recovery 628 + 629 + Each wizard step is independently retryable. If the Tauri app loses connection or is closed mid-wizard, it resumes from the last completed step on next launch. The wizard state is persisted locally. 630 + 631 + ---------------------------- ------------------------------------------------------------------------ 632 + **Failure Point** **Recovery** 633 + Claim code expired Generate new code via web dashboard; wizard restarts at step 1 634 + Device registration failed Retry with a new claim code 635 + Handle taken Wizard prompts for an alternative handle 636 + DID propagation timeout Wizard continues polling; user can skip and check later 637 + DNS verification timeout Wizard shows manual DNS instructions; polling continues in background 638 + Relay unreachable Tauri retries with exponential backoff; wizard shows connection status 639 + ---------------------------- ------------------------------------------------------------------------ 640 + 641 + 6.3 BYO Relay Configuration 642 + 643 + Self-hosted users who run their own relay can point the Tauri app at their relay before entering the setup wizard. The wizard writes a config file that the app reads on subsequent launches. 644 + 645 + 6.3.1 Config File 646 + 647 + --------------- ---------------------------- 648 + **Platform** **Path** 649 + macOS / Linux \~/.config/pds/relay.toml 650 + Windows %APPDATA%\\pds\\relay.toml 651 + --------------- ---------------------------- 652 + 653 + Minimal config shape: 654 + 655 + > \[relay\] 656 + > 657 + > url = \"https://my-relay.example.com\" 658 + > 659 + > iroh\_node\_id = \"abc123\...\" \# optional, discovered via API 660 + > 661 + > \# Auth method is always claim\_code (web dashboard flow) 662 + 663 + If the config file exists when the Tauri app launches for the first time, the wizard uses the configured relay URL instead of the default managed service. All subsequent wizard steps (claim code, device binding, DID ceremony) proceed identically. 664 + 665 + 7\. Exit Ceremony 666 + 667 + The exit ceremony allows a user to leave the relay and take their identity and data with them. The DID is the user's identity --- what moves is the PDS hosting, not the identity itself. The follow graph, followers, and all social connections remain intact because the DID never changes. 668 + 669 + > ***Design Decision:** The exit story is a sovereignty requirement, not an afterthought. If users can't leave cleanly, the product's sovereignty promise is hollow.* 670 + 671 + 7.1 Exit Flow 672 + 673 + 6. **Export repo** --- User downloads a full CAR file of their ATProto repo via the export endpoint. This is the portable data package. 674 + 675 + 7. **Prepare new PDS** --- User imports the CAR file into their new PDS host (outside the scope of this API, but the export format follows the ATProto spec for com.atproto.sync.getRepo). 676 + 677 + 8. **Repoint DID** --- For did:plc: user signs a PLC operation updating the service endpoint to their new PDS. The relay submits this to plc.directory. For did:web: user updates their domain's DNS / .well-known/did.json to point to the new PDS. No relay involvement needed. 678 + 679 + 9. **Grace period** --- The relay continues serving the repo and forwarding requests for 30 days after the DID repoints. This gives the network time to update resolution caches and ensures no dropped interactions during transition. 680 + 681 + 10. **Account teardown** --- After the grace period (or immediately if the user confirms), the relay purges the repo data, revokes all device tokens, and marks the account as closed. 682 + 683 + **GET /v1/export/repo** [v1.0] 684 + 685 + Export the full ATProto repo as a CAR (Content Addressable aRchive) file. This is a potentially large download --- the relay streams the response. 686 + 687 + **Response (200 OK)** 688 + 689 + --------------------- ---------- -------------------------------------------------- 690 + **Field** **Type** **Description** 691 + Content-Type header application/vnd.ipld.car 692 + Content-Disposition header attachment; filename=\"{did}.car\" 693 + X-Repo-Rev header The repo revision (commit CID) at time of export 694 + --------------------- ---------- -------------------------------------------------- 695 + 696 + **Error Responses** 697 + 698 + ------------ ---------------------- ---------------------------------------------------- 699 + **Status** **Code** **Description** 700 + 401 UNAUTHORIZED Invalid token 701 + 503 EXPORT\_IN\_PROGRESS Another export is already running for this account 702 + ------------ ---------------------- ---------------------------------------------------- 703 + 704 + > ***Note:** Requires device\_token from the primary device, or session\_token. The response is streamed --- clients should handle large payloads. Recommended: pipe to disk rather than buffering in memory.* 705 + 706 + **POST /v1/dids/:did/migrate** [v1.0] 707 + 708 + Construct and submit a signed DID operation that repoints the service endpoint to a new PDS. For did:plc, this submits the operation to plc.directory. For did:web, this is a no-op (the user controls their own DNS). 709 + 710 + **Request Body** 711 + 712 + ------------------------ ---------- -------------- ------------------------------------------------------------ 713 + **Field** **Type** **Required** **Description** 714 + new\_service\_endpoint string yes URL of the new PDS (e.g., \"https://pds.alice.com\") 715 + signing\_proof string yes Proof signed by the device's key authorizing the migration 716 + ------------------------ ---------- -------------- ------------------------------------------------------------ 717 + 718 + **Response (200 OK)** 719 + 720 + ------------------------ ---------- ----------------------------------------------------- 721 + **Field** **Type** **Description** 722 + did string The DID that was migrated 723 + new\_service\_endpoint string The endpoint now in the DID document 724 + operation\_id string PLC operation ID (for did:plc) 725 + status string \"pending\_propagation\" \| \"active\" 726 + grace\_period\_ends string ISO 8601, when the relay stops serving the old repo 727 + ------------------------ ---------- ----------------------------------------------------- 728 + 729 + **Error Responses** 730 + 731 + ------------ ------------------------- ------------------------------------------------------------------------- 732 + **Status** **Code** **Description** 733 + 400 INVALID\_PROOF Signing proof is invalid 734 + 400 INVALID\_ENDPOINT New service endpoint is unreachable or malformed 735 + 409 MIGRATION\_IN\_PROGRESS A migration is already pending for this DID 736 + 422 DIDWEB\_SELF\_SERVICE did:web migration is handled via user's own DNS; no relay action needed 737 + ------------ ------------------------- ------------------------------------------------------------------------- 738 + 739 + > ***Note:** The relay validates that the new service endpoint is reachable and responds to basic ATProto XRPC calls before submitting the PLC operation. This prevents accidental lockout from typos.* 740 + 741 + **DELETE /v1/accounts/:id** [v1.0] 742 + 743 + Initiate account teardown. By default, enters a 30-day grace period during which the relay continues serving the repo. The user can force immediate deletion by passing force=true. 744 + 745 + **Request Body** 746 + 747 + -------------- ---------- -------------- ---------------------------------------------------------------------------- 748 + **Field** **Type** **Required** **Description** 749 + force boolean no Skip grace period and delete immediately (default: false) 750 + confirmation string yes Must be the string \"DELETE {account\_id}\" to prevent accidental deletion 751 + -------------- ---------- -------------- ---------------------------------------------------------------------------- 752 + 753 + **Response (200 OK)** 754 + 755 + --------------------- ---------- -------------------------------------------------------------- 756 + **Field** **Type** **Description** 757 + status string \"grace\_period\" \| \"deleted\" 758 + grace\_period\_ends string ISO 8601, when data will be purged (null if force=true) 759 + devices\_revoked integer Number of device tokens invalidated 760 + data\_purge\_at string ISO 8601, when repo and account data are permanently deleted 761 + --------------------- ---------- -------------------------------------------------------------- 762 + 763 + **Error Responses** 764 + 765 + ------------ ----------------------- ---------------------------------------------------- 766 + **Status** **Code** **Description** 767 + 401 UNAUTHORIZED Invalid session token 768 + 400 INVALID\_CONFIRMATION Confirmation string doesn't match 769 + 409 ACTIVE\_MIGRATION Cannot delete while a DID migration is in progress 770 + ------------ ----------------------- ---------------------------------------------------- 771 + 772 + > ***Note:** Requires session\_token. During the grace period, the account is read-only: the relay serves existing repo data and DID resolution but rejects new commits. The user can cancel the deletion during this period via POST /v1/accounts/:id/restore.* 773 + 774 + **POST /v1/accounts/:id/restore** [v1.0] 775 + 776 + Cancel a pending account deletion during the grace period. Restores full write access and re-activates device tokens. 777 + 778 + **Response (200 OK)** 779 + 780 + ------------------- ---------- ------------------------------------- 781 + **Field** **Type** **Description** 782 + status string \"active\" 783 + devices\_restored integer Number of device tokens reactivated 784 + ------------------- ---------- ------------------------------------- 785 + 786 + **Error Responses** 787 + 788 + ------------ ------------------------ ----------------------------------------------------- 789 + **Status** **Code** **Description** 790 + 401 UNAUTHORIZED Invalid session token 791 + 404 NOT\_IN\_GRACE\_PERIOD Account is not currently in a deletion grace period 792 + 410 ALREADY\_DELETED Grace period has expired; data has been purged 793 + ------------ ------------------------ ----------------------------------------------------- 794 + 795 + 8\. Free Tier Enforcement 796 + 797 + Usage is tracked at the relay level and enforced per-account. When a cap is reached, the relay returns 429 on write operations but continues serving reads (eventually consistent). This ensures the PDS remains visible on the network even when the account is over quota. 798 + 799 + ----------------------- --------------- --------------- ----------------------------------------- 800 + **Resource** **Free Tier** **Pro Tier** **Enforcement** 801 + Repo storage 500 MB 50 GB Reject commits over limit 802 + Relay bandwidth 2 GB/month 100 GB/month Throttle to 128 kbps over limit 803 + XRPC proxied requests 10,000/month 500,000/month 429 on writes, reads continue 804 + Devices per account 1 5 Reject new device registrations 805 + Custom domains 0 3 Reject handle creation with type=custom 806 + ----------------------- --------------- --------------- ----------------------------------------- 807 + 808 + Approaching-limit warnings are surfaced through the usage endpoint (GET /v1/accounts/:id/usage), via a custom X-Usage-Warning response header when utilization exceeds 80%, and through the Iroh channel as push notifications to the Tauri app. 809 + 810 + Critical account events (usage at 90%, device revoked, DID rotation) trigger email notifications to the account's registered email address. This does not require a user-facing event API. 811 + 812 + 9\. Security Considerations 813 + 814 + 9.1 Token Security 815 + 816 + - Session tokens are JWTs signed with RS256. The relay's public key is published at /.well-known/jwks.json. 817 + 818 + - Device tokens are opaque (server-side lookup), not JWTs, to allow instant revocation without token refresh lag. 819 + 820 + - Claim codes are cryptographically random, 6-character alphanumeric (36⁶ ≈ 2.2 billion possibilities), rate-limited to 5 attempts per code. 821 + 822 + 9.2 Key Management 823 + 824 + - Device private keys are generated and stored in the OS keychain (macOS Keychain, Windows Credential Manager) via Tauri's secure storage API. 825 + 826 + - The relay never sees, transmits, or stores private keys. All cryptographic proofs are generated device-side. 827 + 828 + - Key rotation invalidates the previous key after a 24-hour grace period. Rotation events are logged immutably. 829 + 830 + - Recovery keys (Shamir shares) are configured during the DID ceremony step of the setup wizard. Loss of all signing keys without recovery shares means the DID is permanently orphaned. 831 + 832 + 9.3 Transport 833 + 834 + - All API traffic is TLS 1.3 only. The relay does not support TLS 1.2 fallback. 835 + 836 + - Device-to-relay data sync uses Iroh's encrypted transport (QUIC-based, end-to-end encrypted). 837 + 838 + - CORS is restricted to the service's web dashboard origin. The Tauri app uses direct HTTPS, not browser fetch. 839 + 840 + 10\. Internal Observability 841 + 842 + The relay exposes internal metrics for operating the SaaS product. These endpoints are not public-facing --- they are served on a separate internal port (default: 9090) accessible only from the ops network. 843 + 844 + > ***Design Decision:** User-facing event streams (webhooks, SSE) are deferred. Device-side notifications use the existing Iroh channel. Email handles critical account events. A public event API will be added if self-hosted operators request it.* 845 + 846 + 10.1 Infrastructure Metrics 847 + 848 + Prometheus-compatible metrics endpoint for standard monitoring infrastructure (Grafana, Datadog, PagerDuty). 849 + 850 + > GET :9090/metrics 851 + 852 + ----------------------------------- ----------- ------------------------------------------------ 853 + **Metric** **Type** **Description** 854 + relay\_active\_connections gauge Currently connected Iroh tunnels 855 + relay\_request\_duration\_seconds histogram XRPC request latency distribution 856 + relay\_error\_total counter Errors by type and status code 857 + relay\_buffer\_depth gauge Messages buffered per device (offline devices) 858 + relay\_bandwidth\_bytes\_total counter Bytes proxied, labeled by account tier 859 + plc\_mirror\_resolution\_ms histogram DID resolution latency from PLC mirror 860 + plc\_mirror\_cache\_hit\_ratio gauge PLC mirror cache hit rate 861 + iroh\_tunnel\_health gauge Per-tunnel health score (0--1) 862 + ----------------------------------- ----------- ------------------------------------------------ 863 + 864 + 10.2 Business Metrics 865 + 866 + Higher-level metrics for product health monitoring. Served as JSON on the internal port. 867 + 868 + > GET :9090/internal/stats 869 + 870 + ------------------------------------- ------------------------------------------------------------- 871 + **Metric** **Description** 872 + accounts\_by\_tier Count of accounts per tier (free/pro/business) 873 + storage\_utilization\_p50\_p95\_p99 Storage consumption distribution across accounts 874 + federation\_health requestCrawl success rate, BGS ingestion lag 875 + did\_resolution\_latency P50/P95 DID resolution time via PLC mirror vs upstream 876 + active\_devices\_24h Devices that connected in the last 24 hours 877 + stale\_devices\_72h Devices that haven't connected in 72+ hours (churn signal) 878 + exit\_ceremonies\_in\_progress Active DID migrations and account deletions in grace period 879 + ------------------------------------- ------------------------------------------------------------- 880 + 881 + > ***Note:** Business metrics power internal dashboards and customer support tooling. They are not exposed to users. The /internal/stats endpoint requires a separate ops-scoped token and is never routed through the public load balancer.* 882 + 883 + ## 11. Relay Key Management [v0.1] 884 + 885 + The relay holds the ATProto signing key. These endpoints manage the relay's key lifecycle. 886 + 887 + **POST /v1/relay/keys** [v0.1] 888 + 889 + Generate a new relay signing key. Called during account creation or key rotation. The relay generates the key internally — the private key is never exposed. 890 + 891 + **Response (200 OK)** 892 + 893 + ----------------------- ---------- ------------------------------------------------------- 894 + **Field** **Type** **Description** 895 + key_id string Key identifier 896 + public_key string P-256 public key, base64url-encoded 897 + algorithm string "ES256" (P-256) or "ES256K" (secp256k1) 898 + created_at string ISO 8601 899 + ----------------------- ---------- ------------------------------------------------------- 900 + 901 + **DELETE /v1/relay/keys/:keyId** [v1.0] 902 + 903 + Revoke a relay signing key. Triggers DID rotation to update the signing key in the DID document. 904 + 905 + **Response (200 OK)** 906 + 907 + ----------------------- ---------- ------------------------------------------------------- 908 + **Field** **Type** **Description** 909 + revoked_at string ISO 8601 910 + rotation_status string "pending" | "complete" 911 + ----------------------- ---------- ------------------------------------------------------- 912 + 913 + **POST /v1/relay/commits/sign** [v0.2] 914 + 915 + Sign an unsigned commit constructed by the desktop PDS. Desktop-enrolled mode only. 916 + 917 + **Request Body** 918 + 919 + ----------------------- ---------- -------------- ------------------------------------------------------- 920 + **Field** **Type** **Required** **Description** 921 + unsigned_commit bytes yes CAR-encoded unsigned commit 922 + repo_did string yes DID of the repo 923 + ----------------------- ---------- -------------- ------------------------------------------------------- 924 + 925 + **Response (200 OK)** 926 + 927 + ----------------------- ---------- ------------------------------------------------------- 928 + **Field** **Type** **Description** 929 + signed_commit bytes CAR-encoded signed commit 930 + commit_cid string CID of the signed commit 931 + ----------------------- ---------- ------------------------------------------------------- 932 + 933 + **GET /v1/relay/repo/snapshot** [v0.2] 934 + 935 + Full repo export as CAR file. Used by desktop during initial sync after enrollment. 936 + 937 + **Response:** streaming CAR file (same format as com.atproto.sync.getRepo) 938 + 939 + **GET /v1/relay/mode** [v0.2] 940 + 941 + Current relay operating mode for this account. 942 + 943 + **Response (200 OK)** 944 + 945 + ----------------------- ---------- ------------------------------------------------------- 946 + **Field** **Type** **Description** 947 + mode string "mobile-only" | "desktop-enrolled" | "desktop-offline" 948 + primary_device string Device ID of repo host (null in mobile-only) 949 + signing_key_id string Active signing key identifier 950 + ----------------------- ---------- ------------------------------------------------------- 951 + 952 + ## 12. Device Management [v0.2] 953 + 954 + Extended device operations for the mobile app. These supplement the existing device registration endpoints in Section 3. 955 + 956 + **POST /v1/devices/:id/pair** [v0.2] 957 + 958 + Initiate device pairing via QR code. The phone generates a pairing session, the desktop scans the QR code containing the session details. 959 + 960 + **Request Body** 961 + 962 + ----------------------- ---------- -------------- ------------------------------------------------------- 963 + **Field** **Type** **Required** **Description** 964 + pairing_code string yes Code from QR scan 965 + device_type string yes "desktop" | "mobile" 966 + ----------------------- ---------- -------------- ------------------------------------------------------- 967 + 968 + **Response (200 OK)** 969 + 970 + ----------------------- ---------- ------------------------------------------------------- 971 + **Field** **Type** **Description** 972 + paired_at string ISO 8601 973 + device_id string The paired device's ID 974 + pairing_status string "paired" | "pending_promotion" 975 + ----------------------- ---------- ------------------------------------------------------- 976 + 977 + **POST /v1/devices/:id/promote** [v0.2] 978 + 979 + Promote a paired desktop to repo host. Transitions the relay from mobile-only to desktop-enrolled mode. The relay transfers the repo to the desktop via Iroh. 980 + 981 + **Response (200 OK)** 982 + 983 + ----------------------- ---------- ------------------------------------------------------- 984 + **Field** **Type** **Description** 985 + promoted_at string ISO 8601 986 + mode string "desktop-enrolled" 987 + repo_transfer string "in_progress" | "complete" 988 + ----------------------- ---------- ------------------------------------------------------- 989 + 990 + **GET /v1/devices/:id/status** [v0.2] 991 + 992 + Device health and connectivity status. 993 + 994 + **Response (200 OK)** 995 + 996 + ----------------------- ---------- ------------------------------------------------------- 997 + **Field** **Type** **Description** 998 + device_id string Device identifier 999 + status string "online" | "offline" | "degraded" 1000 + last_seen string ISO 8601 1001 + is_primary boolean Whether this device hosts the repo 1002 + mode string Current lifecycle phase 1003 + ----------------------- ---------- ------------------------------------------------------- 1004 + 1005 + **DELETE /v1/devices/:id** [v0.2] 1006 + 1007 + De-enroll a device. Already exists in Section 3. This note confirms mobile app can also call it (not just web dashboard). 1008 + 1009 + Note: Update Section 3 to allow device_token auth (not just session_token) for mobile-initiated device removal. 1010 + 1011 + ## 13. Data Transfer [v0.1] 1012 + 1013 + Planned device swap (e.g., upgrading phones). Uses Iroh for direct peer-to-peer transfer with a 6-digit verification code. 1014 + 1015 + **POST /v1/transfer/initiate** [v0.1] 1016 + 1017 + Generate a transfer session. Returns a 6-digit code for the new device to enter. 1018 + 1019 + **Response (200 OK)** 1020 + 1021 + ----------------------- ---------- ------------------------------------------------------- 1022 + **Field** **Type** **Description** 1023 + transfer_id string Transfer session identifier 1024 + code string 6-digit verification code 1025 + expires_at string ISO 8601 (15 minutes) 1026 + iroh_ticket string Iroh connection ticket for direct transfer 1027 + ----------------------- ---------- ------------------------------------------------------- 1028 + 1029 + **POST /v1/transfer/accept** [v0.1] 1030 + 1031 + New device submits the transfer code to join the session. 1032 + 1033 + **Request Body** 1034 + 1035 + ----------------------- ---------- -------------- ------------------------------------------------------- 1036 + **Field** **Type** **Required** **Description** 1037 + code string yes 6-digit code from old device 1038 + device_public_key string yes P-256 public key of new device 1039 + ----------------------- ---------- -------------- ------------------------------------------------------- 1040 + 1041 + **Response (200 OK)** 1042 + 1043 + ----------------------- ---------- ------------------------------------------------------- 1044 + **Field** **Type** **Description** 1045 + transfer_id string Transfer session ID 1046 + status string "accepted" | "transferring" 1047 + ----------------------- ---------- ------------------------------------------------------- 1048 + 1049 + **POST /v1/transfer/complete** [v0.1] 1050 + 1051 + Finalize the transfer. Old device's token is revoked, new device receives a fresh device_token. 1052 + 1053 + **Response (200 OK)** 1054 + 1055 + ----------------------- ---------- ------------------------------------------------------- 1056 + **Field** **Type** **Description** 1057 + new_device_id string New device's identifier 1058 + device_token string New device's long-lived token 1059 + old_device_revoked boolean Confirmation old token is dead 1060 + ----------------------- ---------- ------------------------------------------------------- 1061 + 1062 + ## 14. Recovery [v1.0] 1063 + 1064 + Unplanned device loss recovery via Shamir share reconstruction. 1065 + 1066 + **POST /v1/recovery/initiate** [v1.0] 1067 + 1068 + Begin a recovery ceremony. User must present 2 of 3 Shamir shares to reconstruct the rotation key. 1069 + 1070 + **Request Body** 1071 + 1072 + ----------------------- ---------- -------------- ------------------------------------------------------- 1073 + **Field** **Type** **Required** **Description** 1074 + email string yes Account email for verification 1075 + share_1 string yes First Shamir share (e.g., from iCloud) 1076 + share_source_1 string yes "icloud" | "relay" | "device" | "paper" 1077 + ----------------------- ---------- -------------- ------------------------------------------------------- 1078 + 1079 + **Response (200 OK)** 1080 + 1081 + ----------------------- ---------- ------------------------------------------------------- 1082 + **Field** **Type** **Description** 1083 + recovery_id string Recovery session identifier 1084 + shares_needed integer Number of additional shares required 1085 + status string "awaiting_shares" | "ready_to_verify" 1086 + ----------------------- ---------- ------------------------------------------------------- 1087 + 1088 + **POST /v1/recovery/verify-key** [v1.0] 1089 + 1090 + Submit reconstructed key material to prove DID ownership. 1091 + 1092 + **Request Body** 1093 + 1094 + ----------------------- ---------- -------------- ------------------------------------------------------- 1095 + **Field** **Type** **Required** **Description** 1096 + recovery_id string yes Recovery session ID 1097 + share_2 string yes Second Shamir share 1098 + share_source_2 string yes Source of the second share 1099 + ----------------------- ---------- -------------- ------------------------------------------------------- 1100 + 1101 + **Response (200 OK)** 1102 + 1103 + ----------------------- ---------- ------------------------------------------------------- 1104 + **Field** **Type** **Description** 1105 + status string "verified" | "failed" 1106 + rotation_key string Reconstructed rotation public key (for verification) 1107 + ----------------------- ---------- ------------------------------------------------------- 1108 + 1109 + **GET /v1/recovery/restore** [v1.0] 1110 + 1111 + Stream the repo and blobs from the relay to the new device after successful key verification. 1112 + 1113 + **Response:** streaming CAR file + blob manifest 1114 + 1115 + **PUT /v1/keys/shares/:id** [v1.0] 1116 + 1117 + Update the relay-held Shamir share (Share 2). Used after key rotation to re-split with new shares. 1118 + 1119 + **Request Body** 1120 + 1121 + ----------------------- ---------- -------------- ------------------------------------------------------- 1122 + **Field** **Type** **Required** **Description** 1123 + encrypted_share string yes New encrypted share data 1124 + ----------------------- ---------- -------------- ------------------------------------------------------- 1125 + 1126 + **Response (200 OK)** 1127 + 1128 + ----------------------- ---------- ------------------------------------------------------- 1129 + **Field** **Type** **Description** 1130 + updated_at string ISO 8601 1131 + ----------------------- ---------- ------------------------------------------------------- 1132 + 1133 + **GET /v1/keys/rotation-log** [v1.0] 1134 + 1135 + Immutable audit log of all Shamir share rotations and recovery attempts. 1136 + 1137 + **Response (200 OK)** 1138 + 1139 + ----------------------- ---------- ------------------------------------------------------- 1140 + **Field** **Type** **Description** 1141 + entries array List of rotation/recovery events with timestamps 1142 + ----------------------- ---------- ------------------------------------------------------- 1143 + 1144 + ## 15. Blob Management [v0.1] 1145 + 1146 + Blob endpoints follow the ATProto spec. See blob-handling-spec.md for storage architecture and lifecycle details. 1147 + 1148 + **POST /v1/blobs/upload** [v0.1] 1149 + 1150 + Alias for com.atproto.repo.uploadBlob. Accepts multipart upload, returns CID reference. Subject to per-account storage quotas. 1151 + 1152 + Note: This is the same endpoint as the XRPC uploadBlob — listed here for completeness. The provisioning API does not add a separate blob upload path. 1153 + 1154 + **GET /v1/accounts/:id/storage** [v0.1] 1155 + 1156 + Blob storage usage for an account. Extends the existing usage endpoint with blob-specific metrics. 1157 + 1158 + **Response (200 OK)** 1159 + 1160 + ----------------------- ---------- ------------------------------------------------------- 1161 + **Field** **Type** **Description** 1162 + blob_count integer Total blobs stored 1163 + blob_bytes integer Total blob storage consumed 1164 + blob_limit integer Tier storage limit for blobs 1165 + largest_blob integer Size of largest blob (bytes) 1166 + ----------------------- ---------- ------------------------------------------------------- 1167 + 1168 + 1169 + Appendix A: Status Codes Reference 1170 + 1171 + ---------- --------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------- 1172 + **Code** **Constants** **Context** 1173 + 400 INVALID\_CLAIM, INVALID\_DOCUMENT, INVALID\_PROOF, INVALID\_ENDPOINT, INVALID\_CONFIRMATION Malformed request or failed validation 1174 + 401 UNAUTHORIZED, INVALID\_CREDENTIALS, TOKEN\_EXPIRED Authentication failure 1175 + 403 FORBIDDEN, TIER\_RESTRICTED, DIDWEB\_REQUIRES\_DOMAIN, SINGLE\_DEVICE\_TIER Insufficient permissions or tier 1176 + 404 NOT\_FOUND, DEVICE\_NOT\_FOUND, DID\_NOT\_FOUND, HANDLE\_NOT\_FOUND, NOT\_IN\_GRACE\_PERIOD Resource doesn't exist 1177 + 409 ACCOUNT\_EXISTS, DEVICE\_LIMIT, DID\_EXISTS, HANDLE\_TAKEN, ROTATION\_IN\_PROGRESS, LEASE\_HELD, MIGRATION\_IN\_PROGRESS, ACTIVE\_MIGRATION Conflict with existing state 1178 + 410 ALREADY\_DELETED Resource permanently removed 1179 + 422 WEAK\_PASSWORD, INVALID\_KEY, INVALID\_HANDLE, KEY\_MISMATCH, DIDWEB\_SELF\_SERVICE Semantic validation failure 1180 + 423 ACCOUNT\_LOCKED Temporarily locked due to abuse 1181 + 429 RATE\_LIMITED Rate or usage cap exceeded 1182 + 503 EXPORT\_IN\_PROGRESS Temporary unavailability 1183 + ---------- --------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------- 1184 + 1185 + Appendix B: Design Decisions Log 1186 + 1187 + This appendix collects all design decisions made during the specification process, with rationale for future reference. 1188 + 1189 + ------------------------------------------------ -------------------------------------------------------------------------------------------------------------------------------------------------------------- 1190 + **Decision** **Rationale** 1191 + did:plc is the default DID method Decouples identity from relay domain. Clean exit path: user signs a PLC operation to repoint service endpoint. No ongoing liability for the relay operator. 1192 + did:web requires a user-owned custom domain Eliminates exit liability. If did:web were offered on service subdomains, the relay would be obligated to host DID documents indefinitely after users leave. 1193 + Primary-device write lease for multi-device ATProto repos require a linear commit chain. LWW risks lost writes; conflict queues have poor UX. Primary-device matches actual desktop usage patterns. 1194 + Single auth path (web dashboard for all users) One security model to audit. Self-hosted static\_token bypass can be added later if operators request it. 1195 + No public event API in v1.0 Iroh channel handles device notifications. Email handles critical alerts. A webhook/SSE surface adds complexity without clear demand. 1196 + PLC mirror starts read-only Reduces operational risk. Read-write authority requires consensus participation with the PLC network --- deferred until the product matures. 1197 + Setup wizard handles DID ceremony Users get a single \"setup complete\" moment. Splitting device binding and DID ceremony into separate sessions creates drop-off risk. 1198 + 30-day grace period on account deletion Prevents accidental data loss. The relay continues serving the repo during transition, ensuring zero dropped interactions for the user's followers. 1199 + Relay constructs DID document Client sends raw key material, not a pre-built DID document. Ensures relay controls document structure, service endpoints, and signing key binding. Client only needs public keys — no DID assembly logic required. Simplifies mobile clients significantly. 1200 + ------------------------------------------------ --------------------------------------------------------------------------------------------------------------------------------------------------------------
+352
docs/unified-milestone-map.md
··· 1 + # Unified Milestone Map 2 + 3 + Single source of truth for all ezpds phases. 4 + 5 + v0.1 Draft — March 2026 6 + 7 + Companion to: All spec documents 8 + 9 + --- 10 + 11 + ## 1. Phase Model 12 + 13 + The architecture defined two milestones (v0.1, v1.0). The mobile spec defined four (v0.1, v0.2, v1.0, v2.0+). The mobile spec is canonical. This document reconciles both into a single timeline. 14 + 15 + ### 1.1 Why Four Phases 16 + 17 + The architecture was written before the mobile-first strategy existed. Its two milestones assumed a desktop-only product. The mobile spec introduced a relay-as-full-PDS phase that precedes any desktop involvement. The four-phase model reflects the actual build order: 18 + 19 + 1. **v0.1** — Relay is a full PDS. User creates identity from phone, logs into Bluesky. 20 + 2. **v0.2** — Desktop enrolls. Relay becomes a proxy+signer. Device management from phone. 21 + 3. **v1.0** — Recovery, polish, production readiness. BYO relay support. 22 + 4. **v2.0+** — Signing sovereignty. Contingent on ATProto protocol evolution. 23 + 24 + ### 1.2 Timeline Estimates 25 + 26 + | Phase | Duration | Cumulative | 27 + |-------|----------|------------| 28 + | v0.1 | 3–4 months | 3–4 months | 29 + | v0.2 | 2–3 months | 5–7 months | 30 + | v1.0 | 3–4 months | 8–11 months | 31 + | v2.0+ | TBD | TBD | 32 + 33 + Solo developer estimates from architecture spec. v0.2 is new — estimated at 2–3 months based on scope (device pairing, desktop promotion, XRPC proxying). 34 + 35 + --- 36 + 37 + ## 2. Phase Details 38 + 39 + ### 2.1 v0.1 — Mobile-Only PDS 40 + 41 + **Goal:** User creates an ATProto identity from their iPhone and logs into Bluesky. 42 + 43 + **Lifecycle phase:** Mobile-Only. Relay is a full PDS — hosts repo, serves XRPC, signs commits, emits firehose. 44 + 45 + #### Relay 46 + 47 + | Component | Description | Source | 48 + |-----------|-------------|--------| 49 + | Axum HTTP server | Serves all endpoints | architecture | 50 + | SQLite database | Accounts, repos, tokens, OAuth state | architecture | 51 + | Repo engine | CAR file storage, Merkle tree, commit signing | architecture | 52 + | Signing key management | P-256 key generation, Secure Enclave on phone stores root rotation key | mobile §3 | 53 + | XRPC endpoints | `com.atproto.*` read + write | architecture | 54 + | Firehose emitter | Native event stream (not proxy — relay IS the PDS) | cross-spec §2.6 | 55 + | Iroh tunnel | NAT traversal for phone ↔ relay | mobile §5 | 56 + 57 + #### OAuth (blocks Bluesky login) 58 + 59 + | Component | Description | Source | 60 + |-----------|-------------|--------| 61 + | `atproto-oauth-axum` integration | OAuth 2.1 with DPoP, PAR, PKCE | oauth spec §2.1 | 62 + | Server metadata endpoint | `/.well-known/oauth-authorization-server` | oauth spec §5 | 63 + | Authorization UI | Minimal server-rendered consent screen | oauth spec §6 | 64 + | Token storage | SQLite-backed access + refresh tokens | oauth spec §3.2 | 65 + 66 + #### Blobs 67 + 68 + | Component | Description | Source | 69 + |-----------|-------------|--------| 70 + | `uploadBlob` endpoint | CID-addressed upload | blob spec §4 | 71 + | `getBlob` endpoint | Serve by CID | blob spec §5 | 72 + | Local filesystem storage | Default for v0.1 (S3 optional via config) | blob spec §3 | 73 + | Garbage collection | 6-hour grace for unreferenced temp blobs | blob spec §6 | 74 + | Storage quotas | Per-account enforcement | blob spec §7 | 75 + 76 + #### Provisioning API 77 + 78 + | Endpoint | Description | Source | 79 + |----------|-------------|--------| 80 + | POST /v1/accounts/mobile | Combined account creation + device binding | cross-spec §2.1 | 81 + | POST /v1/dids | DID creation (relay constructs did:plc doc from key material) | cross-spec §1.2 | 82 + | POST /v1/sessions | Session creation (login) | provisioning §2 | 83 + | POST /v1/relay/keys | Generate relay signing key | mobile §9 | 84 + 85 + #### Identity & Keys 86 + 87 + | Component | Description | Source | 88 + |-----------|-------------|--------| 89 + | DID creation | did:plc via PLC directory (relay proxies) | provisioning, cross-spec §1.2 | 90 + | Key types | P-256 for rotation key, P-256/secp256k1 for signing | cross-spec §1.1 | 91 + | Shamir share generation | 2-of-3 split during onboarding. Share 1 = iCloud Keychain, Share 2 = relay escrow, Share 3 = user's choice | cross-spec §2.5 | 92 + 93 + #### Migration 94 + 95 + | Component | Description | Source | 96 + |-----------|-------------|--------| 97 + | Planned device swap | LAN transfer via Iroh, 6-digit code | migration §3 | 98 + 99 + #### XRPC Federation Surface (minimum viable endpoint set) 100 + 101 + The following XRPC endpoints are the minimum required for the relay to join the ATProto network as a federating PDS. Derived from @threddyrex.org's C# PDS implementation (the first non-reference PDS to successfully federate) and cross-referenced with the ATProto spec. 102 + 103 + **com.atproto.repo — Repo CRUD + blobs (8 endpoints)** 104 + 105 + | Endpoint | Method | Description | 106 + |----------|--------|-------------| 107 + | `com.atproto.repo.createRecord` | POST | Create a record in a repo collection | 108 + | `com.atproto.repo.putRecord` | POST | Write a record (create or update) | 109 + | `com.atproto.repo.deleteRecord` | POST | Delete a record | 110 + | `com.atproto.repo.applyWrites` | POST | Batch write (create/update/delete) | 111 + | `com.atproto.repo.getRecord` | GET | Fetch a single record by key | 112 + | `com.atproto.repo.listRecords` | GET | List records in a collection | 113 + | `com.atproto.repo.describeRepo` | GET | Repo metadata (DID, handle, collections) | 114 + | `com.atproto.repo.uploadBlob` | POST | Upload a blob, returns CID ref | 115 + 116 + **com.atproto.server — Auth + account lifecycle (6 endpoints)** 117 + 118 + | Endpoint | Method | Description | 119 + |----------|--------|-------------| 120 + | `com.atproto.server.createSession` | POST | Login (returns access + refresh tokens) | 121 + | `com.atproto.server.getSession` | GET | Validate current session | 122 + | `com.atproto.server.refreshSession` | POST | Rotate session tokens | 123 + | `com.atproto.server.describeServer` | GET | Server capabilities + invite policy | 124 + | `com.atproto.server.activateAccount` | POST | Activate a deactivated account | 125 + | `com.atproto.server.deactivateAccount` | POST | Deactivate account (preserves data) | 126 + 127 + **com.atproto.sync — Federation surface (7 endpoints)** 128 + 129 + | Endpoint | Method | Description | 130 + |----------|--------|-------------| 131 + | `com.atproto.sync.getRepo` | GET | Full repo export (CAR file) | 132 + | `com.atproto.sync.getRecord` | GET | Single record as CAR-encoded block | 133 + | `com.atproto.sync.getBlob` | GET | Fetch blob by CID | 134 + | `com.atproto.sync.listBlobs` | GET | List blob CIDs for a repo | 135 + | `com.atproto.sync.listRepos` | GET | List all repos hosted by this PDS | 136 + | `com.atproto.sync.getRepoStatus` | GET | Repo sync status (active/deactivated) | 137 + | `com.atproto.sync.subscribeRepos` | WS | Firehose — WebSocket event stream of repo commits | 138 + 139 + **com.atproto.identity (1 endpoint)** 140 + 141 + | Endpoint | Method | Description | 142 + |----------|--------|-------------| 143 + | `com.atproto.identity.resolveHandle` | GET | Resolve handle → DID | 144 + 145 + **app.bsky.* — Appview proxy (not implemented locally)** 146 + 147 + | Endpoint | Method | Description | 148 + |----------|--------|-------------| 149 + | `app.bsky.*` (catch-all) | * | Proxy to Bluesky appview (bsky.network) | 150 + | `app.bsky.actor.getPreferences` | GET | Stored locally — survives appview outages | 151 + | `app.bsky.actor.putPreferences` | POST | Stored locally | 152 + 153 + **chat.bsky.convo (2 endpoints)** 154 + 155 + | Endpoint | Method | Description | 156 + |----------|--------|-------------| 157 + | `chat.bsky.convo.getLog` | GET | Chat conversation log | 158 + | `chat.bsky.convo.listConvos` | GET | List chat conversations | 159 + 160 + **Infrastructure (1 endpoint)** 161 + 162 + | Endpoint | Method | Description | 163 + |----------|--------|-------------| 164 + | `/_health` | GET | Liveness check | 165 + 166 + **Total: 25 XRPC endpoints + health check.** This is the federation acceptance test — if these all work correctly, the relay is a functioning PDS on the network. 167 + 168 + Note: `app.bsky.*` calls are proxied to the appview, not implemented locally. The relay only stores preferences locally. The `chat.bsky.convo` endpoints may also be proxied depending on whether ezpds hosts chat state or defers to a chat service. 169 + 170 + #### Not in v0.1 171 + 172 + - Desktop enrollment/pairing 173 + - Desktop XRPC proxying 174 + - Firehose proxy (desktop sleep) 175 + - Key rotation 176 + - Unplanned device recovery 177 + - Tier pricing (all users on free tier) 178 + - PostgreSQL backend 179 + - CDN/S3 blob storage (optional, not required) 180 + - BYO relay distribution 181 + 182 + --- 183 + 184 + ### 2.2 v0.2 — Desktop Enrollment 185 + 186 + **Goal:** User pairs a desktop machine and manages devices from their phone. 187 + 188 + **Lifecycle phase:** Desktop-Enrolled. Relay becomes XRPC proxy + signer. Desktop hosts the repo. 189 + 190 + #### New in v0.2 191 + 192 + | Component | Description | Source | 193 + |-----------|-------------|--------| 194 + | Device pairing | QR code scan, desktop promotion | mobile §5 | 195 + | POST /v1/devices/:id/pair | Pairing endpoint | mobile §9 | 196 + | POST /v1/devices/:id/promote | Promote desktop to repo host | mobile §9 | 197 + | GET /v1/devices/:id/status | Device health/status | mobile §9 | 198 + | DELETE /v1/devices/:id | De-enroll device | mobile §9 | 199 + | XRPC write proxying | Relay forwards createRecord etc. to desktop | mobile §4 | 200 + | POST /v1/relay/commits/sign | Sign unsigned commit from desktop | mobile §9 | 201 + | GET /v1/relay/repo/snapshot | Full repo snapshot (CAR) for desktop sync | mobile §9 | 202 + | GET /v1/relay/mode | Current operating mode (mobile-only vs desktop-enrolled) | mobile §9 | 203 + | Desktop offline handling | 503 on writes when desktop unreachable, reads from cache | mobile §4.3 | 204 + | Firehose proxy | Relay maintains BGS WebSocket on behalf of sleeping desktop | architecture | 205 + | Blob forwarding | Forward uploaded blobs to desktop via Iroh | blob spec §5.2 | 206 + | Blob cache | Relay caches blobs, fetches from desktop on miss | blob spec §5.2 | 207 + 208 + #### Unchanged from v0.1 209 + 210 + - OAuth (no changes needed — relay remains the auth endpoint) 211 + - Provisioning API core endpoints 212 + - Firehose native emission (still works alongside proxy) 213 + 214 + --- 215 + 216 + ### 2.3 v1.0 — Production Launch 217 + 218 + **Goal:** Production-ready identity wallet. Recovery support. BYO relay. 219 + 220 + **Lifecycle phase:** All phases stable and polished. 221 + 222 + #### New in v1.0 223 + 224 + | Component | Description | Source | 225 + |-----------|-------------|--------| 226 + | Unplanned device recovery | Shamir reconstruction ceremony | migration §4 | 227 + | POST /v1/recovery/initiate | Begin recovery | migration §9 | 228 + | POST /v1/recovery/verify-key | Prove DID key reconstruction | migration §9 | 229 + | GET /v1/recovery/restore | Stream repo + blobs from relay | migration §9 | 230 + | PUT /v1/keys/shares/:id | Update relay-held Shamir share | migration §9 | 231 + | GET /v1/keys/rotation-log | Audit log of Shamir rotations | migration §9 | 232 + | Key rotation | Shamir-based rotation via relay | architecture | 233 + | DELETE /v1/relay/keys/:keyId | Revoke relay signing key | mobile §9 | 234 + | Tier pricing | Free/Pro/Business subscription tiers | architecture, cross-spec §1.4 | 235 + | BYO relay binary | Nix/Docker distribution for self-hosted operators | architecture | 236 + | PostgreSQL option | Alternative to SQLite for larger deployments | oauth spec §8, architecture | 237 + | S3 blob backend | Default for managed relay (R2 recommended) | blob spec §9 | 238 + | CDN integration | R2 + Workers for Pro/Business blob serving | blob spec §9 | 239 + | Local → S3 migration tool | For operators upgrading storage | blob spec §9 | 240 + | OAuth rate limiting | Per-endpoint limits | oauth spec §8 | 241 + | OAuth audit logging | Authorization grant logging | oauth spec §8 | 242 + | Customizable auth UI | Branding for BYO relay operators | oauth spec §6 | 243 + | Token revocation endpoint | Active session management | oauth spec §8 | 244 + | Client metadata caching | TTL-based re-validation (24h) | oauth spec §7.2 | 245 + | Blob manifest in transfer | Include blobs in device transfer bundle | blob spec §9 | 246 + | PLC directory mirror | Read-only cache for DID resolution | provisioning | 247 + | Dereferenced blob cleanup | Remove blobs no longer referenced by any record | blob spec §9 | 248 + | MinIO docs | BYO relay blob storage documentation | blob spec §9 | 249 + 250 + --- 251 + 252 + ### 2.4 v2.0+ — Signing Sovereignty 253 + 254 + **Goal:** User's own hardware signs commits. Desktop holds the signing key. 255 + 256 + **Contingency:** Requires ATProto protocol changes (multi-key support or key delegation). 257 + 258 + | Component | Description | Source | 259 + |-----------|-------------|--------| 260 + | Pluggable signer: desktop-remote | Desktop signs commits directly, relay no longer signs | mobile §10 | 261 + | Multi-device sync | Share key across devices without full migration | migration §8 | 262 + | Scoped OAuth tokens | Read-only grants for specific collections | oauth spec §8 | 263 + | Token introspection endpoint | RFC 7662 | oauth spec §8 | 264 + | OAuth admin dashboard | Manage active sessions | oauth spec §8 | 265 + | Video transcoding | Multiple resolutions for video blobs | blob spec §9 | 266 + | Blob deduplication | Cross-account content-addressed dedup | blob spec §9 | 267 + | PLC read-write authority | Participate as PLC directory peer | provisioning | 268 + 269 + --- 270 + 271 + ## 3. Cross-Document Phase Mapping 272 + 273 + How each document's milestones map to the unified phases: 274 + 275 + | Unified Phase | Architecture | Mobile | Provisioning | Migration | OAuth | Blobs | 276 + |---------------|-------------|--------|-------------|-----------|-------|-------| 277 + | **v0.1** | v0.1 (technical preview) | iOS v0.1 (identity wallet) | Core endpoints | v0.1 (planned swap) | v0.1 (basic OAuth) | v0.1 (basic blobs) | 278 + | **v0.2** | — (not represented) | iOS v0.2 (device mgmt) | Device endpoints | — | — (no changes) | Desktop blob sync | 279 + | **v1.0** | v1.0 (public launch) | iOS v1.0 (recovery) | Full API + milestones | v1.0 (full recovery) | v1.0 (production) | v1.0 (production) | 280 + | **v2.0+** | — | v2.0+ (signing sovereignty) | PLC authority | Multi-device sync | Later | Later | 281 + 282 + ### 3.1 Architecture Gap 283 + 284 + The architecture document has no v0.2 milestone. It needs a "Phase 1: Desktop Enrollment" between its technical preview and public launch. The architecture's v0.1 scope includes some items that belong in v0.2 (device pairing, desktop promotion). 285 + 286 + ### 3.2 Provisioning API Gap 287 + 288 + The provisioning API has no milestone tags at all. Every endpoint group needs a phase assignment. The 17 endpoints from the mobile and migration specs need to be integrated and tagged. 289 + 290 + --- 291 + 292 + ## 4. Dependency Graph 293 + 294 + Critical path items that block subsequent phases: 295 + 296 + ``` 297 + v0.1 Critical Path: 298 + Axum server → SQLite schema → Repo engine → XRPC endpoints 299 + → OAuth (blocks Bluesky login) 300 + → Blob upload/serve 301 + → DID creation → Account creation → Shamir split 302 + → Iroh tunnel (blocks device transfer) 303 + 304 + v0.2 Critical Path: 305 + v0.1 complete → Device pairing protocol → Desktop promotion 306 + → XRPC proxy layer → Commit signing endpoint 307 + → Firehose proxy 308 + → Blob forwarding via Iroh 309 + 310 + v1.0 Critical Path: 311 + v0.2 complete → Recovery ceremony → Shamir reconstruction 312 + → Tier pricing → BYO relay packaging 313 + → S3 migration → CDN setup 314 + → PostgreSQL option 315 + ``` 316 + 317 + --- 318 + 319 + ## 5. Feature ↔ Phase Matrix 320 + 321 + Quick reference: which phase delivers which user-visible capability. 322 + 323 + | User Capability | Phase | 324 + |----------------|-------| 325 + | Create ATProto identity from iPhone | v0.1 | 326 + | Log into Bluesky | v0.1 | 327 + | Post, like, follow via third-party apps | v0.1 | 328 + | Transfer identity to new phone (planned) | v0.1 | 329 + | Pair a desktop Mac | v0.2 | 330 + | Desktop runs full PDS, relay proxies | v0.2 | 331 + | Manage devices from phone | v0.2 | 332 + | Desktop sleeps, relay keeps firehose alive | v0.2 | 333 + | Recover from lost device | v1.0 | 334 + | Self-host your own relay | v1.0 | 335 + | Choose subscription tier | v1.0 | 336 + | CDN-accelerated media serving | v1.0 | 337 + | Desktop signs its own commits | v2.0+ | 338 + 339 + --- 340 + 341 + ## 6. Action Items 342 + 343 + This document resolves cross-spec-analysis items: 344 + 345 + - **#7** (architecture: move basic Shamir to v0.1) — Resolved: Shamir split is in v0.1 (§2.1) 346 + - **#8** (architecture: add unified milestone map) — This document 347 + - **#14** (provisioning: add milestone tags) — Phase assignments listed in §2.1–2.4 348 + - **#23** (new document: unified milestone map) — This document 349 + 350 + ### Remaining updates needed in individual documents 351 + 352 + The architecture (items 1–6), provisioning API (items 9–13, 15), migration spec (items 16–18), and mobile spec (items 19–20) still need their own text updated to reference these unified phases. Those are separate action items tracked in cross-spec-analysis.md.