An easy-to-host PDS on the ATProtocol, MacOS. Grandma-approved.

docs: add MM-89 design plan — did:plc genesis op and account promotion

Completed brainstorming session. Design includes:
- crypto crate: pure build_did_plc_genesis_op function (CBOR, ECDSA P-256, RFC 6979, base32 DID derivation)
- relay crate: POST /v1/dids with pending_session auth, pre-store retry resilience, atomic account promotion
- 2 implementation phases

authored by malpercio.dev and committed by

Tangled ec646d5e a4506bfe

+219
+219
docs/design-plans/2026-03-13-MM-89.md
··· 1 + # MM-89: DID Creation — did:plc via PLC Directory Proxy 2 + 3 + ## Summary 4 + 5 + MM-89 implements the final step of mobile account creation: converting a pending account into a fully active ATProto identity. When a mobile client creates an account it receives a temporary `pending_session` token but no permanent DID. This ticket completes the ceremony — the relay constructs a signed did:plc genesis operation, submits it to the external PLC Directory to register the DID on the network, then atomically promotes the account in its own database (writing to `accounts`, `did_documents`, and `handles` while deleting the pending rows). 6 + 7 + The work is split across two crates following the workspace's established Functional Core / Imperative Shell pattern. The `crypto` crate gains a pure, I/O-free function that builds and signs the genesis operation and derives the resulting DID — nothing but bytes in, struct out. The `relay` crate gains the `POST /v1/dids` HTTP endpoint that orchestrates everything: authenticating the pending session, calling the crypto function, talking to plc.directory, and committing the promotion transaction. A pre-store resilience mechanism ensures that if the promotion transaction fails after the PLC Directory has already accepted the operation, the client can safely retry without re-submitting to the directory. 8 + 9 + ## Definition of Done 10 + 11 + 1. `crypto` crate gains a pure function `build_did_plc_genesis_op(rotation_key, signing_key, handle, service_endpoint)` that constructs and ECDSA-signs a did:plc genesis operation and derives the resulting DID — no I/O, no HTTP. 12 + 2. `relay` crate gains `POST /v1/dids` that: authenticates via pending_session Bearer token → calls the crypto function → POSTs the signed op to plc.directory → atomically promotes the account (INSERT `accounts`, INSERT `did_documents`, INSERT `handles`, DELETE `pending_accounts`, DELETE `pending_sessions`) → returns the DID. 13 + 3. New workspace deps: `reqwest` (HTTP client) and `ciborium` (CBOR encoding for signing + DID derivation). 14 + 4. DB migration if the existing V002 schema needs adjustment (e.g., `accounts.password_hash` nullability for mobile-created accounts). 15 + 5. Tests: unit tests in `crypto` for genesis op construction + DID derivation; integration tests in `relay` with a mocked plc.directory. 16 + 17 + **Out of scope:** Handle DNS configuration, repo initialization, federation activation. 18 + 19 + ## Acceptance Criteria 20 + 21 + ### MM-89.AC1: crypto crate produces a valid did:plc genesis operation 22 + - **MM-89.AC1.1 Success:** `build_did_plc_genesis_op` with valid inputs returns `PlcGenesisOp` with `did` matching `^did:plc:[a-z2-7]{24}$` 23 + - **MM-89.AC1.2 Success:** `signed_op_json` contains all required fields: `type`, `rotationKeys`, `verificationMethods`, `alsoKnownAs`, `services`, `prev` (null), `sig` 24 + - **MM-89.AC1.3 Success:** `rotation_key` appears as `rotationKeys[0]`; `signing_key` appears as both `rotationKeys[1]` and `verificationMethods.atproto` 25 + - **MM-89.AC1.4 Success:** Calling `build_did_plc_genesis_op` twice with identical inputs returns the same `did` (RFC 6979 determinism) 26 + - **MM-89.AC1.5 Failure:** Invalid `signing_private_key` bytes (wrong length or invalid scalar) returns `CryptoError::PlcOperation` 27 + 28 + ### MM-89.AC2: POST /v1/dids completes the DID ceremony and promotes the account 29 + - **MM-89.AC2.1 Success:** Valid request with a live `pending_session` token returns `200 OK` with `{ "did": "did:plc:...", "status": "active" }` 30 + - **MM-89.AC2.2 Success:** After success, `accounts` row exists with `did` as PK, correct `email`, and `password_hash` NULL 31 + - **MM-89.AC2.3 Success:** After success, `did_documents` row exists for the DID with non-empty `document` JSON 32 + - **MM-89.AC2.4 Success:** After success, `handles` row exists linking the handle to the DID 33 + - **MM-89.AC2.5 Success:** After success, `pending_accounts` and `pending_sessions` rows for the account are deleted 34 + - **MM-89.AC2.6 Success:** When `pending_did` is already set (client retry), handler skips the plc.directory HTTP call and completes DB promotion, returning 200 35 + - **MM-89.AC2.7 Failure:** Missing `Authorization` header returns 401 `UNAUTHORIZED` 36 + - **MM-89.AC2.8 Failure:** Expired `pending_session` token returns 401 `UNAUTHORIZED` 37 + - **MM-89.AC2.9 Failure:** `signingKey` not present in `relay_signing_keys` returns 404 `NOT_FOUND` 38 + - **MM-89.AC2.10 Failure:** Account already fully promoted (`accounts` row already exists) returns 409 `DID_ALREADY_EXISTS` 39 + - **MM-89.AC2.11 Failure:** plc.directory returns non-2xx returns 502 `PLC_DIRECTORY_ERROR` 40 + 41 + ### MM-89.AC3: Schema migration and protocol correctness 42 + - **MM-89.AC3.1:** V008 migration applies cleanly on top of V007; `accounts.password_hash` accepts NULL; `pending_accounts.pending_did` column exists 43 + - **MM-89.AC3.2:** `sig` field in `signed_op_json` is a base64url string (no padding) decoding to exactly 64 bytes 44 + - **MM-89.AC3.3:** `alsoKnownAs` in `signed_op_json` contains `at://{handle}` (not bare handle) 45 + 46 + ## Glossary 47 + 48 + - **ATProto (AT Protocol)**: The open federated social protocol developed by Bluesky. Defines how identities (DIDs), repos, handles, and federation work. 49 + - **DID (Decentralized Identifier)**: A W3C-standard URI for a self-sovereign identity, e.g. `did:plc:abc123...`. Permanent and globally resolvable without a central registry. 50 + - **did:plc**: The DID method used by ATProto. A DID is derived from its own signed genesis operation, making the identifier content-addressed. "plc" stands for Placeholder. 51 + - **PLC Directory (plc.directory)**: The external append-only log operated by Bluesky that stores the history of did:plc operations. Registering a new DID requires POSTing a valid signed genesis operation to it. 52 + - **Genesis operation**: The first signed operation in a did:plc log. It defines the initial rotation keys, verification methods, handle (`alsoKnownAs`), and service endpoint. The DID itself is derived from the SHA-256 hash of this CBOR-encoded operation. 53 + - **Rotation key**: A cryptographic key authorized to update or revoke a DID's keys. `rotationKeys[0]` (the user's device key) has highest authority; `rotationKeys[1]` (the relay's key) is lower priority. 54 + - **Verification method**: A public key declared in a DID document under a named capability (e.g. `atproto`). Used to verify signatures on ATProto records. 55 + - **did:key URI (`DidKeyUri`)**: A self-describing DID that encodes a public key directly in the identifier string, e.g. `did:key:zDn...`. No external resolution needed. 56 + - **Pending account / pending session**: Temporary rows created during `POST /v1/accounts/mobile`. The account has no DID yet. The session token from this phase is what `POST /v1/dids` authenticates with. 57 + - **Account promotion**: The atomic database transaction that moves a pending account to a fully active account: inserts into `accounts`, `did_documents`, and `handles`; deletes from `pending_accounts` and `pending_sessions`. 58 + - **Functional Core / Imperative Shell**: An architectural pattern where pure logic (no side effects) lives in an inner "core" layer and all I/O (HTTP, database) lives in an outer "shell" layer. The `crypto` crate is the core; `relay` is the shell. 59 + - **RFC 6979**: A deterministic algorithm for ECDSA signing that derives the per-signature nonce from the private key and message rather than a random source. Guarantees the same inputs always produce the same signature — and therefore the same DID. 60 + - **CBOR (Concise Binary Object Representation)**: A binary data serialization format (RFC 7049). did:plc requires operations to be CBOR-encoded before hashing or signing. Implemented here via the `ciborium` crate. 61 + - **ECDSA (Elliptic Curve Digital Signature Algorithm)**: The signing algorithm used to authenticate did:plc operations. This codebase uses P-256 (secp256r1 / prime256v1). 62 + - **P-256**: A specific elliptic curve standardized by NIST. ATProto supports both P-256 and secp256k1; this relay uses P-256 exclusively, matching iOS Secure Enclave capabilities. 63 + - **Low-S canonical signature**: A normalization for ECDSA signatures that forces the `s` component to be in the lower half of the curve order. Required by did:plc for portability across implementations. 64 + - **base32-lowercase**: The encoding applied to the SHA-256 digest to form the DID's method-specific identifier, e.g. `did:plc:abcdefghijklmnopqrstuvwx` (first 24 chars). 65 + - **base64url (no padding)**: URL-safe base64 encoding without trailing `=` characters. Used for the `sig` field in the signed operation JSON. 66 + - **`alsoKnownAs`**: A DID document field listing the subject's known aliases. For ATProto this is the account's handle prefixed with `at://`, e.g. `at://alice.example.com`. 67 + - **`verificationMethods`**: DID document field mapping capability names to public keys. ATProto uses `atproto` as the key name for the commit-signing key. 68 + - **Pre-store resilience**: A pattern where the relay writes the derived DID to `pending_accounts.pending_did` before contacting the PLC Directory. If the subsequent DB promotion fails, a client retry can detect the pre-stored DID, skip the already-accepted directory call, and retry only the promotion. 69 + - **`wiremock`**: A Rust library for standing up an in-process HTTP mock server in tests, used here to simulate plc.directory responses without hitting the live service. 70 + - **`ciborium`**: A Rust library for CBOR serialization/deserialization (serde-compatible). Used for encoding genesis operations before signing and DID derivation. 71 + - **`data-encoding`**: A Rust library providing base32 and base64 codecs. Used here for base32-lowercase encoding during DID derivation. 72 + 73 + ## Architecture 74 + 75 + `POST /v1/dids` spans two crates following the Functional Core / Imperative Shell pattern established in this workspace. 76 + 77 + **`crates/crypto/src/plc.rs`** (new — pure functional core) 78 + 79 + Exposes one public function: 80 + 81 + ```rust 82 + pub fn build_did_plc_genesis_op( 83 + rotation_key: &DidKeyUri, // user's root rotation key (from device) 84 + signing_key: &DidKeyUri, // relay's signing key (also rotationKeys[1]) 85 + signing_private_key: &[u8; 32], // relay's decrypted private key scalar 86 + handle: &str, // e.g. "alice.example.com" 87 + service_endpoint: &str, // e.g. "https://relay.example.com" 88 + ) -> Result<PlcGenesisOp, CryptoError> 89 + 90 + pub struct PlcGenesisOp { 91 + pub did: String, // "did:plc:xxxx" — ready for use as DB primary key 92 + pub signed_op_json: String, // signed operation JSON — ready to POST to plc.directory 93 + } 94 + ``` 95 + 96 + Internally: build unsigned op struct → CBOR-encode (ciborium) → ECDSA-SHA256 sign (p256, RFC 6979 deterministic, low-S canonical) → 64-byte r||s → base64url `sig` field → CBOR-encode signed op → SHA-256 (sha2) → base32-lowercase (data-encoding, first 24 chars) → `"did:plc:"` prefix → serialize signed op to JSON string. No I/O. 97 + 98 + The relay's signing key appears in both `rotationKeys[1]` and `verificationMethods.atproto` (dual purpose per mobile-architecture-spec §2.1). The user's root rotation key appears in `rotationKeys[0]` — giving the device highest-priority authority over the DID — but the relay signs the genesis operation with its own key. See MM-141 for the future device-signing design. 99 + 100 + RFC 6979 deterministic signing means the same inputs always produce the same DID. This is required for the retry resilience mechanism below. 101 + 102 + **`crates/relay/src/routes/create_did.rs`** (new — imperative shell) 103 + 104 + Route: `POST /v1/dids` 105 + Auth: `Authorization: Bearer <sessionToken>` from `pending_sessions` 106 + Request body: 107 + ```json 108 + { "signingKey": "did:key:z...", "rotationKey": "did:key:z..." } 109 + ``` 110 + Response `200 OK`: 111 + ```json 112 + { "did": "did:plc:...", "status": "active" } 113 + ``` 114 + 115 + The `handle` is read from the `pending_accounts` row (stored during `POST /v1/accounts/mobile`) — not re-sent by the client. The `serviceEndpoint` is `config.public_url`. 116 + 117 + Handler flow: 118 + 1. `require_pending_session(headers, db)` → `PendingSessionInfo { account_id, device_id }` 119 + 2. `SELECT handle, pending_did FROM pending_accounts WHERE id = account_id` 120 + 3. `SELECT private_key_encrypted FROM relay_signing_keys WHERE id = signing_key` 121 + 4. `decrypt_private_key(encrypted, master_key)` (existing crypto function) 122 + 5. `build_did_plc_genesis_op(rotation_key, signing_key, private_key, handle, public_url)` → `PlcGenesisOp { did, signed_op_json }` 123 + 6. If `pending_did IS NULL`: `UPDATE pending_accounts SET pending_did = did` (pre-store for retry resilience) 124 + 7. If `pending_did` was already set (client retry after partial failure): skip step 8 125 + 8. `POST {plc_directory_url}/{did}` with `signed_op_json` body (`reqwest`) 126 + 9. Atomic transaction: `INSERT accounts`, `INSERT did_documents`, `INSERT handles`, `DELETE pending_sessions`, `DELETE pending_accounts` 127 + 10. Return `{ did, status: "active" }` 128 + 129 + **Pre-store resilience:** If the DB promotion transaction (step 9) fails after plc.directory succeeds (step 8), `pending_did` is already set. On client retry, the handler detects this, skips the plc.directory call (which would 4xx on a duplicate DID anyway), and retries only the promotion transaction. Because signing is RFC 6979 deterministic, the same `signed_op_json` is re-derived from the same inputs. 130 + 131 + **Auth helper** (`crates/relay/src/routes/auth.rs`): new `require_pending_session(headers, db) -> Result<PendingSessionInfo, ApiError>`. SHA-256 hashes the Bearer token, queries `pending_sessions WHERE token_hash = ? AND expires_at > datetime('now')`. Same inline pattern as `require_admin_token`. 132 + 133 + **DID document construction:** The `did_documents.document` JSON column stores a DID Core document constructed locally from known fields (`@context`, `id`, `alsoKnownAs`, `verificationMethod`, `service`). No round-trip GET to plc.directory needed. 134 + 135 + **New workspace dependencies:** 136 + 137 + | Crate | Version | Used by | Purpose | 138 + |-------|---------|---------|---------| 139 + | `reqwest` | `0.12` (features: `json`) | relay | HTTP client for plc.directory | 140 + | `ciborium` | `0.2` | crypto | CBOR encode for signing + DID derivation | 141 + | `data-encoding` | `2` | crypto | base32-lowercase for DID derivation | 142 + | `wiremock` | `0.6` (dev) | relay | Mock plc.directory in integration tests | 143 + 144 + **Config and AppState additions:** 145 + - `Config.plc_directory_url: String` — default `"https://plc.directory"`; overridden in tests 146 + - `AppState.http_client: reqwest::Client` — created once at startup, cheap to clone per request 147 + 148 + **Error codes:** 149 + 150 + | Status | Code | Condition | 151 + |--------|------|-----------| 152 + | 401 | `UNAUTHORIZED` | Missing, invalid, or expired session token | 153 + | 404 | `NOT_FOUND` | `signingKey` not in `relay_signing_keys` | 154 + | 409 | `DID_ALREADY_EXISTS` | `accounts` row already exists for this account | 155 + | 502 | `PLC_DIRECTORY_ERROR` | plc.directory returned non-2xx | 156 + | 500 | `INTERNAL_ERROR` | DB failure, decrypt failure, key parse error | 157 + 158 + ## Existing Patterns 159 + 160 + This design follows established patterns throughout: 161 + 162 + - **Functional Core / Imperative Shell** — `PlcGenesisOp` struct and `build_did_plc_genesis_op` follow the `P256Keypair` / `generate_p256_keypair` pattern in `crates/crypto/src/keys.rs`. Pure function, named-field return struct, `CryptoError` result. 163 + - **Imperative Shell route comment** — `create_did.rs` opens with the `// pattern: Imperative Shell` block documenting inputs, processing steps, and outputs, matching every other route file. 164 + - **Inline auth helper** — `require_pending_session` follows `require_admin_token` in `crates/relay/src/routes/auth.rs`: extract Bearer header, hash, lookup, return typed result or `ApiError`. 165 + - **Atomic provisioning transaction** — the DB promotion transaction follows the `provision_mobile_account` pattern in `create_mobile_account.rs`: `db.begin()` → multiple statements → `tx.commit()`, with `inspect_err` tracing on each step. 166 + - **Signing key decrypt pattern** — decrypting `relay_signing_keys.private_key_encrypted` before passing to crypto follows `create_signing_key.rs`. 167 + - **ApiError / ErrorCode** — all error paths use `ApiError::new(ErrorCode::..., message)` from the `common` crate. 168 + 169 + ## Implementation Phases 170 + 171 + <!-- START_PHASE_1 --> 172 + ### Phase 1: `crypto` crate — DID genesis op 173 + 174 + **Goal:** Implement `build_did_plc_genesis_op` as a pure function in the crypto crate, tested in isolation before any relay changes. 175 + 176 + **Components:** 177 + - `Cargo.toml` (workspace root) — add `ciborium = "0.2"` and `data-encoding = "2"` workspace deps 178 + - `crates/crypto/Cargo.toml` — add `ciborium` and `data-encoding`; add `serde` and `serde_json` (already workspace) 179 + - `crates/crypto/src/error.rs` — add `CryptoError::PlcOperation(String)` variant 180 + - `crates/crypto/src/plc.rs` (new) — `UnsignedPlcOp`, `SignedPlcOp`, `PlcGenesisOp` structs; `build_did_plc_genesis_op` function 181 + - `crates/crypto/src/lib.rs` — re-export `build_did_plc_genesis_op`, `PlcGenesisOp` 182 + - `crates/crypto/CLAUDE.md` — update contracts section 183 + 184 + **Dependencies:** None (first phase) 185 + 186 + **Done when:** `cargo test -p crypto` passes, covering MM-89.AC1.1–AC1.5 187 + <!-- END_PHASE_1 --> 188 + 189 + <!-- START_PHASE_2 --> 190 + ### Phase 2: `relay` crate — `POST /v1/dids` route 191 + 192 + **Goal:** Implement the full relay-side flow: DB migration, auth helper, route handler, and integration tests with a mocked plc.directory. 193 + 194 + **Components:** 195 + - `Cargo.toml` (workspace root) — add `reqwest = { version = "0.12", features = ["json"] }` 196 + - `crates/relay/Cargo.toml` — add `reqwest`; add `wiremock = "0.6"` as dev-dep 197 + - `crates/relay/src/db/migrations/V008__did_promotion.sql` (new) — rebuild `accounts` (nullable `password_hash`); `ALTER TABLE pending_accounts ADD COLUMN pending_did TEXT` 198 + - `crates/relay/src/db/mod.rs` — add V008 to `MIGRATIONS` array 199 + - `crates/relay/src/main.rs` or `config.rs` — add `plc_directory_url: String` to `Config` (default `"https://plc.directory"`) 200 + - `crates/relay/src/app.rs` — add `http_client: reqwest::Client` to `AppState`; update `test_state()` to include a default client 201 + - `crates/relay/src/routes/auth.rs` — add `require_pending_session` helper and `PendingSessionInfo` struct 202 + - `crates/relay/src/routes/create_did.rs` (new) — full route handler with pre-store retry logic 203 + - `crates/relay/src/routes/mod.rs` — add `pub mod create_did` 204 + - `crates/relay/src/app.rs` — register `POST /v1/dids` in the router 205 + - `bruno/create-did.bru` (new) — Bruno request for `POST /v1/dids` 206 + - `crates/relay/src/db/CLAUDE.md` — document V008 migration 207 + 208 + **Dependencies:** Phase 1 (`crypto::build_did_plc_genesis_op` must be available) 209 + 210 + **Done when:** `cargo test -p relay` passes, covering MM-89.AC2.1–AC2.9; `cargo clippy --workspace -- -D warnings` clean; V008 migration applies cleanly to in-memory DB 211 + <!-- END_PHASE_2 --> 212 + 213 + ## Additional Considerations 214 + 215 + **Device-signing (future):** MM-141 tracks the alternative where the iOS Secure Enclave signs the genesis operation instead of the relay. For v0.1, relay-signing matches the Bluesky reference PDS pattern and avoids a two-step protocol. The sovereignty story is unchanged: the user's root rotation key at `rotationKeys[0]` retains highest-priority authority to revoke the relay's key at any time. 216 + 217 + **plc.directory availability:** plc.directory is an external service with no SLA guarantees. A 502 response from the route indicates the directory was unreachable or rejected the operation. The client should surface this as a retryable error. The pre-store mechanism ensures retries are safe. 218 + 219 + **master key availability:** `config.signing_key_master_key` must be set for this endpoint to work (same requirement as `POST /v1/relay/keys`). Route returns 500 with `INTERNAL_ERROR` if not configured — same behavior as `create_signing_key.rs`.