An easy-to-host PDS on the ATProtocol, MacOS. Grandma-approved.

docs: add MM-72 SQLite migration infrastructure design plan

Completed brainstorming session. Design includes:
- Custom forward-only migration runner with schema_migrations tracking
- WAL-mode SqlitePool (max 1 connection) via SqliteConnectOptions
- Flat db/mod.rs with include_str!-embedded SQL migrations
- AppState extended with SqlitePool; open_pool accepts plain URL for future per-user DB reuse
- 3 implementation phases: sqlx dep, db module + V001 schema, AppState integration

authored by malpercio.dev and committed by

Tangled 0bbe8742 896e8062

+208
+208
docs/design-plans/2026-03-10-MM-72.md
··· 1 + # SQLite Migration Infrastructure + Wave 1 Schema Design 2 + 3 + ## Summary 4 + 5 + This ticket introduces SQLite as the relay's persistent storage layer. Currently the relay starts up, loads configuration, and exits without persisting any state. MM-72 adds a `db/` module inside the `relay` crate that opens a WAL-mode SQLite connection pool at startup, runs a forward-only migration system to bring the schema up to date, and makes the pool available to every request handler through `AppState`. The first migration creates two tables: `schema_migrations` (owned by the runner to track which migrations have been applied) and `server_metadata` (a general-purpose key/value store for future server-level configuration). 6 + 7 + The migration runner is hand-written rather than delegated to sqlx's built-in migrator, keeping the behaviour explicit and auditable: on each startup it reads which versions are already recorded in `schema_migrations`, skips those, and applies any remaining migrations in a single transaction. The pool and runner are designed with forward compatibility in mind — `open_pool` accepts a plain URL string so the same function can later open per-user SQLite databases, and `run_migrations` accepts any pool so a separate migration list can be passed for the per-user schema without changing the runner itself. 8 + 9 + ## Definition of Done 10 + 11 + - `crates/relay/src/db/` module added with a generic pool factory (`open_pool(path: &Path) -> Result<SqlitePool>`, WAL mode, max 1 connection) and a custom forward-only migration runner that tracks applied migrations in `schema_migrations` 12 + - `V001__init.sql` creates `schema_migrations` + `server_metadata` (key/value store); idempotent on repeated starts 13 + - `main.rs` passes `config.database_url` into the pool factory; resulting pool stored in `AppState` 14 + - `sqlx` (`runtime-tokio` + `sqlite`) added to workspace deps; pkg-config/bundled handled by env var, no Cargo feature flags needed 15 + - `cargo test`, `cargo clippy`, `cargo fmt` all pass; unit tests for migration runner use in-memory SQLite (`":memory:"`) 16 + 17 + ## Acceptance Criteria 18 + 19 + ### MM-72.AC1: relay.db created on first start 20 + - **MM-72.AC1.1 Success:** `cargo run --bin relay` (with a valid `relay.toml`) creates `relay.db` in the configured `data_dir` 21 + - **MM-72.AC1.2 Success:** `schema_migrations` table exists in the produced database 22 + - **MM-72.AC1.3 Success:** `server_metadata` table exists in the produced database 23 + 24 + ### MM-72.AC2: Migrations are idempotent 25 + - **MM-72.AC2.1 Success:** Running the relay a second time does not re-apply V001 — row count in `schema_migrations` remains 1 26 + - **MM-72.AC2.2 Success:** `schema_migrations` records `version = 1` with a non-null `applied_at` timestamp after first run 27 + 28 + ### MM-72.AC3: Pool available in AppState 29 + - **MM-72.AC3.1 Success:** Handler tests that extract `State<AppState>` compile and pass with the `db: SqlitePool` field present 30 + - **MM-72.AC3.2 Success:** `sqlx::query("SELECT 1").execute(&state.db)` succeeds in tests using an in-memory pool 31 + 32 + ### MM-72.AC4: WAL mode enabled 33 + - **MM-72.AC4.1 Success:** `PRAGMA journal_mode` queried on the pool returns `wal` 34 + 35 + ### MM-72.AC5: Unit tests use in-memory SQLite 36 + - **MM-72.AC5.1 Success:** Migration runner unit tests use `":memory:"` — no `relay.db` or temp files created on disk during `cargo test` 37 + - **MM-72.AC5.2 Success:** `cargo test --workspace` passes in a clean environment with no pre-existing `relay.db` 38 + 39 + ### MM-72.AC6: Toolchain checks pass 40 + - **MM-72.AC6.1 Success:** `cargo clippy --workspace -- -D warnings` passes with no warnings 41 + - **MM-72.AC6.2 Success:** `cargo fmt --all --check` passes 42 + 43 + ## Glossary 44 + 45 + - **WAL mode**: Write-Ahead Logging — a SQLite journal mode that writes changes to a separate log file before applying them to the main database file. Compared to the default rollback journal, WAL allows concurrent readers while a write is in progress, which is important for an async server where multiple requests may access the database simultaneously. 46 + - **SqlitePool**: A connection pool provided by the `sqlx` crate. Pools manage a fixed set of reusable database connections and hand them out to concurrent callers. `SqlitePool` is internally `Arc`-backed, meaning it is cheap to clone and does not need an extra wrapper to be shared across threads. 47 + - **`include_str!`**: A Rust built-in macro that reads a file at compile time and embeds its contents as a `&'static str`. Used here to embed `.sql` migration files into the binary so that they are always present and the compiler enforces that the referenced file exists. 48 + - **`WITHOUT ROWID`**: A SQLite table option that omits the implicit integer primary key (rowid) that SQLite normally adds to every table. Appropriate when the declared primary key is the only access path, avoiding a redundant index. 49 + - **Forward-only migration runner**: A schema migration strategy that only ever applies new migrations and never rolls back applied ones. Each migration is identified by a version number; the runner records applied versions in a tracking table and skips any that are already present. 50 + - **`schema_migrations`**: The bookkeeping table created by the migration runner itself to record which migration versions have been applied and when. It is not defined in any `.sql` file — the runner creates it with `CREATE TABLE IF NOT EXISTS` before executing any migrations. 51 + - **`server_metadata`**: A key/value table created by `V001__init.sql` intended to hold server-level configuration and state that needs to survive restarts (e.g., server DID, signing keys, or feature flags added in later waves). 52 + - **`AppState`**: The shared state struct passed to every Axum request handler via Axum's `State` extractor. It holds values that need to be accessible across requests — currently `Arc<Config>`, and after this ticket, also the `SqlitePool`. 53 + - **Functional Core / Imperative Shell**: An architectural pattern where pure, side-effect-free logic (the functional core) is kept separate from code that performs I/O (the imperative shell). In this codebase, `main.rs` is the imperative shell; `db/mod.rs` also lives there because opening pools and executing SQL are I/O operations. 54 + - **`thiserror`**: A Rust crate that provides a derive macro for implementing `std::error::Error` on custom error enums with less boilerplate. Used here to define `DbError` with human-readable messages. 55 + - **`anyhow::Context`**: An extension trait from the `anyhow` crate that adds the `.with_context(|| "message")` method to `Result`, attaching a human-readable explanation when propagating errors up the call stack. 56 + - **`LIBSQLITE3_SYS_USE_PKG_CONFIG=1`**: An environment variable that tells `libsqlite3-sys` (sqlx's SQLite backend) to link against a system-provided SQLite library (located via `pkg-config`) instead of compiling a bundled copy. Set automatically by the devenv shell; absent in CI, which falls back to the bundled copy. 57 + - **Wave 1 / Wave 3/4**: Informal milestone labels used in this project to sequence feature work. Wave 1 refers to the foundational server-level schema established in this ticket; Waves 3 and 4 refer to future work introducing per-user SQLite databases following the Bluesky reference PDS pattern. 58 + - **Bluesky reference PDS**: The reference implementation of an AT Protocol Personal Data Server published by Bluesky. It is used here as a precedent for the per-user database layout (`{dataDir}/{didHash[0:2]}/{did}/sqlite`). 59 + - **`pkg-config`**: A command-line tool that reports the compiler and linker flags needed to use a system library. Used here by `libsqlite3-sys` when `LIBSQLITE3_SYS_USE_PKG_CONFIG=1` is set so it can find the Nix-provided SQLite headers and shared library. 60 + 61 + ## Architecture 62 + 63 + SQLite access is added as a single module, `crates/relay/src/db/`, inside the `relay` crate. Only `relay` ever touches the database; `repo-engine`, `crypto`, and `common` remain pure functional cores. 64 + 65 + ``` 66 + crates/relay/src/ 67 + main.rs ← imperative shell: calls open_pool + run_migrations at startup 68 + app.rs ← AppState gains db: SqlitePool field 69 + db/ 70 + mod.rs ← DbError, open_pool, run_migrations, MIGRATIONS list 71 + migrations/ 72 + V001__init.sql ← schema_migrations + server_metadata tables 73 + ``` 74 + 75 + **Public API (`db/mod.rs`):** 76 + 77 + ```rust 78 + #[derive(Debug, thiserror::Error)] 79 + pub enum DbError { 80 + #[error("failed to open database pool: {0}")] 81 + Pool(#[from] sqlx::Error), 82 + #[error("migration v{version} failed: {source}")] 83 + Migration { version: u32, source: sqlx::Error }, 84 + } 85 + 86 + /// Open a WAL-mode SQLite pool (max 1 connection). 87 + /// Accepts any URL string — not coupled to Config — so it can be 88 + /// reused for per-user databases in Wave 3/4. 89 + pub async fn open_pool(url: &str) -> Result<SqlitePool, DbError>; 90 + 91 + /// Apply any pending migrations from MIGRATIONS to the given pool. 92 + /// Creates schema_migrations if it does not exist, then records each 93 + /// applied version. All pending migrations run in a single transaction. 94 + pub async fn run_migrations(pool: &SqlitePool) -> Result<(), DbError>; 95 + ``` 96 + 97 + **Migration registration:** 98 + 99 + ```rust 100 + struct Migration { version: u32, sql: &'static str } 101 + 102 + static MIGRATIONS: &[Migration] = &[ 103 + Migration { version: 1, sql: include_str!("migrations/V001__init.sql") }, 104 + ]; 105 + ``` 106 + 107 + SQL files are embedded at compile time via `include_str!`. Adding a future migration requires both a new SQL file and a new entry in MIGRATIONS — the compiler enforces that the file exists. 108 + 109 + **Pool configuration:** 110 + 111 + Pool is created with `SqlitePoolOptions::new().max_connections(1).connect_with(...)`. WAL mode is set via `SqliteConnectOptions::journal_mode(SqliteJournalMode::Wal)` — not a raw `PRAGMA` — so sqlx tracks the mode across the connection lifecycle. `SqlitePool` is Arc-backed internally; no additional wrapping is needed in `AppState`. 112 + 113 + **Startup sequence (`main.rs`):** 114 + 115 + ``` 116 + 1. init tracing 117 + 2. load + validate Config 118 + 3. open_pool(&config.database_url) 119 + 4. run_migrations(&pool) 120 + 5. AppState { config: Arc::new(config), db: pool } 121 + 6. bind TcpListener 122 + 7. axum::serve(...) 123 + ``` 124 + 125 + Steps 3–4 run before the listener binds. If either fails, `run()` returns an `anyhow::Error` with context and the process exits non-zero — consistent with the existing config-load failure path. 126 + 127 + **Wave 1 schema (`V001__init.sql`):** 128 + 129 + `schema_migrations` is created by the runner before any migration SQL executes. `V001__init.sql` therefore only needs to create `server_metadata`: 130 + 131 + ```sql 132 + CREATE TABLE server_metadata ( 133 + key TEXT NOT NULL, 134 + value TEXT NOT NULL, 135 + PRIMARY KEY (key) 136 + ) WITHOUT ROWID; 137 + ``` 138 + 139 + `schema_migrations` (created by the runner): 140 + 141 + ```sql 142 + CREATE TABLE IF NOT EXISTS schema_migrations ( 143 + version INTEGER PRIMARY KEY, 144 + applied_at TEXT NOT NULL 145 + ) WITHOUT ROWID; 146 + ``` 147 + 148 + Both tables use `WITHOUT ROWID` — their primary keys are the only lookup path, so a separate implicit rowid column adds no value. 149 + 150 + ## Existing Patterns 151 + 152 + This design follows the Functional Core / Imperative Shell split established in `main.rs`: 153 + 154 + - `db/mod.rs` is I/O — opening pools and executing SQL — and lives in the imperative shell alongside `main.rs` 155 + - `run_migrations` and `open_pool` are called from `main.rs` before any handler runs, consistent with how `common::load_config` is called before `AppState` is built 156 + 157 + Error handling follows the pattern established in `crates/common/src/config.rs`: a typed `thiserror` enum (`ConfigError`) for domain errors, with `anyhow::Context` used only at the `main.rs` call site. `DbError` mirrors this structure. 158 + 159 + `AppState` in `app.rs` already holds `Arc<Config>`. Adding `db: SqlitePool` follows the same principle — cheap to clone, no extra `Arc` wrapping needed since `SqlitePool` is already internally `Arc`-backed. 160 + 161 + ## Implementation Phases 162 + 163 + <!-- START_PHASE_1 --> 164 + ### Phase 1: Add sqlx workspace dependency 165 + **Goal:** Introduce `sqlx` into the workspace and verify the build is clean. 166 + 167 + **Components:** 168 + - `Cargo.toml` (workspace root) — add `sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite"] }` to `[workspace.dependencies]` 169 + - `crates/relay/Cargo.toml` — opt in with `sqlx = { workspace = true }` 170 + 171 + **Dependencies:** Branch must be rebased onto `main` (MM-71) before this phase — `main.rs` must be the async version with `AppState` already defined. 172 + 173 + **Done when:** `cargo build --workspace` succeeds with sqlx in the dependency tree; `cargo clippy --workspace -- -D warnings` passes 174 + <!-- END_PHASE_1 --> 175 + 176 + <!-- START_PHASE_2 --> 177 + ### Phase 2: DB module — pool factory, migration runner, Wave 1 schema 178 + **Goal:** Implement the full `db/` module with pool creation, forward-only migration runner, and `V001__init.sql`. 179 + 180 + **Components:** 181 + - `crates/relay/src/db/mod.rs` — `DbError`, `Migration` type, `MIGRATIONS` static, `open_pool`, `run_migrations`, and `#[cfg(test)]` unit tests 182 + - `crates/relay/src/db/migrations/V001__init.sql` — `server_metadata` table 183 + 184 + **Dependencies:** Phase 1 185 + 186 + **Done when:** `cargo test --workspace` passes, including unit tests covering: migrations apply on first run, migrations are idempotent on second run, `schema_migrations` records the applied version, `server_metadata` table exists and accepts key-value inserts; `cargo clippy` passes 187 + <!-- END_PHASE_2 --> 188 + 189 + <!-- START_PHASE_3 --> 190 + ### Phase 3: AppState integration and startup wiring 191 + **Goal:** Wire the pool into `AppState` and call `open_pool` + `run_migrations` from `main.rs`; update the `app.rs` test fixture. 192 + 193 + **Components:** 194 + - `crates/relay/src/app.rs` — add `pub db: SqlitePool` to `AppState`; update `test_state()` to open an in-memory pool with migrations applied 195 + - `crates/relay/src/main.rs` — call `db::open_pool` and `db::run_migrations` between config load and `AppState` construction, using `.with_context(...)` for error messages 196 + 197 + **Dependencies:** Phase 2 198 + 199 + **Done when:** `cargo run --bin relay` (with a valid `relay.toml`) creates `relay.db` in `data_dir` on first start; running the binary a second time does not re-apply migrations; `schema_migrations` and `server_metadata` tables exist in the produced database; all existing `app.rs` handler tests still pass; `cargo clippy` and `cargo fmt --all --check` pass 200 + <!-- END_PHASE_3 --> 201 + 202 + ## Additional Considerations 203 + 204 + **`max_connections(1)` and async:** A pool of one connection serializes all database access. This is safe as long as no handler holds a transaction across an `await` point that could trigger another handler to acquire the pool — a classic async deadlock. For v0.1 with simple key-value reads this is not a concern. If contention becomes an issue, increase `max_connections` to a small value (e.g., 5); WAL mode will handle concurrent readers. 205 + 206 + **Per-user databases (Wave 3/4):** `open_pool(url: &str)` takes a path string rather than reading from `Config` directly. When per-user SQLite files are introduced (following the Bluesky reference PDS pattern of `{dataDir}/{didHash[0:2]}/{did}/sqlite`), the same function can open each user's pool. The migration runner accepts any `&SqlitePool`, so a separate `USER_MIGRATIONS` slice can be passed for the user schema without changing the runner. 207 + 208 + **`LIBSQLITE3_SYS_USE_PKG_CONFIG=1`:** The devenv shell sets this env var automatically. When set, `libsqlite3-sys` (sqlx's SQLite backend) links against the Nix-provided system SQLite instead of compiling a bundled copy. In CI and Docker environments where this var is absent, `libsqlite3-sys` falls back to its bundled SQLite. No Cargo feature flags are needed to control this — the env var is the only switch.