Minimal SQLite key-value store for OCaml
at main 140 lines 4.9 kB view raw view rendered
1# sqlite 2 3Minimal SQLite key-value store for OCaml. 4 5## Overview 6 7A simple key-value store backed by SQLite with support for: 8- Namespaced tables for organizing data 9- WAL mode for concurrent access 10- Efficient batch operations 11- Eio-compatible synchronous API 12 13## Installation 14 15``` 16opam install sqlite 17``` 18 19## Usage 20 21```ocaml 22(* Open or create a database *) 23let db = Sqlite.create (Eio.Path.(fs / "data.db")) in 24 25(* Basic key-value operations *) 26Sqlite.put db "key1" "value1"; 27let value = Sqlite.get db "key1" in (* Some "value1" *) 28 29(* Namespaced tables *) 30let blocks = Sqlite.Table.create db ~name:"blocks" in 31Sqlite.Table.put blocks "cid1" "data1"; 32 33(* Sync to disk *) 34Sqlite.sync db; 35 36(* Close when done *) 37Sqlite.close db 38``` 39 40## API 41 42### Database 43 44- `Sqlite.create path` - Open or create a SQLite database at path 45- `Sqlite.get db key` - Get value for key, or None 46- `Sqlite.put db key value` - Store value at key 47- `Sqlite.delete db key` - Remove key 48- `Sqlite.mem db key` - Check if key exists 49- `Sqlite.iter db ~f` - Iterate over all entries 50- `Sqlite.fold db ~init ~f` - Fold over all entries 51- `Sqlite.sync db` - Flush to disk (WAL checkpoint) 52- `Sqlite.close db` - Close the database 53 54### Namespaced Tables 55 56- `Sqlite.Table.create db ~name` - Create or open a named table 57- `Sqlite.Table.get`, `put`, `delete`, `mem`, `iter` - Same as database operations 58 59## Related Work 60 61- [sqlite3-ocaml](https://github.com/mmottl/sqlite3-ocaml) - Low-level SQLite3 bindings (used internally) 62- [ezsqlite](https://opam.ocaml.org/packages/ezsqlite/) - Alternative SQLite bindings with extensions 63- [irmin](https://github.com/mirage/irmin) - Git-like distributed database (different use case) 64 65## Future: Pure OCaml Implementation 66 67The current implementation uses C bindings via sqlite3-ocaml. A future pure OCaml 68implementation would enable: 69- Unikernel deployment (MirageOS, Solo5) 70- Browser targets via js_of_ocaml 71- Full control over I/O with bytesrw streaming 72- Better debugging and error handling 73 74### Research: Limbo (Rust) 75 76[Limbo](https://github.com/tursodatabase/limbo) is a Rust implementation of SQLite, 77providing a clean reference for pure-language SQLite implementations. 78 79Key design decisions from Limbo: 80- **Async-first**: Built on Rust async/await (we'd use Eio) 81- **Modular pager**: Separates page cache from storage backend 82- **Incremental parsing**: Streams large records without full buffering 83- **WAL-focused**: Prioritizes WAL mode over legacy rollback journal 84 85### SQLite File Format 86 87The [SQLite file format](https://www.sqlite.org/fileformat2.html) is well-documented: 88 89**Database structure:** 90``` 91┌──────────────────────────────────────┐ 92│ Database Header (100 bytes) │ ← Page 1 (first 100 bytes) 93├──────────────────────────────────────┤ 94│ Schema Table (sqlite_master B-tree) │ ← Page 1 (remaining) 95├──────────────────────────────────────┤ 96│ User Tables & Indexes (B-trees) │ ← Pages 2..N 97├──────────────────────────────────────┤ 98│ Freelist (unused pages) │ 99└──────────────────────────────────────┘ 100``` 101 102**B-tree pages:** 103- Interior pages: keys + child page pointers (routing) 104- Leaf pages: keys + record data (storage) 105- Overflow pages: continuation for large records 106 107**Record format:** 108- Header: serial types for each column (varint-encoded) 109- Body: column values in declared order 110 111### Implementation Approach 112 113**Phase 1: Read-only access** 1141. Parse database header (page size, encoding, version) 1152. Read B-tree pages (interior and leaf) 1163. Traverse B-trees to find records 1174. Decode record format (serial types → OCaml values) 118 119**Phase 2: Write support** 1201. B-tree insertion with page splits 1212. Freelist management 1223. WAL mode implementation 1234. Checkpointing 124 125**Phase 3: Eio integration** 1261. bytesrw-based page I/O 1272. Async file operations with Eio.File 1283. LRU page cache with configurable size 129 130### References 131 132- [SQLite File Format](https://www.sqlite.org/fileformat2.html) - Official specification 133- [Limbo](https://github.com/tursodatabase/limbo) - Rust implementation (primary inspiration) 134- [SQLite Database System](https://www.amazon.com/SQLite-Database-System-Design-Implementation/dp/1453861866) - Sibsankar Haldar's design book 135- [Architecture of SQLite](https://www.sqlite.org/arch.html) - Official architecture docs 136- [SQLite Source Code](https://sqlite.org/src/doc/trunk/README.md) - C reference implementation 137 138## Licence 139 140MIT License. See [LICENSE.md](LICENSE.md) for details.