A fork of mtelver's day10 project

Add comprehensive administrator's guide

Covers:
- Prerequisites and installation
- Directory structure
- Basic and batch usage
- Production setup (systemd, cron, webhooks, nginx)
- Monitoring (logs, summary.json, disk usage)
- Maintenance (cache, GC, epoch transitions)
- Troubleshooting common issues
- Architecture overview

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

+480
+480
docs/ADMIN_GUIDE.md
··· 1 + # day10 Administrator's Guide 2 + 3 + This guide covers how to set up and run day10 as a documentation generation system for OCaml packages, intended as a replacement for ocaml-docs-ci. 4 + 5 + ## Overview 6 + 7 + day10 builds OCaml packages and generates documentation using odoc. Key features: 8 + 9 + - **Fresh solving**: Always solves against current opam-repository (no stale cross-references) 10 + - **Graceful degradation**: Failed rebuilds preserve existing docs 11 + - **Layer caching**: Fast rebuilds via overlay filesystem caching 12 + - **Parallel processing**: Fork-based parallelism for batch runs 13 + 14 + ## Prerequisites 15 + 16 + ### System Requirements 17 + 18 + - Linux (Debian/Ubuntu recommended) 19 + - Root access (for runc containers) 20 + - At least 50GB disk space for cache 21 + - 8GB+ RAM recommended 22 + 23 + ### Dependencies 24 + 25 + ```bash 26 + # System packages 27 + sudo apt-get update 28 + sudo apt-get install -y \ 29 + build-essential \ 30 + git \ 31 + curl \ 32 + runc \ 33 + opam 34 + 35 + # Initialize opam 36 + opam init -y 37 + eval $(opam env) 38 + 39 + # Install OCaml and day10 dependencies 40 + opam switch create 5.2.0 41 + opam install -y dune opam-0install yojson cmdliner dockerfile ppx_deriving_yojson 42 + ``` 43 + 44 + ### Clone opam-repository 45 + 46 + ```bash 47 + git clone https://github.com/ocaml/opam-repository /data/opam-repository 48 + ``` 49 + 50 + ## Installation 51 + 52 + ### Build day10 53 + 54 + ```bash 55 + git clone https://github.com/mtelvers/ohc day10 56 + cd day10 57 + opam install . --deps-only 58 + dune build 59 + dune install 60 + ``` 61 + 62 + Verify installation: 63 + ```bash 64 + day10 --version 65 + day10 --help 66 + ``` 67 + 68 + ## Directory Structure 69 + 70 + Recommended production layout: 71 + 72 + ``` 73 + /data/ 74 + ├── opam-repository/ # Clone of ocaml/opam-repository 75 + ├── cache/ # Layer cache (can grow large) 76 + │ ├── debian-12-x86_64/ 77 + │ │ ├── base/ # Base image layer 78 + │ │ ├── solutions/ # Cached solver results 79 + │ │ ├── build-*/ # Build layers 80 + │ │ └── doc-*/ # Doc layers 81 + │ └── logs/ 82 + │ ├── runs/ # Per-run logs and summaries 83 + │ └── latest # Symlink to most recent run 84 + ├── html/ # Generated documentation 85 + │ ├── p/ # Blessed package docs 86 + │ │ └── {pkg}/{ver}/ 87 + │ └── u/ # Universe docs (dependencies) 88 + │ └── {hash}/{pkg}/{ver}/ 89 + └── packages.json # Package list for batch runs 90 + ``` 91 + 92 + ## Basic Usage 93 + 94 + ### Single Package 95 + 96 + Build and generate docs for one package: 97 + 98 + ```bash 99 + day10 health-check \ 100 + --cache-dir /data/cache \ 101 + --opam-repository /data/opam-repository \ 102 + --html-output /data/html \ 103 + base.0.16.0 104 + ``` 105 + 106 + ### Multiple Packages 107 + 108 + Create a JSON file listing packages: 109 + 110 + ```bash 111 + # packages.json 112 + {"packages": ["base.0.16.0", "core.0.16.0", "async.0.16.0"]} 113 + ``` 114 + 115 + Run batch mode: 116 + 117 + ```bash 118 + day10 batch \ 119 + --cache-dir /data/cache \ 120 + --opam-repository /data/opam-repository \ 121 + --html-output /data/html \ 122 + --fork 8 \ 123 + @packages.json 124 + ``` 125 + 126 + ### All Packages 127 + 128 + Generate a list of all packages in opam-repository: 129 + 130 + ```bash 131 + day10 list \ 132 + --opam-repository /data/opam-repository \ 133 + --all-versions \ 134 + --json /data/all-packages.json 135 + ``` 136 + 137 + Run on everything (this takes hours/days): 138 + 139 + ```bash 140 + day10 batch \ 141 + --cache-dir /data/cache \ 142 + --opam-repository /data/opam-repository \ 143 + --html-output /data/html \ 144 + --fork 16 \ 145 + @/data/all-packages.json 146 + ``` 147 + 148 + ## Command Reference 149 + 150 + ### day10 batch 151 + 152 + Main command for production use. 153 + 154 + ``` 155 + day10 batch [OPTIONS] PACKAGE 156 + 157 + PACKAGE: Single package (e.g., "base.0.16.0") or @filename for JSON list 158 + 159 + Required: 160 + --cache-dir DIR Layer cache directory 161 + --opam-repository DIR Path to opam-repository (can specify multiple) 162 + 163 + Recommended: 164 + --html-output DIR Where to write documentation 165 + --fork N Parallel workers (default: 1) 166 + 167 + Optional: 168 + --ocaml-version VER Pin OCaml version (default: solver picks) 169 + --dry-run Check what would be built without building 170 + --log Print build logs to stdout 171 + --json DIR Write per-package JSON results 172 + --md DIR Write per-package markdown results 173 + ``` 174 + 175 + ### day10 health-check 176 + 177 + Run on single package or small set (simpler than batch for testing): 178 + 179 + ``` 180 + day10 health-check [OPTIONS] PACKAGE 181 + ``` 182 + 183 + ### day10 list 184 + 185 + List packages in opam-repository: 186 + 187 + ``` 188 + day10 list --opam-repository DIR [--all-versions] [--json FILE] 189 + ``` 190 + 191 + ## Production Setup 192 + 193 + ### Systemd Service 194 + 195 + Create `/etc/systemd/system/day10.service`: 196 + 197 + ```ini 198 + [Unit] 199 + Description=day10 documentation generator 200 + After=network.target 201 + 202 + [Service] 203 + Type=oneshot 204 + User=root 205 + WorkingDirectory=/data 206 + ExecStart=/usr/local/bin/day10 batch \ 207 + --cache-dir /data/cache \ 208 + --opam-repository /data/opam-repository \ 209 + --html-output /data/html \ 210 + --fork 8 \ 211 + @/data/packages.json 212 + StandardOutput=journal 213 + StandardError=journal 214 + 215 + [Install] 216 + WantedBy=multi-user.target 217 + ``` 218 + 219 + ### Cron Job 220 + 221 + For periodic rebuilds (e.g., daily at 2 AM): 222 + 223 + ```bash 224 + # /etc/cron.d/day10 225 + 0 2 * * * root flock -n /var/run/day10.lock /usr/local/bin/day10 batch --cache-dir /data/cache --opam-repository /data/opam-repository --html-output /data/html --fork 8 @/data/packages.json >> /var/log/day10-cron.log 2>&1 226 + ``` 227 + 228 + ### Webhook Trigger 229 + 230 + To rebuild on opam-repository updates, set up a webhook endpoint that: 231 + 232 + 1. Pulls latest opam-repository 233 + 2. Triggers day10 batch run 234 + 235 + Example script `/usr/local/bin/day10-trigger.sh`: 236 + 237 + ```bash 238 + #!/bin/bash 239 + set -e 240 + 241 + cd /data/opam-repository 242 + git fetch origin 243 + git reset --hard origin/master 244 + 245 + flock -n /var/run/day10.lock \ 246 + day10 batch \ 247 + --cache-dir /data/cache \ 248 + --opam-repository /data/opam-repository \ 249 + --html-output /data/html \ 250 + --fork 8 \ 251 + @/data/packages.json 252 + ``` 253 + 254 + ### Serving Documentation 255 + 256 + Use nginx to serve the HTML output: 257 + 258 + ```nginx 259 + server { 260 + listen 80; 261 + server_name docs.example.com; 262 + root /data/html; 263 + 264 + location / { 265 + autoindex on; 266 + try_files $uri $uri/ =404; 267 + } 268 + } 269 + ``` 270 + 271 + ## Monitoring 272 + 273 + ### Run Logs 274 + 275 + Each batch run creates a timestamped directory: 276 + 277 + ``` 278 + /data/cache/logs/runs/2026-02-04-120000/ 279 + ├── summary.json # Run statistics 280 + ├── build/ # Build logs by package 281 + │ ├── base.0.16.0.log 282 + │ └── core.0.16.0.log 283 + └── docs/ # Doc generation logs 284 + ├── base.0.16.0.log 285 + └── core.0.16.0.log 286 + ``` 287 + 288 + The `latest` symlink always points to the most recent run: 289 + 290 + ```bash 291 + cat /data/cache/logs/latest/summary.json 292 + ``` 293 + 294 + ### summary.json Format 295 + 296 + ```json 297 + { 298 + "run_id": "2026-02-04-120000", 299 + "start_time": "2026-02-04T12:00:00", 300 + "end_time": "2026-02-04T14:30:00", 301 + "duration_seconds": 9000, 302 + "targets_requested": 100, 303 + "solutions_found": 95, 304 + "build_success": 90, 305 + "build_failed": 5, 306 + "doc_success": 85, 307 + "doc_failed": 3, 308 + "doc_skipped": 2, 309 + "failures": [ 310 + {"package": "broken-pkg.1.0.0", "error": "build exit code 2"}, 311 + {"package": "bad-docs.2.0.0", "error": "doc: odoc error"} 312 + ] 313 + } 314 + ``` 315 + 316 + ### Checking Status 317 + 318 + ```bash 319 + # Quick status 320 + jq '.build_success, .build_failed, .doc_success, .doc_failed' \ 321 + /data/cache/logs/latest/summary.json 322 + 323 + # List failures 324 + jq -r '.failures[] | "\(.package): \(.error)"' \ 325 + /data/cache/logs/latest/summary.json 326 + 327 + # Duration 328 + jq '.duration_seconds / 60 | floor | "\(.)m"' \ 329 + /data/cache/logs/latest/summary.json 330 + ``` 331 + 332 + ### Disk Usage 333 + 334 + Monitor cache growth: 335 + 336 + ```bash 337 + du -sh /data/cache/debian-12-x86_64/ 338 + du -sh /data/html/ 339 + ``` 340 + 341 + ## Maintenance 342 + 343 + ### Cache Management 344 + 345 + The cache grows over time. After each batch run, garbage collection automatically: 346 + 347 + 1. **Layer GC**: Deletes build/doc layers not referenced by current solutions 348 + 2. **Universe GC**: Deletes universe directories not referenced by any blessed package 349 + 350 + GC runs automatically at the end of each batch. Special layers are preserved: 351 + - `base` - Base OS image 352 + - `solutions` - Solver cache 353 + - `doc-driver-*` - Shared odoc driver 354 + - `doc-odoc-*` - Per-OCaml-version odoc 355 + 356 + ### Manual Cache Cleanup 357 + 358 + To force a complete rebuild: 359 + 360 + ```bash 361 + # Remove all layers (keeps base) 362 + rm -rf /data/cache/debian-12-x86_64/build-* 363 + rm -rf /data/cache/debian-12-x86_64/doc-* 364 + 365 + # Remove solution cache (forces re-solving) 366 + rm -rf /data/cache/debian-12-x86_64/solutions/ 367 + ``` 368 + 369 + ### Updating opam-repository 370 + 371 + ```bash 372 + cd /data/opam-repository 373 + git fetch origin 374 + git reset --hard origin/master 375 + ``` 376 + 377 + Solutions are cached by opam-repository commit hash, so updating automatically invalidates old solutions. 378 + 379 + ### Epoch Transitions 380 + 381 + For major changes (new odoc version, URL scheme change), you may want a clean rebuild: 382 + 383 + 1. Create new html directory: `/data/html-new/` 384 + 2. Run full batch with `--html-output /data/html-new/` 385 + 3. Once complete, atomically swap: `mv /data/html /data/html-old && mv /data/html-new /data/html` 386 + 4. Remove old: `rm -rf /data/html-old` 387 + 388 + ## Troubleshooting 389 + 390 + ### Build Failures 391 + 392 + Check the build log: 393 + 394 + ```bash 395 + cat /data/cache/logs/latest/build/failing-pkg.1.0.0.log 396 + ``` 397 + 398 + Or check the layer directly: 399 + 400 + ```bash 401 + cat /data/cache/debian-12-x86_64/build-*/build.log 402 + ``` 403 + 404 + ### Doc Generation Failures 405 + 406 + ```bash 407 + cat /data/cache/logs/latest/docs/failing-pkg.1.0.0.log 408 + ``` 409 + 410 + Common issues: 411 + - Missing `.cmti` files (package doesn't install them) 412 + - odoc bugs with certain code patterns 413 + - Memory exhaustion on large packages 414 + 415 + ### Stale .new/.old Directories 416 + 417 + If a run was interrupted, stale staging directories may exist: 418 + 419 + ```bash 420 + find /data/html -name "*.new" -o -name "*.old" 421 + ``` 422 + 423 + These are automatically cleaned up at the start of each batch run. 424 + 425 + ### Permission Issues 426 + 427 + day10 uses runc containers which require root. If you see permission errors: 428 + 429 + ```bash 430 + # Check runc works 431 + sudo runc --version 432 + 433 + # Ensure cache directory is accessible 434 + sudo chown -R root:root /data/cache 435 + ``` 436 + 437 + ### Memory Issues 438 + 439 + For large package sets, you may need to limit parallelism: 440 + 441 + ```bash 442 + # Reduce fork count 443 + day10 batch --fork 4 ... 444 + ``` 445 + 446 + Or increase system memory/swap. 447 + 448 + ## Architecture Notes 449 + 450 + ### How Layers Work 451 + 452 + Each package build creates a layer using overlay filesystem: 453 + 454 + ``` 455 + build-{hash}/ 456 + ├── fs/ # Filesystem overlay (installed files) 457 + ├── build.log # Build output 458 + └── layer.json # Metadata (package, deps, status) 459 + ``` 460 + 461 + The hash is computed from the package and its dependencies, so unchanged packages reuse existing layers. 462 + 463 + ### Blessing 464 + 465 + In batch mode, day10 computes "blessings" - which package version is canonical for each package name. Blessed packages go to `/html/p/`, non-blessed go to `/html/u/{universe}/`. 466 + 467 + ### Graceful Degradation 468 + 469 + When doc generation fails: 470 + 1. New docs are written to a staging directory 471 + 2. On success: atomically swap staging → final 472 + 3. On failure: staging is discarded, old docs remain 473 + 474 + This ensures the live site never shows broken docs. 475 + 476 + ## Getting Help 477 + 478 + - Check logs in `/data/cache/logs/latest/` 479 + - Review `summary.json` for failure details 480 + - File issues at: https://github.com/mtelvers/ohc/issues