Testing Plan for day10#

Date: 2026-02-03 Status: Proposed Author: Brainstorming session

Overview#

This document describes the comprehensive testing strategy for day10, covering correctness, reliability, and compatibility. The strategy uses a tiered approach: fast tests on every commit (~2 minutes) and a full suite nightly/on-demand (~30-60 minutes).

Design Principles#

Real-world testing - Container-based tests that exercise the actual build/docs pipeline
Fast feedback - Most regressions caught in under 2 minutes
Comprehensive coverage - Nightly runs verify edge cases and real packages
Controlled fixtures - Purpose-built mini repositories for specific scenarios
Fault tolerance verification - Explicit testing of failure modes

Test Architecture#

Two-Tier Strategy#

Tier	Purpose	Runtime	Trigger
Fast	Catch most regressions	~2 min	Every commit
Full	Comprehensive coverage	~30-60 min	Nightly, on-demand

Test Types#

tests/
├── unit/                    # Pure OCaml logic tests
│   ├── solver_test.ml
│   ├── atomic_swap_test.ml
│   └── notifier_test.ml
│
├── integration/
│   └── mini_repo/           # Mini opam-repo fixtures
│       ├── simple_build/
│       ├── dep_chain/
│       └── doc_failure/
│
└── full/
    ├── real_snapshots/      # Real opam-repo snapshots
    ├── fault_injection/     # Infrastructure failure tests
    └── large_scale/         # 50+ package tests

Custom Test Harness#

Rather than alcotest or cram tests, we use a custom harness that:

Spins up real containers with day10
Runs against controlled opam-repository fixtures
Verifies outputs match expectations
Provides clear failure diagnostics

(* test_harness.mli *)
type test_result =
  | Pass
  | Fail of string
  | Skip of string

type test = {
  name : string;
  run : unit -> test_result;
}

val run_tests : test list -> unit
(** Runs tests, prints results, exits with appropriate code *)

Mini Opam-Repository Fixtures#

Purpose-built test packages for specific scenarios, stored in tests/fixtures/mini-repos/.

Fixture Structure#

tests/fixtures/mini-repos/
├── simple-success/
│   ├── opam-repository/
│   │   └── packages/
│   │       └── test-pkg/
│   │           └── test-pkg.1.0.0/
│   │               └── opam
│   ├── packages.txt
│   └── expected.json
│
├── dependency-chain/
│   └── ...  (A → B → C)
│
├── doc-failure/
│   └── ...  (build succeeds, docs fail)
│
├── build-failure/
│   └── ...  (build fails)
│
├── partial-universe/
│   └── ...  (some deps fail, others succeed)
│
└── dependency-update/
    ├── opam-repository-v1/   # pkg-a.1.0 depends on pkg-b.1.0
    ├── opam-repository-v2/   # pkg-b.2.0 added, pkg-a.1.0 now resolves to it
    ├── packages.txt
    └── expected.json

Example Fixture: simple-success#

# packages/test-pkg/test-pkg.1.0.0/opam
opam-version: "2.0"
build: ["dune" "build" "-p" name]
depends: ["ocaml" "dune"]

(* src/lib.ml *)
let hello () = "Hello from test-pkg"

// expected.json
{
  "packages": [
    {
      "name": "test-pkg",
      "version": "1.0.0",
      "build": "success",
      "docs": "success",
      "files": ["index.html", "Test_pkg/index.html"]
    }
  ]
}

Example Fixture: dependency-update#

This fixture tests the core "fresh solving" principle - that day10 picks up dependency changes rather than caching solutions forever.

Structure:

dependency-update/
├── opam-repository-v1/
│   └── packages/
│       ├── pkg-a/pkg-a.1.0.0/opam    # depends: ["pkg-b" {>= "1.0"}]
│       └── pkg-b/pkg-b.1.0.0/opam
├── opam-repository-v2/
│   └── packages/
│       ├── pkg-a/pkg-a.1.0.0/opam    # same as v1
│       └── pkg-b/
│           ├── pkg-b.1.0.0/opam
│           └── pkg-b.2.0.0/opam      # new version added
├── packages.txt                       # pkg-a.1.0.0
└── expected.json

Test sequence:

let test_dependency_update () =
  let output_dir = Temp.create () in

  (* Run 1: Against v1 repository *)
  Day10.batch
    ~opam_repository:(fixture_dir / "opam-repository-v1")
    ~packages:["pkg-a.1.0.0"]
    ~output_dir;

  (* Verify pkg-a docs link to pkg-b.1.0.0 *)
  let pkg_a_index = read_file (output_dir / "p/pkg-a/1.0.0/index.html") in
  assert (String.is_substring pkg_a_index ~substring:"pkg-b/1.0.0");

  (* Run 2: Against v2 repository (pkg-b.2.0.0 now available) *)
  Day10.batch
    ~opam_repository:(fixture_dir / "opam-repository-v2")
    ~packages:["pkg-a.1.0.0"]
    ~output_dir;

  (* Verify pkg-a docs now link to pkg-b.2.0.0 *)
  let pkg_a_index = read_file (output_dir / "p/pkg-a/1.0.0/index.html") in
  if String.is_substring pkg_a_index ~substring:"pkg-b/2.0.0" then Pass
  else Fail "pkg-a docs still reference old pkg-b version"

This is a key differentiator from ocaml-docs-ci, which would cache the original solution and never pick up pkg-b.2.0.0.

Scenario Coverage#

Fixture	Tests
`simple-success`	Basic build and doc generation works
`dependency-chain`	A→B→C builds in correct order
`doc-failure`	Build succeeds, docs fail, old docs preserved
`build-failure`	Build fails, dependents skipped
`partial-universe`	Some packages fail, others succeed independently
`atomic-swap`	Verify .new/.old directory handling
`recovery`	Interrupted swap recovery on restart
`dependency-update`	Re-solve picks up new dependency versions

Fault Injection#

Testing infrastructure failure handling via container resource limits.

Fault Types#

type fault =
  | OOM_limit of int          (* bytes - container memory limit *)
  | Disk_limit of int         (* bytes - tmpfs size *)
  | Timeout of int            (* seconds - build timeout *)
  | Build_script_fail         (* inject failing build script *)
  | Odoc_fail                 (* inject failing odoc *)

val with_fault : fault -> (unit -> 'a) -> 'a

Implementation Strategy#

OOM and Disk limits: Use container cgroup limits directly:

let with_resource_limits ~memory_mb ~disk_mb f =
  (* Container already supports --memory flag *)
  (* Use tmpfs with size limit for disk *)
  let container_args = [
    "--memory"; sprintf "%dM" memory_mb;
    "--mount"; sprintf "type=tmpfs,destination=/build,tmpfs-size=%dM" disk_mb;
  ] in
  run_container ~extra_args:container_args f

Build/odoc failures: Inject wrapper scripts:

#!/bin/bash
# Fake odoc that always fails
echo "Injected odoc failure" >&2
exit 1

Fault Test Examples#

let test_oom_handling () =
  with_fault (OOM_limit (50 * 1024 * 1024)) (fun () ->
    let result = Day10.build ~package:"memory-hog.1.0.0" in
    match result with
    | Error (`Resource_exhausted _) -> Pass
    | _ -> Fail "Expected OOM to be detected"
  )

let test_disk_full_handling () =
  with_fault (Disk_limit (10 * 1024 * 1024)) (fun () ->
    let result = Day10.build ~package:"large-output.1.0.0" in
    match result with
    | Error (`Resource_exhausted _) -> Pass
    | _ -> Fail "Expected disk full to be detected"
  )

let test_timeout_handling () =
  with_fault (Timeout 5) (fun () ->
    let result = Day10.build ~package:"slow-build.1.0.0" in
    match result with
    | Error (`Timeout _) -> Pass
    | _ -> Fail "Expected timeout to be detected"
  )

Notification Testing#

Abstraction layer for testable Zulip integration.

Notifier Interface#

(* notifier.mli *)
type message = {
  stream : string;
  topic : string;
  content : string;
}

type t = {
  send : message -> unit;
}

val zulip : Zulip.Client.t -> t
(** Production notifier using ocaml-zulip *)

val mock : unit -> t * message list ref
(** Returns notifier and ref to collect sent messages *)

val null : t
(** Silent notifier for tests that don't care about notifications *)

Mock Implementation#

let mock () =
  let messages = ref [] in
  let send msg = messages := msg :: !messages in
  ({ send }, messages)

Test Usage#

let test_failure_notification () =
  let notifier, messages = Notifier.mock () in

  (* Run day10 with a package that will fail *)
  Day10.batch
    ~notifier
    ~packages:["will-fail.1.0.0"]
    ~opam_repository:fixtures_dir;

  (* Verify notification was sent *)
  match !messages with
  | [msg] ->
      assert (String.is_substring msg.content ~substring:"build failures");
      assert (String.is_substring msg.content ~substring:"will-fail.1.0.0");
      Pass
  | [] -> Fail "Expected failure notification"
  | _ -> Fail "Expected exactly one notification"

Notification Format Verification#

let test_notification_format () =
  let notifier, messages = Notifier.mock () in

  Day10.batch
    ~notifier
    ~packages:["pkg-a.1.0.0"; "pkg-b.1.0.0"; "pkg-c.1.0.0"]
    ~opam_repository:mixed_results_fixture;

  match !messages with
  | [msg] ->
      (* Verify expected format *)
      assert (String.is_prefix msg.content ~prefix:"📦 day10 run completed");
      assert (String.is_substring msg.content ~substring:"packages built");
      assert (String.is_substring msg.content ~substring:"docs generated");
      Pass
  | _ -> Fail "Expected one summary notification"

Output Validation#

Structure verification for generated documentation.

Validation Types#

type expected_file = {
  path : string;        (* relative path from package doc root *)
  required : bool;      (* false = optional *)
}

type expected_output = {
  package : string;
  version : string;
  status : [ `Success | `Build_fail | `Doc_fail | `Doc_skipped ];
  files : expected_file list;  (* only checked if status = `Success *)
}

Verification Function#

let verify_output ~html_dir expected =
  match expected.status with
  | `Success ->
      let base = html_dir / "p" / expected.package / expected.version in
      List.iter (fun file ->
        let path = base / file.path in
        if not (Sys.file_exists path) then
          if file.required then
            failf "Missing required file: %s" path
          else
            Log.warn "Missing optional file: %s" path
      ) expected.files

  | `Build_fail | `Doc_fail | `Doc_skipped ->
      (* Verify old docs still exist if they existed before *)
      ()

Standard File Expectations#

let standard_doc_files ~package ~has_lib ~has_bin =
  let files = [
    { path = "index.html"; required = true };
  ] in
  let files = if has_lib then
    { path = String.capitalize_ascii package ^ "/index.html"; required = true } :: files
  else files in
  files

Graceful Degradation Verification#

let test_graceful_degradation () =
  (* Setup: create initial successful docs *)
  let output_dir = Temp.create () in
  Day10.batch
    ~packages:["good-pkg.1.0.0"]
    ~output_dir
    ~opam_repository:success_fixture;

  let original_mtime =
    (Unix.stat (output_dir / "p/good-pkg/1.0.0/index.html")).st_mtime in

  (* Now run with a fixture where this package's docs fail *)
  Day10.batch
    ~packages:["good-pkg.1.0.0"]
    ~output_dir
    ~opam_repository:doc_failure_fixture;

  (* Verify original docs preserved *)
  let new_mtime =
    (Unix.stat (output_dir / "p/good-pkg/1.0.0/index.html")).st_mtime in

  if Float.(original_mtime = new_mtime) then Pass
  else Fail "Docs were modified despite failure"

Real Repository Snapshot Tests#

Compatibility testing against real-world packages.

Snapshot Storage#

tests/fixtures/real-snapshots/
├── README.md              # Documents snapshot selection rationale
├── 2026-01-15/            # Baseline snapshot
│   ├── opam-repository/   # git submodule or tarball
│   ├── packages.txt       # 50 representative packages
│   └── expected.json      # Expected outcomes
└── 2026-02-01/            # Post-update snapshot
    └── ...

Package Selection Criteria#

Each snapshot's packages.txt includes:

# Core packages (must always work)
dune.3.17.0
ocamlfind.1.9.6
cmdliner.1.3.0

# Common dependencies (high fan-in)
fmt.0.9.0
logs.0.7.0
astring.0.8.5

# Complex build scenarios
js_of_ocaml.5.8.2       # ppx + js output
cohttp-eio.6.0.0        # many dependencies
irmin.3.9.0             # large package

# Documentation-heavy
odoc.2.4.3              # self-documenting
lwt.5.7.0               # extensive docs

# Known edge cases
ocaml-variants.5.2.0+ox # compiler variant
conf-pkg-config.3       # conf package

Snapshot Test Runner#

let test_real_snapshot ~snapshot_dir =
  let packages = read_lines (snapshot_dir / "packages.txt") in
  let expected = Expected.load (snapshot_dir / "expected.json") in

  let result = Day10.batch
    ~opam_repository:(snapshot_dir / "opam-repository")
    ~packages
    ~output_dir:(Temp.create ())
  in

  List.iter2 (fun pkg exp ->
    match exp.status, Result.find pkg result with
    | `Success, `Success -> ()
    | `Build_fail, `Build_fail _ -> ()
    | `Doc_fail, `Doc_fail _ -> ()
    | expected, actual ->
        failf "%s: expected %s but got %s" pkg
          (show_status expected) (show_status actual)
  ) packages expected.packages

Snapshot Maintenance#

Update snapshots:

Quarterly (routine refresh)
On major OCaml release (5.3, etc.)
On major odoc release
On significant opam-repository restructuring

Each update requires:

Create new snapshot directory
Run full test suite
Update expected.json with verified outcomes
Document changes in README.md

Test Execution Strategy#

Tier 1: Fast Tests (Every Commit)#

# .github/workflows/test.yml
fast-tests:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - run: opam install . --deps-only
    - run: dune build
    - run: dune runtest              # Unit tests
    - run: ./tests/run-mini-repo-tests.sh
  timeout-minutes: 5

Tier 2: Full Suite (Nightly + On-Demand)#

full-tests:
  runs-on: ubuntu-latest
  if: |
    github.event_name == 'schedule' ||
    contains(github.event.head_commit.message, '[full-tests]')
  steps:
    - uses: actions/checkout@v4
    - run: opam install . --deps-only
    - run: dune build
    - run: ./tests/run-full-suite.sh
  timeout-minutes: 90

Test Runner Scripts#

#!/bin/bash
# tests/run-mini-repo-tests.sh

set -e

FIXTURES_DIR="tests/fixtures/mini-repos"
WORK_DIR=$(mktemp -d)
trap "rm -rf $WORK_DIR" EXIT

for fixture in "$FIXTURES_DIR"/*/; do
  name=$(basename "$fixture")
  echo "=== Testing: $name ==="

  ./_build/install/default/bin/day10 batch \
    --opam-repository "$fixture/opam-repository" \
    --output-dir "$WORK_DIR/$name/output" \
    --cache-dir "$WORK_DIR/$name/cache" \
    "$fixture/packages.txt"

  ./tests/verify-output.exe "$fixture/expected.json" "$WORK_DIR/$name/output"
done

echo "All mini-repo tests passed!"

Triggering Full Tests#

Three methods:

Nightly cron - Automatic at 2 AM UTC
Commit message - Include [full-tests] in commit message
Manual dispatch - GitHub Actions workflow_dispatch button

Summary#

Component	Purpose	Tier
Unit tests	Pure logic (solver, swap, notifications)	Fast
Mini-repo fixtures	Integration with controlled packages	Fast
Notification mocks	Verify Zulip integration	Fast
Output validation	Verify HTML structure	Fast
Fault injection	OOM, disk, timeout handling	Full
Real snapshots	Compatibility with real packages	Full

Key Design Decisions#

Custom test harness over alcotest/cram - better control for container-based testing
Mini repos + real snapshots - fast iteration plus real-world confidence
Tiered execution - 2-minute fast tests, 30-60 minute full suite
Container-based fault injection - realistic resource limit testing
Abstracted notifications - clean testing without mock HTTP servers
Structure-only output validation - verify files exist without content diffing

Implementation Priority#

Test harness infrastructure - Custom runner, fixture loading
Mini-repo fixtures - Start with simple-success, dependency-chain
Output validation - File existence checks
Notification mocks - Abstract notifier interface
Fault injection - Container resource limits
Real snapshots - Create first baseline snapshot