A fork of mtelver's day10 project

Gap Analysis: Replacing ocaml-docs-ci with day10#

Date: 2026-02-03 Purpose: Comprehensive comparison of day10 (OHC) and ocaml-docs-ci to identify features, gaps, and requirements for replacing ocaml-docs-ci as the documentation CI system for docs.ocaml.org.


Table of Contents#

  1. Executive Summary
  2. Architecture Overview
  3. Feature Comparison Matrix
  4. Detailed Gap Analysis
  5. Ecosystem Integration
  6. Implementation Roadmap
  7. Risk Assessment

Executive Summary#

Current State#

Aspect day10 ocaml-docs-ci
Primary Purpose Health checking OPAM packages (build + docs) CI pipeline for docs.ocaml.org
Architecture Standalone CLI with fork-based parallelism OCurrent-based reactive pipeline
Container Runtime runc/OCI with overlay2 layers OCluster (single machine in practice)
Doc Generation Uses odoc_driver_voodoo Uses voodoo-do + odoc_driver_voodoo
State Management File-based (layer.json) SQLite database + OCurrent cache
Scalability Single machine, forked workers Single machine (OCluster theoretical)

Key Findings#

Important Context: While ocaml-docs-ci has OCluster infrastructure for theoretically distributed execution, in practice it runs on a single machine. This significantly reduces the gap between the two systems.

day10 Strengths:

  • Simpler, more portable architecture (Linux/Windows/FreeBSD)
  • Efficient overlay2-based incremental building
  • Direct container control without orchestration overhead
  • Standalone operation without external services
  • Comparable parallelism model (fork-based vs single-machine OCluster)

ocaml-docs-ci Strengths:

  • Production-proven for docs.ocaml.org
  • Reactive pipeline with automatic rebuilding
  • Rich monitoring and status APIs
  • Epoch-based atomic updates
  • Web UI for status visibility

Migration Complexity: MODERATE#

Since both systems effectively run on single machines, the gap is smaller than it might appear from the architecture diagrams. The core documentation generation is identical (both use voodoo/odoc_driver_voodoo). The main gaps are in orchestration (reactive vs manual), state management, and deployment infrastructure (epochs).


Architecture Overview#

day10 Architecture#

┌─────────────────────────────────────────────────────────────┐
│                         day10 CLI                           │
├─────────────────────────────────────────────────────────────┤
│  Commands: health-check | ci | batch | list | sync-docs     │
└─────────────────────┬───────────────────────────────────────┘
                      │
         ┌────────────┼────────────┐
         ▼            ▼            ▼
┌─────────────┐ ┌──────────┐ ┌──────────────┐
│   Solver    │ │  Builder │ │  Doc Gen     │
│ opam-0install│ │  runc    │ │odoc_driver   │
└─────────────┘ └──────────┘ └──────────────┘
         │            │            │
         └────────────┼────────────┘
                      ▼
         ┌────────────────────────┐
         │   Overlay2 Layers      │
         │   (cache_dir/)         │
         │  ├── base/fs           │
         │  ├── build-{hash}/     │
         │  ├── doc-{hash}/       │
         │  └── layer.json        │
         └────────────────────────┘

Key Characteristics:

  • Single-machine execution with fork-based parallelism
  • Layer-based caching with overlay2 filesystem
  • Deterministic hash-based layer identification
  • Direct runc container execution

ocaml-docs-ci Architecture#

┌─────────────────────────────────────────────────────────────┐
│                    ocaml-docs-ci                            │
│                  (OCurrent Pipeline)                        │
├─────────────────────────────────────────────────────────────┤
│ Stages: Track → Solve → Prep → Bless → Compile → Publish   │
└─────────────────────┬───────────────────────────────────────┘
                      │
    ┌─────────────────┼─────────────────┐
    ▼                 ▼                 ▼
┌─────────┐    ┌───────────┐    ┌──────────────┐
│ Solver  │    │  OCluster │    │   Storage    │
│ Service │    │ (Workers) │    │   Server     │
│(Cap'n P)│    │           │    │  (SSH/rsync) │
└─────────┘    └───────────┘    └──────────────┘
                    │
         ┌──────────┴──────────┐
         ▼                     ▼
┌─────────────────┐  ┌─────────────────┐
│   prep/         │  │   html/         │
│  (voodoo-prep)  │  │  (HTML output)  │
└─────────────────┘  └─────────────────┘
                           │
                           ▼
                  ┌─────────────────┐
                  │  docs.ocaml.org │
                  │ (epoch symlinks)│
                  └─────────────────┘

Key Characteristics:

  • OCluster infrastructure (but single-machine in practice)
  • Reactive pipeline (rebuilds on changes)
  • SQLite for state tracking
  • Cap'n Proto for service communication
  • Epoch-based atomic deployments

Note: Despite the distributed architecture in the diagram, ocaml-docs-ci currently runs all workers on a single machine, making it comparable to day10's fork-based approach.


Feature Comparison Matrix#

Core Features#

Feature day10 ocaml-docs-ci Gap Level
Package Building ✅ Full ✅ Full None
Documentation Generation ✅ odoc_driver_voodoo ✅ voodoo + odoc_driver None
Dependency Solving ✅ opam-0install ✅ opam-0install (service) Minor
Multiple OCaml Versions ✅ Configurable ✅ Multiple tracked None
Blessing System ✅ Implemented ✅ Implemented None
Incremental Building ✅ overlay2 layers ✅ prep caching Different approach

Orchestration & Scheduling#

Feature day10 ocaml-docs-ci Gap Level
Parallelism ✅ Fork-based (--fork N) ✅ OCluster (single machine) Similar
Distributed Execution ❌ Single machine ⚠️ Single machine (theory: multi) None (in practice)
Reactive Rebuilding ❌ Manual trigger ✅ OCurrent reactive MAJOR GAP
Job Queuing ❌ None ✅ OCluster scheduler Minor
Automatic Change Detection ❌ Manual ✅ Git-based tracking MAJOR GAP

State Management#

Feature day10 ocaml-docs-ci Gap Level
Build State Tracking ✅ layer.json files ✅ SQLite database Different
Solution Caching ✅ Per-commit hash ✅ Per-commit hash Similar
Pipeline History ❌ None ✅ Full history in DB MAJOR GAP
Package Status Tracking ⚠️ Basic (JSON) ✅ Full (DB + API) Moderate
Epoch Management ❌ None ✅ Full (atomic updates) MAJOR GAP

External Integrations#

Feature day10 ocaml-docs-ci Gap Level
opam-repository Tracking ✅ Local path ✅ Git clone + tracking Minor
Storage Backend ✅ Local filesystem ✅ SSH/rsync server Moderate
Web UI ❌ None ✅ OCurrent web MAJOR GAP
API for Querying ❌ None ✅ Cap'n Proto API MAJOR GAP
GitHub Integration ❌ None ✅ Via opam-repo Minor

Output & Publishing#

Feature day10 ocaml-docs-ci Gap Level
HTML Generation ✅ Full ✅ Full None
Search Index ✅ Via odoc_driver ✅ Via voodoo-gen None
Atomic Deployment ❌ None ✅ Epoch symlinks MAJOR GAP
Valid Package List ❌ None ✅ Published list Moderate
Sync to Remote ✅ sync-docs command ✅ rsync integration Similar

Platform Support#

Feature day10 ocaml-docs-ci Gap Level
Linux x86_64 None
Linux arm64 None
Windows ✅ containerd ❌ Linux only day10 ahead
FreeBSD day10 ahead
Multi-arch builds None

Detailed Gap Analysis#

1. CRITICAL GAPS (Must Have)#

1.1 Reactive Pipeline / Change Detection#

ocaml-docs-ci has:

  • OCurrent-based reactive pipeline that automatically rebuilds when inputs change
  • Git-based tracking of opam-repository commits
  • Automatic detection of new/updated packages
  • Dependency-aware rebuilding (if A changes, rebuild dependents)

day10 lacks:

  • No automatic change detection
  • Manual triggering required
  • No concept of "pipeline" - just single-shot execution

Implementation Options:

  1. Add OCurrent integration - Wrap day10 in OCurrent pipeline
  2. Implement custom watcher - Poll opam-repo, track changes, trigger builds
  3. External orchestration - Use GitHub Actions/Jenkins to trigger day10

Recommended: Option 1 or 3. Adding full OCurrent would be significant work but provides the richest feature set.


1.2 Distributed Execution (Not a Real Gap)#

Reality check: While ocaml-docs-ci has OCluster infrastructure, it runs on a single machine in practice. This means:

  • Both systems effectively use single-machine parallelism
  • day10's fork-based approach (--fork N) is comparable to ocaml-docs-ci's actual operation
  • OCluster adds overhead without providing real distribution benefits in current deployment

Conclusion: This is not a gap for the migration. day10's existing parallelism model is sufficient.

Future consideration: If true distribution becomes needed, day10 could add OCluster support, but this is not required for feature parity with the current production system.


1.3 Epoch-Based Deployment#

ocaml-docs-ci has:

  • Epoch system for versioned artifact collections
  • Atomic promotion via symlinks (html-current → html-live)
  • Garbage collection of old epochs
  • Safe rollback capability

day10 lacks:

  • No epoch concept
  • Direct file output
  • No atomic update mechanism

Implementation Required:

  • Add epoch directory management
  • Implement symlink-based promotion
  • Add epoch cleanup/GC functionality
  • Support for html-currenthtml-live workflow

1.4 Web UI & Monitoring#

ocaml-docs-ci has:

  • OCurrent-based web dashboard
  • Real-time pipeline status
  • Job logs viewable in browser
  • Package-level status tracking

day10 lacks:

  • No web interface
  • CLI-only interaction
  • No real-time monitoring

Implementation Options:

  1. Use OCurrent web - If integrating with OCurrent
  2. Build custom web UI - Separate web service reading day10 state
  3. Static status pages - Generate HTML status reports

Recommended: Option 1 if using OCurrent, otherwise Option 3 for minimal viable monitoring.


1.5 Remote API#

ocaml-docs-ci has:

  • Cap'n Proto RPC API for querying pipeline state
  • Package status queries
  • Pipeline health checks
  • CLI client (ocaml-docs-ci-client)

day10 lacks:

  • No remote API
  • No programmatic access to state
  • Cannot query status without reading files

Implementation Options:

  1. Add Cap'n Proto service - Match ocaml-docs-ci interface
  2. REST API - Simpler but different from existing ecosystem
  3. GraphQL - Modern but overkill for this use case

Recommended: Option 1 for compatibility with existing tooling.


2. MODERATE GAPS (Should Have)#

2.1 Database-Backed State#

ocaml-docs-ci: SQLite database tracking pipeline runs, package statuses, build history

day10: File-based state (layer.json, JSON outputs)

Gap Impact: Harder to query historical data, no pipeline-level tracking

Implementation: Add SQLite or similar for tracking builds over time


2.2 Solver Service Architecture#

ocaml-docs-ci: External solver service via Cap'n Proto, can run multiple solvers in parallel

day10: In-process solving, one solve at a time per fork

Gap Impact: Potentially slower for large solve operations

Implementation: Could extract solver to service, but current approach works


2.3 Valid Package List Publishing#

ocaml-docs-ci: Publishes list of successfully-built packages for ocaml.org filtering

day10: No concept of valid package list

Implementation: Add post-build step to generate/publish valid package manifest


3. MINOR GAPS (Nice to Have)#

3.1 Storage Server Integration#

ocaml-docs-ci: SSH/rsync to remote storage server, automatic sync

day10: Local filesystem, manual sync-docs command

Gap Impact: Requires additional orchestration for remote deployment


3.2 Multiple opam-repository Sources#

ocaml-docs-ci: Tracks specific git repository with commit history

day10: Supports multiple local paths, no git tracking

Gap Impact: Cannot automatically detect new packages


4. DAY10 ADVANTAGES#

Features day10 has that ocaml-docs-ci lacks:

Feature Benefit
Windows Support Can build Windows packages
FreeBSD Support Can build BSD packages
Simpler Deployment No cluster infrastructure needed
Layer-based Caching More efficient disk usage with overlay2
Standalone Operation Works without external services (OCluster, solver-service)
Direct Container Control Lower latency, no scheduler overhead
Equivalent Parallelism Fork-based model matches ocaml-docs-ci's actual single-machine operation
Simpler Debugging No distributed system complexity to troubleshoot

Ecosystem Integration#

Voodoo Integration#

Both day10 and ocaml-docs-ci use the same documentation toolchain:

                     ┌─────────────────┐
                     │  voodoo-prep    │
                     │ (artifact prep) │
                     └────────┬────────┘
                              │
              ┌───────────────┴───────────────┐
              ▼                               ▼
     ┌─────────────────┐             ┌─────────────────┐
     │   voodoo-do     │             │odoc_driver_voodoo│
     │  (compile/link) │             │   (all-in-one)   │
     └────────┬────────┘             └────────┬────────┘
              │                               │
              └───────────────┬───────────────┘
                              ▼
                     ┌─────────────────┐
                     │   voodoo-gen    │
                     │  (HTML output)  │
                     └─────────────────┘

day10 uses: odoc_driver_voodoo (modern unified approach) ocaml-docs-ci uses: Both voodoo-do and odoc_driver_voodoo

Integration Status: ✅ Compatible - both can produce compatible output

OCluster Integration (Optional - Not Required for Parity)#

Note: Since ocaml-docs-ci runs on a single machine in practice, OCluster integration is not required for feature parity. day10's existing fork-based parallelism provides equivalent functionality.

Current ocaml-docs-ci reality:
┌─────────────────────────────────────────────────────────────┐
│                     OCluster Scheduler                       │
│                    (Single Machine)                         │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
                   ┌───────────┐
                   │  Worker   │  ← All workers on same machine
                   │ (linux-   │
                   │  x86_64)  │
                   └───────────┘

If future scaling is needed, day10 could add OCluster:

  1. Add current_ocluster dependency
  2. Generate OBuilder specs from day10 build commands
  3. Submit jobs via OCluster API
  4. Collect results from worker output

But this is a future enhancement, not a migration requirement.

Solver Service Integration#

The solver-service repository provides a standalone solving service:

┌──────────────┐     Cap'n Proto     ┌────────────────┐
│   day10      │ ─────────────────── │ solver-service │
│  (client)    │      solve()        │   (server)     │
└──────────────┘                     └────────────────┘

Current day10: In-process opam-0install Migration option: Use solver-service for consistency with ecosystem


Implementation Roadmap#

Phase 1: Core Infrastructure (Weeks 1-4)#

Goal: Establish foundation for docs.ocaml.org integration

Task Priority Effort Dependencies
1.1 Add epoch management P0 Medium None
1.2 Implement valid package list P0 Low None
1.3 Add remote storage sync (SSH/rsync) P0 Medium None
1.4 SQLite state tracking P1 Medium None

Deliverable: day10 can produce epoch-structured output compatible with docs.ocaml.org

Phase 2: Change Detection (Weeks 5-8)#

Goal: Automatic rebuilding on opam-repository changes

Task Priority Effort Dependencies
2.1 Git-based opam-repo tracking P0 Medium None
2.2 Change detection algorithm P0 High 2.1
2.3 Dependency-aware rebuild P1 High 2.2
2.4 Incremental solution updates P1 Medium 2.2

Deliverable: day10 can detect and rebuild changed packages automatically

Phase 3: Distributed Execution Skipped#

Not required: Since ocaml-docs-ci runs on a single machine in practice, day10's existing fork-based parallelism (--fork N) provides equivalent functionality. OCluster integration can be added later if true distribution becomes necessary.

Time saved: 6 weeks

Phase 3 (was 4): Monitoring & API (Weeks 9-12)#

Goal: Production observability and integration

Task Priority Effort Dependencies
3.1 Cap'n Proto API service P1 High 1.4
3.2 Status query endpoints P1 Medium 3.1
3.3 Web dashboard (or static pages) P2 Medium 3.1
3.4 Health check endpoints P2 Low 3.1

Note: API/monitoring is lower priority if day10 runs as a batch job (like ocaml-docs-ci in practice).

Deliverable: day10 provides status visibility (at minimum via static pages/JSON)

Phase 4 (was 5): Migration & Cutover (Weeks 13-16)#

Goal: Replace ocaml-docs-ci in production

Task Priority Effort Dependencies
4.1 Parallel run comparison P0 Medium All above
4.2 Output compatibility validation P0 Medium 4.1
4.3 Gradual traffic shift P0 Low 4.2
4.4 Full cutover P0 Low 4.3
4.5 ocaml-docs-ci deprecation P2 Low 4.4

Deliverable: day10 is the production system for docs.ocaml.org

Revised Timeline Summary#

Phase Original Revised Savings
Core Infrastructure Weeks 1-4 Weeks 1-4 -
Change Detection Weeks 5-8 Weeks 5-8 -
Distributed Execution Weeks 9-14 Skipped 6 weeks
Monitoring & API Weeks 15-18 Weeks 9-12 -
Migration Weeks 19-22 Weeks 13-16 -
Total 22 weeks 16 weeks 6 weeks

Risk Assessment#

High Risk#

Risk Probability Impact Mitigation
Output format incompatibility Low High Comprehensive comparison testing
Epoch management bugs Medium High Extensive testing, staged rollout

Medium Risk#

Risk Probability Impact Mitigation
Performance regression Medium Medium Benchmark early, optimize iteratively
Change detection complexity Medium Medium Start with simple polling approach
State tracking gaps Medium Medium Design carefully, review with team

Low Risk#

Risk Probability Impact Mitigation
Voodoo incompatibility Low High Already using same tools
Platform regressions Low Low Existing test coverage
Parallelism issues Low Low Both systems use single-machine model

Note: OCluster integration risk removed since it's not required for parity.


Recommendations#

Immediate Actions#

  1. Validate voodoo compatibility - Confirm day10 and ocaml-docs-ci produce identical HTML output for the same package
  2. Design epoch system - Document epoch structure and promotion workflow
  3. Prototype change detection - Simple git-based tracking of opam-repository changes

Architecture Decision#

Recommended Approach: Incremental enhancement of day10

Since both systems run on single machines in practice, day10's architecture is actually well-suited for the task. The migration is simpler than the theoretical architecture comparison suggests.

Key additions needed:

  1. Epoch management - For atomic deployments (similar to ocaml-docs-ci)
  2. Change detection - Git-based tracking of opam-repository
  3. Valid package list - For ocaml.org integration
  4. Status reporting - JSON/static HTML for visibility

Not needed for parity:

  • OCluster integration (single-machine in practice)
  • Full OCurrent reactive pipeline (can use simpler cron/polling)
  • Cap'n Proto API (if batch job model is acceptable)

Simplest Migration Path#

Rather than adding OCurrent complexity, consider a simpler operational model:

# Cron job or systemd timer
while true; do
  git -C /opam-repo pull
  if [ $(git rev-parse HEAD) != $(cat /state/last-commit) ]; then
    day10 batch --cache-dir /cache --opam-repository /opam-repo \
      --html-output /data/html-current @changed-packages.json
    # Atomic promotion
    ln -sfn /data/html-current /data/html-live
    git rev-parse HEAD > /state/last-commit
  fi
  sleep 3600
done

This provides:

  • Automatic change detection
  • Incremental rebuilding
  • Atomic deployments
  • No additional infrastructure

Alternative: OCurrent Wrapper#

If reactive behavior and web UI are required, wrap day10 in OCurrent:

(* Hypothetical OCurrent pipeline using day10 *)
let pipeline =
  let packages = track_opam_repo () in
  let solutions = Current.list_map solve packages in
  let builds = Current.list_map (day10_build ~config) solutions in
  let docs = Current.list_map (day10_docs ~config) builds in
  publish_epoch docs

This adds complexity but provides OCurrent's monitoring and caching.


Appendix A: File Structure Comparison#

day10 Output Structure#

cache_dir/
├── {os_key}/
│   ├── base/fs/
│   ├── build-{hash}/
│   │   ├── fs/
│   │   └── layer.json
│   └── doc-{hash}/
│       ├── fs/
│       │   └── html/
│       │       ├── p/{pkg}/{ver}/
│       │       └── u/{universe}/{pkg}/{ver}/
│       └── layer.json
└── solutions/
    └── {repo-sha}/
        └── {pkg}.json

ocaml-docs-ci Output Structure#

/data/
├── prep/
│   └── universes/{u}/{pkg}/{ver}/
├── compile/
│   ├── p/{pkg}/{ver}/
│   └── u/{u}/{pkg}/{ver}/
├── linked/
│   ├── p/{pkg}/{ver}/
│   └── u/{u}/{pkg}/{ver}/
├── html-raw/
│   ├── p/{pkg}/{ver}/
│   └── u/{u}/{pkg}/{ver}/
└── epoch-{hash}/
    └── html/
        └── (symlinks to html-raw)

Appendix B: Glossary#

Term Definition
Epoch A versioned collection of documentation artifacts, enabling atomic updates
Blessed The canonical/primary documentation version for a package (lives in p/)
Universe A specific set of package dependencies, identified by hash
Layer An overlay2 filesystem layer containing build artifacts
OCluster OCaml's distributed build cluster system
OCurrent Reactive CI/CD pipeline framework for OCaml
voodoo Documentation preparation and generation toolchain
odoc_driver_voodoo Unified driver for odoc compilation/linking/generation

Repository Purpose URL
ocaml-docs-ci Current docs.ocaml.org CI github.com/ocurrent/ocaml-docs-ci
voodoo Doc preparation tools github.com/ocaml-doc/voodoo
ocluster Distributed build cluster github.com/ocurrent/ocluster
solver-service Dependency solving service github.com/ocurrent/solver-service
odoc Documentation compiler github.com/ocaml/odoc