Gap Analysis: Replacing ocaml-docs-ci with day10#
Date: 2026-02-03
Purpose: Comprehensive comparison of day10 (OHC) and ocaml-docs-ci to identify features, gaps, and requirements for replacing ocaml-docs-ci as the documentation CI system for docs.ocaml.org.
Table of Contents#
- Executive Summary
- Architecture Overview
- Feature Comparison Matrix
- Detailed Gap Analysis
- Ecosystem Integration
- Implementation Roadmap
- Risk Assessment
Executive Summary#
Current State#
| Aspect | day10 | ocaml-docs-ci |
|---|---|---|
| Primary Purpose | Health checking OPAM packages (build + docs) | CI pipeline for docs.ocaml.org |
| Architecture | Standalone CLI with fork-based parallelism | OCurrent-based reactive pipeline |
| Container Runtime | runc/OCI with overlay2 layers | OCluster (single machine in practice) |
| Doc Generation | Uses odoc_driver_voodoo | Uses voodoo-do + odoc_driver_voodoo |
| State Management | File-based (layer.json) | SQLite database + OCurrent cache |
| Scalability | Single machine, forked workers | Single machine (OCluster theoretical) |
Key Findings#
Important Context: While ocaml-docs-ci has OCluster infrastructure for theoretically distributed execution, in practice it runs on a single machine. This significantly reduces the gap between the two systems.
day10 Strengths:
- Simpler, more portable architecture (Linux/Windows/FreeBSD)
- Efficient overlay2-based incremental building
- Direct container control without orchestration overhead
- Standalone operation without external services
- Comparable parallelism model (fork-based vs single-machine OCluster)
ocaml-docs-ci Strengths:
- Production-proven for docs.ocaml.org
- Reactive pipeline with automatic rebuilding
- Rich monitoring and status APIs
- Epoch-based atomic updates
- Web UI for status visibility
Migration Complexity: MODERATE#
Since both systems effectively run on single machines, the gap is smaller than it might appear from the architecture diagrams. The core documentation generation is identical (both use voodoo/odoc_driver_voodoo). The main gaps are in orchestration (reactive vs manual), state management, and deployment infrastructure (epochs).
Architecture Overview#
day10 Architecture#
┌─────────────────────────────────────────────────────────────┐
│ day10 CLI │
├─────────────────────────────────────────────────────────────┤
│ Commands: health-check | ci | batch | list | sync-docs │
└─────────────────────┬───────────────────────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌─────────────┐ ┌──────────┐ ┌──────────────┐
│ Solver │ │ Builder │ │ Doc Gen │
│ opam-0install│ │ runc │ │odoc_driver │
└─────────────┘ └──────────┘ └──────────────┘
│ │ │
└────────────┼────────────┘
▼
┌────────────────────────┐
│ Overlay2 Layers │
│ (cache_dir/) │
│ ├── base/fs │
│ ├── build-{hash}/ │
│ ├── doc-{hash}/ │
│ └── layer.json │
└────────────────────────┘
Key Characteristics:
- Single-machine execution with fork-based parallelism
- Layer-based caching with overlay2 filesystem
- Deterministic hash-based layer identification
- Direct runc container execution
ocaml-docs-ci Architecture#
┌─────────────────────────────────────────────────────────────┐
│ ocaml-docs-ci │
│ (OCurrent Pipeline) │
├─────────────────────────────────────────────────────────────┤
│ Stages: Track → Solve → Prep → Bless → Compile → Publish │
└─────────────────────┬───────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌──────────────┐
│ Solver │ │ OCluster │ │ Storage │
│ Service │ │ (Workers) │ │ Server │
│(Cap'n P)│ │ │ │ (SSH/rsync) │
└─────────┘ └───────────┘ └──────────────┘
│
┌──────────┴──────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ prep/ │ │ html/ │
│ (voodoo-prep) │ │ (HTML output) │
└─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ docs.ocaml.org │
│ (epoch symlinks)│
└─────────────────┘
Key Characteristics:
- OCluster infrastructure (but single-machine in practice)
- Reactive pipeline (rebuilds on changes)
- SQLite for state tracking
- Cap'n Proto for service communication
- Epoch-based atomic deployments
Note: Despite the distributed architecture in the diagram, ocaml-docs-ci currently runs all workers on a single machine, making it comparable to day10's fork-based approach.
Feature Comparison Matrix#
Core Features#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| Package Building | ✅ Full | ✅ Full | None |
| Documentation Generation | ✅ odoc_driver_voodoo | ✅ voodoo + odoc_driver | None |
| Dependency Solving | ✅ opam-0install | ✅ opam-0install (service) | Minor |
| Multiple OCaml Versions | ✅ Configurable | ✅ Multiple tracked | None |
| Blessing System | ✅ Implemented | ✅ Implemented | None |
| Incremental Building | ✅ overlay2 layers | ✅ prep caching | Different approach |
Orchestration & Scheduling#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| Parallelism | ✅ Fork-based (--fork N) | ✅ OCluster (single machine) | Similar |
| Distributed Execution | ❌ Single machine | ⚠️ Single machine (theory: multi) | None (in practice) |
| Reactive Rebuilding | ❌ Manual trigger | ✅ OCurrent reactive | MAJOR GAP |
| Job Queuing | ❌ None | ✅ OCluster scheduler | Minor |
| Automatic Change Detection | ❌ Manual | ✅ Git-based tracking | MAJOR GAP |
State Management#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| Build State Tracking | ✅ layer.json files | ✅ SQLite database | Different |
| Solution Caching | ✅ Per-commit hash | ✅ Per-commit hash | Similar |
| Pipeline History | ❌ None | ✅ Full history in DB | MAJOR GAP |
| Package Status Tracking | ⚠️ Basic (JSON) | ✅ Full (DB + API) | Moderate |
| Epoch Management | ❌ None | ✅ Full (atomic updates) | MAJOR GAP |
External Integrations#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| opam-repository Tracking | ✅ Local path | ✅ Git clone + tracking | Minor |
| Storage Backend | ✅ Local filesystem | ✅ SSH/rsync server | Moderate |
| Web UI | ❌ None | ✅ OCurrent web | MAJOR GAP |
| API for Querying | ❌ None | ✅ Cap'n Proto API | MAJOR GAP |
| GitHub Integration | ❌ None | ✅ Via opam-repo | Minor |
Output & Publishing#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| HTML Generation | ✅ Full | ✅ Full | None |
| Search Index | ✅ Via odoc_driver | ✅ Via voodoo-gen | None |
| Atomic Deployment | ❌ None | ✅ Epoch symlinks | MAJOR GAP |
| Valid Package List | ❌ None | ✅ Published list | Moderate |
| Sync to Remote | ✅ sync-docs command | ✅ rsync integration | Similar |
Platform Support#
| Feature | day10 | ocaml-docs-ci | Gap Level |
|---|---|---|---|
| Linux x86_64 | ✅ | ✅ | None |
| Linux arm64 | ✅ | ✅ | None |
| Windows | ✅ containerd | ❌ Linux only | day10 ahead |
| FreeBSD | ✅ | ❌ | day10 ahead |
| Multi-arch builds | ✅ | ✅ | None |
Detailed Gap Analysis#
1. CRITICAL GAPS (Must Have)#
1.1 Reactive Pipeline / Change Detection#
ocaml-docs-ci has:
- OCurrent-based reactive pipeline that automatically rebuilds when inputs change
- Git-based tracking of opam-repository commits
- Automatic detection of new/updated packages
- Dependency-aware rebuilding (if A changes, rebuild dependents)
day10 lacks:
- No automatic change detection
- Manual triggering required
- No concept of "pipeline" - just single-shot execution
Implementation Options:
- Add OCurrent integration - Wrap day10 in OCurrent pipeline
- Implement custom watcher - Poll opam-repo, track changes, trigger builds
- External orchestration - Use GitHub Actions/Jenkins to trigger day10
Recommended: Option 1 or 3. Adding full OCurrent would be significant work but provides the richest feature set.
1.2 Distributed Execution (Not a Real Gap)#
Reality check: While ocaml-docs-ci has OCluster infrastructure, it runs on a single machine in practice. This means:
- Both systems effectively use single-machine parallelism
- day10's fork-based approach (
--fork N) is comparable to ocaml-docs-ci's actual operation - OCluster adds overhead without providing real distribution benefits in current deployment
Conclusion: This is not a gap for the migration. day10's existing parallelism model is sufficient.
Future consideration: If true distribution becomes needed, day10 could add OCluster support, but this is not required for feature parity with the current production system.
1.3 Epoch-Based Deployment#
ocaml-docs-ci has:
- Epoch system for versioned artifact collections
- Atomic promotion via symlinks (html-current → html-live)
- Garbage collection of old epochs
- Safe rollback capability
day10 lacks:
- No epoch concept
- Direct file output
- No atomic update mechanism
Implementation Required:
- Add epoch directory management
- Implement symlink-based promotion
- Add epoch cleanup/GC functionality
- Support for
html-current→html-liveworkflow
1.4 Web UI & Monitoring#
ocaml-docs-ci has:
- OCurrent-based web dashboard
- Real-time pipeline status
- Job logs viewable in browser
- Package-level status tracking
day10 lacks:
- No web interface
- CLI-only interaction
- No real-time monitoring
Implementation Options:
- Use OCurrent web - If integrating with OCurrent
- Build custom web UI - Separate web service reading day10 state
- Static status pages - Generate HTML status reports
Recommended: Option 1 if using OCurrent, otherwise Option 3 for minimal viable monitoring.
1.5 Remote API#
ocaml-docs-ci has:
- Cap'n Proto RPC API for querying pipeline state
- Package status queries
- Pipeline health checks
- CLI client (ocaml-docs-ci-client)
day10 lacks:
- No remote API
- No programmatic access to state
- Cannot query status without reading files
Implementation Options:
- Add Cap'n Proto service - Match ocaml-docs-ci interface
- REST API - Simpler but different from existing ecosystem
- GraphQL - Modern but overkill for this use case
Recommended: Option 1 for compatibility with existing tooling.
2. MODERATE GAPS (Should Have)#
2.1 Database-Backed State#
ocaml-docs-ci: SQLite database tracking pipeline runs, package statuses, build history
day10: File-based state (layer.json, JSON outputs)
Gap Impact: Harder to query historical data, no pipeline-level tracking
Implementation: Add SQLite or similar for tracking builds over time
2.2 Solver Service Architecture#
ocaml-docs-ci: External solver service via Cap'n Proto, can run multiple solvers in parallel
day10: In-process solving, one solve at a time per fork
Gap Impact: Potentially slower for large solve operations
Implementation: Could extract solver to service, but current approach works
2.3 Valid Package List Publishing#
ocaml-docs-ci: Publishes list of successfully-built packages for ocaml.org filtering
day10: No concept of valid package list
Implementation: Add post-build step to generate/publish valid package manifest
3. MINOR GAPS (Nice to Have)#
3.1 Storage Server Integration#
ocaml-docs-ci: SSH/rsync to remote storage server, automatic sync
day10: Local filesystem, manual sync-docs command
Gap Impact: Requires additional orchestration for remote deployment
3.2 Multiple opam-repository Sources#
ocaml-docs-ci: Tracks specific git repository with commit history
day10: Supports multiple local paths, no git tracking
Gap Impact: Cannot automatically detect new packages
4. DAY10 ADVANTAGES#
Features day10 has that ocaml-docs-ci lacks:
| Feature | Benefit |
|---|---|
| Windows Support | Can build Windows packages |
| FreeBSD Support | Can build BSD packages |
| Simpler Deployment | No cluster infrastructure needed |
| Layer-based Caching | More efficient disk usage with overlay2 |
| Standalone Operation | Works without external services (OCluster, solver-service) |
| Direct Container Control | Lower latency, no scheduler overhead |
| Equivalent Parallelism | Fork-based model matches ocaml-docs-ci's actual single-machine operation |
| Simpler Debugging | No distributed system complexity to troubleshoot |
Ecosystem Integration#
Voodoo Integration#
Both day10 and ocaml-docs-ci use the same documentation toolchain:
┌─────────────────┐
│ voodoo-prep │
│ (artifact prep) │
└────────┬────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ voodoo-do │ │odoc_driver_voodoo│
│ (compile/link) │ │ (all-in-one) │
└────────┬────────┘ └────────┬────────┘
│ │
└───────────────┬───────────────┘
▼
┌─────────────────┐
│ voodoo-gen │
│ (HTML output) │
└─────────────────┘
day10 uses: odoc_driver_voodoo (modern unified approach) ocaml-docs-ci uses: Both voodoo-do and odoc_driver_voodoo
Integration Status: ✅ Compatible - both can produce compatible output
OCluster Integration (Optional - Not Required for Parity)#
Note: Since ocaml-docs-ci runs on a single machine in practice, OCluster integration is not required for feature parity. day10's existing fork-based parallelism provides equivalent functionality.
Current ocaml-docs-ci reality:
┌─────────────────────────────────────────────────────────────┐
│ OCluster Scheduler │
│ (Single Machine) │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌───────────┐
│ Worker │ ← All workers on same machine
│ (linux- │
│ x86_64) │
└───────────┘
If future scaling is needed, day10 could add OCluster:
- Add
current_oclusterdependency - Generate OBuilder specs from day10 build commands
- Submit jobs via OCluster API
- Collect results from worker output
But this is a future enhancement, not a migration requirement.
Solver Service Integration#
The solver-service repository provides a standalone solving service:
┌──────────────┐ Cap'n Proto ┌────────────────┐
│ day10 │ ─────────────────── │ solver-service │
│ (client) │ solve() │ (server) │
└──────────────┘ └────────────────┘
Current day10: In-process opam-0install Migration option: Use solver-service for consistency with ecosystem
Implementation Roadmap#
Phase 1: Core Infrastructure (Weeks 1-4)#
Goal: Establish foundation for docs.ocaml.org integration
| Task | Priority | Effort | Dependencies |
|---|---|---|---|
| 1.1 Add epoch management | P0 | Medium | None |
| 1.2 Implement valid package list | P0 | Low | None |
| 1.3 Add remote storage sync (SSH/rsync) | P0 | Medium | None |
| 1.4 SQLite state tracking | P1 | Medium | None |
Deliverable: day10 can produce epoch-structured output compatible with docs.ocaml.org
Phase 2: Change Detection (Weeks 5-8)#
Goal: Automatic rebuilding on opam-repository changes
| Task | Priority | Effort | Dependencies |
|---|---|---|---|
| 2.1 Git-based opam-repo tracking | P0 | Medium | None |
| 2.2 Change detection algorithm | P0 | High | 2.1 |
| 2.3 Dependency-aware rebuild | P1 | High | 2.2 |
| 2.4 Incremental solution updates | P1 | Medium | 2.2 |
Deliverable: day10 can detect and rebuild changed packages automatically
Phase 3: Distributed Execution Skipped#
Not required: Since ocaml-docs-ci runs on a single machine in practice, day10's existing fork-based parallelism (--fork N) provides equivalent functionality. OCluster integration can be added later if true distribution becomes necessary.
Time saved: 6 weeks
Phase 3 (was 4): Monitoring & API (Weeks 9-12)#
Goal: Production observability and integration
| Task | Priority | Effort | Dependencies |
|---|---|---|---|
| 3.1 Cap'n Proto API service | P1 | High | 1.4 |
| 3.2 Status query endpoints | P1 | Medium | 3.1 |
| 3.3 Web dashboard (or static pages) | P2 | Medium | 3.1 |
| 3.4 Health check endpoints | P2 | Low | 3.1 |
Note: API/monitoring is lower priority if day10 runs as a batch job (like ocaml-docs-ci in practice).
Deliverable: day10 provides status visibility (at minimum via static pages/JSON)
Phase 4 (was 5): Migration & Cutover (Weeks 13-16)#
Goal: Replace ocaml-docs-ci in production
| Task | Priority | Effort | Dependencies |
|---|---|---|---|
| 4.1 Parallel run comparison | P0 | Medium | All above |
| 4.2 Output compatibility validation | P0 | Medium | 4.1 |
| 4.3 Gradual traffic shift | P0 | Low | 4.2 |
| 4.4 Full cutover | P0 | Low | 4.3 |
| 4.5 ocaml-docs-ci deprecation | P2 | Low | 4.4 |
Deliverable: day10 is the production system for docs.ocaml.org
Revised Timeline Summary#
| Phase | Original | Revised | Savings |
|---|---|---|---|
| Core Infrastructure | Weeks 1-4 | Weeks 1-4 | - |
| Change Detection | Weeks 5-8 | Weeks 5-8 | - |
| Distributed Execution | Weeks 9-14 | Skipped | 6 weeks |
| Monitoring & API | Weeks 15-18 | Weeks 9-12 | - |
| Migration | Weeks 19-22 | Weeks 13-16 | - |
| Total | 22 weeks | 16 weeks | 6 weeks |
Risk Assessment#
High Risk#
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Output format incompatibility | Low | High | Comprehensive comparison testing |
| Epoch management bugs | Medium | High | Extensive testing, staged rollout |
Medium Risk#
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Performance regression | Medium | Medium | Benchmark early, optimize iteratively |
| Change detection complexity | Medium | Medium | Start with simple polling approach |
| State tracking gaps | Medium | Medium | Design carefully, review with team |
Low Risk#
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Voodoo incompatibility | Low | High | Already using same tools |
| Platform regressions | Low | Low | Existing test coverage |
| Parallelism issues | Low | Low | Both systems use single-machine model |
Note: OCluster integration risk removed since it's not required for parity.
Recommendations#
Immediate Actions#
- Validate voodoo compatibility - Confirm day10 and ocaml-docs-ci produce identical HTML output for the same package
- Design epoch system - Document epoch structure and promotion workflow
- Prototype change detection - Simple git-based tracking of opam-repository changes
Architecture Decision#
Recommended Approach: Incremental enhancement of day10
Since both systems run on single machines in practice, day10's architecture is actually well-suited for the task. The migration is simpler than the theoretical architecture comparison suggests.
Key additions needed:
- Epoch management - For atomic deployments (similar to ocaml-docs-ci)
- Change detection - Git-based tracking of opam-repository
- Valid package list - For ocaml.org integration
- Status reporting - JSON/static HTML for visibility
Not needed for parity:
- OCluster integration (single-machine in practice)
- Full OCurrent reactive pipeline (can use simpler cron/polling)
- Cap'n Proto API (if batch job model is acceptable)
Simplest Migration Path#
Rather than adding OCurrent complexity, consider a simpler operational model:
# Cron job or systemd timer
while true; do
git -C /opam-repo pull
if [ $(git rev-parse HEAD) != $(cat /state/last-commit) ]; then
day10 batch --cache-dir /cache --opam-repository /opam-repo \
--html-output /data/html-current @changed-packages.json
# Atomic promotion
ln -sfn /data/html-current /data/html-live
git rev-parse HEAD > /state/last-commit
fi
sleep 3600
done
This provides:
- Automatic change detection
- Incremental rebuilding
- Atomic deployments
- No additional infrastructure
Alternative: OCurrent Wrapper#
If reactive behavior and web UI are required, wrap day10 in OCurrent:
(* Hypothetical OCurrent pipeline using day10 *)
let pipeline =
let packages = track_opam_repo () in
let solutions = Current.list_map solve packages in
let builds = Current.list_map (day10_build ~config) solutions in
let docs = Current.list_map (day10_docs ~config) builds in
publish_epoch docs
This adds complexity but provides OCurrent's monitoring and caching.
Appendix A: File Structure Comparison#
day10 Output Structure#
cache_dir/
├── {os_key}/
│ ├── base/fs/
│ ├── build-{hash}/
│ │ ├── fs/
│ │ └── layer.json
│ └── doc-{hash}/
│ ├── fs/
│ │ └── html/
│ │ ├── p/{pkg}/{ver}/
│ │ └── u/{universe}/{pkg}/{ver}/
│ └── layer.json
└── solutions/
└── {repo-sha}/
└── {pkg}.json
ocaml-docs-ci Output Structure#
/data/
├── prep/
│ └── universes/{u}/{pkg}/{ver}/
├── compile/
│ ├── p/{pkg}/{ver}/
│ └── u/{u}/{pkg}/{ver}/
├── linked/
│ ├── p/{pkg}/{ver}/
│ └── u/{u}/{pkg}/{ver}/
├── html-raw/
│ ├── p/{pkg}/{ver}/
│ └── u/{u}/{pkg}/{ver}/
└── epoch-{hash}/
└── html/
└── (symlinks to html-raw)
Appendix B: Glossary#
| Term | Definition |
|---|---|
| Epoch | A versioned collection of documentation artifacts, enabling atomic updates |
| Blessed | The canonical/primary documentation version for a package (lives in p/) |
| Universe | A specific set of package dependencies, identified by hash |
| Layer | An overlay2 filesystem layer containing build artifacts |
| OCluster | OCaml's distributed build cluster system |
| OCurrent | Reactive CI/CD pipeline framework for OCaml |
| voodoo | Documentation preparation and generation toolchain |
| odoc_driver_voodoo | Unified driver for odoc compilation/linking/generation |
Appendix C: Related Repositories#
| Repository | Purpose | URL |
|---|---|---|
| ocaml-docs-ci | Current docs.ocaml.org CI | github.com/ocurrent/ocaml-docs-ci |
| voodoo | Doc preparation tools | github.com/ocaml-doc/voodoo |
| ocluster | Distributed build cluster | github.com/ocurrent/ocluster |
| solver-service | Dependency solving service | github.com/ocurrent/solver-service |
| odoc | Documentation compiler | github.com/ocaml/odoc |