My aggregated monorepo of OCaml code, automaintained

Site redesign, new content, blog gen, E2E tests, and build improvements

New blog posts (monopam-madness, open-source-and-ai, weeknotes-2026-10),
notebook showcase with card layout and screenshots, Atom feed generator,
foundations notebook fixes, ONNX test improvements, widget interaction
tests, deploy script updates for oxcaml switch, and .gitignore for
build artifacts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+4570 -1450
+8
.gitignore
··· 1 1 _build 2 + _site/ 2 3 *.bc.js 4 + .playwright-mcp/ 5 + test-results/ 6 + test/e2e/test-results/ 7 + test/e2e/playwright-report/ 8 + test/e2e/node_modules/ 9 + \#*\# 10 + .#*
+1213 -870
atom.xml
··· 1 1 <?xml version="1.0" encoding="UTF-8"?> 2 - <feed xmlns="http://www.w3.org/2005/Atom"><link href="https://jon.recoil.org/blog/" rel="alternate"/><link href="https://jon.recoil.org/atom.xml" rel="self"/><id>https://jon.recoil.org/atom.xml</id><title type="text">Jon's blog</title><updated>2026-03-06T15:40:56-00:00</updated><entry><summary type="text">Highlights:</summary><published>2026-02-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-week-6&quot;&gt;&lt;a href=&quot;#weeknotes-for-week-6&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for week 6&lt;/h1&gt; 3 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-02-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 4 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.xref2,odoc.loader,odoc.model&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 5 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;packages&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;packages&lt;/span&gt; &lt;p&gt;odoc&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 6 - &lt;p&gt;Highlights:&lt;/p&gt; 7 - &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/day10-jtw/standalone/index.html&quot;&gt;day10 / javascript toplevels integration&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/scrollycoder/&quot;&gt;Scrollycode experiments&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt; 8 - &lt;h2 id=&quot;oxmono&quot;&gt;&lt;a href=&quot;#oxmono&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxmono&lt;/h2&gt; 9 - &lt;p&gt;I spent some time on Anil's oxmono repo getting odoc to work correctly. It turned out that the bug I was working on last week was critically important for this - and that the bugfix was incomplete. One of the issues was to do with identifiers needing to be unique. For example, consider the following code:&lt;/p&gt; 10 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S = sig 2 + <feed xmlns="http://www.w3.org/2005/Atom"> 3 + <id>https://jon.recoil.org/atom.xml</id> 4 + <title>Jon's blog</title> 5 + <updated>2026-03-10T00:31:40Z</updated> 6 + <author> 7 + <name>Jon Ludlam</name> 8 + <uri>https://jon.recoil.org/</uri> 9 + </author> 10 + <link rel="self" href="https://jon.recoil.org/atom.xml"/> 11 + <link rel="alternate" href="https://jon.recoil.org/blog/"/> 12 + <entry> 13 + <id>https://jon.recoil.org/blog/2026/03/weeknotes-2026-10.html</id> 14 + <title>Weeknotes 2026 week 10</title> 15 + <published>2026-03-09T00:00:00Z</published> 16 + <updated>2026-03-09T00:00:00Z</updated> 17 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/03/weeknotes-2026-10.html"/> 18 + <summary>Here are my weeknotes for the last week, while I'm still writing up some more focused posts on some specific topics - like the experience of putting everything in a monorepo to create this site, and m...</summary> 19 + <content type="html"><![CDATA[<h1 id="weeknotes-2026-week-10"><a href="#weeknotes-2026-week-10" class="anchor"></a>Weeknotes 2026 week 10</h1> 20 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-03-09</p></li></ul> 21 + <p>Here are my weeknotes for the last week, while I'm still writing up some more focused posts on some specific topics - like the experience of putting everything in a monorepo to create this site, and more notes on Claude and Agentic coding in general, and its impact on the world of software. But for now, here's what I've been up to.</p> 22 + <h2 id="what-did-i-do?"><a href="#what-did-i-do?" class="anchor"></a>What did I do?</h2> 23 + <ul><li><p>New site design. The old site was a bit of a mess and was simply reusing odoc's default default styling. I've also rearranged the content a bit to make it more navigable and cohesive.</p><div><a href="old.png" class="img-link"><img src="old.png" alt="old.png"/></a></div><div><a href="new.png" class="img-link"><img src="new.png" alt="new.png"/></a></div></li><li><p>TESSERA in the browser is a <a href="https://tee.cl.cam.ac.uk/">hot</a> <a href="https://anil.recoil.org/notes/2026w10">topic</a> right now, so I've applied the work I've been doing with x-ocaml, js_top_worker and odoc plugins to make a <a href="/notebooks/interactive_map.html">TESSERA notebook</a> that's based on the <a href="https://github.com/ucam-eo/tessera-interactive-map">example notebook</a>.</p><div><a href="tessera.png" class="img-link"><img src="tessera.png" alt="tessera.png"/></a></div></li><li>I was interested in whether we'll be able to do inference in reasonable time using these notebooks. <a href="https://onnx.ai/">ONNX</a> has a web version of its runtime, so I got Claude to make some bindings, and checked it was working by doing a sentiment analysis notebook. This is working nicely, so the next step is to do something a bit more useful.</li><li>The docs CI was again causing problems. This time it had decided that it had never built anything, and therefore needed to rebuilt the entire world. However, despite being set up as a custom dedicated runner, all its jobs were queued waiting to start. It turned out that the runner paused itself when the docker partition reached 70%. This was a little surprising on two counts - firstly we don't actually use docker for running the jobs, we use obuilder, which doesn't share space with docker. Secondly, with that in mind, how did it get to 70%? It turned out to be the job logs - including 250 gigs of older logs from a previous instance. Simply blowing those away caused everything to restart and so it's now live again.</li><li>I met up with <a href="">Andrés C. Zúñiga-González</a> to have a chat about how he's using interactive maps and notebooks. He pointed me at his <a href="https://ancazugo.github.io/blog.html">blog</a>, some of which which is using <a href="https://quarto.org/">quarto</a>, which he rates very highly. An <a href="https://ancazugo.github.io/posts/2025-11-16-tessera_example.html">example of quarto output</a>.</li><li>Our group seminar this week was <a href="https://tombearpark.com/">Tom Bearpark</a> who talked about his proposed 'Carbon at Risk' measure in order to compare diverse ways of removing carbon from the atomsphere to help with the carbon removal market.</li></ul> 24 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 25 + <ul><li>More writing before more coding, I think.</li></ul>]]></content> 26 + </entry> 27 + <entry> 28 + <id>https://jon.recoil.org/blog/2026/03/weeknotes-2026-09.html</id> 29 + <title>Weeknotes 2026 week 9</title> 30 + <published>2026-03-02T00:00:00Z</published> 31 + <updated>2026-03-02T00:00:00Z</updated> 32 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/03/weeknotes-2026-09.html"/> 33 + <summary>Let's make this really terse!</summary> 34 + <content type="html"><![CDATA[<h1 id="weeknotes-2026-week-9"><a href="#weeknotes-2026-week-9" class="anchor"></a>Weeknotes 2026 week 9</h1> 35 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-03-02</p></li></ul> 36 + <ul class="at-tags"><li class="notanotebook"><span class="at-tag">notanotebook</span> </li></ul> 37 + <p>Let's make this really terse!</p> 38 + <h2 id="what-did-i-do?"><a href="#what-did-i-do?" class="anchor"></a>What did I do?</h2> 39 + <ul><li>Got docs working with github actions on Anil's oxmono monorepo. Results are <a href="https://jonludlam.github.io/oxmono/">here</a>. This includes experimental support for oxcaml modes/layouts.</li><li><p>Got markdown mode output into Sherlodoc's db so you can query it - great for agents!</p><div><a href="search.png" class="img-link"><img src="search.png" alt="search.png"/></a></div></li><li><p>Widgets in the JS OCaml toplevels - using FRP for the interactions. The neat thing here is that using FRP via Daniel Bunzli's <a href="">note</a> library is that all the interactions are all purely functional, no refs or mutables in sight. You provide a little wrapper scripts that's run in the frontend and the interactions and send back and forth with the worker running the code where it's translated into Events and Signals. My proof-of-concept of this is a widget that works with the <a href="https://leafletjs.com/">leaflet.js</a> library:</p><div><video src="mapdemo.mov" controls="controls" aria-label="mapdemo.mov"></video></div><p>Demo coming soon!</p></li><li>Consolidating all of the Odoc toplevel bits and pieces into the one monorepo. Again, demo of this coming soon!</li></ul> 40 + <h2 id="what-am-i-going-to-do?"><a href="#what-am-i-going-to-do?" class="anchor"></a>What am I going to do?</h2> 41 + <ul><li>New website!</li><li>Odoc plugins showcase</li><li>Writing writing writing writing</li></ul>]]></content> 42 + </entry> 43 + <entry> 44 + <id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-08.html</id> 45 + <title>Weeknotes weeks 7-8</title> 46 + <published>2026-02-24T00:00:00Z</published> 47 + <updated>2026-02-24T00:00:00Z</updated> 48 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-08.html"/> 49 + <summary>A combination one again as I took some time off due to school half term.</summary> 50 + <content type="html"><![CDATA[<h1 id="weeknotes-weeks-7-8"><a href="#weeknotes-weeks-7-8" class="anchor"></a>Weeknotes weeks 7-8</h1> 51 + <ul class="at-tags"><li class="notanotebook"><span class="at-tag">notanotebook</span> </li></ul> 52 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-02-24</p></li></ul> 53 + <p>A combination one again as I took some time off due to school half term.</p> 54 + <h2 id="finished-off-my-exam-questions"><a href="#finished-off-my-exam-questions" class="anchor"></a>Finished off my exam questions</h2> 55 + <p>This was a lot of fun! Obviously I can't talk about it, but while it was stressful and worrying and anxiety inducing and scary, it was also engaging and interesting and thought-provoking. Having some ideas come together to make a nice coherent whole was very cool.</p> 56 + <h2 id="testing-llms-on-past-paper-questions"><a href="#testing-llms-on-past-paper-questions" class="anchor"></a>Testing LLMs on past paper questions</h2> 57 + <p>Similar to our work on the ticks that <a href="https://www.youtube.com/watch?v=Ub8k1BcSRLQ">Sadiq, I and others did last year</a>, I wanted to try to see how well LLMs could answer tripos questions. Partly I wanted to do this so I could check that my own questions were of the right sort of level, and partly it was just a displacement activity while I wasn't making progress on the actual exam questions! I've not done a useful analysis of the results yet, but seemed in line with our experience with the ticks, though the pass rate was lower for the same models (qwen).</p> 58 + <h2 id="claude-from-a-sunbed"><a href="#claude-from-a-sunbed" class="anchor"></a>Claude from a sunbed</h2> 59 + <p>I went away for a vitamin-D boosting bit of sun. Before I went, I got Claude to spin me up a little Telegram bridge so that I could tell it what to do, while it's still running in safeties-off mode on my sacrificial VM. This was kind of fun - I got to just indulge thoughts as they came to me, and off it would go and do stuff. It was a bit limited in how it talked back to me, which wasn't by design but turned out to be nice for this sort of workflow. The downside is that I've now got a load of stuff to sift through - much of which is a 'good start', but none of it is likely to be usable without a good deal more effort. Here's a short-list of things I had it do:</p> 60 + <ul><li>Resurrect Fay Carson's work on the <a href="https://github.com/ocaml/odoc/pull/1295">Menhir parser for odoc</a>, pushed <a href="https://github.com/jonludlam/odoc/tree/menhir-parser-rebased">here</a></li><li>Added some instrumentation to Odoc to do some performance experiments</li><li>Ran some simple experiments to measure the impact of various pre-existing performance knobs/switches</li><li>Resurrected an old patch of mine to <a href="https://github.com/jonludlam/odoc/tree/parameterised-paths">unify the two path representations</a> in odoc to measure its effect on performance.</li><li>Tested aggresively reuse of records if their fields don't change during compile/link</li><li>Mixed up the <a href="https://tangled.org/jon.recoil.org/odoc-scrollycode-extension">scrollycode backend</a> and the x-ocaml backend and stuck a playground on at each step</li><li>Unified the oxcaml/ocaml branches of <a href="https://tangled.org/jon.recoil.org/js_top_worker">js_top_worker</a> and x-ocaml via cppo</li><li>Added oxcaml mode/layout annotations to odoc</li></ul> 61 + <h2 id="oxcaml"><a href="#oxcaml" class="anchor"></a>OxCaml</h2> 62 + <p>I investigated the oxcaml docs build, which I had got working last week. Anil reported that it wasn't working for him, so I looked at the build I had and it definitely <i>was</i> working. However, I was building on our machine Monteverde, which is a bit of a beast, so I checked the memory usage and it was enormous! I tried the build again on my 64 gig VM and it OOM'd. I'd noticed before that the <code>cmti</code> files for base, in particular <code>base__Container.cmti</code> were absolutely massive, and so had just assumed that the problem was that. Luke had also mentioned that some of the output from the template machinery was hidden. However, I had Claude look into this and it couldn't see any doc stop comments. So I asked it to look a little closer and figure out what was using all the memory. It took an unexpectedly large number of prods from me to finally figure out what was going on - it was to do with how odoc processes <code>includes</code> - specifically an <code>include sig ... end</code>. Essentially an include of that type ends up doubling the storage required of the signature. As the ppx_template extension does quite a lot of this, and in particular nests them, this ends up going exponential and this turned out to be the cause of most of the memory usage. With a fair bit more prodding by me, Claude and I eventually got to a solution, which I'll be upstreaming soon - the fix applies to OCaml as well as OxCaml, but it's this particularly pathalogical usage of includes that ppx_template uses where it'll make the most difference.</p> 63 + <h2 id="odoc,-plugins,-js-and-more"><a href="#odoc,-plugins,-js-and-more" class="anchor"></a>Odoc, plugins, JS and more</h2> 64 + <p>Teaser... I have a blog post coming soon with more on this. It's been a lot of fun, and should provide a decent inspiration for a roadmap for Odoc and online notebooks!</p>]]></content> 65 + </entry> 66 + <entry> 67 + <id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html</id> 68 + <title>Weeknotes for week 6</title> 69 + <published>2026-02-09T00:00:00Z</published> 70 + <updated>2026-02-09T00:00:00Z</updated> 71 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html"/> 72 + <summary>Highlights:</summary> 73 + <content type="html"><![CDATA[<h1 id="weeknotes-for-week-6"><a href="#weeknotes-for-week-6" class="anchor"></a>Weeknotes for week 6</h1> 74 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-02-09</p></li></ul> 75 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.xref2,odoc.loader,odoc.model</p></li></ul> 76 + <ul class="at-tags"><li class="packages"><span class="at-tag">packages</span> <p>odoc</p></li></ul> 77 + <p>Highlights:</p> 78 + <ul><li><a href="https://jon.ludl.am/experiments/day10-jtw/standalone/index.html">day10 / javascript toplevels integration</a></li><li><a href="https://jon.ludl.am/experiments/scrollycoder/">Scrollycode experiments</a></li></ul> 79 + <h2 id="oxmono"><a href="#oxmono" class="anchor"></a>Oxmono</h2> 80 + <p>I spent some time on Anil's oxmono repo getting odoc to work correctly. It turned out that the bug I was working on last week was critically important for this - and that the bugfix was incomplete. One of the issues was to do with identifiers needing to be unique. For example, consider the following code:</p> 81 + <div><pre class="language-ocaml"><code>module type S = sig 11 82 type t 12 83 13 84 include sig 14 85 type t 15 86 16 - val f : t -&amp;gt; t 87 + val f : t -&gt; t 17 88 end with type t := t 18 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 19 - &lt;p&gt;The problem here is that both definitions of `type t` have the same identifier, which causes problems when we move to and from the 'Component' types. The solution was to introduce a 'dummy' parent for the type defined within the include. This works because we never actually render the body of the include into HTML - we render the &lt;i&gt;expansion&lt;/i&gt;, which &lt;i&gt;doesn't&lt;/i&gt; have &lt;code&gt;type t&lt;/code&gt; in it, as it has been substituted out.&lt;/p&gt; 20 - &lt;p&gt;The fix I made last week fixed the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_loader&quot;&gt;loader&lt;/span&gt;, which reads in the &lt;code&gt;cmt&lt;/code&gt;/&lt;code&gt;cmti&lt;/code&gt; files produced by the compiler. There's one more place where we create these in the code - when we translate from the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_xref2.Component&quot;&gt;Component&lt;/span&gt; types back into &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang&quot;&gt;Lang&lt;/span&gt; types. I was a little curious about whether it was possible to make this happen, so I thought I'd ask Claude to see if it could come up with a scenario where we'd end up in this situation. This was a complete failure, which was a real disappointment to me, as doing this sort of thing is a quite tedious and annoying part of working on odoc.&lt;/p&gt; 21 - &lt;p&gt;Meanwhile, I was running odoc on Anil's &lt;a href=&quot;https://github.com/avsm/oxmono&quot;&gt;oxmono&lt;/a&gt; repo, which was using &lt;a href=&quot;https://github.com/art-w&quot;&gt;art-w&lt;/a&gt;'s &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1399&quot;&gt;PR to upstream oxcaml support&lt;/a&gt;. It was failing with an exception that was very familiar, so I pulled in the fix I'd been working on, and that enabled it to get much further. However, it did subsequently fail with another slightly different exception. I had my suspicions at this point that it might be due to the other place, but I thought this again was a good opportunity to test Claude's debugging skills. However, this again was a complete failure. I spend quite a long time prodding it - at least 4 separate sessions - and it really didn't get anywhere close to a solution, despite knowing precisely that the commit we'd made that had fixed the first problem. Two of the four times it ended up telling me that the oxcaml compiler was broken and suggesting that we create an issue!&lt;/p&gt; 22 - &lt;p&gt;I'm only very mildly disappointed in this - it's all quite subtle, and something I still end up scratching my head over sometimes, but it would have been wonderful to be able to offload this sort of work!&lt;/p&gt; 23 - &lt;p&gt;In any case, the docs now all build on &lt;a href=&quot;https://github.com/jonludlam/oxmono/commit/2a53f6857d5b8849a73f5bb3e5244b9ac0f36708&quot;&gt;my fork of oxmono&lt;/a&gt;.&lt;/p&gt; 24 - &lt;h2 id=&quot;docs-ci&quot;&gt;&lt;a href=&quot;#docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI&lt;/h2&gt; 25 - &lt;p&gt;The fix I deployed last week for ocaml-docs-ci was taking forever to complete, so I ended up spending some time investigating this. The problem was happening during the 'prep' phase, which is the first part of the pipeline where we simply build the package to be documented. This is supposed to work by building a graph of all inter-package dependencies across all of the solved packages, so we maximise sharing of built artefacts. Each 'prep' job builds precisely one package by coping in the dependencies from previous prep jobs, then running &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;opamh&lt;/a&gt; to fix up the metadata so that opam believes it has installed everything itself, then running opam to build the one package required. It was this last step that was going wrong, where it would decide that there had been upstream changes to the compiler itself, and rebuild &lt;i&gt;everything&lt;/i&gt;, so rather than a prep job taking a few seconds, it would take a few minutes.&lt;/p&gt; 26 - &lt;p&gt;I was totally unable to repro this locally - everything build very quickly and just how it should have done. After much head-scratching I finally realised that the problem was somewhere in the caching. I think what's going on is that we dynamically build an opam repository to make the `opam install` command faster, and that repo contains only the packages that are required to build whatever it is we're building. Those opam files are cached by the docs CI server and passed to the build script as a base64-encoded gzipped tarball inline in the obuilder file (!). This should all be totally consistent as we're also caching all the builds - except for the compiler itself, which comes from the base docker image. This, of course, is the problem. The ocaml compiler opam files had been updated, and then when we reconstructed the opam repo with our cached opam files, opam noticed they had changed (gone &lt;i&gt;backwards&lt;/i&gt; in time!) and decided it needed to rebuild the compiler, and therefore &lt;i&gt;everything&lt;/i&gt; else. Clearing out the opam-files cache and restarting the builds fixed this entirely, and the full rebuild job completed after about 2 days. I flipped the switch on Saturday night and the docs are now fully up to date again. Phew!&lt;/p&gt; 27 - &lt;h2 id=&quot;day10-work&quot;&gt;&lt;a href=&quot;#day10-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;day10 work&lt;/h2&gt; 28 - &lt;p&gt;This was a fun week of large-scale building! I integrated day10 and odoc_driver and js_top_worker and x-ocaml and have now successfully got a docs-ci-like system that's able to build docs and toplevels that can coexist in the one HTML tree. I've not got a full integrated demo yet, but you can see the test cases for this &lt;a href=&quot;https://jon.ludl.am/experiments/day10-jtw/standalone/index.html&quot;&gt;here&lt;/a&gt;. Be sure to take a look at the 'network' tab in the browser dev tools to see what it's doing!&lt;/p&gt; 29 - &lt;h2 id=&quot;scrollycode-experiments&quot;&gt;&lt;a href=&quot;#scrollycode-experiments&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Scrollycode experiments&lt;/h2&gt; 30 - &lt;p&gt;I've long been a fan of &lt;a href=&quot;https://pomb.us/&quot;&gt;Rodrigo Pombo's&lt;/a&gt; work on &amp;quot;building tools for better code reading comprehension&amp;quot;, ever since first seeing his post &amp;quot;&lt;a href=&quot;https://pomb.us/build-your-own-react/&quot;&gt;Build your own React&lt;/a&gt;&amp;quot;. Claude is &lt;i&gt;fantastically good&lt;/i&gt; at doing this sort of thing, so I asked it to go and build me some simple OCaml-focused versions. We came up with 5 variations in the end - and they're all pretty neat! &lt;a href=&quot;https://jon.ludl.am/experiments/scrollycoder/&quot;&gt;take a look!&lt;/a&gt;. The best part of this was that it took me less than half-an-hour to get Claude to do all this.&lt;/p&gt; 31 - &lt;h2 id=&quot;dune-pr&quot;&gt;&lt;a href=&quot;#dune-pr&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune PR&lt;/h2&gt; 32 - &lt;p&gt;I attended the bi-weekly dune dev meeting to talk about the first part of the dune PR - the bit that Paul Elliot did almost a year ago.&lt;/p&gt; 33 - &lt;h2 id=&quot;coming-week&quot;&gt;&lt;a href=&quot;#coming-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Coming week&lt;/h2&gt; 34 - &lt;p&gt;So the clock is ticking on writing the exam questions for FoCS, so I'll need to be spending time this week on that.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html</id><title type="text">Weeknotes for week 6</title><updated>2026-02-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</summary><published>2026-01-30T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-weeks-4-5&quot;&gt;&lt;a href=&quot;#weeknotes-for-weeks-4-5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for weeks 4-5&lt;/h1&gt; 35 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-01-30&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 36 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.extension_api&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 37 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;packages&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;packages&lt;/span&gt; &lt;p&gt;odoc-admonition-extension odoc-rfc-extension odoc-msc-extension odoc-mermaid-extension odoc-dot-extension&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 38 - &lt;p&gt;I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!&lt;/p&gt; 39 - &lt;p&gt;Probably the most interesting part of this is the &lt;a href=&quot;#retrospective&quot; title=&quot;retrospective&quot;&gt;Retrospective&lt;/a&gt;, so make sure to read that bit.&lt;/p&gt; 40 - &lt;h2 id=&quot;the-last-two-weeks&quot;&gt;&lt;a href=&quot;#the-last-two-weeks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The Last Two Weeks&lt;/h2&gt; 41 - &lt;p&gt;As is becoming more and more apparent, the &lt;em&gt;breadth&lt;/em&gt; of what I'm working on is ever expanding, powered by agentic AI. It's become so much more (cognitively) cheaper to have an idea and set an agent off investigating it that I've been finding that I'm working in parallel on far more things in a single week than I would have even six months ago. Here are some of the bigger headings though.&lt;/p&gt; 42 - &lt;h3 id=&quot;monorepo-excitement&quot;&gt;&lt;a href=&quot;#monorepo-excitement&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Monorepo excitement&lt;/h3&gt; 43 - &lt;p&gt;We're currently experimenting with a new tool - &lt;a href=&quot;https://tangled.org/anil.recoil.org/monopam&quot;&gt;monopam&lt;/a&gt; to help develop across multiple OCaml libraries by using git subtrees to create a monorepo with all of the packages in. We then extract patches to the individual repos to push upstream. I've been moving my development workflow from in-vscode-claude with careful permissions checking to running claude with `--dangerously-skip-permissions` in a container with the monorepo checked out. This has been a bit of a bumpy ride, with the tool evolving daily, but I'm very much seeing the benefits of letting Claude just get on with things, given a strict enough early design and testing strategy, and using Anil's method of creating the interfaces first.&lt;/p&gt; 44 - &lt;h4 id=&quot;odoc&quot;&gt;&lt;a href=&quot;#odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc&lt;/h4&gt; 45 - &lt;p&gt;I also did quite a bit related to odoc these 2 weeks, split over improving functionality and bugfixing.&lt;/p&gt; 46 - &lt;h5 id=&quot;plugins&quot;&gt;&lt;a href=&quot;#plugins&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Plugins&lt;/h5&gt; 47 - &lt;p&gt;Getting Claude to run with all of the monorepo libraries implicitly requires that they're well documented, as looking at the source to figure out how to use them exhausts the context window pretty rapidly. Odoc's main focus has been on getting the expansions and referencing correct, and while we've made progress on the actual content markup, introducing &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#media&quot;&gt;media tags&lt;/a&gt; for example, there's still a good distance to go.&lt;/p&gt; 48 - &lt;p&gt;Using the plugins mechanism I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2026/01/weeknotes-2026-03&quot;&gt;wrote about last week&lt;/span&gt;, I've made a plugin interface for odoc and implemented a few plugins. Initially I was just going to support 'custom tags' but it occurred to me that rendering code blocks could also be done in this way. So I've made a few. Two custom tag plugins:&lt;/p&gt; 49 - &lt;ul&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-admonition-extension/index&quot;&gt;odoc-admonition-extension&lt;/span&gt; - styled callout blocks for notes, warnings, tips. Note that we are intending to make this more first-class - there's a &lt;a href=&quot;https://hackmd.io/ETSOAmetTI-E3vrDk3Bfrw&quot;&gt;design out there&lt;/a&gt;. This was just a convenient way to test the feature!&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-rfc-extension/index&quot;&gt;odoc-rfc-extension&lt;/span&gt; - links to IETF RFC documents&lt;/li&gt;&lt;/ul&gt; 50 - &lt;p&gt;and 3 code block plugins:&lt;/p&gt; 51 - &lt;ul&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-msc-extension/index&quot;&gt;odoc-msc-extension&lt;/span&gt; - Message Sequence Charts&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-mermaid-extension/index&quot;&gt;odoc-mermaid-extension&lt;/span&gt; - Mermaid diagrams (flowcharts, sequence diagrams, etc.)&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-dot-extension/index&quot;&gt;odoc-dot-extension&lt;/span&gt; - Graphviz/DOT diagrams&lt;/li&gt;&lt;/ul&gt; 52 - &lt;p&gt;The module signatures relevant to the plugins are documented in &lt;code&gt;/odoc.extension_api/Odoc_extension_api&lt;/code&gt; and the plugins each have to implement an interface described in &lt;code&gt;Odoc_extension_api.Code_Block_Extension&lt;/code&gt; or &lt;code&gt;Odoc_extension_api.Extension&lt;/code&gt; for custom tags.&lt;/p&gt; 53 - &lt;h5 id=&quot;bugfixing&quot;&gt;&lt;a href=&quot;#bugfixing&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bugfixing&lt;/h5&gt; 54 - &lt;p&gt;&lt;a href=&quot;https://github.com/lukemaurer&quot;&gt;Luke Maurer&lt;/a&gt; at Jane Street pointed out that they're still suffering from yet another repro of &lt;a href=&quot;https://github.com/ocaml/odoc/issues/930&quot;&gt;issue 930&lt;/a&gt; at Jane Street. I'd worked on this &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/odoc-bugs&quot;&gt;back in September&lt;/span&gt; but turns out I hadn't actually made a PR, so I tidied up the branch and &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1400&quot;&gt;made a PR&lt;/a&gt;.&lt;/p&gt; 55 - &lt;h3 id=&quot;docs-ci&quot;&gt;&lt;a href=&quot;#docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI&lt;/h3&gt; 56 - &lt;p&gt;Docs CI has been fixed and is even now rebuilding all of the docs for ocaml.org. I've added in the &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/commit/c6231fa383820b4c700aaa1e72107536b1872112&quot;&gt;handling of `post &amp;amp; with-doc`&lt;/a&gt; in place of x-extra-doc-deps, so we should be able to use either mechanism now. The idea is to deprecate x-extra-doc-deps soon though. Somehow despite an explicit button to press to update the epoch symlinks, it got updated anyway and broke most of the docs on ocaml.org. Fortunately &lt;a href=&quot;https://discuss.ocaml.org/t/is-caqti-doc-missing/17741/5&quot;&gt;someone noticed&lt;/a&gt; and posted on discuss and so I switched it back.&lt;/p&gt; 57 - &lt;p&gt;Unfortunately, it seemed to be taking a long time to build the docs - at time of writing it's now Friday, and the CI jobs have been running since Tuesday. In that time, it's only managed to build about 6500 packages, a long way short of the 16,000 or so that I expect a full build will produce. Looking through the logs, it seems that some change to opam is causing it to sometime rebuild the entire opam universe when it should only be building 1 package. For example, in a job that should be building just `tezos-protocol-004-Pt24m4xi`, it installs all of the prebuilt dependencies, then runs `opamh` to try to convince opam that everything is all set up to just run the build step for the package we want. Unfortunately the logs show the following:&lt;/p&gt; 58 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;The following actions will be performed: 89 + end</code></pre></div> 90 + <p>The problem here is that both definitions of `type t` have the same identifier, which causes problems when we move to and from the 'Component' types. The solution was to introduce a 'dummy' parent for the type defined within the include. This works because we never actually render the body of the include into HTML - we render the <i>expansion</i>, which <i>doesn't</i> have <code>type t</code> in it, as it has been substituted out.</p> 91 + <p>The fix I made last week fixed the <span class="xref-unresolved" title="Odoc_loader">loader</span>, which reads in the <code>cmt</code>/<code>cmti</code> files produced by the compiler. There's one more place where we create these in the code - when we translate from the <span class="xref-unresolved" title="Odoc_xref2.Component">Component</span> types back into <span class="xref-unresolved" title="Odoc_model.Lang">Lang</span> types. I was a little curious about whether it was possible to make this happen, so I thought I'd ask Claude to see if it could come up with a scenario where we'd end up in this situation. This was a complete failure, which was a real disappointment to me, as doing this sort of thing is a quite tedious and annoying part of working on odoc.</p> 92 + <p>Meanwhile, I was running odoc on Anil's <a href="https://github.com/avsm/oxmono">oxmono</a> repo, which was using <a href="https://github.com/art-w">art-w</a>'s <a href="https://github.com/ocaml/odoc/pull/1399">PR to upstream oxcaml support</a>. It was failing with an exception that was very familiar, so I pulled in the fix I'd been working on, and that enabled it to get much further. However, it did subsequently fail with another slightly different exception. I had my suspicions at this point that it might be due to the other place, but I thought this again was a good opportunity to test Claude's debugging skills. However, this again was a complete failure. I spend quite a long time prodding it - at least 4 separate sessions - and it really didn't get anywhere close to a solution, despite knowing precisely that the commit we'd made that had fixed the first problem. Two of the four times it ended up telling me that the oxcaml compiler was broken and suggesting that we create an issue!</p> 93 + <p>I'm only very mildly disappointed in this - it's all quite subtle, and something I still end up scratching my head over sometimes, but it would have been wonderful to be able to offload this sort of work!</p> 94 + <p>In any case, the docs now all build on <a href="https://github.com/jonludlam/oxmono/commit/2a53f6857d5b8849a73f5bb3e5244b9ac0f36708">my fork of oxmono</a>.</p> 95 + <h2 id="docs-ci"><a href="#docs-ci" class="anchor"></a>Docs CI</h2> 96 + <p>The fix I deployed last week for ocaml-docs-ci was taking forever to complete, so I ended up spending some time investigating this. The problem was happening during the 'prep' phase, which is the first part of the pipeline where we simply build the package to be documented. This is supposed to work by building a graph of all inter-package dependencies across all of the solved packages, so we maximise sharing of built artefacts. Each 'prep' job builds precisely one package by coping in the dependencies from previous prep jobs, then running <a href="https://github.com/jonludlam/opamh">opamh</a> to fix up the metadata so that opam believes it has installed everything itself, then running opam to build the one package required. It was this last step that was going wrong, where it would decide that there had been upstream changes to the compiler itself, and rebuild <i>everything</i>, so rather than a prep job taking a few seconds, it would take a few minutes.</p> 97 + <p>I was totally unable to repro this locally - everything build very quickly and just how it should have done. After much head-scratching I finally realised that the problem was somewhere in the caching. I think what's going on is that we dynamically build an opam repository to make the `opam install` command faster, and that repo contains only the packages that are required to build whatever it is we're building. Those opam files are cached by the docs CI server and passed to the build script as a base64-encoded gzipped tarball inline in the obuilder file (!). This should all be totally consistent as we're also caching all the builds - except for the compiler itself, which comes from the base docker image. This, of course, is the problem. The ocaml compiler opam files had been updated, and then when we reconstructed the opam repo with our cached opam files, opam noticed they had changed (gone <i>backwards</i> in time!) and decided it needed to rebuild the compiler, and therefore <i>everything</i> else. Clearing out the opam-files cache and restarting the builds fixed this entirely, and the full rebuild job completed after about 2 days. I flipped the switch on Saturday night and the docs are now fully up to date again. Phew!</p> 98 + <h2 id="day10-work"><a href="#day10-work" class="anchor"></a>day10 work</h2> 99 + <p>This was a fun week of large-scale building! I integrated day10 and odoc_driver and js_top_worker and x-ocaml and have now successfully got a docs-ci-like system that's able to build docs and toplevels that can coexist in the one HTML tree. I've not got a full integrated demo yet, but you can see the test cases for this <a href="https://jon.ludl.am/experiments/day10-jtw/standalone/index.html">here</a>. Be sure to take a look at the 'network' tab in the browser dev tools to see what it's doing!</p> 100 + <h2 id="scrollycode-experiments"><a href="#scrollycode-experiments" class="anchor"></a>Scrollycode experiments</h2> 101 + <p>I've long been a fan of <a href="https://pomb.us/">Rodrigo Pombo's</a> work on &quot;building tools for better code reading comprehension&quot;, ever since first seeing his post &quot;<a href="https://pomb.us/build-your-own-react/">Build your own React</a>&quot;. Claude is <i>fantastically good</i> at doing this sort of thing, so I asked it to go and build me some simple OCaml-focused versions. We came up with 5 variations in the end - and they're all pretty neat! <a href="https://jon.ludl.am/experiments/scrollycoder/">take a look!</a>. The best part of this was that it took me less than half-an-hour to get Claude to do all this.</p> 102 + <h2 id="dune-pr"><a href="#dune-pr" class="anchor"></a>Dune PR</h2> 103 + <p>I attended the bi-weekly dune dev meeting to talk about the first part of the dune PR - the bit that Paul Elliot did almost a year ago.</p> 104 + <h2 id="coming-week"><a href="#coming-week" class="anchor"></a>Coming week</h2> 105 + <p>So the clock is ticking on writing the exam questions for FoCS, so I'll need to be spending time this week on that.</p>]]></content> 106 + </entry> 107 + <entry> 108 + <id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html</id> 109 + <title>Weeknotes for weeks 4-5</title> 110 + <published>2026-01-30T00:00:00Z</published> 111 + <updated>2026-01-30T00:00:00Z</updated> 112 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html"/> 113 + <summary>I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</summary> 114 + <content type="html"><![CDATA[<h1 id="weeknotes-for-weeks-4-5"><a href="#weeknotes-for-weeks-4-5" class="anchor"></a>Weeknotes for weeks 4-5</h1> 115 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-01-30</p></li></ul> 116 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.extension_api</p></li></ul> 117 + <ul class="at-tags"><li class="packages"><span class="at-tag">packages</span> <p>odoc-admonition-extension odoc-rfc-extension odoc-msc-extension odoc-mermaid-extension odoc-dot-extension</p></li></ul> 118 + <p>I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</p> 119 + <p>Probably the most interesting part of this is the <a href="#retrospective" title="retrospective">Retrospective</a>, so make sure to read that bit.</p> 120 + <h2 id="the-last-two-weeks"><a href="#the-last-two-weeks" class="anchor"></a>The Last Two Weeks</h2> 121 + <p>As is becoming more and more apparent, the <em>breadth</em> of what I'm working on is ever expanding, powered by agentic AI. It's become so much more (cognitively) cheaper to have an idea and set an agent off investigating it that I've been finding that I'm working in parallel on far more things in a single week than I would have even six months ago. Here are some of the bigger headings though.</p> 122 + <h3 id="monorepo-excitement"><a href="#monorepo-excitement" class="anchor"></a>Monorepo excitement</h3> 123 + <p>We're currently experimenting with a new tool - <a href="https://tangled.org/anil.recoil.org/monopam">monopam</a> to help develop across multiple OCaml libraries by using git subtrees to create a monorepo with all of the packages in. We then extract patches to the individual repos to push upstream. I've been moving my development workflow from in-vscode-claude with careful permissions checking to running claude with `--dangerously-skip-permissions` in a container with the monorepo checked out. This has been a bit of a bumpy ride, with the tool evolving daily, but I'm very much seeing the benefits of letting Claude just get on with things, given a strict enough early design and testing strategy, and using Anil's method of creating the interfaces first.</p> 124 + <h4 id="odoc"><a href="#odoc" class="anchor"></a>Odoc</h4> 125 + <p>I also did quite a bit related to odoc these 2 weeks, split over improving functionality and bugfixing.</p> 126 + <h5 id="plugins"><a href="#plugins" class="anchor"></a>Plugins</h5> 127 + <p>Getting Claude to run with all of the monorepo libraries implicitly requires that they're well documented, as looking at the source to figure out how to use them exhausts the context window pretty rapidly. Odoc's main focus has been on getting the expansions and referencing correct, and while we've made progress on the actual content markup, introducing <a href="https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#media">media tags</a> for example, there's still a good distance to go.</p> 128 + <p>Using the plugins mechanism I <a href="weeknotes-2026-03.html" title="weeknotes-2026-03">wrote about last week</a>, I've made a plugin interface for odoc and implemented a few plugins. Initially I was just going to support 'custom tags' but it occurred to me that rendering code blocks could also be done in this way. So I've made a few. Two custom tag plugins:</p> 129 + <ul><li><span class="xref-unresolved" title="/odoc-admonition-extension/index">odoc-admonition-extension</span> - styled callout blocks for notes, warnings, tips. Note that we are intending to make this more first-class - there's a <a href="https://hackmd.io/ETSOAmetTI-E3vrDk3Bfrw">design out there</a>. This was just a convenient way to test the feature!</li><li><span class="xref-unresolved" title="/odoc-rfc-extension/index">odoc-rfc-extension</span> - links to IETF RFC documents</li></ul> 130 + <p>and 3 code block plugins:</p> 131 + <ul><li><span class="xref-unresolved" title="/odoc-msc-extension/index">odoc-msc-extension</span> - Message Sequence Charts</li><li><span class="xref-unresolved" title="/odoc-mermaid-extension/index">odoc-mermaid-extension</span> - Mermaid diagrams (flowcharts, sequence diagrams, etc.)</li><li><span class="xref-unresolved" title="/odoc-dot-extension/index">odoc-dot-extension</span> - Graphviz/DOT diagrams</li></ul> 132 + <p>The module signatures relevant to the plugins are documented in <code>/odoc.extension_api/Odoc_extension_api</code> and the plugins each have to implement an interface described in <code>Odoc_extension_api.Code_Block_Extension</code> or <code>Odoc_extension_api.Extension</code> for custom tags.</p> 133 + <h5 id="bugfixing"><a href="#bugfixing" class="anchor"></a>Bugfixing</h5> 134 + <p><a href="https://github.com/lukemaurer">Luke Maurer</a> at Jane Street pointed out that they're still suffering from yet another repro of <a href="https://github.com/ocaml/odoc/issues/930">issue 930</a> at Jane Street. I'd worked on this <a href="../../2025/09/odoc-bugs.html" title="odoc-bugs">back in September</a> but turns out I hadn't actually made a PR, so I tidied up the branch and <a href="https://github.com/ocaml/odoc/pull/1400">made a PR</a>.</p> 135 + <h3 id="docs-ci"><a href="#docs-ci" class="anchor"></a>Docs CI</h3> 136 + <p>Docs CI has been fixed and is even now rebuilding all of the docs for ocaml.org. I've added in the <a href="https://github.com/ocurrent/ocaml-docs-ci/commit/c6231fa383820b4c700aaa1e72107536b1872112">handling of `post &amp; with-doc`</a> in place of x-extra-doc-deps, so we should be able to use either mechanism now. The idea is to deprecate x-extra-doc-deps soon though. Somehow despite an explicit button to press to update the epoch symlinks, it got updated anyway and broke most of the docs on ocaml.org. Fortunately <a href="https://discuss.ocaml.org/t/is-caqti-doc-missing/17741/5">someone noticed</a> and posted on discuss and so I switched it back.</p> 137 + <p>Unfortunately, it seemed to be taking a long time to build the docs - at time of writing it's now Friday, and the CI jobs have been running since Tuesday. In that time, it's only managed to build about 6500 packages, a long way short of the 16,000 or so that I expect a full build will produce. Looking through the logs, it seems that some change to opam is causing it to sometime rebuild the entire opam universe when it should only be building 1 package. For example, in a job that should be building just `tezos-protocol-004-Pt24m4xi`, it installs all of the prebuilt dependencies, then runs `opamh` to try to convince opam that everything is all set up to just run the build step for the package we want. Unfortunately the logs show the following:</p> 138 + <div><pre class="language-ocaml"><code>The following actions will be performed: 59 139 === recompile 178 packages 60 140 - recompile aches 1.1.0 [uses ocaml] 61 141 - recompile aches-lwt 1.1.0 [uses ocaml] ··· 63 143 - recompile mtime 2.1.0 [uses ocaml] 64 144 - recompile ocaml 4.14.2 [upstream or system changes] 65 145 - recompile ocaml-compiler-libs v0.12.4 [uses ocaml] 66 - ...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 67 - &lt;p&gt;where it seems opam has decided that something has changed enough for it to want to recompile the `ocaml` package, and therefore &lt;i&gt;everything&lt;/i&gt; in the entire opam switch! So this job took 12 minutes instead of 21 seconds, which was the time required to finally build the `tezos-protocol` package.&lt;/p&gt; 68 - &lt;h3 id=&quot;day10-and-docs&quot;&gt;&lt;a href=&quot;#day10-and-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Day10 and docs&lt;/h3&gt; 69 - &lt;p&gt;In closely related news, &lt;a href=&quot;https://tunbury.org/&quot;&gt;mtelver's&lt;/a&gt; day10 project looked precisely the right shape for building docs - in fact it shares its architecture and some components with the docs CI. So I asked Claude to take a look and see what it would take, and discovered that it doesn't take very much! We have a Really Big Machine here at the CL that was temporarily underused; and by Really Big I mean 768 cores and 3TB of RAM. So, how long could building all of the docs for all of the packages possibly take? Well, it takes 5 hours 40 mins. And I was only using roughly a third of the machine. Nice!&lt;/p&gt; 70 - &lt;p&gt;So should I push on with fixing ocaml-docs-ci and figure out why it's rebuilding everything all the time? Or should I forge ahead with day10 and turn it into a proper CI system as opposed to a slightly flakey bespoke thing I have to handhold through a build? This is next week's problem.&lt;/p&gt; 71 - &lt;h3 id=&quot;js-toplevels&quot;&gt;&lt;a href=&quot;#js-toplevels&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;JS toplevels&lt;/h3&gt; 72 - &lt;p&gt;Something I keep coming back to is javascript toplevels. I'd really like to be able to be able to host JS toplevels on ocaml.org for each different version of each different package. This is something I've worked on on-and-off for a long time now, and several fixes to help have been merged to various projects along the way. The tricky thing is to not put a massive load onto ocaml.org with this, so we need to be efficient. That means firstly having a single toplevel js file with all of the logic in but none of the libraries, and then dynamically loading libraries as we need them. Also we can save some bandwidth by not immediately sending all of the cmi files, as these can be faulted in as necessary too. So once again I've got Claude on the task, and things are honestly looking pretty hopeful now. I've got 2 demos:&lt;/p&gt; 73 - &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/findlibish/&quot;&gt;Dynamic library loading&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/multi-universe-demo/&quot;&gt;Multi-version support&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt; 74 - &lt;p&gt;In both cases, make sure you take a look at the network tab to see it dynamically loading only what it needs.&lt;/p&gt; 75 - &lt;h2 id=&quot;retrospective&quot;&gt;&lt;a href=&quot;#retrospective&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Retrospective&lt;/h2&gt; 76 - &lt;h3 id=&quot;autonomous-claude&quot;&gt;&lt;a href=&quot;#autonomous-claude&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Autonomous Claude&lt;/h3&gt; 77 - &lt;p&gt;The power of sending Claude off to do some work can be immense. However, it does mean investing time up front telling it precisely what problem you're trying to solve, what approach to take, finer details on how you want it done, and how you can tell if it's working when it finishes. A 'failure mode' I've been experiencing is when I end up in a long, drawn out real time interaction, especially if that's happening with 2 projects simultaneously - and by 'failure' I really mean just 'slow'. Ideally what would be going on is for all of my agents to be getting on with whatever task they've been allocated without bothering me for more details. For Claude to have to ask me a question has much more latency involved than it just getting on with things, especially if I don't notice it immediately.&lt;/p&gt; 78 - &lt;h3 id=&quot;when-to-stop&quot;&gt;&lt;a href=&quot;#when-to-stop&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;When to Stop&lt;/h3&gt; 79 - &lt;p&gt;The 'finishing criteria' are important - many times this week I've had Claude tell me it's finished something, having verified that it's passing all the tests, only for me to take a look to find that it's very obviously broken. As quite a few things recently have involved the web, I've put Playwright into all of my devcontainers, and told Claude to use it to verify things are working. This has been working pretty well, so I'll be adding it to my prompts. It's not too dissimilar to what we used to call 'pre-flight checks' back in the Citrix days.&lt;/p&gt; 80 - &lt;h3 id=&quot;containers-vs-accounts&quot;&gt;&lt;a href=&quot;#containers-vs-accounts&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Containers vs accounts&lt;/h3&gt; 81 - &lt;p&gt;I've been running everything with `--dangerously-ignore-permissions` in containers, and while the outcome is amazing, the containers bit has been a bit of a headache. Next week I'll be trialling the idea of just giving the agents their own account (non-admin!) on my servers, their own github account, tangled account and so on, and just treating them more like I would if I had a real colleague. It's always slightly alarming to see my own name on the output of the bots, assigning me (or sometimes someone else (!!)) copyright over code I've never seen. This is, of course, a whole other pandora's box that I really don't want to open right now - but I think the point is that I'll feel a lot more comfortable if the commits are all by `Jon's Agent &amp;lt;jon+claude@recoil.org&amp;gt;` rather than by me!&lt;/p&gt; 82 - &lt;h3 id=&quot;deciding-next-steps&quot;&gt;&lt;a href=&quot;#deciding-next-steps&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Deciding next steps&lt;/h3&gt; 83 - &lt;p&gt;The question of whether I should fix up ocaml-docs-ci or improve the day10 solution requires a bit of thought. In fact, it requires a bit of a gap analysis between the two. This isn't something I've asked Claude to do before, so I'll try that and see how it turns out. I'll be asking it to be &amp;quot;scientific&amp;quot; in its approach, coming up with hypotheses and verifying them - for which I think I'll need to give it a platform on which it can perform experiments. This is a bit trickier with ocaml-docs-ci than day10 as day10 runs entirely on any given linux computer, whereas ocaml-docs-ci needs ocurrent workers and a routable ssh server. I'll report on the outcome of this next week!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html</id><title type="text">Weeknotes for weeks 4-5</title><updated>2026-01-30T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">First week back of 2026! Let's write some terse weeknotes.</summary><published>2026-01-19T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-week-3&quot;&gt;&lt;a href=&quot;#weeknotes-for-week-3&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for week 3&lt;/h1&gt; 84 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-01-19&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 85 - &lt;p&gt;First week back of 2026! Let's write some terse weeknotes.&lt;/p&gt; 86 - &lt;h2 id=&quot;projects&quot;&gt;&lt;a href=&quot;#projects&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Projects&lt;/h2&gt; 87 - &lt;h3 id=&quot;dune-odoc-rules&quot;&gt;&lt;a href=&quot;#dune-odoc-rules&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune odoc rules&lt;/h3&gt; 88 - &lt;p&gt;Last thing I did last year was to push the new rules for odoc 3. This week, Anil handed me an excellent opportunity to test the rules on the monorepo containing his &lt;a href=&quot;https://anil.recoil.org/notes/aoah-2025&quot;&gt;AOAH&lt;/a&gt; projects. Claude tends to actually write ocamldoc-formatted comments, so this is really useful to test the rules. I've &lt;a href=&quot;https://github.com/jonludlam/dune/tree/odoc-v3-rules-3.21&quot;&gt;rebased the commits&lt;/a&gt; on the just-released Dune 3.21 and we've been trying them out. There were a few things to fix:&lt;/p&gt; 89 - &lt;ul&gt;&lt;li&gt;More careful &lt;a href=&quot;https://github.com/jonludlam/dune/commit/25158eabf0c3cac2826e16ce590b4bd4d7c09818&quot;&gt;dependency tracking&lt;/a&gt; during the compile phase - this particularly affected the &lt;code&gt;@doc&lt;/code&gt; target, which was pulling in unnecessary dependencies. Most of these dependencies were compiling just fine, but one - Anstrom - is slightly odd in that the opam install of Angstrom installs a META file that references libraries that aren't in the dependencies of its opam package. This is a backward-compatibility hack that was implemented when the Anstrom package was split into several in order to manage the dependencies better.&lt;/li&gt;&lt;li&gt;A similar issue happens with eio, where the documentation of the package depends upon &lt;code&gt;bigstring&lt;/code&gt;, which isn't in eio's dependencies. This is entirely intentional - the extra doc dependencies is stated in the opam file with a &lt;code&gt;x-extra-doc-deps&lt;/code&gt; field. However, &lt;code&gt;opam install&lt;/code&gt; totally ignores this field (quite reasonably), and so a simple install gives you an opam repo whose docs can't be built. Once again, this broke &lt;code&gt;dune build @doc&lt;/code&gt; unnecessarily, but the fix was &lt;a href=&quot;https://github.com/jonludlam/dune/commit/2afe046cf4290d7a83b5f2c5646e3391ca94b630&quot;&gt;relatively simple&lt;/a&gt;. The &lt;i&gt;real&lt;/i&gt; fix here is to not use &lt;code&gt;x-extra-doc-deps&lt;/code&gt;, but switch to using a &lt;i&gt;real&lt;/i&gt; dependency, but marked with &lt;code&gt;with-doc&lt;/code&gt; and &lt;code&gt;post&lt;/code&gt; if it would otherwise introduce a circular dependency. That way, an &lt;code&gt;opam install --with-doc&lt;/code&gt; &lt;i&gt;would&lt;/i&gt; install the extra dependency.&lt;/li&gt;&lt;li&gt;Over the Christmas break, &lt;a href=&quot;https://discuss.ocaml.org/u/tbrk&quot;&gt;tbrk&lt;/a&gt; posted &lt;a href=&quot;https://discuss.ocaml.org/t/odoc-index-for-multiple-packages-inter-package-links-and-local-global-sidebar/17652&quot;&gt;on discuss&lt;/a&gt; a question about building docs, for which my dune branch was a partial answer. One feature he was requesting though was the ability to use a custom top-level index. It's a useful feature that's implemented in &lt;code&gt;odoc_driver&lt;/code&gt; so I've &lt;a href=&quot;https://github.com/jonludlam/dune/commit/efecdee1b36b7e47906e7c64b7496a1fc7954a2d&quot;&gt;added it&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;More sensible &lt;a href=&quot;https://github.com/jonludlam/dune/commit/039eb3d2a3e9d28f8b195905f43839daf5ce8c21&quot;&gt;default link scope&lt;/a&gt;. By default, documentation references in the &lt;code&gt;mli&lt;/code&gt; files of a library can link to any other library in the package. However, by default it wasn't possible to link to the dependencies of another library, unless it happened to be a dependency of your own library. Similarly, the package-wide mld files could only reference the modules in the package's libraries, not to the dependencies. This seems overly cautious, as we can be sure that if we've managed to build the libraries then their dependencies are installed, and if there are any module name conflicts, we can resolve them via the &lt;code&gt;/&amp;lt;lib&amp;gt;/Module&lt;/code&gt; syntax.&lt;/li&gt;&lt;li&gt;Lastly, implementations of virtual libraries &lt;a href=&quot;https://github.com/jonludlam/dune/commit/12f9ecbd4888444c2d359049a914ffb4827912f9&quot;&gt;need to be skipped&lt;/a&gt; as they've all got the same docs (as they share mli files), and the rules as they were causing Dune to crash with a &amp;quot;Conflicting implementations&amp;quot; error.&lt;/li&gt;&lt;/ul&gt; 90 - &lt;p&gt;I've also rebased the PR onto latest &lt;code&gt;main&lt;/code&gt;, but I've not yet put these patches there, which I'll need to do for the PR to be mergable. For now, the 3.21 branch is successfully building the docs for the monorepo.&lt;/p&gt; 91 - &lt;h3 id=&quot;ocaml-docs-ci&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml Docs CI&lt;/h3&gt; 92 - &lt;p&gt;&lt;a href=&quot;https://github.com/jmid&quot;&gt;Jan Midtgaard&lt;/a&gt; noticed over xmas that the Docs CI &lt;a href=&quot;https://github.com/ocaml/ocaml.org/issues/3437&quot;&gt;was broken&lt;/a&gt; and submitted &lt;a href=&quot;https://github.com/jonludlam/opamh/pull/1&quot;&gt;a fix&lt;/a&gt;. I've therefore been poking &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; to get the fix incorporated and into production. I almost immediately hit the issue that &lt;code&gt;odoc_driver&lt;/code&gt; now breaks for the exact same reason. I couldn't quite understand how &lt;code&gt;opam-format&lt;/code&gt; &lt;a href=&quot;https://github.com/ocaml/opam-repository/pull/28978&quot;&gt;had been merged&lt;/a&gt; to &lt;code&gt;opam-repository&lt;/code&gt; without someone noticing that it had broken &lt;code&gt;odoc_driver&lt;/code&gt;, but it turned out that it &lt;i&gt;had&lt;/i&gt; been noticed, but on a &lt;a href=&quot;https://github.com/ocaml/opam-repository/pull/28877&quot;&gt;beta release&lt;/a&gt;. The fix to docs ci was to install &lt;code&gt;odoc_driver&lt;/code&gt; from opam rather than &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/blob/81ca17c7b7a2f47ca571b1d6bc866720cebef136/src/lib/config.ml#L226&quot;&gt;pinning directly&lt;/a&gt; to a github hash, especially if that hash happens to be the hash of the released version!&lt;/p&gt; 93 - &lt;p&gt;While I'm working on docs CI, I thought it's probably also a good idea to move over to the &lt;code&gt;with-doc &amp;amp; post&lt;/code&gt; suggestion from above, so we're ready for when packages start to use that. This is now being tested, and hopefully we'll have the CI back up and running early next week.&lt;/p&gt; 94 - &lt;h3 id=&quot;better-styling-for-odoc&quot;&gt;&lt;a href=&quot;#better-styling-for-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Better styling for odoc&lt;/h3&gt; 95 - &lt;p&gt;I've done very little to the styling of odoc since I took maintainership way back in 2019 or so. It's a bit dated, and there are some annoying usability issues, so I thought it's a good opportunity to vibe-code a nice new frontend for it. Rather than hack directly on the HTML generator of odoc, this seemed to be a good opportunity to test the JSON output from the new Dune rules, so I asked Claude to make me a static site generator that read in the JSON files and spat out some nicely styled HTML. This worked like a charm, and the results are &lt;a href=&quot;https://jon.ludl.am/experiments/vibe-coded-odoc-frontend/&quot;&gt;here&lt;/a&gt;. Next steps are to see what it would take to get the native odoc output looking more like that.&lt;/p&gt; 96 - &lt;h3 id=&quot;custom-tags-in-odoc&quot;&gt;&lt;a href=&quot;#custom-tags-in-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Custom tags in odoc&lt;/h3&gt; 97 - &lt;p&gt;One of the themes of Anil's &lt;a href=&quot;&quot;&gt;AOAH&lt;/a&gt; coding spree was that many libraries were implementations of RFCs. In many places in the docs, there are links to relevant sections of the RFCs. It'd be nice in future to be able to validate that we've covered all of the parts of the RFCs, so making the links a little more parsable seemed like a good idea. In fact, it seemed that this might be a perfect use for custom tags - a feature that was present in ocamldoc that odoc has yet to implement.&lt;/p&gt; 98 - &lt;p&gt;&lt;a href=&quot;https://github.com/art-w&quot;&gt;Arthur Wendling&lt;/a&gt; then pointed me at dune's &lt;a href=&quot;https://dune.readthedocs.io/en/stable/reference/dune/plugin.html&quot;&gt;plugin system&lt;/a&gt;, which seemed just the ticket as a way to implement this. It's really nice, taking all of the hard work out of creating OCaml plugins, so I've now got &lt;a href=&quot;https://github.com/jonludlam/odoc/tree/extension-plugins&quot;&gt;an extension-plugins branch&lt;/a&gt; that implements this. It allows you to add support to odoc for tags like &lt;code&gt;@rfc&lt;/code&gt; which generate custom HTML, markdown or any other backend, can include links in their bodies, and can add custom headers to the web page, and custom files to be output by &lt;code&gt;odoc support-files&lt;/code&gt;. It looks like this should &amp;quot;just work&amp;quot; and no further changes to the dune rules are needed - though I need to actually test this out.&lt;/p&gt; 99 - &lt;h3 id=&quot;day10-and-docs&quot;&gt;&lt;a href=&quot;#day10-and-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Day10 and docs&lt;/h3&gt; 100 - &lt;p&gt;I've &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/build-ids-for-day10&quot;&gt;written about&lt;/span&gt; &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark's&lt;/a&gt; day10 project before. It's a tool to very rapidly build odoc packages mainly in order to test that they build correctly. An obvious extension would be to use this to then build the docs for those packages, as the way we do this requires the packages to be built first. This would be a replacement for the Docs CI that I talked about above, though there's considerable work to do before it's fully-featured enough to be a viable alternative. It seemed like a good time to experiment with this though, so I set up one of Anil's &lt;a href=&quot;https://anil.recoil.org/notes/ocaml-claude-dev&quot;&gt;devcontainers&lt;/a&gt;, gave Claude some instructions on what to do, took the safety belt off, and let him hack away! Previously most of my interactions with Claude had been via the vscode plugin, so using the terminal interface was a bit of a different experience. I'm fairly certain though that I'm going to switch everything over to working this way, as letting Claude just get on with things without having to OK every step is a far more efficient way to work - especially when you're not that concerned with the actual code being produced. This has been mostly a good experience, though Claude does sometimes go off in rather odd directions. At one point there was a network error with a dependency while trying to build odoc_driver, so it decided that it should have a fallback mechanism that executed odoc directly. I told it &lt;i&gt;NEVER&lt;/i&gt; to replace functionality in odoc_driver, so it rolled this back, but a few hours later in then did exactly the same thing again.&lt;/p&gt; 101 - &lt;h3 id=&quot;misc-other-stuff&quot;&gt;&lt;a href=&quot;#misc-other-stuff&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Misc other stuff&lt;/h3&gt; 102 - &lt;p&gt;A few other things too - &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/59037341cd53d8734a5874f7af2b728b5be70035&quot;&gt;improving the &lt;code&gt;--warn-error&lt;/code&gt; logic in odoc&lt;/a&gt;, and one of its &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/9d18feff5eda543652c6749062750de6e5bb4d6e&quot;&gt;error messages&lt;/a&gt;, improving the build of this website so I can iterate on it more quickly, fixing up some of my self-hosted services like my tangled knot, and other bits and bobs.&lt;/p&gt; 103 - &lt;h2 id=&quot;reflections&quot;&gt;&lt;a href=&quot;#reflections&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Reflections&lt;/h2&gt; 104 - &lt;p&gt;I think the most important thing this week has been the slightly eye-opening benefits of using Claude outside of the context of VSCode. I suspect I'll be doing much more of my work this way in future. There's also a good chance I'll have to upgrade my subscription from the $100-per-month to the $200 one...&lt;/p&gt; 105 - &lt;h2 id=&quot;next-week&quot;&gt;&lt;a href=&quot;#next-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Next week&lt;/h2&gt; 106 - &lt;ul&gt;&lt;li&gt;Start of term tutorial meetings&lt;/li&gt;&lt;li&gt;Sherldoc in monopam-myspace&lt;/li&gt;&lt;li&gt;Get ocaml-docs-ci deployed and working&lt;/li&gt;&lt;li&gt;Update the Dune PR&lt;/li&gt;&lt;li&gt;Integrate the custom-tags and website generator into monopam-myspace&lt;/li&gt;&lt;li&gt;Unleash Claude on my js-top-worker repo&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html</id><title type="text">Weeknotes for week 3</title><updated>2026-01-19T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Back in March of this year we released , a major new version of the OCaml documentation generator. It had a whole load of , many of which came with new demands on the build system driving it. We decid...</summary><published>2025-12-18T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/12/claude-and-dune.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;claude-and-dune&quot;&gt;&lt;a href=&quot;#claude-and-dune&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Claude and Dune&lt;/h1&gt; 107 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-12-18&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 108 - &lt;p&gt;Back in March of this year we released &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/index.html&quot;&gt;odoc 3.0.0&lt;/a&gt;, a major new version of the OCaml documentation generator. It had a whole load of &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;new features&lt;/a&gt;, many of which came with new demands on the build system driving it. We decided when working on it to build a new driver for odoc so that we could adjust it as we were building the new features, and this driver is now used to &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;build the documentation&lt;/span&gt; that appears on &lt;a href=&quot;https://ocaml.org/p/base/latest/doc/index.html&quot;&gt;ocaml.org&lt;/a&gt;. However, it was always the plan to integrate the new features into &lt;a href=&quot;https://dune.build&quot;&gt;Dune&lt;/a&gt; so that everyone could just run &lt;code&gt;dune build @doc&lt;/code&gt; and be able to use all of the new odoc 3 features.&lt;/p&gt; 109 - &lt;p&gt;So over the last few weeks I have been wrestling with getting Claude to update the odoc rules in Dune to support &lt;i&gt;some&lt;/i&gt; of the new features of odoc v3. What began as a background experiment during a lecture series has turned into a multi-week effort to turn mostly-working code into a clean, reviewable patch. AI-developed software is clearly going to be a big part of our future, and Anil is showing us all the way with his &lt;a href=&quot;https://anil.recoil.org/notes/aoah-2025-1&quot;&gt;Advent of Agentic Humps&lt;/a&gt; by building &lt;i&gt;new&lt;/i&gt; software, but upstreaming AI-generated changes to an existing, well established code base &lt;a href=&quot;https://github.com/ocaml/ocaml/pull/14369&quot;&gt;hasn't got off to a good start&lt;/a&gt; in the OCaml community, so I wanted to be extra careful to get this right.&lt;/p&gt; 110 - &lt;h3 id=&quot;claude-as-a-protyping-tool&quot;&gt;&lt;a href=&quot;#claude-as-a-protyping-tool&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Claude as a protyping tool&lt;/h3&gt; 111 - &lt;p&gt;The initial progress was pretty amazing, despite my initial worries that the dune code-base would be &lt;a href=&quot;https://github.com/ocaml/dune/pull/12529&quot;&gt;too large and subtle&lt;/a&gt; for an LLM to be able to make workable changes. In order to get going, first I had it look at several bits of example code:&lt;/p&gt; 112 - &lt;p&gt;1. &lt;a href=&quot;https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc.ml&quot;&gt;dune_rules/odoc.ml&lt;/a&gt; - this is the current home of the odoc rules in dune. It's local-only, meaning it only builds the docs for the current package in isolation, so no resolution of links to stdlib, other packages or libraries.&lt;/p&gt; 113 - &lt;p&gt;2. &lt;a href=&quot;https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc_new.ml&quot;&gt;dune_rules/odoc_new.ml&lt;/a&gt; - these are the rules for odoc v2, which allow you to build the docs for your package plus all of the dependencies. I wrote this mostly myself some time ago. It does a pretty poor job of caching, error reporting, and has none of the odoc v3 features like assets, source rendering, hierarchical docs, better errors and so on.&lt;/p&gt; 114 - &lt;p&gt;3. &lt;a href=&quot;https://github.com/ocaml/odoc/tree/d8460cdaa2b91a03434a9a045d673703b7fabfb2/src/driver&quot;&gt;odoc_driver&lt;/a&gt; - this is the driver we wrote when building odoc v3. It's fully featured, but not at all incremental, and actually external to the dune codebase. It's the reference implementation that's used to build the docs that appear on &lt;a href=&quot;https://ocaml.org/p/base/latest/doc/index.html&quot;&gt;ocaml.org&lt;/a&gt;.&lt;/p&gt; 115 - &lt;p&gt;Armed with these three code-bases, I asked Claude to synthesise a new incremental version of the odoc rules for dune that has some of the features of &lt;code&gt;odoc_driver&lt;/code&gt;.&lt;/p&gt; 116 - &lt;h3 id=&quot;the-working-prototype&quot;&gt;&lt;a href=&quot;#the-working-prototype&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The working prototype&lt;/h3&gt; 117 - &lt;p&gt;Claude quickly produced a prototype that actually compiled and generated documentation. At that stage I was not interested in the quality of the generated source; I only needed to know whether Claude could navigate Dune's codebase and produce something that &lt;b&gt;works&lt;/b&gt;. I let the prototype evolve incrementally, adding in new features one at a time, for example, fixing the error reporting so that you only get warned about documentation errors that you can actually fix.&lt;/p&gt; 118 - &lt;p&gt;When the lectures finished, it turned out I had something that was pretty useful to me, and had a good chance to be useful to others too. So I opened up my editor and had a look through what had been produced, at this point hoping that a little bit of polishing should be enough - after all, it &lt;i&gt;was&lt;/i&gt; working!&lt;/p&gt; 119 - &lt;p&gt;It was dreadful.&lt;/p&gt; 120 - &lt;p&gt;There were long, rambling functions, code duplication, bad comments, it was unstructured, with repeated-but-slightly-different chunks all over the place. It wasn't just bad on one length scale - it was bad from the large-scale organisation of the code down to small scale baffling weirdnesses on one line. The more I looked, the more bonkers it appeared. But it did &lt;i&gt;work&lt;/i&gt;! So I thought I'd get Claude to clean up its own messes.&lt;/p&gt; 121 - &lt;h3 id=&quot;the-clean-up&quot;&gt;&lt;a href=&quot;#the-clean-up&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The clean-up&lt;/h3&gt; 122 - &lt;p&gt;I resolved that I would continue to let Claude do &lt;i&gt;all&lt;/i&gt; of the editing, and not do &lt;i&gt;any&lt;/i&gt; myself, and so thus began the more frustrating part of this adventure! I ended up giving a mix of very specific instructions: &amp;quot;move this code here&amp;quot;, &amp;quot;factorize out this functionality&amp;quot;, &amp;quot;rename this function&amp;quot;, and sometimes more general ones: &amp;quot;Remove any comments that don't add anything of value&amp;quot;, or &amp;quot;Think of a better way to do this&amp;quot;. The constant was that I needed to be looking over each change that it did, because while most of them were pretty good, there were still a few, even with the very explicit instructions, where it messed up. From the very broad, where at one point it told me &amp;quot;I'll remove this code to create odoc files for external dependencies, as they're installed by opam&amp;quot;, which isn't true, down to the very small - for example, it produced the following:&lt;/p&gt; 123 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let lib_names = deps.Odoc_config.libraries in 146 + ...</code></pre></div> 147 + <p>where it seems opam has decided that something has changed enough for it to want to recompile the `ocaml` package, and therefore <i>everything</i> in the entire opam switch! So this job took 12 minutes instead of 21 seconds, which was the time required to finally build the `tezos-protocol` package.</p> 148 + <h3 id="day10-and-docs"><a href="#day10-and-docs" class="anchor"></a>Day10 and docs</h3> 149 + <p>In closely related news, <a href="https://tunbury.org/">mtelver's</a> day10 project looked precisely the right shape for building docs - in fact it shares its architecture and some components with the docs CI. So I asked Claude to take a look and see what it would take, and discovered that it doesn't take very much! We have a Really Big Machine here at the CL that was temporarily underused; and by Really Big I mean 768 cores and 3TB of RAM. So, how long could building all of the docs for all of the packages possibly take? Well, it takes 5 hours 40 mins. And I was only using roughly a third of the machine. Nice!</p> 150 + <p>So should I push on with fixing ocaml-docs-ci and figure out why it's rebuilding everything all the time? Or should I forge ahead with day10 and turn it into a proper CI system as opposed to a slightly flakey bespoke thing I have to handhold through a build? This is next week's problem.</p> 151 + <h3 id="js-toplevels"><a href="#js-toplevels" class="anchor"></a>JS toplevels</h3> 152 + <p>Something I keep coming back to is javascript toplevels. I'd really like to be able to be able to host JS toplevels on ocaml.org for each different version of each different package. This is something I've worked on on-and-off for a long time now, and several fixes to help have been merged to various projects along the way. The tricky thing is to not put a massive load onto ocaml.org with this, so we need to be efficient. That means firstly having a single toplevel js file with all of the logic in but none of the libraries, and then dynamically loading libraries as we need them. Also we can save some bandwidth by not immediately sending all of the cmi files, as these can be faulted in as necessary too. So once again I've got Claude on the task, and things are honestly looking pretty hopeful now. I've got 2 demos:</p> 153 + <ul><li><a href="https://jon.ludl.am/experiments/findlibish/">Dynamic library loading</a></li><li><a href="https://jon.ludl.am/experiments/multi-universe-demo/">Multi-version support</a></li></ul> 154 + <p>In both cases, make sure you take a look at the network tab to see it dynamically loading only what it needs.</p> 155 + <h2 id="retrospective"><a href="#retrospective" class="anchor"></a>Retrospective</h2> 156 + <h3 id="autonomous-claude"><a href="#autonomous-claude" class="anchor"></a>Autonomous Claude</h3> 157 + <p>The power of sending Claude off to do some work can be immense. However, it does mean investing time up front telling it precisely what problem you're trying to solve, what approach to take, finer details on how you want it done, and how you can tell if it's working when it finishes. A 'failure mode' I've been experiencing is when I end up in a long, drawn out real time interaction, especially if that's happening with 2 projects simultaneously - and by 'failure' I really mean just 'slow'. Ideally what would be going on is for all of my agents to be getting on with whatever task they've been allocated without bothering me for more details. For Claude to have to ask me a question has much more latency involved than it just getting on with things, especially if I don't notice it immediately.</p> 158 + <h3 id="when-to-stop"><a href="#when-to-stop" class="anchor"></a>When to Stop</h3> 159 + <p>The 'finishing criteria' are important - many times this week I've had Claude tell me it's finished something, having verified that it's passing all the tests, only for me to take a look to find that it's very obviously broken. As quite a few things recently have involved the web, I've put Playwright into all of my devcontainers, and told Claude to use it to verify things are working. This has been working pretty well, so I'll be adding it to my prompts. It's not too dissimilar to what we used to call 'pre-flight checks' back in the Citrix days.</p> 160 + <h3 id="containers-vs-accounts"><a href="#containers-vs-accounts" class="anchor"></a>Containers vs accounts</h3> 161 + <p>I've been running everything with `--dangerously-ignore-permissions` in containers, and while the outcome is amazing, the containers bit has been a bit of a headache. Next week I'll be trialling the idea of just giving the agents their own account (non-admin!) on my servers, their own github account, tangled account and so on, and just treating them more like I would if I had a real colleague. It's always slightly alarming to see my own name on the output of the bots, assigning me (or sometimes someone else (!!)) copyright over code I've never seen. This is, of course, a whole other pandora's box that I really don't want to open right now - but I think the point is that I'll feel a lot more comfortable if the commits are all by `Jon's Agent &lt;jon+claude@recoil.org&gt;` rather than by me!</p> 162 + <h3 id="deciding-next-steps"><a href="#deciding-next-steps" class="anchor"></a>Deciding next steps</h3> 163 + <p>The question of whether I should fix up ocaml-docs-ci or improve the day10 solution requires a bit of thought. In fact, it requires a bit of a gap analysis between the two. This isn't something I've asked Claude to do before, so I'll try that and see how it turns out. I'll be asking it to be &quot;scientific&quot; in its approach, coming up with hypotheses and verifying them - for which I think I'll need to give it a platform on which it can perform experiments. This is a bit trickier with ocaml-docs-ci than day10 as day10 runs entirely on any given linux computer, whereas ocaml-docs-ci needs ocurrent workers and a routable ssh server. I'll report on the outcome of this next week!</p>]]></content> 164 + </entry> 165 + <entry> 166 + <id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html</id> 167 + <title>Weeknotes for week 3</title> 168 + <published>2026-01-19T00:00:00Z</published> 169 + <updated>2026-01-19T00:00:00Z</updated> 170 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html"/> 171 + <summary>First week back of 2026! Let's write some terse weeknotes.</summary> 172 + <content type="html"><![CDATA[<h1 id="weeknotes-for-week-3"><a href="#weeknotes-for-week-3" class="anchor"></a>Weeknotes for week 3</h1> 173 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-01-19</p></li></ul> 174 + <p>First week back of 2026! Let's write some terse weeknotes.</p> 175 + <h2 id="projects"><a href="#projects" class="anchor"></a>Projects</h2> 176 + <h3 id="dune-odoc-rules"><a href="#dune-odoc-rules" class="anchor"></a>Dune odoc rules</h3> 177 + <p>Last thing I did last year was to push the new rules for odoc 3. This week, Anil handed me an excellent opportunity to test the rules on the monorepo containing his <a href="https://anil.recoil.org/notes/aoah-2025">AOAH</a> projects. Claude tends to actually write ocamldoc-formatted comments, so this is really useful to test the rules. I've <a href="https://github.com/jonludlam/dune/tree/odoc-v3-rules-3.21">rebased the commits</a> on the just-released Dune 3.21 and we've been trying them out. There were a few things to fix:</p> 178 + <ul><li>More careful <a href="https://github.com/jonludlam/dune/commit/25158eabf0c3cac2826e16ce590b4bd4d7c09818">dependency tracking</a> during the compile phase - this particularly affected the <code>@doc</code> target, which was pulling in unnecessary dependencies. Most of these dependencies were compiling just fine, but one - Anstrom - is slightly odd in that the opam install of Angstrom installs a META file that references libraries that aren't in the dependencies of its opam package. This is a backward-compatibility hack that was implemented when the Anstrom package was split into several in order to manage the dependencies better.</li><li>A similar issue happens with eio, where the documentation of the package depends upon <code>bigstring</code>, which isn't in eio's dependencies. This is entirely intentional - the extra doc dependencies is stated in the opam file with a <code>x-extra-doc-deps</code> field. However, <code>opam install</code> totally ignores this field (quite reasonably), and so a simple install gives you an opam repo whose docs can't be built. Once again, this broke <code>dune build @doc</code> unnecessarily, but the fix was <a href="https://github.com/jonludlam/dune/commit/2afe046cf4290d7a83b5f2c5646e3391ca94b630">relatively simple</a>. The <i>real</i> fix here is to not use <code>x-extra-doc-deps</code>, but switch to using a <i>real</i> dependency, but marked with <code>with-doc</code> and <code>post</code> if it would otherwise introduce a circular dependency. That way, an <code>opam install --with-doc</code> <i>would</i> install the extra dependency.</li><li>Over the Christmas break, <a href="https://discuss.ocaml.org/u/tbrk">tbrk</a> posted <a href="https://discuss.ocaml.org/t/odoc-index-for-multiple-packages-inter-package-links-and-local-global-sidebar/17652">on discuss</a> a question about building docs, for which my dune branch was a partial answer. One feature he was requesting though was the ability to use a custom top-level index. It's a useful feature that's implemented in <code>odoc_driver</code> so I've <a href="https://github.com/jonludlam/dune/commit/efecdee1b36b7e47906e7c64b7496a1fc7954a2d">added it</a>.</li><li>More sensible <a href="https://github.com/jonludlam/dune/commit/039eb3d2a3e9d28f8b195905f43839daf5ce8c21">default link scope</a>. By default, documentation references in the <code>mli</code> files of a library can link to any other library in the package. However, by default it wasn't possible to link to the dependencies of another library, unless it happened to be a dependency of your own library. Similarly, the package-wide mld files could only reference the modules in the package's libraries, not to the dependencies. This seems overly cautious, as we can be sure that if we've managed to build the libraries then their dependencies are installed, and if there are any module name conflicts, we can resolve them via the <code>/&lt;lib&gt;/Module</code> syntax.</li><li>Lastly, implementations of virtual libraries <a href="https://github.com/jonludlam/dune/commit/12f9ecbd4888444c2d359049a914ffb4827912f9">need to be skipped</a> as they've all got the same docs (as they share mli files), and the rules as they were causing Dune to crash with a &quot;Conflicting implementations&quot; error.</li></ul> 179 + <p>I've also rebased the PR onto latest <code>main</code>, but I've not yet put these patches there, which I'll need to do for the PR to be mergable. For now, the 3.21 branch is successfully building the docs for the monorepo.</p> 180 + <h3 id="ocaml-docs-ci"><a href="#ocaml-docs-ci" class="anchor"></a>OCaml Docs CI</h3> 181 + <p><a href="https://github.com/jmid">Jan Midtgaard</a> noticed over xmas that the Docs CI <a href="https://github.com/ocaml/ocaml.org/issues/3437">was broken</a> and submitted <a href="https://github.com/jonludlam/opamh/pull/1">a fix</a>. I've therefore been poking <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> to get the fix incorporated and into production. I almost immediately hit the issue that <code>odoc_driver</code> now breaks for the exact same reason. I couldn't quite understand how <code>opam-format</code> <a href="https://github.com/ocaml/opam-repository/pull/28978">had been merged</a> to <code>opam-repository</code> without someone noticing that it had broken <code>odoc_driver</code>, but it turned out that it <i>had</i> been noticed, but on a <a href="https://github.com/ocaml/opam-repository/pull/28877">beta release</a>. The fix to docs ci was to install <code>odoc_driver</code> from opam rather than <a href="https://github.com/ocurrent/ocaml-docs-ci/blob/81ca17c7b7a2f47ca571b1d6bc866720cebef136/src/lib/config.ml#L226">pinning directly</a> to a github hash, especially if that hash happens to be the hash of the released version!</p> 182 + <p>While I'm working on docs CI, I thought it's probably also a good idea to move over to the <code>with-doc &amp; post</code> suggestion from above, so we're ready for when packages start to use that. This is now being tested, and hopefully we'll have the CI back up and running early next week.</p> 183 + <h3 id="better-styling-for-odoc"><a href="#better-styling-for-odoc" class="anchor"></a>Better styling for odoc</h3> 184 + <p>I've done very little to the styling of odoc since I took maintainership way back in 2019 or so. It's a bit dated, and there are some annoying usability issues, so I thought it's a good opportunity to vibe-code a nice new frontend for it. Rather than hack directly on the HTML generator of odoc, this seemed to be a good opportunity to test the JSON output from the new Dune rules, so I asked Claude to make me a static site generator that read in the JSON files and spat out some nicely styled HTML. This worked like a charm, and the results are <a href="https://jon.ludl.am/experiments/vibe-coded-odoc-frontend/">here</a>. Next steps are to see what it would take to get the native odoc output looking more like that.</p> 185 + <h3 id="custom-tags-in-odoc"><a href="#custom-tags-in-odoc" class="anchor"></a>Custom tags in odoc</h3> 186 + <p>One of the themes of Anil's <a href="">AOAH</a> coding spree was that many libraries were implementations of RFCs. In many places in the docs, there are links to relevant sections of the RFCs. It'd be nice in future to be able to validate that we've covered all of the parts of the RFCs, so making the links a little more parsable seemed like a good idea. In fact, it seemed that this might be a perfect use for custom tags - a feature that was present in ocamldoc that odoc has yet to implement.</p> 187 + <p><a href="https://github.com/art-w">Arthur Wendling</a> then pointed me at dune's <a href="https://dune.readthedocs.io/en/stable/reference/dune/plugin.html">plugin system</a>, which seemed just the ticket as a way to implement this. It's really nice, taking all of the hard work out of creating OCaml plugins, so I've now got <a href="https://github.com/jonludlam/odoc/tree/extension-plugins">an extension-plugins branch</a> that implements this. It allows you to add support to odoc for tags like <code>@rfc</code> which generate custom HTML, markdown or any other backend, can include links in their bodies, and can add custom headers to the web page, and custom files to be output by <code>odoc support-files</code>. It looks like this should &quot;just work&quot; and no further changes to the dune rules are needed - though I need to actually test this out.</p> 188 + <h3 id="day10-and-docs"><a href="#day10-and-docs" class="anchor"></a>Day10 and docs</h3> 189 + <p>I've <a href="../../2025/09/build-ids-for-day10.html" title="build-ids-for-day10">written about</a> <a href="https://tunbury.org/">Mark's</a> day10 project before. It's a tool to very rapidly build odoc packages mainly in order to test that they build correctly. An obvious extension would be to use this to then build the docs for those packages, as the way we do this requires the packages to be built first. This would be a replacement for the Docs CI that I talked about above, though there's considerable work to do before it's fully-featured enough to be a viable alternative. It seemed like a good time to experiment with this though, so I set up one of Anil's <a href="https://anil.recoil.org/notes/ocaml-claude-dev">devcontainers</a>, gave Claude some instructions on what to do, took the safety belt off, and let him hack away! Previously most of my interactions with Claude had been via the vscode plugin, so using the terminal interface was a bit of a different experience. I'm fairly certain though that I'm going to switch everything over to working this way, as letting Claude just get on with things without having to OK every step is a far more efficient way to work - especially when you're not that concerned with the actual code being produced. This has been mostly a good experience, though Claude does sometimes go off in rather odd directions. At one point there was a network error with a dependency while trying to build odoc_driver, so it decided that it should have a fallback mechanism that executed odoc directly. I told it <i>NEVER</i> to replace functionality in odoc_driver, so it rolled this back, but a few hours later in then did exactly the same thing again.</p> 190 + <h3 id="misc-other-stuff"><a href="#misc-other-stuff" class="anchor"></a>Misc other stuff</h3> 191 + <p>A few other things too - <a href="https://github.com/jonludlam/odoc/commit/59037341cd53d8734a5874f7af2b728b5be70035">improving the <code>--warn-error</code> logic in odoc</a>, and one of its <a href="https://github.com/jonludlam/odoc/commit/9d18feff5eda543652c6749062750de6e5bb4d6e">error messages</a>, improving the build of this website so I can iterate on it more quickly, fixing up some of my self-hosted services like my tangled knot, and other bits and bobs.</p> 192 + <h2 id="reflections"><a href="#reflections" class="anchor"></a>Reflections</h2> 193 + <p>I think the most important thing this week has been the slightly eye-opening benefits of using Claude outside of the context of VSCode. I suspect I'll be doing much more of my work this way in future. There's also a good chance I'll have to upgrade my subscription from the $100-per-month to the $200 one...</p> 194 + <h2 id="next-week"><a href="#next-week" class="anchor"></a>Next week</h2> 195 + <ul><li>Start of term tutorial meetings</li><li>Sherldoc in monopam-myspace</li><li>Get ocaml-docs-ci deployed and working</li><li>Update the Dune PR</li><li>Integrate the custom-tags and website generator into monopam-myspace</li><li>Unleash Claude on my js-top-worker repo</li></ul>]]></content> 196 + </entry> 197 + <entry> 198 + <id>https://jon.recoil.org/blog/2025/12/claude-and-dune.html</id> 199 + <title>Claude and Dune</title> 200 + <published>2025-12-18T00:00:00Z</published> 201 + <updated>2025-12-18T00:00:00Z</updated> 202 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/12/claude-and-dune.html"/> 203 + <summary>Back in March of this year we released , a major new version of the OCaml documentation generator. It had a whole load of , many of which came with new demands on the build system driving it. We decid...</summary> 204 + <content type="html"><![CDATA[<h1 id="claude-and-dune"><a href="#claude-and-dune" class="anchor"></a>Claude and Dune</h1> 205 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-12-18</p></li></ul> 206 + <p>Back in March of this year we released <a href="https://ocaml.github.io/odoc/odoc/index.html">odoc 3.0.0</a>, a major new version of the OCaml documentation generator. It had a whole load of <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">new features</a>, many of which came with new demands on the build system driving it. We decided when working on it to build a new driver for odoc so that we could adjust it as we were building the new features, and this driver is now used to <a href="../07/odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">build the documentation</a> that appears on <a href="https://ocaml.org/p/base/latest/doc/index.html">ocaml.org</a>. However, it was always the plan to integrate the new features into <a href="https://dune.build">Dune</a> so that everyone could just run <code>dune build @doc</code> and be able to use all of the new odoc 3 features.</p> 207 + <p>So over the last few weeks I have been wrestling with getting Claude to update the odoc rules in Dune to support <i>some</i> of the new features of odoc v3. What began as a background experiment during a lecture series has turned into a multi-week effort to turn mostly-working code into a clean, reviewable patch. AI-developed software is clearly going to be a big part of our future, and Anil is showing us all the way with his <a href="https://anil.recoil.org/notes/aoah-2025-1">Advent of Agentic Humps</a> by building <i>new</i> software, but upstreaming AI-generated changes to an existing, well established code base <a href="https://github.com/ocaml/ocaml/pull/14369">hasn't got off to a good start</a> in the OCaml community, so I wanted to be extra careful to get this right.</p> 208 + <h3 id="claude-as-a-protyping-tool"><a href="#claude-as-a-protyping-tool" class="anchor"></a>Claude as a protyping tool</h3> 209 + <p>The initial progress was pretty amazing, despite my initial worries that the dune code-base would be <a href="https://github.com/ocaml/dune/pull/12529">too large and subtle</a> for an LLM to be able to make workable changes. In order to get going, first I had it look at several bits of example code:</p> 210 + <p>1. <a href="https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc.ml">dune_rules/odoc.ml</a> - this is the current home of the odoc rules in dune. It's local-only, meaning it only builds the docs for the current package in isolation, so no resolution of links to stdlib, other packages or libraries.</p> 211 + <p>2. <a href="https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc_new.ml">dune_rules/odoc_new.ml</a> - these are the rules for odoc v2, which allow you to build the docs for your package plus all of the dependencies. I wrote this mostly myself some time ago. It does a pretty poor job of caching, error reporting, and has none of the odoc v3 features like assets, source rendering, hierarchical docs, better errors and so on.</p> 212 + <p>3. <a href="https://github.com/ocaml/odoc/tree/d8460cdaa2b91a03434a9a045d673703b7fabfb2/src/driver">odoc_driver</a> - this is the driver we wrote when building odoc v3. It's fully featured, but not at all incremental, and actually external to the dune codebase. It's the reference implementation that's used to build the docs that appear on <a href="https://ocaml.org/p/base/latest/doc/index.html">ocaml.org</a>.</p> 213 + <p>Armed with these three code-bases, I asked Claude to synthesise a new incremental version of the odoc rules for dune that has some of the features of <code>odoc_driver</code>.</p> 214 + <h3 id="the-working-prototype"><a href="#the-working-prototype" class="anchor"></a>The working prototype</h3> 215 + <p>Claude quickly produced a prototype that actually compiled and generated documentation. At that stage I was not interested in the quality of the generated source; I only needed to know whether Claude could navigate Dune's codebase and produce something that <b>works</b>. I let the prototype evolve incrementally, adding in new features one at a time, for example, fixing the error reporting so that you only get warned about documentation errors that you can actually fix.</p> 216 + <p>When the lectures finished, it turned out I had something that was pretty useful to me, and had a good chance to be useful to others too. So I opened up my editor and had a look through what had been produced, at this point hoping that a little bit of polishing should be enough - after all, it <i>was</i> working!</p> 217 + <p>It was dreadful.</p> 218 + <p>There were long, rambling functions, code duplication, bad comments, it was unstructured, with repeated-but-slightly-different chunks all over the place. It wasn't just bad on one length scale - it was bad from the large-scale organisation of the code down to small scale baffling weirdnesses on one line. The more I looked, the more bonkers it appeared. But it did <i>work</i>! So I thought I'd get Claude to clean up its own messes.</p> 219 + <h3 id="the-clean-up"><a href="#the-clean-up" class="anchor"></a>The clean-up</h3> 220 + <p>I resolved that I would continue to let Claude do <i>all</i> of the editing, and not do <i>any</i> myself, and so thus began the more frustrating part of this adventure! I ended up giving a mix of very specific instructions: &quot;move this code here&quot;, &quot;factorize out this functionality&quot;, &quot;rename this function&quot;, and sometimes more general ones: &quot;Remove any comments that don't add anything of value&quot;, or &quot;Think of a better way to do this&quot;. The constant was that I needed to be looking over each change that it did, because while most of them were pretty good, there were still a few, even with the very explicit instructions, where it messed up. From the very broad, where at one point it told me &quot;I'll remove this code to create odoc files for external dependencies, as they're installed by opam&quot;, which isn't true, down to the very small - for example, it produced the following:</p> 221 + <div><pre class="language-ocaml"><code>let lib_names = deps.Odoc_config.libraries in 124 222 if List.is_empty lib_names 125 223 then Memo.return [] 126 - else Memo.List.filter_map lib_names ~f:(fun lib_name -&amp;gt; Lib.DB.find lib_db lib_name)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 127 - &lt;p&gt;where it has come up with a totally redundant check for the empty list.&lt;/p&gt; 128 - &lt;p&gt;It was at this point where it became frustrating, because although it's almost magical that Claude can do what it does in the time it does, this fact of having to keep a constant eye on it meant the the tens-of-seconds to minutes delay in between it doing something meant I ended up either twiddling my thumbs for long periods of time, or getting started on some other task and forgetting to come back to Claude, sometimes for hours!&lt;/p&gt; 129 - &lt;h3 id=&quot;ocaml-is-not-the-problem&quot;&gt;&lt;a href=&quot;#ocaml-is-not-the-problem&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml is &lt;b&gt;not&lt;/b&gt; the problem&lt;/h3&gt; 130 - &lt;p&gt;One part that particularly impressed, and also quite surprised me, was with its knowledge of OCaml. In particular, I had at one point two different types representing the 'target' - either a library or a package - and a 'kind' - either a module or a page. Now pages can only be associated with package targets, and modules can only be associated with libraries, but these two values were distinct, so there was a fair bit of code pattern matching invalid combinations and either throwing exceptions or picking some random value, depending on the whims of Claude's context. I bravely suggested it think of a better way to represent this, maybe using GADTs, and it did indeed come up with a pretty nice refactoring of the types:&lt;/p&gt; 131 - &lt;p&gt;Before:&lt;/p&gt; 132 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type target = 224 + else Memo.List.filter_map lib_names ~f:(fun lib_name -&gt; Lib.DB.find lib_db lib_name)</code></pre></div> 225 + <p>where it has come up with a totally redundant check for the empty list.</p> 226 + <p>It was at this point where it became frustrating, because although it's almost magical that Claude can do what it does in the time it does, this fact of having to keep a constant eye on it meant the the tens-of-seconds to minutes delay in between it doing something meant I ended up either twiddling my thumbs for long periods of time, or getting started on some other task and forgetting to come back to Claude, sometimes for hours!</p> 227 + <h3 id="ocaml-is-not-the-problem"><a href="#ocaml-is-not-the-problem" class="anchor"></a>OCaml is <b>not</b> the problem</h3> 228 + <p>One part that particularly impressed, and also quite surprised me, was with its knowledge of OCaml. In particular, I had at one point two different types representing the 'target' - either a library or a package - and a 'kind' - either a module or a page. Now pages can only be associated with package targets, and modules can only be associated with libraries, but these two values were distinct, so there was a fair bit of code pattern matching invalid combinations and either throwing exceptions or picking some random value, depending on the whims of Claude's context. I bravely suggested it think of a better way to represent this, maybe using GADTs, and it did indeed come up with a pretty nice refactoring of the types:</p> 229 + <p>Before:</p> 230 + <div><pre class="language-ocaml"><code>type target = 133 231 | Lib of Package.Name.t * Lib.t 134 232 | Pkg of Package.Name.t 135 233 ··· 142 240 | Page of 143 241 { name : string 144 242 ; pkg_libs : Lib.t list 145 - }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 146 - &lt;p&gt;After:&lt;/p&gt; 147 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;(* Artifact data types *) 243 + }</code></pre></div> 244 + <p>After:</p> 245 + <div><pre class="language-ocaml"><code>(* Artifact data types *) 148 246 type page = { name : string; pkg_libs : Lib.t list } 149 247 150 248 type mod_ = ··· 154 252 } 155 253 156 254 type _ target = 157 - | Lib : Package.Name.t * Lib.t -&amp;gt; mod_ target 158 - | Pkg : Package.Name.t -&amp;gt; page target 255 + | Lib : Package.Name.t * Lib.t -&gt; mod_ target 256 + | Pkg : Package.Name.t -&gt; page target 159 257 160 258 type artifact_kind = 161 - | Module : mod_ * mod_ target -&amp;gt; artifact_kind 162 - | Page : page * page target -&amp;gt; artifact_kind&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 163 - &lt;p&gt;This refactoring immediately removed a whole swathe of invalid combinations, making the code both safer and clearer. It's quite clear that Claude had no trouble understanding how GADTs work in OCaml, quite happily also using some existentials to pack them into lists and so on.&lt;/p&gt; 164 - &lt;h3 id=&quot;odd-behaviours&quot;&gt;&lt;a href=&quot;#odd-behaviours&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odd behaviours&lt;/h3&gt; 165 - &lt;p&gt;Sometimes Claude just went a little bit bananas. One annoyance that &lt;i&gt;repeatedly&lt;/i&gt; occurred was that it would forget how to build and test the dune executable, despite clear instructions in &lt;code&gt;Claude.md&lt;/code&gt;. Most of the time when it went wrong it would build dune, execute &lt;code&gt;dune clean&lt;/code&gt;, then try to run the dune binary that it had just removed with the &lt;code&gt;clean&lt;/code&gt;. Sometimes it would decide to use the bootstrap binary instead, which isn't rebuilt on every change, sometimes it would run the switch-installed dune binary, and on one occasion it tried to run &lt;code&gt;./configure &amp;amp;&amp;amp; make&lt;/code&gt;!&lt;/p&gt; 166 - &lt;p&gt;It would usually figure out eventually what the right thing to do was, but when you're waiting for it to complete so you can check what it's done these sorts of delays got a bit frustrating.&lt;/p&gt; 167 - &lt;h3 id=&quot;reflections&quot;&gt;&lt;a href=&quot;#reflections&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Reflections&lt;/h3&gt; 168 - &lt;p&gt;At one point, I ran out of Claude credits (despite paying $100 a month or so), at about 6:20pm one evening, and it told me that I needed to wait until 7pm to carry on. I'd just got to the point when I needed to write a short bit of code rather than refactoring what was already there, and I realised that while it would take me maybe 10 mins, it would take Claude maybe 10 seconds. Now, it could just be that it was the end of a long day and I was running out of steam, but I was content to switch focus elsewhere for a bit to wait for my credits to reset before carrying on! The point being that for the small implementation that I was after, it would be possible for me to get Claude to do it, and to eyeball the result to make sure it was OK in less time than I would have been able to do it myself. But I absolutely wouldn't have trusted Claude to do it in an upstreamable way &lt;b&gt;without&lt;/b&gt; looking at the result.&lt;/p&gt; 169 - &lt;p&gt;Overall, It's clear that Claude will be an incredibly useful tool for working with software. It's unbelievably good at jumping into a new code-base and figuring things out quickly, but less good at producing high-quality code that can be directly submitted upstream (yet?) - at least, not that &lt;b&gt;I&lt;/b&gt; would be comfortable submitting anyway. However, I think it's still a bit of an open question as to what the quality bar &lt;em&gt;should&lt;/em&gt; be. If it builds correctly, passes the tests, looks &lt;i&gt;broadly&lt;/i&gt; sensible and isn't on the critical path for performance, how much should we care about the line-to-line quality? &lt;b&gt;I&lt;/b&gt; certainly care, but am I being old fashioned?&lt;/p&gt; 170 - &lt;p&gt;I've submitted a &lt;a href=&quot;https://github.com/ocaml/dune/pull/12995&quot;&gt;PR with these changes&lt;/a&gt; for review, and we'll see what happens there. I ended up squashing all of the commits into one, as the intermediate steps are very likely not useful. However, for historical interest, the branch on which I did most of the work is &lt;a href=&quot;https://github.com/ocaml/dune/compare/main...jonludlam:dune:odoc3-global-sidebar&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/12/claude-and-dune.html</id><title type="text">Claude and Dune</title><updated>2025-12-18T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them int...</summary><published>2025-12-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;an-svg-is-all-you-need&quot;&gt;&lt;a href=&quot;#an-svg-is-all-you-need&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;An SVG is all you need&lt;/h1&gt; 171 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-12-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 172 - &lt;p&gt;SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them interactive. They're &lt;a href=&quot;https://www.youtube.com/watch?v=4laPOtTRteI&quot;&gt;way more capable&lt;/a&gt; than many people realise, and I think we can capitalise on some of that unrealised potential.&lt;/p&gt; 173 - &lt;p&gt;Anil's recent post &lt;a href=&quot;https://anil.recoil.org/notes/principles-for-collective-knowledge&quot;&gt;Four Ps for Building Massive Collective Knowledge Systems&lt;/a&gt; got me thinking about the permanence of the experimentation that underlies our scientific papers. In my idealistic vision of how scientific publishing should work, each paper would be accompanied by a fully interactive environment where the reader could explore the data, rerun the experiments, tweak the parameters, and see how the results changed. Obviously we can't do this in the general case - some experiments are just too expensive or time-consuming to rerun on demand. But for many papers, especially in computer science, this is entirely feasible.&lt;/p&gt; 174 - &lt;p&gt;That line of thought reminded me of a project I tackled about 20 years ago as a post-doc in the Department of Plant Sciences here in Cambridge. I was writing a paper on &lt;a href=&quot;https://royalsocietypublishing.org/rsif/article/9/70/949/173/Applications-of-percolation-theory-to-fungal&quot;&gt;synergy in fungal networks&lt;/a&gt; and built a tiny SVG visualisation tool that let readers wander through the raw data captured from a real fungal network growing in a petri dish. I dug it up recently and was surprised (and delighted) to see that it still works perfectly in modern browsers - even though the original “cover page” suggested Firefox 1.5 or the Adobe SVG plug-in (!). Give it a spin; click the 'forward', 'back' and other buttons below the petri dish!&lt;/p&gt; 175 - &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;./fungus.svg&lt;/span&gt;&lt;/div&gt; 176 - &lt;p&gt;And that, dear reader, is literally all you need. A completely self-contained SVG file can either fetch data from a versioned repository or embed the data directly, as the example does. It can process that data, generate visualisations, and render knobs and sliders for interactive exploration. No server-side magic required - everything runs client-side in the browser, served by a plain static web server, and very easily to share.&lt;/p&gt; 177 - &lt;p&gt;How does it fit in with Anil's four Ps?&lt;/p&gt; 178 - &lt;ul&gt;&lt;li&gt;Permanence: SVGs can be assigned DOIs just like papers, blog posts, or datasets. The fact that the above SVG still works after two decades is a testament to the durability of the format.&lt;/li&gt;&lt;/ul&gt; 179 - &lt;ul&gt;&lt;li&gt;Provenance: Because SVG is plain text, it plays nicely with version control systems such as Git. When an SVG pulls in external data, the same provenance-tracking strategies Anil describes for datasets apply here as well.&lt;/li&gt;&lt;/ul&gt; 180 - &lt;ul&gt;&lt;li&gt;Permission: Once again, with the separation between the processing in the SVG and that data that it works on, the same permissioning models apply as for data in general.&lt;/li&gt;&lt;/ul&gt; 181 - &lt;ul&gt;&lt;li&gt;Placement: SVGs are &lt;i&gt;inherently&lt;/i&gt; spatial; it's very easy, for example, to make beautiful &lt;a href=&quot;https://stephanwagner.me/coding/blog/create-world-map-charts-with-svgmap#svgMapDemoGDP&quot;&gt;world maps&lt;/a&gt; with SVG.&lt;/li&gt;&lt;/ul&gt; 182 - &lt;p&gt;The SVG above is only a visualisation tool for data; it doesn't really do any processing, but it certainly &lt;i&gt;could&lt;/i&gt;. The biggest change that's happened over the 20 years since I wrote this is the &lt;i&gt;massive&lt;/i&gt; increase in the computation power available in the browser. If would be entirely feasible to implement the entire data analysis pipeline for that paper in an SVG today, probably without even spinning up the fans on my laptop!&lt;/p&gt; 183 - &lt;p&gt;So this is yet another tool in our ongoing effort to be able to effortlessly share and remix our work - added to the pile of Jupyter notebooks, &lt;a href=&quot;https://digitalflapjack.com/blog/marimo/&quot;&gt;Marimo botebooks&lt;/a&gt;, the &lt;a href=&quot;https://slipshow.readthedocs.io/en/stable/&quot;&gt;slipshow&lt;/a&gt;/&lt;a href=&quot;https://github.com/art-w/x-ocaml/&quot;&gt;x-ocaml&lt;/a&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/11/foundations-of-computer-science&quot;&gt;combination&lt;/span&gt;, &lt;a href=&quot;https://patrick.sirref.org/weekly-2025-w45/index.xml&quot;&gt;Patrick's take&lt;/a&gt; on Jon Sterling's &lt;a href=&quot;https://sr.ht/~jonsterling/forester/&quot;&gt;Forester&lt;/a&gt;, my own &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/index&quot;&gt;notebooks&lt;/span&gt;, and many others - and this is a subset of what we're using just in our own group!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html</id><title type="text">An SVG is all you need</title><updated>2025-12-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I recently completed lecturing the course to our newly arrived first-year computer scientists here at . This is the first time I've lectured this course, taking over from while he's on sabbatical. A...</summary><published>2025-11-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;foundations-of-computer-science&quot;&gt;&lt;a href=&quot;#foundations-of-computer-science&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Foundations of Computer Science&lt;/h1&gt; 184 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-11-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 185 - &lt;p&gt;I recently completed lecturing the course &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/&quot;&gt;&amp;quot;Foundations of Computer Science&amp;quot;&lt;/a&gt; to our newly arrived first-year computer scientists here at &lt;a href=&quot;https://www.cam.ac.uk&quot;&gt;Cambridge&lt;/a&gt;. This is the first time I've lectured this course, taking over from &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt; while he's on sabbatical. Although I was very nervous indeed about it, I ended up really enjoying the experience - and I hope the students did too! This post is a little brain dump of my thoughts on how it went and how we might improve it for next year.&lt;/p&gt; 186 - &lt;h2 id=&quot;course-overview&quot;&gt;&lt;a href=&quot;#course-overview&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Course Overview&lt;/h2&gt; 187 - &lt;p&gt;The course is 12 lectures long and has been lectured in a similar way since I myself was an undergraduate here, way back in 1996. There have been a few changes, not least of which is that back then it was in Standard ML rather than OCaml, but the core material has remained largely the same: lists, recursive functions, trees, higher-order functions, search and finally mutability. There are no prerequisites for the course, although all students have at least a maths A-level (or equivalent), and almost all of them have done some programming before, though the experience varies widely. Very few have done any functional programming, and even fewer have written any OCaml before.&lt;/p&gt; 188 - &lt;p&gt;The notes for the course are distributed both in hard copy and also as an &lt;a href=&quot;https://github.com/ocamllabs/focs-notebooks/blob/main/1A%20Foundations%20of%20Computer%20Science.ipynb&quot;&gt;interactive Jupyter Notebook&lt;/a&gt;, which we host on our &lt;a href=&quot;https://hub.cl.cam.ac.uk&quot;&gt;JupyterHub server&lt;/a&gt; that I maintain. The idea is that the students can read through the notes and then play around with the code examples directly in the notebook. I don't encourage them or give them time to do much &lt;i&gt;during&lt;/i&gt; the lectures - not that I think this is a terrible idea, but it's a struggle to fit all the material in otherwise! The notes are pretty closely coupled to the lectures, organised into 11 chapters that correspond to the first 11 lectures, with exercises at the end of each chapter that are intended to be covered in the supervisions. We also have some assessed exercises - &amp;quot;Ticks&amp;quot; - that the students complete in their own time using the JupyterHub server using &lt;a href=&quot;https://github.com/jupyter/nbgrader&quot;&gt;nbgrader&lt;/a&gt;. They are automatically assessed in a very transparent way; each &amp;quot;tick&amp;quot; is a Jupyter notebook with editable answer cells and read-only test cells. Overall we're aiming for the students not to &lt;i&gt;have&lt;/i&gt; to install OCaml locally at all, though I hope many of them will choose to do so anyway.&lt;/p&gt; 189 - &lt;p&gt;While I didn't want them playing around with the notebook during the lectures, I do, however, try to get them to interact by getting them to answer questions. It's pretty intimidating to stick your head above the parapet like this, so as an incentive I rewarded those that answered (rightly or wrongly) with some of the excellent stickers that Tarides has printed over the years. Everybody loves stickers!&lt;/p&gt; 190 - &lt;p&gt;The questions I asked varied quite a lot in their difficulty, and many were in the first few minutes of each lecture, where I had a short 'warm-up' where we recapped the contents of the previous lecture. These warm-ups were strongly suggested by Anil, and as well as reminding everyone of where we left off, they also gave me a bit of feedback on the things that the students found challenging.&lt;/p&gt; 191 - &lt;p&gt;One entertaining aspect is that during the first lecture I do actually encourage them to at least log on to the JupyterHub server, mostly to get them used to the idea of trying it. The entertaining part is that our server isn't particularly big and beefy, and so with 130 students all trying to log on at once, it invariably caves in under the load. At this point in the lecture I ssh to the server and run btop/htop and we watch it die in real time!&lt;/p&gt; 192 - &lt;h2 id=&quot;what-changed-this-year&quot;&gt;&lt;a href=&quot;#what-changed-this-year&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What changed this year&lt;/h2&gt; 193 - &lt;p&gt;During the lectures themselves, rather than use Keynote or PowerPoint for the slides, I decided to try using &lt;a href=&quot;https://slipshow.readthedocs.io/en/stable/&quot;&gt;Slipshow&lt;/a&gt;, augmented with &lt;a href=&quot;https://github.com/art-w/x-ocaml&quot;&gt;x-ocaml&lt;/a&gt; to embed executable OCaml code snippets. I'm very happy with how this worked out. I was able to prepare both working and broken snippets, modify them live during the lecture, and things like type-on-hover was very useful. In a few lectures where we were discussing big-O notation, I was able to run code on different input sizes and really demonstrate the big difference in run-time of certain algorithms. After the lectures, I posted the slides onto the course website so that students can refer back to them, and they can also try out the live code snippets directly in the slides.&lt;/p&gt; 194 - &lt;p&gt;Both Slipshow and x-ocaml are still quite young projects, so it was inevitable that there were a few rough edges, and in fact the interaction of the two revealed the biggest problem: that when you use the 'speaker-view' mode of Slipshow, where you have a separate window with notes and the current slide, the x-ocaml widgets are effectively independent in the two windows, so updating in one doesn't update in the other. &lt;a href=&quot;https://choum.net/panglesd/&quot;&gt;Paul-Elliot&lt;/a&gt;, the author of Slipshow, had already got a potential fix for this in the works when I spoke to him about it, so hopefully next time I use this I'll be able to have speaker notes on screen, instead of hand-written index cards! The x-ocaml project is a lot smaller than Slipshow, so I was able to use Claude to help me add functionality I needed, such as being able to programmatically highlight sections of the code.&lt;/p&gt; 195 - &lt;p&gt;Another new thing I tried this year was to go over 'tracing' of execution to help the students understand how programs run. We've always taught reduction steps in the course, which works well as it's only the last lecture where we introduce mutability, but it can quickly become unwieldy, and it can be challenging to do this all by hand. Tracing a function tells the runtime to log when function calls and returns happen, so you just need to call the function on your desired input, and you get a fully automatic trace of the execution. As it's only function calls and returns, it doesn't tell the full story, but alongside the handwritten reduction, it can help reassure students that they're on the right track. I ended up writing up a trace of a particularly complicated lazy-list evaluation using Slipshow and x-ocaml, which I posted &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/interleave_explanation.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt; 196 - &lt;h2 id=&quot;thoughts-for-next-year&quot;&gt;&lt;a href=&quot;#thoughts-for-next-year&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Thoughts for next year&lt;/h2&gt; 197 - &lt;p&gt;Overall I'm very happy with how the course went this year, though in some ways it did feel a little bit like the course finished just when it had started to get to the good stuff! There's a Tripos review process going on at the moment, so maybe we'll get to expand this course a bit in future years.&lt;/p&gt; 198 - &lt;p&gt;While the Slipshow+x-ocaml combination worked well, the fact that we ended up with two separate systems for executing OCaml wasn't ideal. I think it'd be a really nice project to investigate just how far we can push x-ocaml / Slipshow / some other web technology to have a true &amp;quot;serverless&amp;quot; experience so we can ditch the JupyterHub server entirely. By caching the x-ocaml 'execution' web worker in the browser, we could have a system that works fully offline, removing an annoyingly failure-prone single point of failure. Of course, we'd still need some way to do the assessed exercises, but that's a small point in a much larger problem: we really can't continue to ignore how LLMs are impacting the way that students are approaching these exercises in both positive and negative ways. To answer this properly, we need to think hard about what the purpose of these exercises is and look around to see what our &lt;a href=&quot;https://eecs.iisc.ac.in/people/prof-viraj-kumar/&quot;&gt;colleagues&lt;/a&gt; are doing &lt;a href=&quot;https://dl.acm.org/doi/10.1145/3724363.3729100&quot;&gt;in this space&lt;/a&gt;.&lt;/p&gt; 199 - &lt;p&gt;The slide decks themselves are fully open and available on the &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/&quot;&gt;course website&lt;/a&gt;:&lt;/p&gt; 200 - &lt;ol&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture1/lecture1.html&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture2/lecture2.html&quot;&gt;Recursion and Complexity&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture3/lecture3.html&quot;&gt;Lists and Polymorphism&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture4/lecture4.html&quot;&gt;More Lists and Making Change&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture5/lecture5.html&quot;&gt;Sorting&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture6/lecture6.html&quot;&gt;Datatypes and Trees&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture7/lecture7.html&quot;&gt;Dictionaries and Functional Arrays&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture8/lecture8.html&quot;&gt;Currying&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture9/lecture9.html&quot;&gt;Sequences, or Lazy Lists&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture10/lecture10.html&quot;&gt;Search&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture11/lecture11.html&quot;&gt;Procedural Programming&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture12/lecture12.html&quot;&gt;Recap and Real World Use!&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><id>https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html</id><title type="text">Foundations of Computer Science</title><updated>2025-11-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Some results from the . This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</summary><published>2025-09-23T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;caching-opam-solutions---part-2&quot;&gt;&lt;a href=&quot;#caching-opam-solutions---part-2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Caching opam solutions - part 2&lt;/h1&gt; 201 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-23&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 202 - &lt;p&gt;Some results from the &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/caching-opam-solutions&quot;&gt;previous post&lt;/span&gt;. This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.&lt;/p&gt; 203 - &lt;p&gt;First let's talk about the &amp;quot;examination map&amp;quot;. This is a map from package name to a list of other packages whose solutions should be recalculated if the package in question is altered. It's built by first looking at the packages that the solver asks about during the solution for a package, and then taking &lt;em&gt;all&lt;/em&gt; of the solutions, and 'inverting' the map, so for example, if both packages 'a' and 'b' ask about package 'c' during their solutions, then altering 'c' means that the solutions for both 'a' and 'b' need to be recalculated. The examination map entry for 'c' would then be &lt;code&gt;'a'; 'b'&lt;/code&gt;. We can plot the histogram of the sizes of each entry in the examination map:&lt;/p&gt; 204 - &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Package Examiner Distribution Histogram&lt;/span&gt;&lt;/div&gt; 205 - &lt;p&gt;Some interesting features from these data:&lt;/p&gt; 206 - &lt;ul&gt;&lt;li&gt;The most common number of observers is 1, meaning that the package is not involved in the solution of any other package. There are approximately 2000 such packages.&lt;/li&gt;&lt;li&gt;Most (~80%) of packages have fewer than 100 observers. This means that if we alter one of these packages, we only need to recalculate the solutions for fewer than 100 other packages.&lt;/li&gt;&lt;li&gt;A &lt;em&gt;very&lt;/em&gt; small number of packages are observed in all 4,400 solutions. This is actually a bit artificial, as the solver adds the ocaml-compiler package as an input to all solves to ensure we get the correct compiler version. There's another way to do this which would avoid this particular problem.&lt;/li&gt;&lt;li&gt;A small number of packages have a very large number of observers, around 3800. This mostly corresponds with &lt;code&gt;dune&lt;/code&gt; and its dependencies and associated packages. There are around 350 such packages, and any change to these means we need to recalcuate most of the solutions.&lt;/li&gt;&lt;/ul&gt; 207 - &lt;p&gt;This last point doesn't mean that we actually &lt;em&gt;recompile&lt;/em&gt; 3,800 packages, just that we need to recalcualte the solution, which might then lead to a cache hit of the layer and no actual compilation. However, recalculating the solutions of all of the packages takes (on my computer) around 10,000 seconds, or roughly 5 minutes of wall-clock time as I've got 32 threads.&lt;/p&gt; 208 - &lt;p&gt;However, if the package that's changes &lt;i&gt;isn't&lt;/i&gt; one of those 350 packages, then the number of solutions that need to be recalculated is dramatically reduced. I ran the logic over the last few weeks of commits to opam-repository, from commit &lt;code&gt;109398e2fd61803126becd398df0f1eabc9f3ca2&lt;/code&gt; of the 10th September up until commit &lt;code&gt;3f21ebe342ce440d9c9142ffe1185d8e5a326085&lt;/code&gt; from the 22nd. In this time there were 144 commits (counting only those from &lt;code&gt;git log --first-parent&lt;/code&gt;). Of these, only 4 resulted in a full resolve - the first commit, since obviously we have no cache at that point, the &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/40283204789e7116e1c99466de902cd565d121cf&quot;&gt;release of OCaml 5.4.0 beta2&lt;/a&gt; by &lt;a href=&quot;https://perso.quaesituri.org/florian.angeletti/&quot;&gt;Florian Angeletti&lt;/a&gt;, a fix of &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/6ef6813522b6ea29933f6451236a1639bdbaec61&quot;&gt;ocaml-base-compiler for MSVC&lt;/a&gt; by &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;David&lt;/a&gt; and a fix for &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/d141887ab0b4fc0836ad0787f1f806585a260bc8&quot;&gt;BER-OCaml&lt;/a&gt; by &lt;a href=&quot;https://www.cl.cam.ac.uk/~jdy22/&quot;&gt;Jeremy Yallop&lt;/a&gt;. Then 25 commits resulted in recalculating solutions for 3800 packages as they hit dune-adjacent packages, 5 commits resulted in recalculating between 100 and 300 packages and the remaining 110 commits resulted in recalculating fewer than 100 packages, the majority of which resulted in recalculating fewer than 5 packages.&lt;/p&gt; 209 - &lt;p&gt;Overall, at a rough estimate, this means that over this period, using this caching strategy gave us a 5x speedup in the solver!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html</id><title type="text">Caching opam solutions - part 2</title><updated>2025-09-23T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they bo...</summary><published>2025-09-22T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/odoc-bugs.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-bugs&quot;&gt;&lt;a href=&quot;#odoc-bugs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc bugs&lt;/h1&gt; 210 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-22&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 211 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.model&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 212 - &lt;p&gt;This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they both had the same backtrace and similar immediate causes, they're actually quite different. They both involve &lt;em&gt;expansion&lt;/em&gt;, which is the process that odoc uses to work out the contents of a module from its expression - what allows you to see the contents of a module such as &lt;code&gt;module M = Map.Make(String)&lt;/code&gt;.&lt;/p&gt; 213 - &lt;h3 id=&quot;bug-930:-inline-destructive-substitutions&quot;&gt;&lt;a href=&quot;#bug-930:-inline-destructive-substitutions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bug 930: inline destructive substitutions&lt;/h3&gt; 214 - &lt;p&gt;Bug #930 in odoc is about a substitution problem:&lt;/p&gt; 215 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S1 = sig 259 + | Module : mod_ * mod_ target -&gt; artifact_kind 260 + | Page : page * page target -&gt; artifact_kind</code></pre></div> 261 + <p>This refactoring immediately removed a whole swathe of invalid combinations, making the code both safer and clearer. It's quite clear that Claude had no trouble understanding how GADTs work in OCaml, quite happily also using some existentials to pack them into lists and so on.</p> 262 + <h3 id="odd-behaviours"><a href="#odd-behaviours" class="anchor"></a>Odd behaviours</h3> 263 + <p>Sometimes Claude just went a little bit bananas. One annoyance that <i>repeatedly</i> occurred was that it would forget how to build and test the dune executable, despite clear instructions in <code>Claude.md</code>. Most of the time when it went wrong it would build dune, execute <code>dune clean</code>, then try to run the dune binary that it had just removed with the <code>clean</code>. Sometimes it would decide to use the bootstrap binary instead, which isn't rebuilt on every change, sometimes it would run the switch-installed dune binary, and on one occasion it tried to run <code>./configure &amp;&amp; make</code>!</p> 264 + <p>It would usually figure out eventually what the right thing to do was, but when you're waiting for it to complete so you can check what it's done these sorts of delays got a bit frustrating.</p> 265 + <h3 id="reflections"><a href="#reflections" class="anchor"></a>Reflections</h3> 266 + <p>At one point, I ran out of Claude credits (despite paying $100 a month or so), at about 6:20pm one evening, and it told me that I needed to wait until 7pm to carry on. I'd just got to the point when I needed to write a short bit of code rather than refactoring what was already there, and I realised that while it would take me maybe 10 mins, it would take Claude maybe 10 seconds. Now, it could just be that it was the end of a long day and I was running out of steam, but I was content to switch focus elsewhere for a bit to wait for my credits to reset before carrying on! The point being that for the small implementation that I was after, it would be possible for me to get Claude to do it, and to eyeball the result to make sure it was OK in less time than I would have been able to do it myself. But I absolutely wouldn't have trusted Claude to do it in an upstreamable way <b>without</b> looking at the result.</p> 267 + <p>Overall, It's clear that Claude will be an incredibly useful tool for working with software. It's unbelievably good at jumping into a new code-base and figuring things out quickly, but less good at producing high-quality code that can be directly submitted upstream (yet?) - at least, not that <b>I</b> would be comfortable submitting anyway. However, I think it's still a bit of an open question as to what the quality bar <em>should</em> be. If it builds correctly, passes the tests, looks <i>broadly</i> sensible and isn't on the critical path for performance, how much should we care about the line-to-line quality? <b>I</b> certainly care, but am I being old fashioned?</p> 268 + <p>I've submitted a <a href="https://github.com/ocaml/dune/pull/12995">PR with these changes</a> for review, and we'll see what happens there. I ended up squashing all of the commits into one, as the intermediate steps are very likely not useful. However, for historical interest, the branch on which I did most of the work is <a href="https://github.com/ocaml/dune/compare/main...jonludlam:dune:odoc3-global-sidebar">here</a>.</p>]]></content> 269 + </entry> 270 + <entry> 271 + <id>https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html</id> 272 + <title>An SVG is all you need</title> 273 + <published>2025-12-09T00:00:00Z</published> 274 + <updated>2025-12-09T00:00:00Z</updated> 275 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html"/> 276 + <summary>SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them int...</summary> 277 + <content type="html"><![CDATA[<h1 id="an-svg-is-all-you-need"><a href="#an-svg-is-all-you-need" class="anchor"></a>An SVG is all you need</h1> 278 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-12-09</p></li></ul> 279 + <p>SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them interactive. They're <a href="https://www.youtube.com/watch?v=4laPOtTRteI">way more capable</a> than many people realise, and I think we can capitalise on some of that unrealised potential.</p> 280 + <p>Anil's recent post <a href="https://anil.recoil.org/notes/principles-for-collective-knowledge">Four Ps for Building Massive Collective Knowledge Systems</a> got me thinking about the permanence of the experimentation that underlies our scientific papers. In my idealistic vision of how scientific publishing should work, each paper would be accompanied by a fully interactive environment where the reader could explore the data, rerun the experiments, tweak the parameters, and see how the results changed. Obviously we can't do this in the general case - some experiments are just too expensive or time-consuming to rerun on demand. But for many papers, especially in computer science, this is entirely feasible.</p> 281 + <p>That line of thought reminded me of a project I tackled about 20 years ago as a post-doc in the Department of Plant Sciences here in Cambridge. I was writing a paper on <a href="https://royalsocietypublishing.org/rsif/article/9/70/949/173/Applications-of-percolation-theory-to-fungal">synergy in fungal networks</a> and built a tiny SVG visualisation tool that let readers wander through the raw data captured from a real fungal network growing in a petri dish. I dug it up recently and was surprised (and delighted) to see that it still works perfectly in modern browsers - even though the original “cover page” suggested Firefox 1.5 or the Adobe SVG plug-in (!). Give it a spin; click the 'forward', 'back' and other buttons below the petri dish!</p> 282 + <div><a href="fungus.svg" class="img-link"><img src="fungus.svg" alt="fungus.svg"/></a></div> 283 + <p>And that, dear reader, is literally all you need. A completely self-contained SVG file can either fetch data from a versioned repository or embed the data directly, as the example does. It can process that data, generate visualisations, and render knobs and sliders for interactive exploration. No server-side magic required - everything runs client-side in the browser, served by a plain static web server, and very easily to share.</p> 284 + <p>How does it fit in with Anil's four Ps?</p> 285 + <ul><li>Permanence: SVGs can be assigned DOIs just like papers, blog posts, or datasets. The fact that the above SVG still works after two decades is a testament to the durability of the format.</li></ul> 286 + <ul><li>Provenance: Because SVG is plain text, it plays nicely with version control systems such as Git. When an SVG pulls in external data, the same provenance-tracking strategies Anil describes for datasets apply here as well.</li></ul> 287 + <ul><li>Permission: Once again, with the separation between the processing in the SVG and that data that it works on, the same permissioning models apply as for data in general.</li></ul> 288 + <ul><li>Placement: SVGs are <i>inherently</i> spatial; it's very easy, for example, to make beautiful <a href="https://stephanwagner.me/coding/blog/create-world-map-charts-with-svgmap#svgMapDemoGDP">world maps</a> with SVG.</li></ul> 289 + <p>The SVG above is only a visualisation tool for data; it doesn't really do any processing, but it certainly <i>could</i>. The biggest change that's happened over the 20 years since I wrote this is the <i>massive</i> increase in the computation power available in the browser. If would be entirely feasible to implement the entire data analysis pipeline for that paper in an SVG today, probably without even spinning up the fans on my laptop!</p> 290 + <p>So this is yet another tool in our ongoing effort to be able to effortlessly share and remix our work - added to the pile of Jupyter notebooks, <a href="https://digitalflapjack.com/blog/marimo/">Marimo botebooks</a>, the <a href="https://slipshow.readthedocs.io/en/stable/">slipshow</a>/<a href="https://github.com/art-w/x-ocaml/">x-ocaml</a> <a href="../11/foundations-of-computer-science.html" title="foundations-of-computer-science">combination</a>, <a href="https://patrick.sirref.org/weekly-2025-w45/index.xml">Patrick's take</a> on Jon Sterling's <a href="https://sr.ht/~jonsterling/forester/">Forester</a>, my own <a href="../../../notebooks/index.html" title="index">notebooks</a>, and many others - and this is a subset of what we're using just in our own group!</p>]]></content> 291 + </entry> 292 + <entry> 293 + <id>https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html</id> 294 + <title>Foundations of Computer Science</title> 295 + <published>2025-11-14T00:00:00Z</published> 296 + <updated>2025-11-14T00:00:00Z</updated> 297 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html"/> 298 + <summary>I recently completed lecturing the course to our newly arrived first-year computer scientists here at . This is the first time I've lectured this course, taking over from while he's on sabbatical. A...</summary> 299 + <content type="html"><![CDATA[<h1 id="foundations-of-computer-science"><a href="#foundations-of-computer-science" class="anchor"></a>Foundations of Computer Science</h1> 300 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-11-14</p></li></ul> 301 + <p>I recently completed lecturing the course <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/">&quot;Foundations of Computer Science&quot;</a> to our newly arrived first-year computer scientists here at <a href="https://www.cam.ac.uk">Cambridge</a>. This is the first time I've lectured this course, taking over from <a href="https://anil.recoil.org/">Anil</a> while he's on sabbatical. Although I was very nervous indeed about it, I ended up really enjoying the experience - and I hope the students did too! This post is a little brain dump of my thoughts on how it went and how we might improve it for next year.</p> 302 + <h2 id="course-overview"><a href="#course-overview" class="anchor"></a>Course Overview</h2> 303 + <p>The course is 12 lectures long and has been lectured in a similar way since I myself was an undergraduate here, way back in 1996. There have been a few changes, not least of which is that back then it was in Standard ML rather than OCaml, but the core material has remained largely the same: lists, recursive functions, trees, higher-order functions, search and finally mutability. There are no prerequisites for the course, although all students have at least a maths A-level (or equivalent), and almost all of them have done some programming before, though the experience varies widely. Very few have done any functional programming, and even fewer have written any OCaml before.</p> 304 + <p>The notes for the course are distributed both in hard copy and also as an <a href="https://github.com/ocamllabs/focs-notebooks/blob/main/1A%20Foundations%20of%20Computer%20Science.ipynb">interactive Jupyter Notebook</a>, which we host on our <a href="https://hub.cl.cam.ac.uk">JupyterHub server</a> that I maintain. The idea is that the students can read through the notes and then play around with the code examples directly in the notebook. I don't encourage them or give them time to do much <i>during</i> the lectures - not that I think this is a terrible idea, but it's a struggle to fit all the material in otherwise! The notes are pretty closely coupled to the lectures, organised into 11 chapters that correspond to the first 11 lectures, with exercises at the end of each chapter that are intended to be covered in the supervisions. We also have some assessed exercises - &quot;Ticks&quot; - that the students complete in their own time using the JupyterHub server using <a href="https://github.com/jupyter/nbgrader">nbgrader</a>. They are automatically assessed in a very transparent way; each &quot;tick&quot; is a Jupyter notebook with editable answer cells and read-only test cells. Overall we're aiming for the students not to <i>have</i> to install OCaml locally at all, though I hope many of them will choose to do so anyway.</p> 305 + <p>While I didn't want them playing around with the notebook during the lectures, I do, however, try to get them to interact by getting them to answer questions. It's pretty intimidating to stick your head above the parapet like this, so as an incentive I rewarded those that answered (rightly or wrongly) with some of the excellent stickers that Tarides has printed over the years. Everybody loves stickers!</p> 306 + <p>The questions I asked varied quite a lot in their difficulty, and many were in the first few minutes of each lecture, where I had a short 'warm-up' where we recapped the contents of the previous lecture. These warm-ups were strongly suggested by Anil, and as well as reminding everyone of where we left off, they also gave me a bit of feedback on the things that the students found challenging.</p> 307 + <p>One entertaining aspect is that during the first lecture I do actually encourage them to at least log on to the JupyterHub server, mostly to get them used to the idea of trying it. The entertaining part is that our server isn't particularly big and beefy, and so with 130 students all trying to log on at once, it invariably caves in under the load. At this point in the lecture I ssh to the server and run btop/htop and we watch it die in real time!</p> 308 + <h2 id="what-changed-this-year"><a href="#what-changed-this-year" class="anchor"></a>What changed this year</h2> 309 + <p>During the lectures themselves, rather than use Keynote or PowerPoint for the slides, I decided to try using <a href="https://slipshow.readthedocs.io/en/stable/">Slipshow</a>, augmented with <a href="https://github.com/art-w/x-ocaml">x-ocaml</a> to embed executable OCaml code snippets. I'm very happy with how this worked out. I was able to prepare both working and broken snippets, modify them live during the lecture, and things like type-on-hover was very useful. In a few lectures where we were discussing big-O notation, I was able to run code on different input sizes and really demonstrate the big difference in run-time of certain algorithms. After the lectures, I posted the slides onto the course website so that students can refer back to them, and they can also try out the live code snippets directly in the slides.</p> 310 + <p>Both Slipshow and x-ocaml are still quite young projects, so it was inevitable that there were a few rough edges, and in fact the interaction of the two revealed the biggest problem: that when you use the 'speaker-view' mode of Slipshow, where you have a separate window with notes and the current slide, the x-ocaml widgets are effectively independent in the two windows, so updating in one doesn't update in the other. <a href="https://choum.net/panglesd/">Paul-Elliot</a>, the author of Slipshow, had already got a potential fix for this in the works when I spoke to him about it, so hopefully next time I use this I'll be able to have speaker notes on screen, instead of hand-written index cards! The x-ocaml project is a lot smaller than Slipshow, so I was able to use Claude to help me add functionality I needed, such as being able to programmatically highlight sections of the code.</p> 311 + <p>Another new thing I tried this year was to go over 'tracing' of execution to help the students understand how programs run. We've always taught reduction steps in the course, which works well as it's only the last lecture where we introduce mutability, but it can quickly become unwieldy, and it can be challenging to do this all by hand. Tracing a function tells the runtime to log when function calls and returns happen, so you just need to call the function on your desired input, and you get a fully automatic trace of the execution. As it's only function calls and returns, it doesn't tell the full story, but alongside the handwritten reduction, it can help reassure students that they're on the right track. I ended up writing up a trace of a particularly complicated lazy-list evaluation using Slipshow and x-ocaml, which I posted <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/interleave_explanation.html">here</a>.</p> 312 + <h2 id="thoughts-for-next-year"><a href="#thoughts-for-next-year" class="anchor"></a>Thoughts for next year</h2> 313 + <p>Overall I'm very happy with how the course went this year, though in some ways it did feel a little bit like the course finished just when it had started to get to the good stuff! There's a Tripos review process going on at the moment, so maybe we'll get to expand this course a bit in future years.</p> 314 + <p>While the Slipshow+x-ocaml combination worked well, the fact that we ended up with two separate systems for executing OCaml wasn't ideal. I think it'd be a really nice project to investigate just how far we can push x-ocaml / Slipshow / some other web technology to have a true &quot;serverless&quot; experience so we can ditch the JupyterHub server entirely. By caching the x-ocaml 'execution' web worker in the browser, we could have a system that works fully offline, removing an annoyingly failure-prone single point of failure. Of course, we'd still need some way to do the assessed exercises, but that's a small point in a much larger problem: we really can't continue to ignore how LLMs are impacting the way that students are approaching these exercises in both positive and negative ways. To answer this properly, we need to think hard about what the purpose of these exercises is and look around to see what our <a href="https://eecs.iisc.ac.in/people/prof-viraj-kumar/">colleagues</a> are doing <a href="https://dl.acm.org/doi/10.1145/3724363.3729100">in this space</a>.</p> 315 + <p>The slide decks themselves are fully open and available on the <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/">course website</a>:</p> 316 + <ol><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture1/lecture1.html">Introduction</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture2/lecture2.html">Recursion and Complexity</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture3/lecture3.html">Lists and Polymorphism</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture4/lecture4.html">More Lists and Making Change</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture5/lecture5.html">Sorting</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture6/lecture6.html">Datatypes and Trees</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture7/lecture7.html">Dictionaries and Functional Arrays</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture8/lecture8.html">Currying</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture9/lecture9.html">Sequences, or Lazy Lists</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture10/lecture10.html">Search</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture11/lecture11.html">Procedural Programming</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture12/lecture12.html">Recap and Real World Use!</a></li></ol>]]></content> 317 + </entry> 318 + <entry> 319 + <id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html</id> 320 + <title>Caching opam solutions - part 2</title> 321 + <published>2025-09-23T00:00:00Z</published> 322 + <updated>2025-09-23T00:00:00Z</updated> 323 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html"/> 324 + <summary>Some results from the . This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</summary> 325 + <content type="html"><![CDATA[<h1 id="caching-opam-solutions---part-2"><a href="#caching-opam-solutions---part-2" class="anchor"></a>Caching opam solutions - part 2</h1> 326 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-23</p></li></ul> 327 + <p>Some results from the <a href="caching-opam-solutions.html" title="caching-opam-solutions">previous post</a>. This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</p> 328 + <p>First let's talk about the &quot;examination map&quot;. This is a map from package name to a list of other packages whose solutions should be recalculated if the package in question is altered. It's built by first looking at the packages that the solver asks about during the solution for a package, and then taking <em>all</em> of the solutions, and 'inverting' the map, so for example, if both packages 'a' and 'b' ask about package 'c' during their solutions, then altering 'c' means that the solutions for both 'a' and 'b' need to be recalculated. The examination map entry for 'c' would then be <code>'a'; 'b'</code>. We can plot the histogram of the sizes of each entry in the examination map:</p> 329 + <div><a href="examination_map_histogram.svg" class="img-link"><img src="examination_map_histogram.svg" alt="Package Examiner Distribution Histogram"/></a></div> 330 + <p>Some interesting features from these data:</p> 331 + <ul><li>The most common number of observers is 1, meaning that the package is not involved in the solution of any other package. There are approximately 2000 such packages.</li><li>Most (~80%) of packages have fewer than 100 observers. This means that if we alter one of these packages, we only need to recalculate the solutions for fewer than 100 other packages.</li><li>A <em>very</em> small number of packages are observed in all 4,400 solutions. This is actually a bit artificial, as the solver adds the ocaml-compiler package as an input to all solves to ensure we get the correct compiler version. There's another way to do this which would avoid this particular problem.</li><li>A small number of packages have a very large number of observers, around 3800. This mostly corresponds with <code>dune</code> and its dependencies and associated packages. There are around 350 such packages, and any change to these means we need to recalcuate most of the solutions.</li></ul> 332 + <p>This last point doesn't mean that we actually <em>recompile</em> 3,800 packages, just that we need to recalcualte the solution, which might then lead to a cache hit of the layer and no actual compilation. However, recalculating the solutions of all of the packages takes (on my computer) around 10,000 seconds, or roughly 5 minutes of wall-clock time as I've got 32 threads.</p> 333 + <p>However, if the package that's changes <i>isn't</i> one of those 350 packages, then the number of solutions that need to be recalculated is dramatically reduced. I ran the logic over the last few weeks of commits to opam-repository, from commit <code>109398e2fd61803126becd398df0f1eabc9f3ca2</code> of the 10th September up until commit <code>3f21ebe342ce440d9c9142ffe1185d8e5a326085</code> from the 22nd. In this time there were 144 commits (counting only those from <code>git log --first-parent</code>). Of these, only 4 resulted in a full resolve - the first commit, since obviously we have no cache at that point, the <a href="https://github.com/ocaml/opam-repository/commit/40283204789e7116e1c99466de902cd565d121cf">release of OCaml 5.4.0 beta2</a> by <a href="https://perso.quaesituri.org/florian.angeletti/">Florian Angeletti</a>, a fix of <a href="https://github.com/ocaml/opam-repository/commit/6ef6813522b6ea29933f6451236a1639bdbaec61">ocaml-base-compiler for MSVC</a> by <a href="https://www.dra27.uk/blog/">David</a> and a fix for <a href="https://github.com/ocaml/opam-repository/commit/d141887ab0b4fc0836ad0787f1f806585a260bc8">BER-OCaml</a> by <a href="https://www.cl.cam.ac.uk/~jdy22/">Jeremy Yallop</a>. Then 25 commits resulted in recalculating solutions for 3800 packages as they hit dune-adjacent packages, 5 commits resulted in recalculating between 100 and 300 packages and the remaining 110 commits resulted in recalculating fewer than 100 packages, the majority of which resulted in recalculating fewer than 5 packages.</p> 334 + <p>Overall, at a rough estimate, this means that over this period, using this caching strategy gave us a 5x speedup in the solver!</p>]]></content> 335 + </entry> 336 + <entry> 337 + <id>https://jon.recoil.org/blog/2025/09/odoc-bugs.html</id> 338 + <title>Odoc bugs</title> 339 + <published>2025-09-22T00:00:00Z</published> 340 + <updated>2025-09-22T00:00:00Z</updated> 341 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/odoc-bugs.html"/> 342 + <summary>This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they bo...</summary> 343 + <content type="html"><![CDATA[<h1 id="odoc-bugs"><a href="#odoc-bugs" class="anchor"></a>Odoc bugs</h1> 344 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-22</p></li></ul> 345 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.model</p></li></ul> 346 + <p>This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they both had the same backtrace and similar immediate causes, they're actually quite different. They both involve <em>expansion</em>, which is the process that odoc uses to work out the contents of a module from its expression - what allows you to see the contents of a module such as <code>module M = Map.Make(String)</code>.</p> 347 + <h3 id="bug-930:-inline-destructive-substitutions"><a href="#bug-930:-inline-destructive-substitutions" class="anchor"></a>Bug 930: inline destructive substitutions</h3> 348 + <p>Bug #930 in odoc is about a substitution problem:</p> 349 + <div><pre class="language-ocaml"><code>module type S1 = sig 216 350 type t0 217 351 type 'a t := unit 218 352 ··· 229 363 type t1 230 364 231 365 include S2 with type t := t1 232 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 233 - &lt;p&gt;which when processed by odoc 2.4 throws an exception:&lt;/p&gt; 234 - &lt;pre&gt;odoc: internal error, uncaught exception: 235 - Invalid_argument(&amp;quot;List.fold_left2&amp;quot;) 236 - Raised at Stdlib.invalid_arg in file &amp;quot;stdlib.ml&amp;quot;, line 33, characters 20-45 237 - Called from Odoc_xref2__Subst.type_expr in file &amp;quot;subst.ml&amp;quot;, line 598, characters 21-59 238 - Called from Odoc_xref2__Subst.value in file &amp;quot;subst.ml&amp;quot; (inlined), line 842, characters 19-38 239 - Called from Odoc_xref2__Subst.apply_sig_map.inner.(fun) in file &amp;quot;subst.ml&amp;quot;, line 1089, characters 19-52 240 - Called from Odoc_xref2__Component.Delayed.get in file &amp;quot;component.ml&amp;quot; (inlined), line 55, characters 16-22 241 - Called from Odoc_xref2__Lang_of.signature_items.inner in file &amp;quot;lang_of.ml&amp;quot;, line 438, characters 16-39 242 - Called from Odoc_xref2__Lang_of.signature in file &amp;quot;lang_of.ml&amp;quot; (inlined), line 466, characters 12-43 243 - Called from Odoc_xref2__Lang_of.include_ in file &amp;quot;lang_of.ml&amp;quot;, line 641, characters 18-69&lt;/pre&gt; 244 - &lt;p&gt;The key thing here is that definition of &lt;code&gt;'a t&lt;/code&gt; in &lt;code&gt;S1&lt;/code&gt; - a destructive substituion. If you type this code into an OCaml toplevel, you will see that the signature of &lt;code&gt;S1&lt;/code&gt; is:&lt;/p&gt; 245 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S1 = sig 366 + end</code></pre></div> 367 + <p>which when processed by odoc 2.4 throws an exception:</p> 368 + <pre>odoc: internal error, uncaught exception: 369 + Invalid_argument(&quot;List.fold_left2&quot;) 370 + Raised at Stdlib.invalid_arg in file &quot;stdlib.ml&quot;, line 33, characters 20-45 371 + Called from Odoc_xref2__Subst.type_expr in file &quot;subst.ml&quot;, line 598, characters 21-59 372 + Called from Odoc_xref2__Subst.value in file &quot;subst.ml&quot; (inlined), line 842, characters 19-38 373 + Called from Odoc_xref2__Subst.apply_sig_map.inner.(fun) in file &quot;subst.ml&quot;, line 1089, characters 19-52 374 + Called from Odoc_xref2__Component.Delayed.get in file &quot;component.ml&quot; (inlined), line 55, characters 16-22 375 + Called from Odoc_xref2__Lang_of.signature_items.inner in file &quot;lang_of.ml&quot;, line 438, characters 16-39 376 + Called from Odoc_xref2__Lang_of.signature in file &quot;lang_of.ml&quot; (inlined), line 466, characters 12-43 377 + Called from Odoc_xref2__Lang_of.include_ in file &quot;lang_of.ml&quot;, line 641, characters 18-69</pre> 378 + <p>The key thing here is that definition of <code>'a t</code> in <code>S1</code> - a destructive substituion. If you type this code into an OCaml toplevel, you will see that the signature of <code>S1</code> is:</p> 379 + <div><pre class="language-ocaml"><code>module type S1 = sig 246 380 type t0 247 381 type 'a t := unit 248 382 249 383 val x : t0 t 250 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 251 - &lt;p&gt;where the substitution has clearly taken place. In contrast, odoc takes the position that the use of these inline destructive substitutions is to make the code easier to understand, and so it tries to keep them in the signature rather than simply apply them and present the resulting signature. So when rendering &lt;code&gt;S1&lt;/code&gt; we end up with:&lt;/p&gt; 384 + end</code></pre></div> 385 + <p>where the substitution has clearly taken place. In contrast, odoc takes the position that the use of these inline destructive substitutions is to make the code easier to understand, and so it tries to keep them in the signature rather than simply apply them and present the resulting signature. So when rendering <code>S1</code> we end up with:</p> 252 386 253 - &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; border-radius: 5px&quot;&gt; 254 - &lt;a id=&quot;module-type-S1&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S1&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 255 - &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-t0&quot;&gt;&lt;a href=&quot;#type-t0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t0&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type subst anchored&quot; id=&quot;type-t&quot;&gt;&lt;a href=&quot;#type-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;span&gt;'a t&lt;/span&gt;&lt;/span&gt;&lt;span&gt; := unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : &lt;span&gt;&lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; &lt;a href=&quot;#type-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt; 256 - &lt;/div&gt; 387 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; border-radius: 5px"> 388 + <a id="module-type-S1" class="anchor"></a><h2>Module type <code><span>S1</span></code></h2> 389 + <div class="odoc-spec"><div class="spec type anchored" id="type-t0"><a href="#type-t0" class="anchor"></a><code><span><span class="keyword">type</span> t0</span></code></div></div><div class="odoc-spec"><div class="spec type subst anchored" id="type-t"><a href="#type-t" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a t</span></span><span> := unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : <span><a href="#type-t0">t0</a> <a href="#type-t">t</a></span></span></code></div></div> 390 + </div> 257 391 258 - &lt;p&gt;The reported problem is a failure with a stack trace while processing &lt;code&gt;S3&lt;/code&gt;, but upon looking closely the real problem has happened when expanding &lt;code&gt;S2&lt;/code&gt;. What happens is that we have a type &lt;code&gt;t&lt;/code&gt; defined in &lt;code&gt;S2&lt;/code&gt; and a type &lt;code&gt;t&lt;/code&gt; that will later be substituted away that comes from the inclusion of &lt;code&gt;S1&lt;/code&gt;. The rendered signature of &lt;code&gt;S2&lt;/code&gt; is:&lt;/p&gt; 392 + <p>The reported problem is a failure with a stack trace while processing <code>S3</code>, but upon looking closely the real problem has happened when expanding <code>S2</code>. What happens is that we have a type <code>t</code> defined in <code>S2</code> and a type <code>t</code> that will later be substituted away that comes from the inclusion of <code>S1</code>. The rendered signature of <code>S2</code> is:</p> 259 393 260 - &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px&quot;&gt; 261 - &lt;a id=&quot;module-type-S2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S2&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 262 - &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-s2-t&quot;&gt;&lt;a href=&quot;#type-s2-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-include&quot;&gt;&lt;details open=&quot;open&quot;&gt;&lt;summary class=&quot;spec include&quot;&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;include&lt;/span&gt; &lt;a href=&quot;#module-type-S1&quot;&gt;S1&lt;/a&gt; &lt;span class=&quot;keyword&quot;&gt;with&lt;/span&gt; &lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; := &lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/summary&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type subst anchored&quot; id=&quot;type-s2-t&quot;&gt;&lt;a href=&quot;#type-s2-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;span&gt;'a t&lt;/span&gt;&lt;/span&gt;&lt;span&gt; := unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : &lt;span&gt;&lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt; &lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;/details&gt;&lt;/div&gt; 263 - &lt;/div&gt; 394 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px"> 395 + <a id="module-type-S2" class="anchor"></a><h2>Module type <code><span>S2</span></code></h2> 396 + <div class="odoc-spec"><div class="spec type anchored" id="type-s2-t"><a href="#type-s2-t" class="anchor"></a><code><span><span class="keyword">type</span> t</span></code></div></div><div class="odoc-include"><details open="open"><summary class="spec include"><code><span><span class="keyword">include</span> <a href="#module-type-S1">S1</a> <span class="keyword">with</span> <span><span class="keyword">type</span> <a href="#type-t0">t0</a> := <a href="#type-s2-t">t</a></span></span></code></summary><div class="odoc-spec"><div class="spec type subst anchored" id="type-s2-t"><a href="#type-s2-t" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a t</span></span><span> := unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : <span><a href="#type-s2-t">t</a> <a href="#type-s2-t">t</a></span></span></code></div></div></details></div> 397 + </div> 264 398 265 - &lt;p&gt;where the type of &lt;code&gt;x&lt;/code&gt; is now &lt;code&gt;t t&lt;/code&gt;, which is clearly incorrect. The problem is that odoc assumes that type names are unique within a signature (modulo shadowing, which isn't quite what's going on here), but in this signature there are two definitions of &lt;code&gt;type t&lt;/code&gt;, one of which is parameterised and one is not. At this point nothing fatal has happened, but when we try to process &lt;code&gt;S3&lt;/code&gt; the substitution code gets very confused by these different arities and &lt;code&gt;List.fold_left2&lt;/code&gt; throws the above exception.&lt;/p&gt; 266 - &lt;p&gt;The fix I'm trialling for this is that when we're including a signature that contains an inline destructive substitution, we will perform that substitution when the expansion of the include is done. This means that the rendered signature of &lt;code&gt;S1&lt;/code&gt; will be just the same as before, but the rendered signature of &lt;code&gt;S2&lt;/code&gt; will now be:&lt;/p&gt; 399 + <p>where the type of <code>x</code> is now <code>t t</code>, which is clearly incorrect. The problem is that odoc assumes that type names are unique within a signature (modulo shadowing, which isn't quite what's going on here), but in this signature there are two definitions of <code>type t</code>, one of which is parameterised and one is not. At this point nothing fatal has happened, but when we try to process <code>S3</code> the substitution code gets very confused by these different arities and <code>List.fold_left2</code> throws the above exception.</p> 400 + <p>The fix I'm trialling for this is that when we're including a signature that contains an inline destructive substitution, we will perform that substitution when the expansion of the include is done. This means that the rendered signature of <code>S1</code> will be just the same as before, but the rendered signature of <code>S2</code> will now be:</p> 267 401 268 - &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px&quot;&gt; 269 - &lt;a id=&quot;module-type-newS2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S2&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 270 - &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-s2new-t&quot;&gt;&lt;a href=&quot;#type-s2new-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-include&quot;&gt;&lt;details open=&quot;open&quot;&gt;&lt;summary class=&quot;spec include&quot;&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;include&lt;/span&gt; &lt;a href=&quot;#module-type-S1&quot;&gt;S1&lt;/a&gt; &lt;span class=&quot;keyword&quot;&gt;with&lt;/span&gt; &lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; := &lt;a href=&quot;#type-s2new-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/summary&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;/details&gt;&lt;/div&gt; 271 - &lt;/div&gt; 402 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px"> 403 + <a id="module-type-newS2" class="anchor"></a><h2>Module type <code><span>S2</span></code></h2> 404 + <div class="odoc-spec"><div class="spec type anchored" id="type-s2new-t"><a href="#type-s2new-t" class="anchor"></a><code><span><span class="keyword">type</span> t</span></code></div></div><div class="odoc-include"><details open="open"><summary class="spec include"><code><span><span class="keyword">include</span> <a href="#module-type-S1">S1</a> <span class="keyword">with</span> <span><span class="keyword">type</span> <a href="#type-t0">t0</a> := <a href="#type-s2new-t">t</a></span></span></code></summary><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : unit</span></code></div></div></details></div> 405 + </div> 272 406 273 - &lt;p&gt;where the type of &lt;code&gt;x&lt;/code&gt; is now simply &lt;code&gt;unit&lt;/code&gt;, which is what OCaml itself thinks, happily! I think this strikes the balance between keeping the substitutions visible for clarity where they are originally defined, but when including them elsewhere we simply see the resulting signature.&lt;/p&gt; 274 - &lt;h3 id=&quot;bug-#1385:-exception-raised-during-compilation&quot;&gt;&lt;a href=&quot;#bug-#1385:-exception-raised-during-compilation&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bug #1385: Exception raised during compilation&lt;/h3&gt; 275 - &lt;p&gt;The second bug has the identical backtrace, indicating a problem with arities. However, the repro case for this one does not involve any inline destructive substitution, though it does involve destructive substitution at the module expression level:&lt;/p&gt; 276 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Creators_base = sig 407 + <p>where the type of <code>x</code> is now simply <code>unit</code>, which is what OCaml itself thinks, happily! I think this strikes the balance between keeping the substitutions visible for clarity where they are originally defined, but when including them elsewhere we simply see the resulting signature.</p> 408 + <h3 id="bug-#1385:-exception-raised-during-compilation"><a href="#bug-#1385:-exception-raised-during-compilation" class="anchor"></a>Bug #1385: Exception raised during compilation</h3> 409 + <p>The second bug has the identical backtrace, indicating a problem with arities. However, the repro case for this one does not involve any inline destructive substitution, though it does involve destructive substitution at the module expression level:</p> 410 + <div><pre class="language-ocaml"><code>module type Creators_base = sig 277 411 type ('a, _, _) t 278 412 type (_, _, _) concat 279 413 280 414 include sig 281 415 type ('a, 'b, 'c) t 282 416 283 - val concat : (('a, 'p1, 'p2) t, 'p1, 'p2) concat -&amp;gt; ('a, 'p1, 'p2) t 417 + val concat : (('a, 'p1, 'p2) t, 'p1, 'p2) concat -&gt; ('a, 'p1, 'p2) t 284 418 end 285 419 with type ('a, 'b, 'c) t := ('a, 'b, 'c) t 286 420 end ··· 289 423 type t 290 424 291 425 include Creators_base with type ('a, _, _) t := t and type ('a, _, _) concat := t 292 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 293 - &lt;p&gt;There's quite a lot of type parameters flying around here, so the first step was to try to simplify this as much as possible while still getting the exception. I got it down to:&lt;/p&gt; 294 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Creators_base = sig 426 + end</code></pre></div> 427 + <p>There's quite a lot of type parameters flying around here, so the first step was to try to simplify this as much as possible while still getting the exception. I got it down to:</p> 428 + <div><pre class="language-ocaml"><code>module type Creators_base = sig 295 429 type 'a t 296 430 type _ concat 297 431 298 432 include sig 299 433 type 'a t 300 434 301 - val concat : 'a concat -&amp;gt; 'a t 435 + val concat : 'a concat -&gt; 'a t 302 436 end 303 437 with type 'a t := 'a t 304 438 end ··· 307 441 type t 308 442 309 443 include Creators_base with type _ t := t with type _ concat := t 310 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 311 - &lt;p&gt;which still throws the same exception. So, what's going on here? Fundamentally, it's a similar issue to the first bug, just caused in a different way, in that once again we'll end up with a signature that has two definitions of &lt;code&gt;type t&lt;/code&gt; with different arities. In this case, the problem occurs during the expansion of &lt;code&gt;S0_with_creators_base&lt;/code&gt;.&lt;/p&gt; 312 - &lt;p&gt;This is the intermediate expansion of &lt;code&gt;Creators_base&lt;/code&gt; that odoc calculates:&lt;/p&gt; 313 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S0_with_creators_base = sig 444 + end</code></pre></div> 445 + <p>which still throws the same exception. So, what's going on here? Fundamentally, it's a similar issue to the first bug, just caused in a different way, in that once again we'll end up with a signature that has two definitions of <code>type t</code> with different arities. In this case, the problem occurs during the expansion of <code>S0_with_creators_base</code>.</p> 446 + <p>This is the intermediate expansion of <code>Creators_base</code> that odoc calculates:</p> 447 + <div><pre class="language-ocaml"><code>module type S0_with_creators_base = sig 314 448 type t 315 449 316 450 include Creators_base with type _ t := t with type _ concat := t (* ··· 319 453 320 454 include sig 321 455 type 'a t 322 - val concat : t -&amp;gt; 'a t 456 + val concat : t -&gt; 'a t 323 457 end 324 458 *) 325 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 326 - &lt;p&gt;What's happened here is during the calculation of the body of the include, odoc has taken the signature of &lt;code&gt;Creators_base&lt;/code&gt; and has its two type definitions both replaced with &lt;code&gt;type t&lt;/code&gt; (with no parameters). However, since the &lt;code&gt;type t&lt;/code&gt; in the body of the include is defined in that signature, that one wasn't replaced. So we end up with the type of &lt;code&gt;concat&lt;/code&gt; being &lt;code&gt;t -&amp;gt; 'a t&lt;/code&gt;, which looks very odd! At this point though, odoc knows very well that they're different types. However, when odoc converts this signature back into the datatype that represents the expansions, it loses that information and we end up with the two types mixed up. We then go on to process this signature, the mixup of the arities causes the failure.&lt;/p&gt; 327 - &lt;p&gt;There are several independent fixes that we can make here. Firstly we can make sure that we don't mix up the types. This we can do because we can distinguish between items that are declared within the signature of the include's declaration and those that come from the outer context. We don't have to do this for the expansion of the include as OCaml's type system means that there can't be two types of the same name in the resulting signature. We never actually render any signature that occurs within the body of an include, so this doesn't actually make any difference to the output.&lt;/p&gt; 328 - &lt;p&gt;The second fix is to make sure that we only calculate the expansion of the include once. Currently the bug happens because we try to re-calculate the expansion of the &lt;code&gt;include sig ... end&lt;/code&gt; expression, even though we calculated it during the processing of &lt;code&gt;S0_with_creators_base&lt;/code&gt;. What we should do instead is apply the substitutions to the expansion of that calculated include, which would end up with the same result. This isn't a perfect solution though, as there are occasions when we have to recalculate the signature anyway.&lt;/p&gt; 329 - &lt;p&gt;The third fix is - and this takes a little care to parse - to ensure that we never actually try to process the items within a signature within a &amp;quot;with&amp;quot; expression within a module-type expression. Before diving into the 'why' of this, let's first explain how Odoc represents module-type expressions.&lt;/p&gt; 330 - &lt;p&gt;Internally, we have &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.expr&quot;&gt;a datatype&lt;/span&gt; that represents module expressions, which looks like this:&lt;/p&gt; 331 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type expr = 459 + end</code></pre></div> 460 + <p>What's happened here is during the calculation of the body of the include, odoc has taken the signature of <code>Creators_base</code> and has its two type definitions both replaced with <code>type t</code> (with no parameters). However, since the <code>type t</code> in the body of the include is defined in that signature, that one wasn't replaced. So we end up with the type of <code>concat</code> being <code>t -&gt; 'a t</code>, which looks very odd! At this point though, odoc knows very well that they're different types. However, when odoc converts this signature back into the datatype that represents the expansions, it loses that information and we end up with the two types mixed up. We then go on to process this signature, the mixup of the arities causes the failure.</p> 461 + <p>There are several independent fixes that we can make here. Firstly we can make sure that we don't mix up the types. This we can do because we can distinguish between items that are declared within the signature of the include's declaration and those that come from the outer context. We don't have to do this for the expansion of the include as OCaml's type system means that there can't be two types of the same name in the resulting signature. We never actually render any signature that occurs within the body of an include, so this doesn't actually make any difference to the output.</p> 462 + <p>The second fix is to make sure that we only calculate the expansion of the include once. Currently the bug happens because we try to re-calculate the expansion of the <code>include sig ... end</code> expression, even though we calculated it during the processing of <code>S0_with_creators_base</code>. What we should do instead is apply the substitutions to the expansion of that calculated include, which would end up with the same result. This isn't a perfect solution though, as there are occasions when we have to recalculate the signature anyway.</p> 463 + <p>The third fix is - and this takes a little care to parse - to ensure that we never actually try to process the items within a signature within a &quot;with&quot; expression within a module-type expression. Before diving into the 'why' of this, let's first explain how Odoc represents module-type expressions.</p> 464 + <p>Internally, we have <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.expr">a datatype</span> that represents module expressions, which looks like this:</p> 465 + <div><pre class="language-ocaml"><code>type expr = 332 466 | Path of path_t 333 467 | Signature of Signature.t 334 468 | Functor of FunctorParameter.t * expr 335 469 | With of with_t 336 - | TypeOf of typeof_t&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 337 - &lt;p&gt;Now, each of the arguments to these constructors might contain an expansion of the expression that Odoc will calculate. For example, the definition of &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.path_t&quot;&gt;path_t&lt;/span&gt; is:&lt;/p&gt; 338 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type path_t = { 470 + | TypeOf of typeof_t</code></pre></div> 471 + <p>Now, each of the arguments to these constructors might contain an expansion of the expression that Odoc will calculate. For example, the definition of <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.path_t">path_t</span> is:</p> 472 + <div><pre class="language-ocaml"><code>type path_t = { 339 473 p_expansion : simple_expansion option; 340 474 p_path : Paths.Path.ModuleType.t; 341 - }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 342 - &lt;p&gt;and this expansion is initially &lt;code&gt;None&lt;/code&gt; and then filled in by Odoc in order to render the expansion in the HTML. In the case of a &lt;code&gt;With&lt;/code&gt; expression, the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.with_t&quot;&gt;with_t&lt;/span&gt; type is:&lt;/p&gt; 343 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type with_t = { 475 + }</code></pre></div> 476 + <p>and this expansion is initially <code>None</code> and then filled in by Odoc in order to render the expansion in the HTML. In the case of a <code>With</code> expression, the <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.with_t">with_t</span> type is:</p> 477 + <div><pre class="language-ocaml"><code>type with_t = { 344 478 w_substitutions : substitution list; 345 479 w_expansion : simple_expansion option; 346 480 w_expr : U.expr; 347 - }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 348 - &lt;p&gt;here you can see that the &lt;code&gt;With&lt;/code&gt; expression contains another module expression, as a &lt;code&gt;with&lt;/code&gt; expression operates on another module type. Early during Odoc's development, this simply was another `ModuleType.expr`, but we had a couple of bugs where we ended up calculating expansions for these inner expressions, which was all very wasteful as we only ever rendered the &amp;quot;outer&amp;quot; expansion. So we changed this to be a &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.U.expr&quot;&gt;U.expr&lt;/span&gt;, which is an &amp;quot;unexpanded&amp;quot; module type expression, and is very similar to the main expression above, but without the expansions and also with the functor case, as we can't have functors inside a &amp;quot;with&amp;quot; expression.&lt;/p&gt; 349 - &lt;p&gt;These &amp;quot;unexpanded&amp;quot; expressions still contain signatures though, so aren't &lt;em&gt;completely&lt;/em&gt; unexpanded, and it's &lt;em&gt;these&lt;/em&gt; signatures that we should avoid processing.&lt;/p&gt; 350 - &lt;p&gt;So, what I expected to be just one bug when I started looking at this turned out to be two related issues, and a total of four different fixes!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/odoc-bugs.html</id><title type="text">Odoc bugs</title><updated>2025-09-22T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In or...</summary><published>2025-09-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;caching-opam-solutions&quot;&gt;&lt;a href=&quot;#caching-opam-solutions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Caching opam solutions&lt;/h1&gt; 351 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 352 - &lt;p&gt;The &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In order to give the docs some stability, as the docs may well &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/semantic-versioning-is-hard&quot;&gt;depend upon your dependencies&lt;/span&gt;, we currently cache the solve results so that a package will always be built with the same set of dependencies, even if a new version of one of those dependencies has been released.&lt;/p&gt; 353 - &lt;p&gt;The downside to this is that as time goes on, the number of distinct universes that we build increases, and docs get more and more out of date. So it's not necessarily the best thing to do, though it does mean we minimise the amount of time spent solving.&lt;/p&gt; 354 - &lt;p&gt;The alternative approach is that on every commit to opam-repository we could resolve for all packages and use the latest, greatest solution to build the docs. Using this approach we would maximise the sharing of builds and keep the total amount of required storage steadier. Of course, this would mean solving for every package on every commit to opam-repository, even if we didn't end up rebuilding all of them due to the way that the cache works.&lt;/p&gt; 355 - &lt;p&gt;One possibility that might be worth investigating is to cache the solutions - but then Leon Bambrick &lt;a href=&quot;https://twitter.com/secretGeek/status/7269997868&quot;&gt;advises us&lt;/a&gt;:&lt;/p&gt; 356 - &lt;div&gt;&lt;pre class=&quot;language-quote&quot;&gt;&lt;code&gt;There are 2 hard problems in computer science: cache invalidation, 357 - naming things, and off-by-1 errors.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 358 - &lt;p&gt;and indeed it's not obvious what the best approach to cache invalidation is here. A sledgehammer approach would be to hook into the solver and note what questions it asks of opam-repository and record the responses. If any of these change, then it's safe to say that we need to recalculate. I had a quick look at this and checked what packages were involved in the solution of &lt;code&gt;ocaml&lt;/code&gt; as this would represent a minimum set of packages that would affect virtually all packages. The list was big, but not &lt;i&gt;too&lt;/i&gt; big:&lt;/p&gt; 359 - &lt;pre&gt;winpthreads, system-msvc, system-mingw, ocaml-variants, ocaml-system, 481 + }</code></pre></div> 482 + <p>here you can see that the <code>With</code> expression contains another module expression, as a <code>with</code> expression operates on another module type. Early during Odoc's development, this simply was another `ModuleType.expr`, but we had a couple of bugs where we ended up calculating expansions for these inner expressions, which was all very wasteful as we only ever rendered the &quot;outer&quot; expansion. So we changed this to be a <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.U.expr">U.expr</span>, which is an &quot;unexpanded&quot; module type expression, and is very similar to the main expression above, but without the expansions and also with the functor case, as we can't have functors inside a &quot;with&quot; expression.</p> 483 + <p>These &quot;unexpanded&quot; expressions still contain signatures though, so aren't <em>completely</em> unexpanded, and it's <em>these</em> signatures that we should avoid processing.</p> 484 + <p>So, what I expected to be just one bug when I started looking at this turned out to be two related issues, and a total of four different fixes!</p>]]></content> 485 + </entry> 486 + <entry> 487 + <id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html</id> 488 + <title>Caching opam solutions</title> 489 + <published>2025-09-09T00:00:00Z</published> 490 + <updated>2025-09-09T00:00:00Z</updated> 491 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html"/> 492 + <summary>The system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In or...</summary> 493 + <content type="html"><![CDATA[<h1 id="caching-opam-solutions"><a href="#caching-opam-solutions" class="anchor"></a>Caching opam solutions</h1> 494 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-09</p></li></ul> 495 + <p>The <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In order to give the docs some stability, as the docs may well <a href="../04/semantic-versioning-is-hard.html" title="semantic-versioning-is-hard">depend upon your dependencies</a>, we currently cache the solve results so that a package will always be built with the same set of dependencies, even if a new version of one of those dependencies has been released.</p> 496 + <p>The downside to this is that as time goes on, the number of distinct universes that we build increases, and docs get more and more out of date. So it's not necessarily the best thing to do, though it does mean we minimise the amount of time spent solving.</p> 497 + <p>The alternative approach is that on every commit to opam-repository we could resolve for all packages and use the latest, greatest solution to build the docs. Using this approach we would maximise the sharing of builds and keep the total amount of required storage steadier. Of course, this would mean solving for every package on every commit to opam-repository, even if we didn't end up rebuilding all of them due to the way that the cache works.</p> 498 + <p>One possibility that might be worth investigating is to cache the solutions - but then Leon Bambrick <a href="https://twitter.com/secretGeek/status/7269997868">advises us</a>:</p> 499 + <div><pre class="language-quote"><code>There are 2 hard problems in computer science: cache invalidation, 500 + naming things, and off-by-1 errors.</code></pre></div> 501 + <p>and indeed it's not obvious what the best approach to cache invalidation is here. A sledgehammer approach would be to hook into the solver and note what questions it asks of opam-repository and record the responses. If any of these change, then it's safe to say that we need to recalculate. I had a quick look at this and checked what packages were involved in the solution of <code>ocaml</code> as this would represent a minimum set of packages that would affect virtually all packages. The list was big, but not <i>too</i> big:</p> 502 + <pre>winpthreads, system-msvc, system-mingw, ocaml-variants, ocaml-system, 360 503 ocaml-options-vanilla, ocaml-option-tsan, ocaml-option-static, 361 504 ocaml-option-spacetime, ocaml-option-no-flat-float-array, 362 505 ocaml-option-no-compression, ocaml-option-nnpchecker, ··· 367 510 ocaml-config, ocaml-compiler, ocaml-beta, ocaml-base-compiler, ocaml, 368 511 dkml-base-compiler, conf-unwind, conf-pkg-config, base-unix, 369 512 base-threads, base-ocamlbuild, base-nnp, base-metaocaml-ocamlfind, 370 - base-implicits, base-effects, base-domains, base-bigarray&lt;/pre&gt; 371 - &lt;p&gt;I tried the same thing whilst using the oxcaml opam-repository, and this time, the list became much &lt;i&gt;much&lt;/i&gt; larger:&lt;/p&gt; 372 - &lt;pre&gt;zed, zarith-xen, zarith-freestanding, zarith, yojson, xenstore, xdg, 513 + base-implicits, base-effects, base-domains, base-bigarray</pre> 514 + <p>I tried the same thing whilst using the oxcaml opam-repository, and this time, the list became much <i>much</i> larger:</p> 515 + <pre>zed, zarith-xen, zarith-freestanding, zarith, yojson, xenstore, xdg, 373 516 x509, webbrowser, wasm_of_ocaml-compiler, variantslib, uutf, uuseg, 374 517 uunf, uucp, uTop, uri-sexp, uri, uopt, univ_map, uchar, tyxml, 375 518 typerex, typerep, trie, topkg, tls-lwt, tls, timezone, time_now, ··· 446 589 base-num, base-nnp, base-effects, base-domains, base-bytes, 447 590 base-bigarray, base, backoff, atdgen-runtime, atdgen, atd, async_unix, 448 591 async_rpc_kernel, async_log, async_kernel, async_extra, async, 449 - astring, asn1-combinators, arp, angstrom, alcotest&lt;/pre&gt; 450 - &lt;p&gt;This enormous list is because the opam file for oxcaml - &lt;code&gt;ocaml-variants.5.2.0+ox&lt;/code&gt; - lists a bunch of conflicts to ensure that various incompatible packages are never selected:&lt;/p&gt; 451 - &lt;pre&gt;conflicts: [ 452 - &amp;quot;base&amp;quot; {&amp;lt; &amp;quot;v0.18~&amp;quot;} 453 - &amp;quot;alcotest&amp;quot; {!= &amp;quot;1.9.0+ox&amp;quot;} 454 - &amp;quot;backoff&amp;quot; {!= &amp;quot;0.1.1+ox&amp;quot;} 455 - &amp;quot;dot-merlin-reader&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 456 - &amp;quot;gen_js_api&amp;quot; {!= &amp;quot;1.1.2+ox&amp;quot;} 457 - &amp;quot;js_of_ocaml&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 458 - &amp;quot;js_of_ocaml-compiler&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 459 - &amp;quot;js_of_ocaml-ppx&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 460 - &amp;quot;js_of_ocaml-toplevel&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 461 - &amp;quot;jsonrpc&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 462 - &amp;quot;lsp&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 463 - &amp;quot;lwt_ppx&amp;quot; {!= &amp;quot;5.9.1+ox&amp;quot;} 464 - &amp;quot;mdx&amp;quot; {!= &amp;quot;2.5.0+ox&amp;quot;} 465 - &amp;quot;merlin&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 466 - &amp;quot;merlin-lib&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 467 - &amp;quot;ocaml-compiler-libs&amp;quot; {!= &amp;quot;v0.17.0+ox&amp;quot;} 468 - &amp;quot;ocaml-index&amp;quot; {!= &amp;quot;1.1+ox&amp;quot;} 469 - &amp;quot;ocaml-lsp-server&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 470 - &amp;quot;ocamlbuild&amp;quot; {!= &amp;quot;0.15.0+ox&amp;quot;} 471 - &amp;quot;ocamlformat&amp;quot; {!= &amp;quot;0.26.2+ox&amp;quot;} 472 - &amp;quot;ocamlformat-lib&amp;quot; {!= &amp;quot;0.26.2+ox&amp;quot;} 473 - &amp;quot;ojs&amp;quot; {!= &amp;quot;1.1.2+ox&amp;quot;} 474 - &amp;quot;ppxlib&amp;quot; {!= &amp;quot;0.33.0+ox&amp;quot;} 475 - &amp;quot;ppxlib_ast&amp;quot; {!= &amp;quot;0.33.0+ox&amp;quot;} 476 - &amp;quot;sedlex&amp;quot; {!= &amp;quot;3.3+ox&amp;quot;} 477 - &amp;quot;topkg&amp;quot; {!= &amp;quot;1.0.8+ox&amp;quot;} 478 - &amp;quot;uTop&amp;quot; {!= &amp;quot;2.15.0+ox&amp;quot;} 479 - &amp;quot;uutf&amp;quot; {!= &amp;quot;1.0.3+ox&amp;quot;} 480 - &amp;quot;wasm_of_ocaml-compiler&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 481 - &amp;quot;zarith&amp;quot; {!= &amp;quot;1.12+ox&amp;quot;} 482 - ]&lt;/pre&gt; 483 - &lt;p&gt;and it seems that the solver is looking not just at these packages, but also at all of their dependencies too. So this is a much larger set of packages that we need to track changes for, probably making the caching an awful lot less effective. It's not clear to me that this is the best way for the solver to handle conflicts, but I don't know enough about how it works yet to say for sure.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html</id><title type="text">Caching opam solutions</title><updated>2025-09-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">, and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opa...</summary><published>2025-09-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;build-ids-for-day10&quot;&gt;&lt;a href=&quot;#build-ids-for-day10&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Build IDs for Day10&lt;/h1&gt; 484 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 485 - &lt;p&gt;&lt;a href=&quot;https://tunbury.org&quot;&gt;mtelvers&lt;/a&gt;, &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;dra27&lt;/a&gt; and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opam repository. It supports building even mutually-incompatible packages by dynamically creating the build environment for each package, and thus allows us to generate something akin to &lt;a href=&quot;&quot;&gt;opam health check&lt;/a&gt; but much faster.&lt;/p&gt; 486 - &lt;p&gt;Currently the cache of a package is a key-value store where the key is a hash of the package name and version and all of its dependencies and their name and version, alongside some information about the OS. This is great when this info can uniquely identify the output, but this isn't always the case. In particular, the oxcaml opam-repository has several packages where the version number is the upstream version number with `-ox` appended, as they have patches to make them compatible with oxcaml. If these patches change without bumping the suffix the currently caching mechanism would lead to trouble. When we discussed this David pointed out the idea of the &lt;a href=&quot;https://github.com/ocaml/opam/blob/c36dd1ce40a715ef27122184715bbf3e9aa7f0c9/src/state/opamPackageVar.ml#L178-L211&quot;&gt;build-id&lt;/a&gt; in opam, which would perfectly satisfy our needs. Unfortunately this code is quite deep within the opam codebase and at the point we need it we don't have an installed opam switch, so we need to pull the code out and insert it into our project.&lt;/p&gt; 487 - &lt;p&gt;One of the first challenges was that day10 currently includes the OS details in the hash so that we can test across different distros. This is at odds with the opam build-id which doesn't include that, so in order to try to get as close as possible to the opam hash I split the cache into 2 layers - a per-OS cache directory containing hashes based on pure opam metadata. The idea is that these should be identical to the build-ids of opam. With that fixed, the new cache layout looks like:&lt;/p&gt; 488 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;debian-12-x86_64/123...abc/{build.log,config,...}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 489 - &lt;p&gt;where the &lt;code&gt;123...abc&lt;/code&gt; should be the same as the build-id you would get with all the packages contained installed.&lt;/p&gt; 490 - &lt;p&gt;Now my actual use case for this is to track the state of the oxcaml world day by day, so for this I need to track both the opam-repository for OCaml and also the opam repository for OxCaml. The project currently uses a Makefile for coordinating the builds, but I thought it was time we moved on to a dedicated batch execution process. So I asked Claude to knock me up one of those, using odoc_driver for inspiration. It's very basic right now, simply iterating through the latest versions of every package, but I have got it to check on cache hits and misses, so I should be able to run it tomorrow to see how quickly we can test PRs to oxcaml/opam-repository&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html</id><title type="text">Build IDs for Day10</title><updated>2025-09-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">For a few years now we've been running , a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form o...</summary><published>2025-09-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;giving-hub.cl-an-upgrade&quot;&gt;&lt;a href=&quot;#giving-hub.cl-an-upgrade&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Giving hub.cl an upgrade&lt;/h1&gt; 491 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 492 - &lt;p&gt;For a few years now we've been running &lt;code&gt;hub.cl.cam.ac.uk&lt;/code&gt;, a Jupyterhub instance, for the first year course &amp;quot;Foundations of Computer Science&amp;quot;. It serves as a hosting site for the lecture notes, which come in the form of Jupyter notebooks, and as a playground where students can try OCaml, and it also is used to run the assessed exercises that are a mandatory part of the course.&lt;/p&gt; 493 - &lt;p&gt;Since I spent some time setting it up back in 2018 or so, its aggregated some cruft over the years, and has also fallen somewhat behind the bleeding edge of the Jupyter software stack. So I thought this year, as I'm actually lecturing the course, I'd give it a bit of loving care and attention.&lt;/p&gt; 494 - &lt;p&gt;We were still on Jupyterhub 1.5.3 whereas the current release is 5.3.0 - so there was quite a bit of work to do. I brief play with putting things on the latest version seemed to break quite a lot of things, so I thought it might be better to go back to the drawing board and start the config again from scratch. So with some help from Claude, I've now managed to hugely simplify the whole config of Jupyterhub, and even given it a makeover to try to match the style of www.cst.cam.ac.uk as well. The improvements include:&lt;/p&gt; 495 - &lt;ul&gt;&lt;li&gt;Using caddy as a reverse proxy for TLS termination, meaning I don't have to manually renew the letsencrypt cert every 3 months&lt;/li&gt;&lt;li&gt;Unifying the configuration of the two container images used for students and instructors&lt;/li&gt;&lt;li&gt;Upgrading to much newer jupyterhub, notebook and nbgrader images&lt;/li&gt;&lt;li&gt;Simplifying the configuration required to make it work on a new server - persistent user directories are now docker volumes rather than bindmounts on the local filesystem&lt;/li&gt;&lt;li&gt;Updating the authentication method to use Raven via OAuth2 rather than the unmaintained &lt;a href=&quot;https://github.com/pyCav/jupyterhub-raven-auth&quot;&gt;jupyterhub-raven-auth&lt;/a&gt; which I'd had to maintain &lt;a href=&quot;https://github.com/jonludlam/jupyterhub-raven-auth/commit/36eaf16b410e7ac3cfc532269e0ae5f1de34f231&quot;&gt;a patch&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Rebasing &lt;a href=&quot;https://github.com/jonludlam/nbgrader/commit/c83a6cbb7b530ce87b0b157accddcdc832bcba38&quot;&gt;my patch&lt;/a&gt; to nbgrader to verify all of the output of the cells when grading answers&lt;/li&gt;&lt;/ul&gt; 496 - &lt;p&gt;As ever, this took longer than I'd anticipated, but I'm mostly there now. There are a few more steps to try:&lt;/p&gt; 497 - &lt;ul&gt;&lt;li&gt;trial the &lt;a href=&quot;https://github.com/akabe/ocaml-jupyter/pull/210&quot;&gt;new patch&lt;/a&gt; for using ocaml-jupyter with OCaml 5.x&lt;/li&gt;&lt;li&gt;see how to upgrade to notebook v7, as I've stuck with v6 in order to keep the extensions we're using going.&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html</id><title type="text">Giving hub.cl an upgrade</title><updated>2025-09-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</summary><published>2025-08-27T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;using-ocaml-lsp-server-via-an-mcp-server&quot;&gt;&lt;a href=&quot;#using-ocaml-lsp-server-via-an-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using ocaml-lsp-server via an MCP server&lt;/h1&gt; 498 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-27&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 499 - &lt;p&gt;Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.&lt;/p&gt; 500 - &lt;p&gt;We're going to use &lt;a href=&quot;https://github.com/isaacphi&quot;&gt;issacphi&lt;/a&gt;'s adapter for LSP servers, which is written in go. So install go, and then:&lt;/p&gt; 501 - &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;go install github.com/isaacphi/mcp-language-server@latest&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 502 - &lt;p&gt;Once that's done, make sure you've got `ocaml-lsp-server` installed in your switch:&lt;/p&gt; 503 - &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;opam install ocaml-lsp-server&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 504 - &lt;p&gt;Then add the MCP config for claude where you want to run it:&lt;/p&gt; 505 - &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;claude mcp add ocamllsp -s local -t stdio -- /Users/jon/go/bin/mcp-language-server -workspace . -lsp ocamllsp&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 506 - &lt;p&gt;It'd be nice to get this working `globally` - that is, with `-s user` - but I haven't been able to get that to work yet.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html</id><title type="text">Using ocaml-lsp-server via an MCP server</title><updated>2025-08-27T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller langua...</summary><published>2025-08-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;an-ocaml-mcp-server&quot;&gt;&lt;a href=&quot;#an-ocaml-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;An OCaml MCP server&lt;/h1&gt; 507 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 508 - &lt;p&gt;LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller languages like OCaml there simply isn't enough training material out there, particularly when it comes to new language features like &lt;a href=&quot;https://ocaml.org/manual/5.3/effects.html&quot;&gt;effects&lt;/a&gt; or new packages that haven't had time to be widely used. With my colleagues &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt;, &lt;a href=&quot;https://ryan.freumh.org/&quot;&gt;Ryan&lt;/a&gt; and &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq&lt;/a&gt; we've been exploring ways to &lt;a href=&quot;https://anil.recoil.org/notes/cresting-the-ocaml-ai-hump&quot;&gt;improve this situation&lt;/a&gt;. One way we can mitigate these challenges is to provide a Model Context Protocol (&lt;a href=&quot;https://modelcontextprotocol.io&quot;&gt;MCP&lt;/a&gt;) server that's capable of providing up-to-date info on the current state of the OCaml world.&lt;/p&gt; 509 - &lt;p&gt;The &lt;a href=&quot;https://docs.anthropic.com/en/docs/mcp&quot;&gt;MCP specification&lt;/a&gt; was released by Anthropic at the end of last year. Since then it has become an astonishingly popular mechanism for extending the capabilities of LLMs, allowing them to become incredibly powerful agents capable of much more than simply chatting. There are now a huge variety of MCP servers, from one that provides &lt;a href=&quot;https://github.com/r-huijts/firstcycling-mcp&quot;&gt;professional cycling data&lt;/a&gt; to one that can &lt;a href=&quot;https://github.com/GongRzhe/Gmail-MCP-Server&quot;&gt;do your email&lt;/a&gt;. The &lt;a href=&quot;https://github.com/punkpeye/awesome-mcp-servers&quot;&gt;awesome mcp server list&lt;/a&gt; already lists hundreds, and these are just the &lt;em&gt;awesome&lt;/em&gt; ones!&lt;/p&gt; 510 - &lt;p&gt;I've been working with &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq&lt;/a&gt; to make an &lt;a href=&quot;https://github.com/sadiqj/odoc-llm/&quot;&gt;MCP server for OCaml&lt;/a&gt;, with an initial focus on building it such that it can be hosted for everyone rather than something that is run locally. Our plan is to start with a service that can help with choosing OCaml libraries, by taking advantage of the work done by &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/&quot;&gt;ocaml-docs-ci&lt;/a&gt; which is the tool used to generate the documentation for all packages in &lt;a href=&quot;https://github.com/ocaml/opam-repository&quot;&gt;opam-repository&lt;/a&gt; and is served by &lt;a href=&quot;https://ocaml.org/&quot;&gt;ocaml.org&lt;/a&gt;. As well as producing HTML docs, we can also extract a number of other formats from the pipeline, including a newly created &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;markdown backend&lt;/a&gt;. Using this, we can get markdown-formatted documentation for the every version of every package in the OCaml ecosystem.&lt;/p&gt; 511 - &lt;h2 id=&quot;semantic-searching&quot;&gt;&lt;a href=&quot;#semantic-searching&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Semantic searching&lt;/h2&gt; 512 - &lt;p&gt;The first thing we focused on was being able to do a &lt;em&gt;semantic search&lt;/em&gt; over the whole OCaml ecosystem. To do this, we're using &lt;a href=&quot;https://huggingface.co/spaces/hesamation/primer-llm-embedding&quot;&gt;LLM embeddings&lt;/a&gt;, for which we need some natural-language description to seach through.&lt;/p&gt; 513 - &lt;p&gt;The documentation produced by &lt;code&gt;ocaml-docs-ci&lt;/code&gt; is generated per library module using &lt;a href=&quot;https://github.com/ocaml/odoc&quot;&gt;odoc&lt;/a&gt;, relying on the package author to provide documentation comments for each element in the signature. However, even if the package authors &lt;em&gt;hasn't&lt;/em&gt; provided any documentation, we can still see the types, values, modules and so on that the library exposes, and this is often enough to get a good idea of what the module does. We then take these documentation pages, which are formatted in markdown, and summarise them via an LLM at the module level. This is done hierarchically, so we start with the 'deepest' modules, and then insert their summaries into the text of their parent module, then summarise those and so on. We found it useful to include the names and &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#preamble&quot;&gt;preambles&lt;/a&gt; of the ancestor modules when doing the summarisation to give additional context to the LLM. For example, here is the prompt generated for a submodule of the &lt;a href=&quot;https://erratique.ch/software/astring&quot;&gt;astring&lt;/a&gt; library:&lt;/p&gt; 514 - &lt;div&gt;&lt;pre class=&quot;language-markdown&quot;&gt;&lt;code&gt;Module: Astring.String.Ascii 592 + astring, asn1-combinators, arp, angstrom, alcotest</pre> 593 + <p>This enormous list is because the opam file for oxcaml - <code>ocaml-variants.5.2.0+ox</code> - lists a bunch of conflicts to ensure that various incompatible packages are never selected:</p> 594 + <pre>conflicts: [ 595 + &quot;base&quot; {&lt; &quot;v0.18~&quot;} 596 + &quot;alcotest&quot; {!= &quot;1.9.0+ox&quot;} 597 + &quot;backoff&quot; {!= &quot;0.1.1+ox&quot;} 598 + &quot;dot-merlin-reader&quot; {!= &quot;5.2.1-502+ox&quot;} 599 + &quot;gen_js_api&quot; {!= &quot;1.1.2+ox&quot;} 600 + &quot;js_of_ocaml&quot; {!= &quot;6.0.1+ox&quot;} 601 + &quot;js_of_ocaml-compiler&quot; {!= &quot;6.0.1+ox&quot;} 602 + &quot;js_of_ocaml-ppx&quot; {!= &quot;6.0.1+ox&quot;} 603 + &quot;js_of_ocaml-toplevel&quot; {!= &quot;6.0.1+ox&quot;} 604 + &quot;jsonrpc&quot; {!= &quot;1.19.0+ox&quot;} 605 + &quot;lsp&quot; {!= &quot;1.19.0+ox&quot;} 606 + &quot;lwt_ppx&quot; {!= &quot;5.9.1+ox&quot;} 607 + &quot;mdx&quot; {!= &quot;2.5.0+ox&quot;} 608 + &quot;merlin&quot; {!= &quot;5.2.1-502+ox&quot;} 609 + &quot;merlin-lib&quot; {!= &quot;5.2.1-502+ox&quot;} 610 + &quot;ocaml-compiler-libs&quot; {!= &quot;v0.17.0+ox&quot;} 611 + &quot;ocaml-index&quot; {!= &quot;1.1+ox&quot;} 612 + &quot;ocaml-lsp-server&quot; {!= &quot;1.19.0+ox&quot;} 613 + &quot;ocamlbuild&quot; {!= &quot;0.15.0+ox&quot;} 614 + &quot;ocamlformat&quot; {!= &quot;0.26.2+ox&quot;} 615 + &quot;ocamlformat-lib&quot; {!= &quot;0.26.2+ox&quot;} 616 + &quot;ojs&quot; {!= &quot;1.1.2+ox&quot;} 617 + &quot;ppxlib&quot; {!= &quot;0.33.0+ox&quot;} 618 + &quot;ppxlib_ast&quot; {!= &quot;0.33.0+ox&quot;} 619 + &quot;sedlex&quot; {!= &quot;3.3+ox&quot;} 620 + &quot;topkg&quot; {!= &quot;1.0.8+ox&quot;} 621 + &quot;uTop&quot; {!= &quot;2.15.0+ox&quot;} 622 + &quot;uutf&quot; {!= &quot;1.0.3+ox&quot;} 623 + &quot;wasm_of_ocaml-compiler&quot; {!= &quot;6.0.1+ox&quot;} 624 + &quot;zarith&quot; {!= &quot;1.12+ox&quot;} 625 + ]</pre> 626 + <p>and it seems that the solver is looking not just at these packages, but also at all of their dependencies too. So this is a much larger set of packages that we need to track changes for, probably making the caching an awful lot less effective. It's not clear to me that this is the best way for the solver to handle conflicts, but I don't know enough about how it works yet to say for sure.</p>]]></content> 627 + </entry> 628 + <entry> 629 + <id>https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html</id> 630 + <title>Build IDs for Day10</title> 631 + <published>2025-09-08T00:00:00Z</published> 632 + <updated>2025-09-08T00:00:00Z</updated> 633 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html"/> 634 + <summary>, and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opa...</summary> 635 + <content type="html"><![CDATA[<h1 id="build-ids-for-day10"><a href="#build-ids-for-day10" class="anchor"></a>Build IDs for Day10</h1> 636 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-08</p></li></ul> 637 + <p><a href="https://tunbury.org">mtelvers</a>, <a href="https://www.dra27.uk/blog/">dra27</a> and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opam repository. It supports building even mutually-incompatible packages by dynamically creating the build environment for each package, and thus allows us to generate something akin to <a href="">opam health check</a> but much faster.</p> 638 + <p>Currently the cache of a package is a key-value store where the key is a hash of the package name and version and all of its dependencies and their name and version, alongside some information about the OS. This is great when this info can uniquely identify the output, but this isn't always the case. In particular, the oxcaml opam-repository has several packages where the version number is the upstream version number with `-ox` appended, as they have patches to make them compatible with oxcaml. If these patches change without bumping the suffix the currently caching mechanism would lead to trouble. When we discussed this David pointed out the idea of the <a href="https://github.com/ocaml/opam/blob/c36dd1ce40a715ef27122184715bbf3e9aa7f0c9/src/state/opamPackageVar.ml#L178-L211">build-id</a> in opam, which would perfectly satisfy our needs. Unfortunately this code is quite deep within the opam codebase and at the point we need it we don't have an installed opam switch, so we need to pull the code out and insert it into our project.</p> 639 + <p>One of the first challenges was that day10 currently includes the OS details in the hash so that we can test across different distros. This is at odds with the opam build-id which doesn't include that, so in order to try to get as close as possible to the opam hash I split the cache into 2 layers - a per-OS cache directory containing hashes based on pure opam metadata. The idea is that these should be identical to the build-ids of opam. With that fixed, the new cache layout looks like:</p> 640 + <div><pre class="language-ocaml"><code>debian-12-x86_64/123...abc/{build.log,config,...}</code></pre></div> 641 + <p>where the <code>123...abc</code> should be the same as the build-id you would get with all the packages contained installed.</p> 642 + <p>Now my actual use case for this is to track the state of the oxcaml world day by day, so for this I need to track both the opam-repository for OCaml and also the opam repository for OxCaml. The project currently uses a Makefile for coordinating the builds, but I thought it was time we moved on to a dedicated batch execution process. So I asked Claude to knock me up one of those, using odoc_driver for inspiration. It's very basic right now, simply iterating through the latest versions of every package, but I have got it to check on cache hits and misses, so I should be able to run it tomorrow to see how quickly we can test PRs to oxcaml/opam-repository</p>]]></content> 643 + </entry> 644 + <entry> 645 + <id>https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html</id> 646 + <title>Giving hub.cl an upgrade</title> 647 + <published>2025-09-07T00:00:00Z</published> 648 + <updated>2025-09-07T00:00:00Z</updated> 649 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html"/> 650 + <summary>For a few years now we've been running , a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form o...</summary> 651 + <content type="html"><![CDATA[<h1 id="giving-hub.cl-an-upgrade"><a href="#giving-hub.cl-an-upgrade" class="anchor"></a>Giving hub.cl an upgrade</h1> 652 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-07</p></li></ul> 653 + <p>For a few years now we've been running <code>hub.cl.cam.ac.uk</code>, a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form of Jupyter notebooks, and as a playground where students can try OCaml, and it also is used to run the assessed exercises that are a mandatory part of the course.</p> 654 + <p>Since I spent some time setting it up back in 2018 or so, its aggregated some cruft over the years, and has also fallen somewhat behind the bleeding edge of the Jupyter software stack. So I thought this year, as I'm actually lecturing the course, I'd give it a bit of loving care and attention.</p> 655 + <p>We were still on Jupyterhub 1.5.3 whereas the current release is 5.3.0 - so there was quite a bit of work to do. I brief play with putting things on the latest version seemed to break quite a lot of things, so I thought it might be better to go back to the drawing board and start the config again from scratch. So with some help from Claude, I've now managed to hugely simplify the whole config of Jupyterhub, and even given it a makeover to try to match the style of www.cst.cam.ac.uk as well. The improvements include:</p> 656 + <ul><li>Using caddy as a reverse proxy for TLS termination, meaning I don't have to manually renew the letsencrypt cert every 3 months</li><li>Unifying the configuration of the two container images used for students and instructors</li><li>Upgrading to much newer jupyterhub, notebook and nbgrader images</li><li>Simplifying the configuration required to make it work on a new server - persistent user directories are now docker volumes rather than bindmounts on the local filesystem</li><li>Updating the authentication method to use Raven via OAuth2 rather than the unmaintained <a href="https://github.com/pyCav/jupyterhub-raven-auth">jupyterhub-raven-auth</a> which I'd had to maintain <a href="https://github.com/jonludlam/jupyterhub-raven-auth/commit/36eaf16b410e7ac3cfc532269e0ae5f1de34f231">a patch</a>.</li><li>Rebasing <a href="https://github.com/jonludlam/nbgrader/commit/c83a6cbb7b530ce87b0b157accddcdc832bcba38">my patch</a> to nbgrader to verify all of the output of the cells when grading answers</li></ul> 657 + <p>As ever, this took longer than I'd anticipated, but I'm mostly there now. There are a few more steps to try:</p> 658 + <ul><li>trial the <a href="https://github.com/akabe/ocaml-jupyter/pull/210">new patch</a> for using ocaml-jupyter with OCaml 5.x</li><li>see how to upgrade to notebook v7, as I've stuck with v6 in order to keep the extensions we're using going.</li></ul>]]></content> 659 + </entry> 660 + <entry> 661 + <id>https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html</id> 662 + <title>Using ocaml-lsp-server via an MCP server</title> 663 + <published>2025-08-27T00:00:00Z</published> 664 + <updated>2025-08-27T00:00:00Z</updated> 665 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html"/> 666 + <summary>Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</summary> 667 + <content type="html"><![CDATA[<h1 id="using-ocaml-lsp-server-via-an-mcp-server"><a href="#using-ocaml-lsp-server-via-an-mcp-server" class="anchor"></a>Using ocaml-lsp-server via an MCP server</h1> 668 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-27</p></li></ul> 669 + <p>Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</p> 670 + <p>We're going to use <a href="https://github.com/isaacphi">issacphi</a>'s adapter for LSP servers, which is written in go. So install go, and then:</p> 671 + <div><pre class="language-bash"><code>go install github.com/isaacphi/mcp-language-server@latest</code></pre></div> 672 + <p>Once that's done, make sure you've got `ocaml-lsp-server` installed in your switch:</p> 673 + <div><pre class="language-bash"><code>opam install ocaml-lsp-server</code></pre></div> 674 + <p>Then add the MCP config for claude where you want to run it:</p> 675 + <div><pre class="language-bash"><code>claude mcp add ocamllsp -s local -t stdio -- /Users/jon/go/bin/mcp-language-server -workspace . -lsp ocamllsp</code></pre></div> 676 + <p>It'd be nice to get this working `globally` - that is, with `-s user` - but I haven't been able to get that to work yet.</p>]]></content> 677 + </entry> 678 + <entry> 679 + <id>https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html</id> 680 + <title>An OCaml MCP server</title> 681 + <published>2025-08-20T00:00:00Z</published> 682 + <updated>2025-08-20T00:00:00Z</updated> 683 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html"/> 684 + <summary>LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller langua...</summary> 685 + <content type="html"><![CDATA[<h1 id="an-ocaml-mcp-server"><a href="#an-ocaml-mcp-server" class="anchor"></a>An OCaml MCP server</h1> 686 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-20</p></li></ul> 687 + <p>LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller languages like OCaml there simply isn't enough training material out there, particularly when it comes to new language features like <a href="https://ocaml.org/manual/5.3/effects.html">effects</a> or new packages that haven't had time to be widely used. With my colleagues <a href="https://anil.recoil.org/">Anil</a>, <a href="https://ryan.freumh.org/">Ryan</a> and <a href="https://toao.com/">Sadiq</a> we've been exploring ways to <a href="https://anil.recoil.org/notes/cresting-the-ocaml-ai-hump">improve this situation</a>. One way we can mitigate these challenges is to provide a Model Context Protocol (<a href="https://modelcontextprotocol.io">MCP</a>) server that's capable of providing up-to-date info on the current state of the OCaml world.</p> 688 + <p>The <a href="https://docs.anthropic.com/en/docs/mcp">MCP specification</a> was released by Anthropic at the end of last year. Since then it has become an astonishingly popular mechanism for extending the capabilities of LLMs, allowing them to become incredibly powerful agents capable of much more than simply chatting. There are now a huge variety of MCP servers, from one that provides <a href="https://github.com/r-huijts/firstcycling-mcp">professional cycling data</a> to one that can <a href="https://github.com/GongRzhe/Gmail-MCP-Server">do your email</a>. The <a href="https://github.com/punkpeye/awesome-mcp-servers">awesome mcp server list</a> already lists hundreds, and these are just the <em>awesome</em> ones!</p> 689 + <p>I've been working with <a href="https://toao.com/">Sadiq</a> to make an <a href="https://github.com/sadiqj/odoc-llm/">MCP server for OCaml</a>, with an initial focus on building it such that it can be hosted for everyone rather than something that is run locally. Our plan is to start with a service that can help with choosing OCaml libraries, by taking advantage of the work done by <a href="https://github.com/ocurrent/ocaml-docs-ci/">ocaml-docs-ci</a> which is the tool used to generate the documentation for all packages in <a href="https://github.com/ocaml/opam-repository">opam-repository</a> and is served by <a href="https://ocaml.org/">ocaml.org</a>. As well as producing HTML docs, we can also extract a number of other formats from the pipeline, including a newly created <a href="https://github.com/ocaml/odoc/pull/1341">markdown backend</a>. Using this, we can get markdown-formatted documentation for the every version of every package in the OCaml ecosystem.</p> 690 + <h2 id="semantic-searching"><a href="#semantic-searching" class="anchor"></a>Semantic searching</h2> 691 + <p>The first thing we focused on was being able to do a <em>semantic search</em> over the whole OCaml ecosystem. To do this, we're using <a href="https://huggingface.co/spaces/hesamation/primer-llm-embedding">LLM embeddings</a>, for which we need some natural-language description to seach through.</p> 692 + <p>The documentation produced by <code>ocaml-docs-ci</code> is generated per library module using <a href="https://github.com/ocaml/odoc">odoc</a>, relying on the package author to provide documentation comments for each element in the signature. However, even if the package authors <em>hasn't</em> provided any documentation, we can still see the types, values, modules and so on that the library exposes, and this is often enough to get a good idea of what the module does. We then take these documentation pages, which are formatted in markdown, and summarise them via an LLM at the module level. This is done hierarchically, so we start with the 'deepest' modules, and then insert their summaries into the text of their parent module, then summarise those and so on. We found it useful to include the names and <a href="https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#preamble">preambles</a> of the ancestor modules when doing the summarisation to give additional context to the LLM. For example, here is the prompt generated for a submodule of the <a href="https://erratique.ch/software/astring">astring</a> library:</p> 693 + <div><pre class="language-markdown"><code>Module: Astring.String.Ascii 515 694 516 695 Ancestor Module Context: 517 696 - Astring: Alternative `Char` and `String` modules. Open the module to ··· 533 712 References. 534 713 535 714 ## Predicates 536 - - val is_valid : string -&amp;gt; bool (* `is_valid s` is `true` iff only for 715 + - val is_valid : string -&gt; bool (* `is_valid s` is `true` iff only for 537 716 all indices `i` of `s`, `s.[i]` is an US-ASCII character, i.e. a 538 717 byte in the range [`0x00`;`0x7F`]. *) 539 718 ··· 543 722 functions can be safely used on UTF-8 encoded strings; they will of 544 723 course only deal with US-ASCII casings. 545 724 546 - - val uppercase : string -&amp;gt; string (* `uppercase s` is `s` with 725 + - val uppercase : string -&gt; string (* `uppercase s` is `s` with 547 726 US-ASCII characters `'a'` to `'z'` mapped to `'A'` to `'Z'`. *) 548 - - val lowercase : string -&amp;gt; string (* `lowercase s` is `s` with 727 + - val lowercase : string -&gt; string (* `lowercase s` is `s` with 549 728 US-ASCII characters `'A'` to `'Z'` mapped to `'a'` to `'z'`. *) 550 - - val capitalize : string -&amp;gt; string (* `capitalize s` is like 729 + - val capitalize : string -&gt; string (* `capitalize s` is like 551 730 `uppercase` but performs the map only on `s.[0]`. *) 552 - - val uncapitalize : string -&amp;gt; string (* `uncapitalize s` is like 731 + - val uncapitalize : string -&gt; string (* `uncapitalize s` is like 553 732 `lowercase` but performs the map only on `s.[0]`. *) 554 733 555 734 ## Escaping to printable US-ASCII 556 - - val escape : string -&amp;gt; string (* `escape s` is `s` with: *) 557 - - val unescape : string -&amp;gt; string option (* `unescape s` unescapes 735 + - val escape : string -&gt; string (* `escape s` is `s` with: *) 736 + - val unescape : string -&gt; string option (* `unescape s` unescapes 558 737 what `escape` did. The letters of hex escapes can be upper, lower or 559 738 mixed case, and any two letter hex escape is decoded to its 560 739 corresponding byte. Any other escape not defined by `escape` or 561 740 truncated escape makes the function return `None`. *) 562 - - val escape_string : string -&amp;gt; string (* `escape_string s` is like 741 + - val escape_string : string -&gt; string (* `escape_string s` is like 563 742 `escape` except it escapes `s` according to OCaml's lexical 564 743 conventions for strings with: *) 565 - - val unescape_string : string -&amp;gt; string option (* `unescape_string` 744 + - val unescape_string : string -&gt; string option (* `unescape_string` 566 745 is to `escape_string` what `unescape` is to `escape` and also 567 - additionally unescapes the sequence `&amp;quot;\\'&amp;quot;` (`0x5C,0x27`) to `&amp;quot;'&amp;quot;` 568 - (`0x27`). *)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 569 - &lt;p&gt;where clearly the package author has provided excellent documentation comments. This is then passed to an LLM which generated the following description:&lt;/p&gt; 570 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;This module provides functions to check if a string contains only 746 + additionally unescapes the sequence `&quot;\\'&quot;` (`0x5C,0x27`) to `&quot;'&quot;` 747 + (`0x27`). *)</code></pre></div> 748 + <p>where clearly the package author has provided excellent documentation comments. This is then passed to an LLM which generated the following description:</p> 749 + <div><pre class="language-ocaml"><code>This module provides functions to check if a string contains only 571 750 US-ASCII characters, convert case for ASCII letters, and escape or 572 751 unescape strings using ASCII conventions. It operates on standard 573 752 OCaml strings, treating them as sequences of bytes, and ensures ··· 575 754 cases include sanitizing input for ASCII-only protocols, preparing 576 755 strings for environments requiring strict ASCII formatting, and 577 756 handling escaped string representations in configuration or 578 - serialization contexts.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 579 - &lt;p&gt;Once we have these natural language descriptions, we can generate embeddings for them to allow for semantic search amongst all modules in opam.&lt;/p&gt; 580 - &lt;p&gt;In addition to the module descriptions, we also generate similar natural-language descriptions of the &lt;em&gt;package&lt;/em&gt; as a whole, by taking the README from the package and summarising it similarly. Where there is no README, we summarise the summaries of the modules of the libraries, so we're always able to generate some text description of the entire package.&lt;/p&gt; 581 - &lt;p&gt;To help with the ranking, we're also using a measure of popularity for both modules and packages. For packages, we're using the number of reverse dependencies in opam as a proxy for popularity, and for modules, we're using the &amp;quot;occurrences&amp;quot; generated as part of the docs build. These [occurrences] are a count of how often modules are used in other modules, and are calculated by looking at the compiled [cmt] files and resolving references to external modules using odoc's internal logic and counting them.&lt;/p&gt; 582 - &lt;p&gt;Once we have both the module and package summaries, we generate an embedding of the descriptions to allow for a semantic search to be performed efficiently. We're using this in two ways - to search for packages for broad queries of functionality, which just uses the package summaries, and for more specific queries to search for modules within packages.&lt;/p&gt; 583 - &lt;p&gt;For the module search, if the packages to search in haven't been specified, we search for both modules and packages and then combine the results. This is particularly helpful when the search is for generic functionality that might be found in more specific packages. For example, a module-only search for the term &amp;quot;time and date manipulation functions&amp;quot; returns the strongest match with a &lt;a href=&quot;https://ocaml.org/p/caqti/2.2.4/doc/caqti.platform/Caqti_platform/Conv/index.html&quot;&gt;module from caqti&lt;/a&gt;, which, as caqti is a library for talking to relational databases, might not be what the user is looking for.&lt;/p&gt; 584 - &lt;p&gt;We then put these search tools into an MCP server, along with a little more functionality. The server currently provides these five functions: &lt;/p&gt; 585 - &lt;ol&gt;&lt;/ol&gt; 586 - &lt;p&gt;The first 2 use the LLM-generated summaries as described above, and the last is using &lt;a href=&quot;https://github.com/art-w/&quot;&gt;Arthur's&lt;/a&gt; &lt;a href=&quot;https://github.com/art-w/sherlodoc&quot;&gt;sherlodoc tool&lt;/a&gt; which can do various searches, including type-based search, across the output of the &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt;.&lt;/p&gt; 587 - &lt;h2 id=&quot;example-searches&quot;&gt;&lt;a href=&quot;#example-searches&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Example searches&lt;/h2&gt; 588 - &lt;p&gt;The following are the results from some example package searches: &lt;/p&gt; 589 - &lt;ul&gt;&lt;li&gt;&amp;quot;HTTP client&amp;quot;&lt;/li&gt;&lt;/ul&gt; 590 - &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - http (v6.1.1) 757 + serialization contexts.</code></pre></div> 758 + <p>Once we have these natural language descriptions, we can generate embeddings for them to allow for semantic search amongst all modules in opam.</p> 759 + <p>In addition to the module descriptions, we also generate similar natural-language descriptions of the <em>package</em> as a whole, by taking the README from the package and summarising it similarly. Where there is no README, we summarise the summaries of the modules of the libraries, so we're always able to generate some text description of the entire package.</p> 760 + <p>To help with the ranking, we're also using a measure of popularity for both modules and packages. For packages, we're using the number of reverse dependencies in opam as a proxy for popularity, and for modules, we're using the &quot;occurrences&quot; generated as part of the docs build. These [occurrences] are a count of how often modules are used in other modules, and are calculated by looking at the compiled [cmt] files and resolving references to external modules using odoc's internal logic and counting them.</p> 761 + <p>Once we have both the module and package summaries, we generate an embedding of the descriptions to allow for a semantic search to be performed efficiently. We're using this in two ways - to search for packages for broad queries of functionality, which just uses the package summaries, and for more specific queries to search for modules within packages.</p> 762 + <p>For the module search, if the packages to search in haven't been specified, we search for both modules and packages and then combine the results. This is particularly helpful when the search is for generic functionality that might be found in more specific packages. For example, a module-only search for the term &quot;time and date manipulation functions&quot; returns the strongest match with a <a href="https://ocaml.org/p/caqti/2.2.4/doc/caqti.platform/Caqti_platform/Conv/index.html">module from caqti</a>, which, as caqti is a library for talking to relational databases, might not be what the user is looking for.</p> 763 + <p>We then put these search tools into an MCP server, along with a little more functionality. The server currently provides these five functions: </p> 764 + <ol></ol> 765 + <p>The first 2 use the LLM-generated summaries as described above, and the last is using <a href="https://github.com/art-w/">Arthur's</a> <a href="https://github.com/art-w/sherlodoc">sherlodoc tool</a> which can do various searches, including type-based search, across the output of the <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a>.</p> 766 + <h2 id="example-searches"><a href="#example-searches" class="anchor"></a>Example searches</h2> 767 + <p>The following are the results from some example package searches: </p> 768 + <ul><li>&quot;HTTP client&quot;</li></ul> 769 + <div><pre class="language-nolang"><code>#1 - http (v6.1.1) 591 770 Similarity: 0.7593 592 771 Reverse Dependencies: 407 593 772 Combined Score: 0.6588 ··· 670 849 requests and non-blocking I/O. Practical use cases include web 671 850 scraping, API client deve lopment, and integrating HTTP-based services 672 851 into OCaml applications. 673 - &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 674 - &lt;ul&gt;&lt;li&gt;&amp;quot;Cryptographic hash&amp;quot;&lt;/li&gt;&lt;/ul&gt; 675 - &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - digestif (v1.3.0) 852 + </code></pre></div> 853 + <ul><li>&quot;Cryptographic hash&quot;</li></ul> 854 + <div><pre class="language-nolang"><code>#1 - digestif (v1.3.0) 676 855 Similarity: 0.8165 677 856 Reverse Dependencies: 621 678 857 Combined Score: 0.7041 ··· 738 917 help mitigate brute-force attacks and ensure keys are de rived in a 739 918 reproducible, secure manner. Use cases include password-based 740 919 encryption, secure token generation, and key material expansion in 741 - cryptographic protocols.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 742 - &lt;p&gt;and a module-level search for &amp;quot;time and date manipulation functions&amp;quot;&lt;/p&gt; 743 - &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - timmy-jsoo: Timmy_jsoo 920 + cryptographic protocols.</code></pre></div> 921 + <p>and a module-level search for &quot;time and date manipulation functions&quot;</p> 922 + <div><pre class="language-nolang"><code>#1 - timmy-jsoo: Timmy_jsoo 744 923 Similarity: 0.5460 745 924 Original Similarity: 0.7800 746 925 Popularity Score: 0.0000 ··· 804 983 timezone. It works with time and date types from the Timmy library, 805 984 specifically `Timmy.Time.t` and `Timmy.Date.t`. Use this module to 806 985 obtain precise time and date information for logging, scheduling, or 807 - time-based computations.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 808 - &lt;p&gt;and for &amp;quot;Balanced Tree&amp;quot;:&lt;/p&gt; 809 - &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - grenier: Mbt 986 + time-based computations.</code></pre></div> 987 + <p>and for &quot;Balanced Tree&quot;:</p> 988 + <div><pre class="language-nolang"><code>#1 - grenier: Mbt 810 989 Similarity: 0.5274 811 990 Original Similarity: 0.7534 812 991 Popularity Score: 0.0495 ··· 864 1043 automatically balanced and annotated with measurable values from 865 1044 module M. The module enables efficient rank queries and joining of 866 1045 trees, with applications in managing dynamic sequences where fast 867 - access and concatenation are critical.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 868 - &lt;h2 id=&quot;limitations-and-future-work&quot;&gt;&lt;a href=&quot;#limitations-and-future-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Limitations and future work&lt;/h2&gt; 869 - &lt;p&gt;We're aware that there are currently a number of limitations with what's been done so far, and there's a lot of exciting things that could quite easily be added!&lt;/p&gt; 870 - &lt;p&gt;We haven't done much prompt optimisation either for the tools themselves, nor their descriptions in the MCP server. We also haven't done much optimisation of the information retrieval - and it's clear from some of the results shown above that there are improvements to be made in the ranking algorithms. Some obvious next steps would be to do some &lt;a href=&quot;https://arxiv.org/html/2406.12433v2&quot;&gt;re-ranking&lt;/a&gt; or some form of hybrid search.&lt;/p&gt; 871 - &lt;p&gt;A particular challenge is that since this is based entirely off of the &lt;code&gt;ocaml-docs-ci&lt;/code&gt; build, it won't necessarily reflect the actual API your local build, as for OCaml, this &lt;a href=&quot;https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html&quot;&gt;can't be done&lt;/a&gt;. Thibaut Mattio is working on a &lt;a href=&quot;https://github.com/tmattio/ocaml-mcp&quot;&gt;local MCP server&lt;/a&gt; that would be perfectly positioned to do some of what we're doing, although we'd need to have a good local docs build implemented in dune for this to work well.&lt;/p&gt; 872 - &lt;p&gt;Also, there's plenty more data that we've collected during the docs builds! We can show the implementations of functions, we can expose code samples, select different versions of packages and much more. While we've concentrated on the search aspects, there's still a lot of low-hanging fruit that can be worked on.&lt;/p&gt; 873 - &lt;p&gt;If you're interested in helping us out on this, the project lives &lt;a href=&quot;https://github.com/sadiqj/odoc-llm&quot;&gt;on github&lt;/a&gt; - come along and join us!&lt;/p&gt; 874 - &lt;h2 id=&quot;using-the-server&quot;&gt;&lt;a href=&quot;#using-the-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using the server&lt;/h2&gt; 875 - &lt;p&gt;If you'd like to try it, we've got a demo server running right now. It's hosted on dill.caelum.ci.dev here at the Computer Laboratory in the University of Cambridge. To enable it with Claude, try this:&lt;/p&gt; 876 - &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;claude mcp add -t sse ocaml http://dill.caelum.ci.dev:8000/sse&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 877 - &lt;p&gt;Obviously this is pre-alpha quality software, and we might take it down with no notice, and it might not work as expected, and all of the other usual caveats. Let us know if it works, or doesn't, or if you've got some suggestions for improvements!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html</id><title type="text">An OCaml MCP server</title><updated>2025-08-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</summary><published>2025-08-19T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/week33.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-33&quot;&gt;&lt;a href=&quot;#week-33&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 33&lt;/h1&gt; 878 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-19&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 879 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;cohttp,yojson,jsonm&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 880 - &lt;p&gt;More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:&lt;/p&gt; 881 - &lt;ul&gt;&lt;li&gt;Write a README&lt;/li&gt;&lt;li&gt;Write and run a small script to fix a problem with module-type names&lt;/li&gt;&lt;li&gt;Write up and publish a blog post&lt;/li&gt;&lt;/ul&gt; 882 - &lt;p&gt;Not much, right? As always though, writing things up lead to a whole load more work.&lt;/p&gt; 883 - &lt;p&gt;The first problem occurred when writing up how it parsed the input docs. It turned out that when converting the repo so that it took markdown formatted files (using a &lt;a href=&quot;https://github.com/jonludlam/odoc/tree/odoc-llm-markdown&quot;&gt;slightly tweaked&lt;/a&gt; version of &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;davesnx's PR&lt;/a&gt;), Claude had decided that the way to do this was to first convert the markdown into HTML, and then use the HTML parser it had already built. Whilst tidying this up, Claude was remarkably keen to just use regexps to parse the markdown rather than using a pre-existing markdown library, so it took a little persuasion to get it into a state I was happy with.&lt;/p&gt; 884 - &lt;p&gt;The second issue was that the script that form the bulk of the repo had been written at different times, and therefore Claude didn't really take into account any of the decisions it had made in one script when building the next. So most of the command-line arguments were slightly different, which made writing up a mini 'howto' in the README quite a jarring experience.&lt;/p&gt; 885 - &lt;p&gt;Thirdly, and most importantly, we had decided that we needed a few example searches to show how the system worked. We'd already had a &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/week28&quot;&gt;useful experience&lt;/span&gt; with this when Anil had tried to search for a 'time and date parsing and formatting' library, so it shouldn't really have been a surprise that trying a few more examples showed some more interesting behaviour. Specifically, the searches I wanted to do were for an &amp;quot;HTTP client&amp;quot;, &amp;quot;JSON parser&amp;quot;, &amp;quot;Cryptographic Hash&amp;quot; and Anil's time-and-date query, and in actually trying these searches and critically examining the results, I had to go back and figure out why they weren't giving me the results I had expected.&lt;/p&gt; 886 - &lt;p&gt;The first of these searches I had anticipated would be quite interesting, as this is a query that should show the OCaml ecosystem &lt;a href=&quot;https://discuss.ocaml.org/t/simple-modern-http-client-library/11239&quot;&gt;missing an obvious HTTP client&lt;/a&gt;. However, even with this in mind one of the top results was one of Cohttp's module types, &lt;code&gt;Cohttp.Generic.Client.S&lt;/code&gt;. This, of course, isn't much use if you're looking for an HTTP client, as module-types aren't going to give you an implementation to actually use. So I decided that we'd exclude module-types from the results. This turned out to be slightly more tricky than I anticipated as we'd lost the distinction between modules and module types further back in the pipeline, so Claude had to do some plumbing to ensure we had this information at the point we were doing the search.&lt;/p&gt; 887 - &lt;p&gt;The cryptographic hash search gave some plausible looking results, so I moved on to the JSON search. I was expecting to see &lt;code&gt;Yojson&lt;/code&gt; somewhere near the top of the list as that's a very popular library. I was also expecting to see &lt;code&gt;Jsonm&lt;/code&gt; somewhere near the top - or at least I'd like to be able to find it by searching for a 'streaming parser' as that's one of its key strengths. However, searching for &amp;quot;JSON parser&amp;quot; yielded some less than brilliant answers. The top 5 results were for modules in the packages &lt;code&gt;yojson-five&lt;/code&gt;, &lt;code&gt;decoders-yojson&lt;/code&gt;, &lt;code&gt;decoders-jsonaf&lt;/code&gt;, &lt;code&gt;ocplib-json-typed-browser&lt;/code&gt; and &lt;code&gt;ppx_protocol_conv_jsonm&lt;/code&gt;. While all of these are clearly in the same realm as I was after, having &lt;code&gt;jsonm&lt;/code&gt; show up literally 99th in the list, and &lt;code&gt;yojson&lt;/code&gt; itself not in the top 100 wasn't a great result.&lt;/p&gt; 888 - &lt;p&gt;Some investigation showed that yojson had a particularly bad showing because the description of the module &lt;code&gt;Yojson.Basic&lt;/code&gt; was the empty string! This turned out to be because of some bad error-handling logic in the summariser script, which ended up turning some errors into a blank description. Since running the summariser costs actual money, I didn't want to just rerun the whole thing, so I asked Claude for a script to find these problems and rerun them. The problem is not totally trivial as the summaries of child modules are used when generating the summary for parents, so when one is regenerated we should regenerate the summaries of all ancestors too. Given my recent experiences with Claude I'd like to look this over quite carefully before letting it loose on my data, so I've run it on yojson, which seemed to do the right thing, but not yet on the rest of the packages.&lt;/p&gt; 889 - &lt;p&gt;Having fixed this, I still found that &lt;code&gt;jsonm&lt;/code&gt; was making a very poor showing. This turned out to be because the description it gives itself is a &amp;quot;Non-blocking streaming JSON codec for OCaml&amp;quot; which had a fairly low similarity with &amp;quot;JSON parser&amp;quot;. I was using a fairly small embedding model for the queries - Qwen/Qwen3-Embedding-0.6B, so I thought I might address this by using a larger one, and opted for Qwen/Qwen3-Embedding-8B. The machine I had been using for the MCP server has no GPU and had taken a while to do the embeddings using the 0.6B model, so I switched to generating them on my M4 macbook. This went &lt;i&gt;much&lt;/i&gt; faster, though since I have about 70Mb of module summaries it still took quite a while. This improved the situation somewhat, but it was still not high in the list.&lt;/p&gt; 890 - &lt;p&gt;So I took a step back and had a think about the problem some more. Searching for a JSON parser is really quite a high-level search, and when evaluating the results I realised I was really thinking in terms of packages rather than modules. So I thought we could split the search in two - a package search and a module search. The package search would be used for the broad queries where you're interested in pulling in whole chunks of functionality, and the module search is for more low-level queries. In fact, the 'time and dating formatting' query is somewhere in between, so I might need to have some more example queries for the module search functions. In addition, the module search could be restricted to the set of packages you're using, which might make it even more useful.&lt;/p&gt; 891 - &lt;p&gt;Part of the split meant that I needed a different source of 'popularity' for the packages than the occurrences data that came out of docs ci, as that was per-module and I needed something per-package. The obvious thing is to look at reverse dependencies in opam. I have this kind-of working, but it's currently not particularly smart, so this will need a little more attention. For example, it currently thinks that &lt;a href=&quot;https://melange.re/v5.0.0/&quot;&gt;melange&lt;/a&gt; has over 3000 reverse dependencies.&lt;/p&gt; 892 - &lt;p&gt;With these changes in place, a package search for 'JSON parser' now returns &lt;code&gt;yojson&lt;/code&gt; as number one, followed by &lt;code&gt;ppx_deriving_yojson&lt;/code&gt;, &lt;code&gt;ezjsonm&lt;/code&gt;, &lt;code&gt;ocplib-json-typed&lt;/code&gt; and &lt;code&gt;jsonaf&lt;/code&gt;. Unfortunately &lt;code&gt;jsonm&lt;/code&gt; is still languishing in 27th place, so there's still some tweaking to do.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/week33.html</id><title type="text">Week 33</title><updated>2025-08-19T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Astonishingly, it's already been since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into t...</summary><published>2025-07-18T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/retrospective.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;4-months-in,-a-retrospective&quot;&gt;&lt;a href=&quot;#4-months-in,-a-retrospective&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;4 months in, a retrospective&lt;/h1&gt; 893 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-18&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 894 - &lt;p&gt;Astonishingly, it's already been &lt;i&gt;four whole months&lt;/i&gt; since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into the Computer Laboratory as an SRA for the first time since 2021, but here we are, at the end of term already. Time to do a bit of a retrospective and forward-looking plan for the next 3-4 months!&lt;/p&gt; 895 - &lt;h2 id=&quot;what's-happened?&quot;&gt;&lt;a href=&quot;#what's-happened?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's happened?&lt;/h2&gt; 896 - &lt;p&gt;On wednesday this week, I had a chance to sit down with Anil, supposedly to talk about the upcoming lecturing of 1A Foundations of Computer Science, but we ended up talking about what I've been doing for the past few months, and where it fits into the broader picture of the group as a whole. It was a really useful conversation, and I thought it would be good to outline it here while it's fresh in my mind.&lt;/p&gt; 897 - &lt;p&gt;So then, to start, what have I been doing? What have I achieved? What have I learnt? It's been a bit of a daunting experience, landing in a team that are already working one hundred miles an hour on things well out of my comfort zone. I've been going to group meetings and having lots of interesting conversations, but I've found it difficult to make the next steps happen. One area where I've had some success is in working with Sadiq on LLMs - in particular, getting local LLMs to solve programming exercises that we both &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;wrote&lt;/a&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/05/ticks-solved-by-ai&quot;&gt;up&lt;/span&gt;. I've also been working with him on taking the output from the docs CI and &lt;a href=&quot;https://github.com/sadiqj/odoc-llm&quot;&gt;summarising it with LLMs&lt;/a&gt; in order to create an MCP server that would help tools like &lt;a href=&quot;https://anthropic.com/&quot;&gt;Claude Code&lt;/a&gt; to choose OCaml packages to solve users' problems.&lt;/p&gt; 898 - &lt;p&gt;It's been somewhat easier, partly due to inertia, to carry on with projects that had been in flight at the time I started. Things like getting the Odoc 3 generated docs onto ocaml.org, which is finally complete only &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;as of this week!&lt;/span&gt;. This has taken a whole lot of time, but I'm really pleased with the end results. There's still an awful lot of improvements that I'd like to see made, which, after drawing breath for a couple of weeks, I'll be writing down.&lt;/p&gt; 899 - &lt;p&gt;An itch I'd been wanting to scratch for a long time has been to look at client-side ocaml notebooks. I decided to make this an integral &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/this-site&quot;&gt;feature of this blog&lt;/span&gt;, and I've learnt an awful lot doing it. An important feature of this that I've been keeping in mind is the idea that we could use the ocaml-docs-ci tool to build the libraries, which would allow us to host a toplevel for every single package in opam-repository - allowing at best &lt;a href=&quot;https://discuss.ocaml.org/t/an-example-for-every-ocaml-package/16953/10&quot;&gt;interactive examples&lt;/a&gt;, and at bare minimum merlin for live type-checking and autocompletion. The important principles to keep in mind for this are that:&lt;/p&gt; 900 - &lt;ul&gt;&lt;li&gt;We have one 'toplevel' javascript file, and libraries and cmis are dynamically loaded&lt;/li&gt;&lt;li&gt;The interface between the frontend and the worker must not rely on a matched pair, e.g. an OCaml-5.3-compiled frontend might be talking to an OCaml-4.08-compiled worker thread - or even an oxcaml one!&lt;/li&gt;&lt;/ul&gt; 901 - &lt;p&gt;I have this all working on my blog, where I have both an oxcaml worker and a standard ocaml worker and they both dynamically load in libraries and cmis as specified on the page.&lt;/p&gt; 902 - &lt;p&gt;I've also supervised a 1A course for the first time - &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/IntroProb/&quot;&gt;Introduction to Probability&lt;/a&gt;, and I've done some marking for the 1A Foundations of Computer Science.&lt;/p&gt; 903 - &lt;p&gt;Something that I'd been expecting to do a lot on was work with oxcaml, but as the release happened later than anticipated and it coinciding with the marking and supervising, I've not done quite as much of this as I had thought I would. In addition, I had anticipated working on Odoc to start implementing the new features of oxcaml, but to avoid duplicating effort I've been waiting for the patches that have already been written at Jane Street to at least get odoc to compile, which have taken longer than I had hoped to get to me.&lt;/p&gt; 904 - &lt;h2 id=&quot;what's-next?&quot;&gt;&lt;a href=&quot;#what's-next?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's next?&lt;/h2&gt; 905 - &lt;p&gt;With that in mind, Anil and I then talked about the bigger picture, as those of you who know Anil will be entirely unsurprised to hear! In particular, how will we be weaving the various threads of these activites - the teaching of OCaml, the large-scale (for OCaml) CI work, the LLMs and Oxcaml work together to form a coherent whole? How do I find a balance between them and ensure that we find &lt;a href=&quot;https://arxiv.org/abs/1106.0848&quot;&gt;synergies&lt;/a&gt; as opposed to pulling in different directions? How do make sure what we're doing helps us navigate the upending of the nature of development that agentic coding is bringing?&lt;/p&gt; 906 - &lt;h3 id=&quot;efficient-and-reusable-ci&quot;&gt;&lt;a href=&quot;#efficient-and-reusable-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Efficient and reusable CI&lt;/h3&gt; 907 - &lt;p&gt;A clear and obvious area where we'll be able to see real progress is to extract from docs CI the logic that I've been using to do efficient builds of packages. As I previously &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;wrote about&lt;/span&gt;, the new CI system is far more efficient than some of the other ocurrent-based pipelines, and it would save a huge amount of compute time if we were to take this tech and apply it elsewhere.&lt;/p&gt; 908 - &lt;p&gt;So, how might we take what we've got and produce something useful to everyone? We need to take a hammer to the fracture points of the docs CI service and split it into individually useful parts. Here are some next steps as I see them now. Let's take the solver out of docs CI, and have a service whose sole job is to create a repository of up-to-date solutions for all versions of all packages in opam-repository. These are the data that allow us to build the tree of package builds.&lt;/p&gt; 909 - &lt;p&gt;Next, turn these solutions into one giant build. Perhaps a script? Maybe a giant buildkit dockerfile? This is very similar to Mark Elvers' &lt;a href=&quot;https://github.com/mtelvers/ohc&quot;&gt;day10&lt;/a&gt; project. We can get this running on a big machine and just see how fast we can build everything. The key thing here is that it should be &lt;em&gt;trivial&lt;/em&gt; to run this on a linux box. A raspberry pi or a 768-core behemoth with 3TiB of ram. Just how fast &lt;em&gt;can&lt;/em&gt; we get it going? It's already building in a couple of days using &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;sage&lt;/span&gt;, but that's using ocurrent/obuilder, which isn't quite the right tool for the job, and on a relatively puny machine. Can we do it in an hour? 10 minutes? Certainly the incrememntal builds ought to be done in seconds. What's the limit?&lt;/p&gt; 910 - &lt;p&gt;These tools can then be used as the foundation for other CI systems. For opam-repo-ci, where we should be able to do the builds for a new package incredibly quickly. For opam-health-check, where we currently build foundational packages like dune and findlib &lt;i&gt;thousands of times&lt;/i&gt; per run.&lt;/p&gt; 911 - &lt;p&gt;Once we've got the packages built, docs CI is simply a pass over the top of the built artifacts. ocaml-docs-ci already demonstrates this - it only takes a few hours to rebuild all the docs when a new version of odoc is released, but in a way that only benefits docs! All the CI systems should be able to use this.&lt;/p&gt; 912 - &lt;p&gt;We should also then be able to run js_of_ocaml on the libraries to build to infrastructure needed for the per-package toplevels for ocaml.org that I mentioned above. Each of these steps should be separate stages in a pipeline - one where each step produces artifacts for the next to consume.&lt;/p&gt; 913 - &lt;p&gt;When we mix in some of the projects that other people in the team are working on, like David's work on &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;relocatable OCaml&lt;/a&gt;, we've got something that might be able to form a basis for a binary cache for Dune Package Management, particularly when we involve Ryan's &lt;a href=&quot;https://ryan.freumh.org/papers.html#2025-arxiv-hyperres&quot;&gt;Hyperres&lt;/a&gt; paper so we might check that dependencies from outside of the OCaml universe are correct. Can we use &lt;a href=&quot;https://github.com/quantifyearth/shark&quot;&gt;Patrick and Michael's shark&lt;/a&gt; to do the build steps? Can we use these images to serve up toplevels for ocaml.org that are &lt;em&gt;real toplevels&lt;/em&gt; rather than javascript toplevels? Can we use these build environments to do help with reinforcement learning to train LLMs on OCaml code? There are a lot of interesting directions to take this work.&lt;/p&gt; 914 - &lt;h3 id=&quot;other-projects&quot;&gt;&lt;a href=&quot;#other-projects&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Other projects&lt;/h3&gt; 915 - &lt;p&gt;There are, of course, other responsibilities that I have. Some of these I'll be able to fit in with the theme above, and some - well - maybe I'll have to figure out how to delegate them, a skill that I am not particularly good at, but one that I feel I should learn!&lt;/p&gt; 916 - &lt;h4 id=&quot;teaching&quot;&gt;&lt;a href=&quot;#teaching&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Teaching&lt;/h4&gt; 917 - &lt;p&gt;A looming, terrifying, but tremendously exciting opportunity is teaching of 1A Foundations of Computer Science. This is amongst the first courses we teach our incoming undergraduates, currently lectured by &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/&quot;&gt;Anil&lt;/a&gt;. As he's on sabbatical this year, he has asked me to step up and lecture it. This is definitely not one for delegation!&lt;/p&gt; 918 - &lt;p&gt;The immediate question, partly raised by my work with Sadiq, is: what do we do about LLMs? How should we adjust our teaching to take into account the existence of these tools? We had a very interesting chat earlier in the term with Professor &lt;a href=&quot;https://eecs.iisc.ac.in/people/prof-viraj-kumar/&quot;&gt;Viraj Kumar&lt;/a&gt; from &lt;a href=&quot;https://eecs.iisc.ac.in/&quot;&gt;IISc&lt;/a&gt; who was visiting Cambridge earlier this year. He's been &lt;a href=&quot;https://dl.acm.org/doi/10.1145/3724363.3729100&quot;&gt;working on this question&lt;/a&gt; for a while now, and I hope to have some more conversations with him over the summer.&lt;/p&gt; 919 - &lt;h4 id=&quot;odoc-paper&quot;&gt;&lt;a href=&quot;#odoc-paper&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc paper&lt;/h4&gt; 920 - &lt;p&gt;An area where I've really made a shockingly small amount of progress is to write up all the work that's gone into Odoc over the past 6 (!!!) years.&lt;/p&gt; 921 - &lt;h4 id=&quot;odoc-notebooks&quot;&gt;&lt;a href=&quot;#odoc-notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc notebooks&lt;/h4&gt; 922 - &lt;p&gt;This needs to be tidied up and a v0.1 released. In particular, the work on js_top_worker might well be shared with Arthur's &lt;a href=&quot;https://github.com/art-w/x-ocaml&quot;&gt;x-ocaml&lt;/a&gt; for a unified toplevel experience.&lt;/p&gt; 923 - &lt;h4 id=&quot;ai-work&quot;&gt;&lt;a href=&quot;#ai-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;AI work&lt;/h4&gt; 924 - &lt;p&gt;I'd like to carry on the work I've started with Sadiq on the interaction of LLMs with OCaml. Getting the package search to work sensibly for an MCP server is first on the list, but also doing some reinforcement learning to improve specifically the perfomance on OCaml is very interesting, but not something I've managed to carve out the time for yet. Something along the lines of &lt;a href=&quot;https://arxiv.org/abs/2504.21798&quot;&gt;swesmith&lt;/a&gt; but adapted for OCaml.&lt;/p&gt; 925 - &lt;h4 id=&quot;oxcaml-odoc&quot;&gt;&lt;a href=&quot;#oxcaml-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml Odoc&lt;/h4&gt; 926 - &lt;p&gt;Odoc needs to have some work done on it to support the new work that's gone into oxcaml, for example documenting of the modes. This is something I do expect to be working on soon.&lt;/p&gt; 927 - &lt;h4 id=&quot;dune-and-odoc&quot;&gt;&lt;a href=&quot;#dune-and-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune and odoc&lt;/h4&gt; 928 - &lt;p&gt;Work needs to be done on the dune rules for odoc, which currently only support the feature-set in odoc 2.x. Paul-Elliot has &lt;a href=&quot;https://github.com/ocaml/dune/pull/11716&quot;&gt;done some work on this&lt;/a&gt;, but much more needs to be done.&lt;/p&gt; 929 - &lt;h4 id=&quot;further-general-odoc-work&quot;&gt;&lt;a href=&quot;#further-general-odoc-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Further general odoc work&lt;/h4&gt; 930 - &lt;ul&gt;&lt;li&gt;Better source rendering&lt;/li&gt;&lt;li&gt;Syntax for linking to source&lt;/li&gt;&lt;li&gt;Custom tags (used in odoc_notebook)&lt;/li&gt;&lt;li&gt;Web-native rendering, for embedding odoc in a website&lt;/li&gt;&lt;li&gt;Unifying paths and cpaths (https://github.com/jonludlam/odoc/tree/parameterised-paths)&lt;/li&gt;&lt;/ul&gt; 931 - &lt;h2 id=&quot;what-to-actually-do?&quot;&gt;&lt;a href=&quot;#what-to-actually-do?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What to &lt;i&gt;actually&lt;/i&gt; do?&lt;/h2&gt; 932 - &lt;p&gt;There are a lot of things in the above list. I'm not sure yet how I manage to figure out what I actually end up doing, and how that helps me to help Tarides, to fit in as a useful member of the EEG group, and to make sure I'm doing what's right for my own future. I feel the core project of the CI work will help everyone, but slotting the other work into the bigger picture will require some careful thought.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/retrospective.html</id><title type="text">4 months in, a retrospective</title><updated>2025-07-18T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Week 28</summary><published>2025-07-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/week28.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-28&quot;&gt;&lt;a href=&quot;#week-28&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 28&lt;/h1&gt; 933 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 934 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;caqti.platform,mariadb&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 935 - &lt;h2 id=&quot;ocaml-mcp-server&quot;&gt;&lt;a href=&quot;#ocaml-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml MCP server&lt;/h2&gt; 936 - &lt;p&gt;Last week I got the summarisation to the point where it felt useful to run it across all the modules in opam. With this completed we then got to try out the MCP server to see how useful it would be in practice.&lt;/p&gt; 937 - &lt;p&gt;One of the first queries &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt; tried was to ask it which libraries would be most useful for &amp;quot;date time parsing and formatting&amp;quot;. We were surprised to see that the first two libraries it returned were &lt;code&gt;caqti&lt;/code&gt; and &lt;code&gt;mariadb&lt;/code&gt;, specifically mentioning the module &lt;code&gt;Caqti_platform.Conv&lt;/code&gt; and &lt;code&gt;Mariadb.S.Time&lt;/code&gt;. While these do indeed provide the required functionality, they're probably not the right libraries to provide this. It's going to be tricky to decide this in the MCP server, so we should probably be leaving it up to the LLM to decide amongst them on the client. However, for very general queries we might end up with a large number of matching libraries, so we'll need to have a limit on the number of packages returned, which implies some form of ranking.&lt;/p&gt; 938 - &lt;p&gt;One way we can do this is by using the occurrences code in odoc. The idea is that we examine module implementation files (ie, ml rather than mli files), and counts the number of times the code uses values, types and other identifiers from other libraries. We can then aggregate these counts over all packages in opam repository and use it as an effective marker of popularity, which allows us to rank the results by popularity and only return the top N results.&lt;/p&gt; 939 - &lt;p&gt;We're not currently using the occurrences for anything, so I wasn't especially surprised to find that it's not working as intended. There were a number of issues:&lt;/p&gt; 940 - &lt;ul&gt;&lt;li&gt;The occurrences output file was being written at a path not within the package dir, so it wasn't being persisted.&lt;/li&gt;&lt;li&gt;The CLI interface for generating occurrences works by providing a directory containing the odocl files, but we were only providing the top-level directory and it wasn't recursively searching.&lt;/li&gt;&lt;li&gt;Once the occurrences were captured, the aggregation step used the full identifier of the value being aggregated, meaning that, for example, &lt;code&gt;List.length&lt;/code&gt; in OCaml 5.3 was counted separately from &lt;code&gt;List.length&lt;/code&gt; in OCaml 4.14.&lt;/li&gt;&lt;/ul&gt; 941 - &lt;p&gt;All of these issues are with code in the odoc repository, which, as it happens, also needs a release soon to ensure that it works with the imminent launch of OCaml 5.4. During the week, before I discovered the problems above, I had attempted to make a release of Odoc 3.1, but there was a license kerfuffle that, when combined with the issues in the occurrences code, gave me enough cause to pull the release.&lt;/p&gt; 942 - &lt;p&gt;Before I try to make the release again, this time I'll be running the release candidate with docs-ci, and checking that the occurrences make sense. I set this running on Friday afternoon, and it had completed by Friday evening, so it's actually pretty quick to rerun odoc on the 15,000 or so packages required for ocaml.org.&lt;/p&gt; 943 - &lt;h2 id=&quot;trouble-with-this-blog&quot;&gt;&lt;a href=&quot;#trouble-with-this-blog&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Trouble with this blog&lt;/h2&gt; 944 - &lt;p&gt;In other news, in trying to post my blog at the beginning of the week, I was stymied a little by the changes in oxcaml. I had been using a custom opam-repository forked from the official oxcaml one, because I needed a patched js_of_ocaml in order to fix the toplevel code. I had hoped this would mean that I could update it on my schedule, rather than being at the mercy of upstream changes. Unfortunately though, the download URL for ocaml-flambda wasn't pointing at an immutable commit, so when I tried it I got a checksum error. So I ended up trying to rebase the changes onto the latest oxcaml opam-repository, which didn't go well at all. The version numbers had all changed, which in opam means that files are in different directories, so git got thoroughly confused. On top of that, because the js_of_ocaml repository has multiple packages in it, whereas opam repository has a directory per-package, we end up having multiple copies of the patches. So in the end I've just committed all the patches to a git repo on github, and pinned it in the Dockerfile that builds this site.&lt;/p&gt; 945 - &lt;p&gt;What would be handy is a way to apply the patches in a package in opam repository to and from a git repository, similar to quilt/guilt. We don't quite have all of the pieces to do this, as although we have a download URL and often a dev-repo, I don't believe we currently have a way to get the base commit of that repository.&lt;/p&gt; 946 - &lt;h2 id=&quot;oxcaml-continues&quot;&gt;&lt;a href=&quot;#oxcaml-continues&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml continues&lt;/h2&gt; 947 - &lt;p&gt;We had a meeting on Thursday with Jane Street on the next steps for oxcaml. There are a number of areas in which JS are keen for us to help out with.&lt;/p&gt; 948 - &lt;ul&gt;&lt;li&gt;Playgrounds - both javascript and docker-based. The playground on the oxcaml website right now uses github codespaces, which works nicely but currently takes an absolute age to start up. We can almost certainly improve this by building docker images and pushing them to the docker hub, rather than building oxcaml from scratch when starting the codespace. There's also interest in the javascript playgrounds, which can serve a slightly different purpose than the docker-based one, more limited in how it can be used, but without requiring someone to spin up a full docker container.&lt;/li&gt;&lt;li&gt;Documentation - Odoc has had some patches to run on oxcaml, but there's no support for documenting many of the new features yet, including modes. We've got to do some experiments here to see what the best way is to show the new type-system features in the generated docs. There were some suggestions of using javascript to show/hide the modes, for example.&lt;/li&gt;&lt;li&gt;Improvements in Merlin - again this is an area ripe for investigation. In particular, how do we best expose the new features of the type system for users? What's needed here is user feedback from people who are actually using oxcaml to build real projects.&lt;/li&gt;&lt;li&gt;Better error messages - OCaml has been getting improved error messages with each release, but there's still room for improvement, and the new features of the type system in particular have many different failure modes. Again, we need user feedback to understand the pain points and improve the error messages accordingly.&lt;/li&gt;&lt;/ul&gt; 949 - &lt;h2 id=&quot;next-week&quot;&gt;&lt;a href=&quot;#next-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Next week&lt;/h2&gt; 950 - &lt;p&gt;Next week, the plan is to:&lt;/p&gt; 951 - &lt;ul&gt;&lt;li&gt;Check the occurrences from docs-ci, and integrate into the MCP server&lt;/li&gt;&lt;li&gt;Talk to &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark&lt;/a&gt; about building the docker image for the oxcaml playground&lt;/li&gt;&lt;li&gt;Tidy up the &lt;code&gt;Js_top_worker&lt;/code&gt; code so it can be used in the javascript oxcaml playground&lt;/li&gt;&lt;li&gt;Release Odoc 3.1&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2025/07/week28.html</id><title type="text">Week 28</title><updated>2025-07-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</summary><published>2025-07-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-3-is-live-on-ocaml.org!&quot;&gt;&lt;a href=&quot;#odoc-3-is-live-on-ocaml.org!&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc 3 is live on OCaml.org!&lt;/h1&gt; 952 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 953 - &lt;p&gt;As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.&lt;/p&gt; 954 - &lt;p&gt;Some of the highlights include:&lt;/p&gt; 955 - &lt;ul&gt;&lt;li&gt;Source code rendering&lt;/li&gt;&lt;li&gt;Hierarchical manual pages&lt;/li&gt;&lt;li&gt;Image, video and audio support&lt;/li&gt;&lt;li&gt;Separation of API docs by library&lt;/li&gt;&lt;/ul&gt; 956 - &lt;p&gt;A huge amount of work went into the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339&quot;&gt;Odoc 3.0 release&lt;/a&gt;, and I'd like to thank my colleagues at Tarides, in particular &lt;a href=&quot;https://github.com/panglesd&quot;&gt;Paul-Elliot&lt;/a&gt; and &lt;a href=&quot;https://github.com/julow/&quot;&gt;Jules&lt;/a&gt; for the work they put into this.&lt;/p&gt; 957 - &lt;p&gt;But the odoc release happened months ago, so why is it only going live now? So, the doc tool itself is only one small part of getting the docs onto ocaml.org. Odoc works on the &lt;a href=&quot;https://discuss.ocaml.org/t/cmt-cmti-question/5308&quot;&gt;cmt and cmti&lt;/a&gt; files that are produced during the build process, and so part of the process of building docs is to build the packages, so we have to, at minimum, attempt to build all 17,000 or so distinct versions of the packages in opam-repository. The &lt;a href=&quot;https://github.com/ocurrent&quot;&gt;ocurrent&lt;/a&gt; tool &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt;, which I've previously &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/05/docs-progress&quot;&gt;written&lt;/span&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;about&lt;/span&gt;, is responsible for these builds and in this new release has demonstrated a new approach to this task, where we attempt to do the build in as efficient a way as possible by effectively building binary packages once for each required package in a specific 'universe' of dependencies. For example, many packages require e.g. &lt;a href=&quot;https://erratique.ch/software/cmdliner&quot;&gt;cmdliner.1.3.0&lt;/a&gt; to build, and some require a specific version of OCaml too. So we'll build cmdliner.1.3.0 once against each version of OCaml required -- but &lt;i&gt;only once&lt;/i&gt;, which is in contrast to how some of the other tools in the ocurrent suite work, e.g. &lt;a href=&quot;https://github.com/ocurrent/opam-repo-ci&quot;&gt;opam-repo-ci&lt;/a&gt;. Once the packages are built, we then run the new tool &lt;a href=&quot;https://ocaml.github.io/odoc/odoc-driver/index.html&quot;&gt;odoc_driver&lt;/a&gt; to actually build the HTML docs. In addition to this, a new feature of Odoc 3 is to be able to link to packages that are your direct dependencies - so for example, the docs of odoc contain links to the docs of odoc_driver, even though odoc_driver depends upon odoc. This, whilst sounding easy enough, required some radical changes in the docs ci, which I promise I will write about later!&lt;/p&gt; 958 - &lt;p&gt;The builds and the generation of the docs is all done on a single blade server, called &lt;a href=&quot;https://sage.caelum.ci.dev&quot;&gt;sage&lt;/a&gt; with 40 threads, 2 8TiB spinning drives and a 1.8TiB SSD cache, and it produces about 1 TiB of data over the course of a couple of days. The changes required to this part of the process since odoc 2.x were primarily myself and &lt;a href=&quot;https://tunbury.org&quot;&gt;Mark Elvers&lt;/a&gt;&lt;/p&gt; 959 - &lt;p&gt;Once the docs are built, how do they get onto ocaml.org? Odoc itself knows nothing about the layout and styling of ocaml.org, so the HTML it produces isn't suitable to be just rendered when a user requests particular docs. What happens is that odoc produces, as well as a self-contained HTML file, a json file with the body of the page, the sidebars, the breadcrumbs and so on as structured data, one per HTML page, which are then served from sage over HTTP. When a user requests a particular docs page, the ocaml.org server will request that json file from sage, then render this with the ocaml.org styling, then send it back to the user.&lt;/p&gt; 960 - &lt;p&gt;As odoc 3 moved a fair bit of logic from ocaml.org into odoc itself, there were quite a few changes that needed to be made to the ocaml.org server to integrate this into the site. This work was mostly done by &lt;a href=&quot;https://github.com/panglesd&quot;&gt;Paul-Elliot&lt;/a&gt; and myself, with a lot of help from the &lt;a href=&quot;https://github.com/ocaml/ocaml.org?tab=readme-ov-file#maintainers&quot;&gt;ocaml.org team&lt;/a&gt;, in particular &lt;a href=&quot;&quot;&gt;Sabine Schmaltz&lt;/a&gt; and &lt;a href=&quot;https://github.com/cuihtlauac&quot;&gt;Cuihtlauac Alvarado&lt;/a&gt;.&lt;/p&gt; 961 - &lt;p&gt;So, quite a lot of integration and infrastructure work was required to get the new docs site up and running, and I'm very happy to see this particular task concluded!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html</id><title type="text">Odoc 3 is live on OCaml.org!</title><updated>2025-07-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml ...</summary><published>2025-07-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/week27.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeks-24-27&quot;&gt;&lt;a href=&quot;#weeks-24-27&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeks 24-27&lt;/h1&gt; 962 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 963 - &lt;p&gt;It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml documentation. This post is about anaspect of that last one that I found particularly interesting.&lt;/p&gt; 964 - &lt;h2 id=&quot;odoc-llm&quot;&gt;&lt;a href=&quot;#odoc-llm&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;odoc-llm&lt;/h2&gt; 965 - &lt;p&gt;Sadiq and I have been &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;looking at using LLMs&lt;/a&gt; to summarise the documentation produced by Odoc. The idea is to get a concise summary of the purpose of each module, so that we can quickly identify which modules are relevant to a particular task.&lt;/p&gt; 966 - &lt;p&gt;For testing this, we need to see how it works on different types of libraries. The first axis I wanted to test on goes between 'well documented' and 'poorly documented', and so I need at least two libraries on opposite ends of the spectrum.&lt;/p&gt; 967 - &lt;p&gt;For the 'well documented' case, I chose &lt;code&gt;cmdliner&lt;/code&gt;. It's a library that I almost always have to look at the docs for when I'm using it, as, despite using it many many times, the interface doesn't seem to stick in my head.&lt;/p&gt; 968 - &lt;p&gt;For the 'poorly documented' case, I chose &lt;code&gt;odoc&lt;/code&gt; itself, somewhat ironically. In defence of the odoc authors (me included!), the libraries it provides are simply there for code organisation and aren't meant to be consumed outside of the tool itself!&lt;/p&gt; 969 - &lt;h3 id=&quot;the-approach&quot;&gt;&lt;a href=&quot;#the-approach&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The approach&lt;/h3&gt; 970 - &lt;p&gt;The output from Odoc is a set of HTML files, one per module/module type/class/etc., containing the documentation for that item. Our first take on this was to parse the HTML files and extract the text content, which we then fed to an LLM to summarise. However, this was pretty awkward, so we decided to try a PR that &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;davesnx recently made to Odoc&lt;/a&gt; to output markdown instead of HTML.&lt;/p&gt; 971 - &lt;p&gt;We look for leaf modules that don't contain any submodules, and start by summarising those, then move onto the parent modules, splicing in the summaries of the children, and so on, up to the top-level modules. We then move on to summarising the whole library, which usually is just a single namespace module, but occasionally is a group of top-level modules.&lt;/p&gt; 972 - &lt;p&gt;One of the early prompts for the module &lt;code&gt;Cmdliner.Term.Syntax&lt;/code&gt; looked roughly as follows:&lt;/p&gt; 973 - &lt;pre&gt;You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 1046 + access and concatenation are critical.</code></pre></div> 1047 + <h2 id="limitations-and-future-work"><a href="#limitations-and-future-work" class="anchor"></a>Limitations and future work</h2> 1048 + <p>We're aware that there are currently a number of limitations with what's been done so far, and there's a lot of exciting things that could quite easily be added!</p> 1049 + <p>We haven't done much prompt optimisation either for the tools themselves, nor their descriptions in the MCP server. We also haven't done much optimisation of the information retrieval - and it's clear from some of the results shown above that there are improvements to be made in the ranking algorithms. Some obvious next steps would be to do some <a href="https://arxiv.org/html/2406.12433v2">re-ranking</a> or some form of hybrid search.</p> 1050 + <p>A particular challenge is that since this is based entirely off of the <code>ocaml-docs-ci</code> build, it won't necessarily reflect the actual API your local build, as for OCaml, this <a href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html">can't be done</a>. Thibaut Mattio is working on a <a href="https://github.com/tmattio/ocaml-mcp">local MCP server</a> that would be perfectly positioned to do some of what we're doing, although we'd need to have a good local docs build implemented in dune for this to work well.</p> 1051 + <p>Also, there's plenty more data that we've collected during the docs builds! We can show the implementations of functions, we can expose code samples, select different versions of packages and much more. While we've concentrated on the search aspects, there's still a lot of low-hanging fruit that can be worked on.</p> 1052 + <p>If you're interested in helping us out on this, the project lives <a href="https://github.com/sadiqj/odoc-llm">on github</a> - come along and join us!</p> 1053 + <h2 id="using-the-server"><a href="#using-the-server" class="anchor"></a>Using the server</h2> 1054 + <p>If you'd like to try it, we've got a demo server running right now. It's hosted on dill.caelum.ci.dev here at the Computer Laboratory in the University of Cambridge. To enable it with Claude, try this:</p> 1055 + <div><pre class="language-bash"><code>claude mcp add -t sse ocaml http://dill.caelum.ci.dev:8000/sse</code></pre></div> 1056 + <p>Obviously this is pre-alpha quality software, and we might take it down with no notice, and it might not work as expected, and all of the other usual caveats. Let us know if it works, or doesn't, or if you've got some suggestions for improvements!</p>]]></content> 1057 + </entry> 1058 + <entry> 1059 + <id>https://jon.recoil.org/blog/2025/08/week33.html</id> 1060 + <title>Week 33</title> 1061 + <published>2025-08-19T00:00:00Z</published> 1062 + <updated>2025-08-19T00:00:00Z</updated> 1063 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/week33.html"/> 1064 + <summary>More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</summary> 1065 + <content type="html"><![CDATA[<h1 id="week-33"><a href="#week-33" class="anchor"></a>Week 33</h1> 1066 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-19</p></li></ul> 1067 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>cohttp,yojson,jsonm</p></li></ul> 1068 + <p>More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</p> 1069 + <ul><li>Write a README</li><li>Write and run a small script to fix a problem with module-type names</li><li>Write up and publish a blog post</li></ul> 1070 + <p>Not much, right? As always though, writing things up lead to a whole load more work.</p> 1071 + <p>The first problem occurred when writing up how it parsed the input docs. It turned out that when converting the repo so that it took markdown formatted files (using a <a href="https://github.com/jonludlam/odoc/tree/odoc-llm-markdown">slightly tweaked</a> version of <a href="https://github.com/ocaml/odoc/pull/1341">davesnx's PR</a>), Claude had decided that the way to do this was to first convert the markdown into HTML, and then use the HTML parser it had already built. Whilst tidying this up, Claude was remarkably keen to just use regexps to parse the markdown rather than using a pre-existing markdown library, so it took a little persuasion to get it into a state I was happy with.</p> 1072 + <p>The second issue was that the script that form the bulk of the repo had been written at different times, and therefore Claude didn't really take into account any of the decisions it had made in one script when building the next. So most of the command-line arguments were slightly different, which made writing up a mini 'howto' in the README quite a jarring experience.</p> 1073 + <p>Thirdly, and most importantly, we had decided that we needed a few example searches to show how the system worked. We'd already had a <a href="../07/week28.html" title="week28">useful experience</a> with this when Anil had tried to search for a 'time and date parsing and formatting' library, so it shouldn't really have been a surprise that trying a few more examples showed some more interesting behaviour. Specifically, the searches I wanted to do were for an &quot;HTTP client&quot;, &quot;JSON parser&quot;, &quot;Cryptographic Hash&quot; and Anil's time-and-date query, and in actually trying these searches and critically examining the results, I had to go back and figure out why they weren't giving me the results I had expected.</p> 1074 + <p>The first of these searches I had anticipated would be quite interesting, as this is a query that should show the OCaml ecosystem <a href="https://discuss.ocaml.org/t/simple-modern-http-client-library/11239">missing an obvious HTTP client</a>. However, even with this in mind one of the top results was one of Cohttp's module types, <code>Cohttp.Generic.Client.S</code>. This, of course, isn't much use if you're looking for an HTTP client, as module-types aren't going to give you an implementation to actually use. So I decided that we'd exclude module-types from the results. This turned out to be slightly more tricky than I anticipated as we'd lost the distinction between modules and module types further back in the pipeline, so Claude had to do some plumbing to ensure we had this information at the point we were doing the search.</p> 1075 + <p>The cryptographic hash search gave some plausible looking results, so I moved on to the JSON search. I was expecting to see <code>Yojson</code> somewhere near the top of the list as that's a very popular library. I was also expecting to see <code>Jsonm</code> somewhere near the top - or at least I'd like to be able to find it by searching for a 'streaming parser' as that's one of its key strengths. However, searching for &quot;JSON parser&quot; yielded some less than brilliant answers. The top 5 results were for modules in the packages <code>yojson-five</code>, <code>decoders-yojson</code>, <code>decoders-jsonaf</code>, <code>ocplib-json-typed-browser</code> and <code>ppx_protocol_conv_jsonm</code>. While all of these are clearly in the same realm as I was after, having <code>jsonm</code> show up literally 99th in the list, and <code>yojson</code> itself not in the top 100 wasn't a great result.</p> 1076 + <p>Some investigation showed that yojson had a particularly bad showing because the description of the module <code>Yojson.Basic</code> was the empty string! This turned out to be because of some bad error-handling logic in the summariser script, which ended up turning some errors into a blank description. Since running the summariser costs actual money, I didn't want to just rerun the whole thing, so I asked Claude for a script to find these problems and rerun them. The problem is not totally trivial as the summaries of child modules are used when generating the summary for parents, so when one is regenerated we should regenerate the summaries of all ancestors too. Given my recent experiences with Claude I'd like to look this over quite carefully before letting it loose on my data, so I've run it on yojson, which seemed to do the right thing, but not yet on the rest of the packages.</p> 1077 + <p>Having fixed this, I still found that <code>jsonm</code> was making a very poor showing. This turned out to be because the description it gives itself is a &quot;Non-blocking streaming JSON codec for OCaml&quot; which had a fairly low similarity with &quot;JSON parser&quot;. I was using a fairly small embedding model for the queries - Qwen/Qwen3-Embedding-0.6B, so I thought I might address this by using a larger one, and opted for Qwen/Qwen3-Embedding-8B. The machine I had been using for the MCP server has no GPU and had taken a while to do the embeddings using the 0.6B model, so I switched to generating them on my M4 macbook. This went <i>much</i> faster, though since I have about 70Mb of module summaries it still took quite a while. This improved the situation somewhat, but it was still not high in the list.</p> 1078 + <p>So I took a step back and had a think about the problem some more. Searching for a JSON parser is really quite a high-level search, and when evaluating the results I realised I was really thinking in terms of packages rather than modules. So I thought we could split the search in two - a package search and a module search. The package search would be used for the broad queries where you're interested in pulling in whole chunks of functionality, and the module search is for more low-level queries. In fact, the 'time and dating formatting' query is somewhere in between, so I might need to have some more example queries for the module search functions. In addition, the module search could be restricted to the set of packages you're using, which might make it even more useful.</p> 1079 + <p>Part of the split meant that I needed a different source of 'popularity' for the packages than the occurrences data that came out of docs ci, as that was per-module and I needed something per-package. The obvious thing is to look at reverse dependencies in opam. I have this kind-of working, but it's currently not particularly smart, so this will need a little more attention. For example, it currently thinks that <a href="https://melange.re/v5.0.0/">melange</a> has over 3000 reverse dependencies.</p> 1080 + <p>With these changes in place, a package search for 'JSON parser' now returns <code>yojson</code> as number one, followed by <code>ppx_deriving_yojson</code>, <code>ezjsonm</code>, <code>ocplib-json-typed</code> and <code>jsonaf</code>. Unfortunately <code>jsonm</code> is still languishing in 27th place, so there's still some tweaking to do.</p>]]></content> 1081 + </entry> 1082 + <entry> 1083 + <id>https://jon.recoil.org/blog/2025/07/retrospective.html</id> 1084 + <title>4 months in, a retrospective</title> 1085 + <published>2025-07-18T00:00:00Z</published> 1086 + <updated>2025-07-18T00:00:00Z</updated> 1087 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/retrospective.html"/> 1088 + <summary>Astonishingly, it's already been since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into t...</summary> 1089 + <content type="html"><![CDATA[<h1 id="4-months-in,-a-retrospective"><a href="#4-months-in,-a-retrospective" class="anchor"></a>4 months in, a retrospective</h1> 1090 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-18</p></li></ul> 1091 + <p>Astonishingly, it's already been <i>four whole months</i> since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into the Computer Laboratory as an SRA for the first time since 2021, but here we are, at the end of term already. Time to do a bit of a retrospective and forward-looking plan for the next 3-4 months!</p> 1092 + <h2 id="what's-happened?"><a href="#what's-happened?" class="anchor"></a>What's happened?</h2> 1093 + <p>On wednesday this week, I had a chance to sit down with Anil, supposedly to talk about the upcoming lecturing of 1A Foundations of Computer Science, but we ended up talking about what I've been doing for the past few months, and where it fits into the broader picture of the group as a whole. It was a really useful conversation, and I thought it would be good to outline it here while it's fresh in my mind.</p> 1094 + <p>So then, to start, what have I been doing? What have I achieved? What have I learnt? It's been a bit of a daunting experience, landing in a team that are already working one hundred miles an hour on things well out of my comfort zone. I've been going to group meetings and having lots of interesting conversations, but I've found it difficult to make the next steps happen. One area where I've had some success is in working with Sadiq on LLMs - in particular, getting local LLMs to solve programming exercises that we both <a href="https://toao.com/blog/ocaml-local-code-models">wrote</a> <a href="../05/ticks-solved-by-ai.html" title="ticks-solved-by-ai">up</a>. I've also been working with him on taking the output from the docs CI and <a href="https://github.com/sadiqj/odoc-llm">summarising it with LLMs</a> in order to create an MCP server that would help tools like <a href="https://anthropic.com/">Claude Code</a> to choose OCaml packages to solve users' problems.</p> 1095 + <p>It's been somewhat easier, partly due to inertia, to carry on with projects that had been in flight at the time I started. Things like getting the Odoc 3 generated docs onto ocaml.org, which is finally complete only <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">as of this week!</a>. This has taken a whole lot of time, but I'm really pleased with the end results. There's still an awful lot of improvements that I'd like to see made, which, after drawing breath for a couple of weeks, I'll be writing down.</p> 1096 + <p>An itch I'd been wanting to scratch for a long time has been to look at client-side ocaml notebooks. I decided to make this an integral <a href="../04/this-site.html" title="this-site">feature of this blog</a>, and I've learnt an awful lot doing it. An important feature of this that I've been keeping in mind is the idea that we could use the ocaml-docs-ci tool to build the libraries, which would allow us to host a toplevel for every single package in opam-repository - allowing at best <a href="https://discuss.ocaml.org/t/an-example-for-every-ocaml-package/16953/10">interactive examples</a>, and at bare minimum merlin for live type-checking and autocompletion. The important principles to keep in mind for this are that:</p> 1097 + <ul><li>We have one 'toplevel' javascript file, and libraries and cmis are dynamically loaded</li><li>The interface between the frontend and the worker must not rely on a matched pair, e.g. an OCaml-5.3-compiled frontend might be talking to an OCaml-4.08-compiled worker thread - or even an oxcaml one!</li></ul> 1098 + <p>I have this all working on my blog, where I have both an oxcaml worker and a standard ocaml worker and they both dynamically load in libraries and cmis as specified on the page.</p> 1099 + <p>I've also supervised a 1A course for the first time - <a href="https://www.cl.cam.ac.uk/teaching/2425/IntroProb/">Introduction to Probability</a>, and I've done some marking for the 1A Foundations of Computer Science.</p> 1100 + <p>Something that I'd been expecting to do a lot on was work with oxcaml, but as the release happened later than anticipated and it coinciding with the marking and supervising, I've not done quite as much of this as I had thought I would. In addition, I had anticipated working on Odoc to start implementing the new features of oxcaml, but to avoid duplicating effort I've been waiting for the patches that have already been written at Jane Street to at least get odoc to compile, which have taken longer than I had hoped to get to me.</p> 1101 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 1102 + <p>With that in mind, Anil and I then talked about the bigger picture, as those of you who know Anil will be entirely unsurprised to hear! In particular, how will we be weaving the various threads of these activites - the teaching of OCaml, the large-scale (for OCaml) CI work, the LLMs and Oxcaml work together to form a coherent whole? How do I find a balance between them and ensure that we find <a href="https://arxiv.org/abs/1106.0848">synergies</a> as opposed to pulling in different directions? How do make sure what we're doing helps us navigate the upending of the nature of development that agentic coding is bringing?</p> 1103 + <h3 id="efficient-and-reusable-ci"><a href="#efficient-and-reusable-ci" class="anchor"></a>Efficient and reusable CI</h3> 1104 + <p>A clear and obvious area where we'll be able to see real progress is to extract from docs CI the logic that I've been using to do efficient builds of packages. As I previously <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">wrote about</a>, the new CI system is far more efficient than some of the other ocurrent-based pipelines, and it would save a huge amount of compute time if we were to take this tech and apply it elsewhere.</p> 1105 + <p>So, how might we take what we've got and produce something useful to everyone? We need to take a hammer to the fracture points of the docs CI service and split it into individually useful parts. Here are some next steps as I see them now. Let's take the solver out of docs CI, and have a service whose sole job is to create a repository of up-to-date solutions for all versions of all packages in opam-repository. These are the data that allow us to build the tree of package builds.</p> 1106 + <p>Next, turn these solutions into one giant build. Perhaps a script? Maybe a giant buildkit dockerfile? This is very similar to Mark Elvers' <a href="https://github.com/mtelvers/ohc">day10</a> project. We can get this running on a big machine and just see how fast we can build everything. The key thing here is that it should be <em>trivial</em> to run this on a linux box. A raspberry pi or a 768-core behemoth with 3TiB of ram. Just how fast <em>can</em> we get it going? It's already building in a couple of days using <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">sage</a>, but that's using ocurrent/obuilder, which isn't quite the right tool for the job, and on a relatively puny machine. Can we do it in an hour? 10 minutes? Certainly the incrememntal builds ought to be done in seconds. What's the limit?</p> 1107 + <p>These tools can then be used as the foundation for other CI systems. For opam-repo-ci, where we should be able to do the builds for a new package incredibly quickly. For opam-health-check, where we currently build foundational packages like dune and findlib <i>thousands of times</i> per run.</p> 1108 + <p>Once we've got the packages built, docs CI is simply a pass over the top of the built artifacts. ocaml-docs-ci already demonstrates this - it only takes a few hours to rebuild all the docs when a new version of odoc is released, but in a way that only benefits docs! All the CI systems should be able to use this.</p> 1109 + <p>We should also then be able to run js_of_ocaml on the libraries to build to infrastructure needed for the per-package toplevels for ocaml.org that I mentioned above. Each of these steps should be separate stages in a pipeline - one where each step produces artifacts for the next to consume.</p> 1110 + <p>When we mix in some of the projects that other people in the team are working on, like David's work on <a href="https://www.dra27.uk/blog/">relocatable OCaml</a>, we've got something that might be able to form a basis for a binary cache for Dune Package Management, particularly when we involve Ryan's <a href="https://ryan.freumh.org/papers.html#2025-arxiv-hyperres">Hyperres</a> paper so we might check that dependencies from outside of the OCaml universe are correct. Can we use <a href="https://github.com/quantifyearth/shark">Patrick and Michael's shark</a> to do the build steps? Can we use these images to serve up toplevels for ocaml.org that are <em>real toplevels</em> rather than javascript toplevels? Can we use these build environments to do help with reinforcement learning to train LLMs on OCaml code? There are a lot of interesting directions to take this work.</p> 1111 + <h3 id="other-projects"><a href="#other-projects" class="anchor"></a>Other projects</h3> 1112 + <p>There are, of course, other responsibilities that I have. Some of these I'll be able to fit in with the theme above, and some - well - maybe I'll have to figure out how to delegate them, a skill that I am not particularly good at, but one that I feel I should learn!</p> 1113 + <h4 id="teaching"><a href="#teaching" class="anchor"></a>Teaching</h4> 1114 + <p>A looming, terrifying, but tremendously exciting opportunity is teaching of 1A Foundations of Computer Science. This is amongst the first courses we teach our incoming undergraduates, currently lectured by <a href="https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/">Anil</a>. As he's on sabbatical this year, he has asked me to step up and lecture it. This is definitely not one for delegation!</p> 1115 + <p>The immediate question, partly raised by my work with Sadiq, is: what do we do about LLMs? How should we adjust our teaching to take into account the existence of these tools? We had a very interesting chat earlier in the term with Professor <a href="https://eecs.iisc.ac.in/people/prof-viraj-kumar/">Viraj Kumar</a> from <a href="https://eecs.iisc.ac.in/">IISc</a> who was visiting Cambridge earlier this year. He's been <a href="https://dl.acm.org/doi/10.1145/3724363.3729100">working on this question</a> for a while now, and I hope to have some more conversations with him over the summer.</p> 1116 + <h4 id="odoc-paper"><a href="#odoc-paper" class="anchor"></a>Odoc paper</h4> 1117 + <p>An area where I've really made a shockingly small amount of progress is to write up all the work that's gone into Odoc over the past 6 (!!!) years.</p> 1118 + <h4 id="odoc-notebooks"><a href="#odoc-notebooks" class="anchor"></a>Odoc notebooks</h4> 1119 + <p>This needs to be tidied up and a v0.1 released. In particular, the work on js_top_worker might well be shared with Arthur's <a href="https://github.com/art-w/x-ocaml">x-ocaml</a> for a unified toplevel experience.</p> 1120 + <h4 id="ai-work"><a href="#ai-work" class="anchor"></a>AI work</h4> 1121 + <p>I'd like to carry on the work I've started with Sadiq on the interaction of LLMs with OCaml. Getting the package search to work sensibly for an MCP server is first on the list, but also doing some reinforcement learning to improve specifically the perfomance on OCaml is very interesting, but not something I've managed to carve out the time for yet. Something along the lines of <a href="https://arxiv.org/abs/2504.21798">swesmith</a> but adapted for OCaml.</p> 1122 + <h4 id="oxcaml-odoc"><a href="#oxcaml-odoc" class="anchor"></a>Oxcaml Odoc</h4> 1123 + <p>Odoc needs to have some work done on it to support the new work that's gone into oxcaml, for example documenting of the modes. This is something I do expect to be working on soon.</p> 1124 + <h4 id="dune-and-odoc"><a href="#dune-and-odoc" class="anchor"></a>Dune and odoc</h4> 1125 + <p>Work needs to be done on the dune rules for odoc, which currently only support the feature-set in odoc 2.x. Paul-Elliot has <a href="https://github.com/ocaml/dune/pull/11716">done some work on this</a>, but much more needs to be done.</p> 1126 + <h4 id="further-general-odoc-work"><a href="#further-general-odoc-work" class="anchor"></a>Further general odoc work</h4> 1127 + <ul><li>Better source rendering</li><li>Syntax for linking to source</li><li>Custom tags (used in odoc_notebook)</li><li>Web-native rendering, for embedding odoc in a website</li><li>Unifying paths and cpaths (https://github.com/jonludlam/odoc/tree/parameterised-paths)</li></ul> 1128 + <h2 id="what-to-actually-do?"><a href="#what-to-actually-do?" class="anchor"></a>What to <i>actually</i> do?</h2> 1129 + <p>There are a lot of things in the above list. I'm not sure yet how I manage to figure out what I actually end up doing, and how that helps me to help Tarides, to fit in as a useful member of the EEG group, and to make sure I'm doing what's right for my own future. I feel the core project of the CI work will help everyone, but slotting the other work into the bigger picture will require some careful thought.</p>]]></content> 1130 + </entry> 1131 + <entry> 1132 + <id>https://jon.recoil.org/blog/2025/07/week28.html</id> 1133 + <title>Week 28</title> 1134 + <published>2025-07-14T00:00:00Z</published> 1135 + <updated>2025-07-14T00:00:00Z</updated> 1136 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/week28.html"/> 1137 + <summary>Week 28</summary> 1138 + <content type="html"><![CDATA[<h1 id="week-28"><a href="#week-28" class="anchor"></a>Week 28</h1> 1139 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-14</p></li></ul> 1140 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>caqti.platform,mariadb</p></li></ul> 1141 + <h2 id="ocaml-mcp-server"><a href="#ocaml-mcp-server" class="anchor"></a>OCaml MCP server</h2> 1142 + <p>Last week I got the summarisation to the point where it felt useful to run it across all the modules in opam. With this completed we then got to try out the MCP server to see how useful it would be in practice.</p> 1143 + <p>One of the first queries <a href="https://anil.recoil.org/">Anil</a> tried was to ask it which libraries would be most useful for &quot;date time parsing and formatting&quot;. We were surprised to see that the first two libraries it returned were <code>caqti</code> and <code>mariadb</code>, specifically mentioning the module <code>Caqti_platform.Conv</code> and <code>Mariadb.S.Time</code>. While these do indeed provide the required functionality, they're probably not the right libraries to provide this. It's going to be tricky to decide this in the MCP server, so we should probably be leaving it up to the LLM to decide amongst them on the client. However, for very general queries we might end up with a large number of matching libraries, so we'll need to have a limit on the number of packages returned, which implies some form of ranking.</p> 1144 + <p>One way we can do this is by using the occurrences code in odoc. The idea is that we examine module implementation files (ie, ml rather than mli files), and counts the number of times the code uses values, types and other identifiers from other libraries. We can then aggregate these counts over all packages in opam repository and use it as an effective marker of popularity, which allows us to rank the results by popularity and only return the top N results.</p> 1145 + <p>We're not currently using the occurrences for anything, so I wasn't especially surprised to find that it's not working as intended. There were a number of issues:</p> 1146 + <ul><li>The occurrences output file was being written at a path not within the package dir, so it wasn't being persisted.</li><li>The CLI interface for generating occurrences works by providing a directory containing the odocl files, but we were only providing the top-level directory and it wasn't recursively searching.</li><li>Once the occurrences were captured, the aggregation step used the full identifier of the value being aggregated, meaning that, for example, <code>List.length</code> in OCaml 5.3 was counted separately from <code>List.length</code> in OCaml 4.14.</li></ul> 1147 + <p>All of these issues are with code in the odoc repository, which, as it happens, also needs a release soon to ensure that it works with the imminent launch of OCaml 5.4. During the week, before I discovered the problems above, I had attempted to make a release of Odoc 3.1, but there was a license kerfuffle that, when combined with the issues in the occurrences code, gave me enough cause to pull the release.</p> 1148 + <p>Before I try to make the release again, this time I'll be running the release candidate with docs-ci, and checking that the occurrences make sense. I set this running on Friday afternoon, and it had completed by Friday evening, so it's actually pretty quick to rerun odoc on the 15,000 or so packages required for ocaml.org.</p> 1149 + <h2 id="trouble-with-this-blog"><a href="#trouble-with-this-blog" class="anchor"></a>Trouble with this blog</h2> 1150 + <p>In other news, in trying to post my blog at the beginning of the week, I was stymied a little by the changes in oxcaml. I had been using a custom opam-repository forked from the official oxcaml one, because I needed a patched js_of_ocaml in order to fix the toplevel code. I had hoped this would mean that I could update it on my schedule, rather than being at the mercy of upstream changes. Unfortunately though, the download URL for ocaml-flambda wasn't pointing at an immutable commit, so when I tried it I got a checksum error. So I ended up trying to rebase the changes onto the latest oxcaml opam-repository, which didn't go well at all. The version numbers had all changed, which in opam means that files are in different directories, so git got thoroughly confused. On top of that, because the js_of_ocaml repository has multiple packages in it, whereas opam repository has a directory per-package, we end up having multiple copies of the patches. So in the end I've just committed all the patches to a git repo on github, and pinned it in the Dockerfile that builds this site.</p> 1151 + <p>What would be handy is a way to apply the patches in a package in opam repository to and from a git repository, similar to quilt/guilt. We don't quite have all of the pieces to do this, as although we have a download URL and often a dev-repo, I don't believe we currently have a way to get the base commit of that repository.</p> 1152 + <h2 id="oxcaml-continues"><a href="#oxcaml-continues" class="anchor"></a>Oxcaml continues</h2> 1153 + <p>We had a meeting on Thursday with Jane Street on the next steps for oxcaml. There are a number of areas in which JS are keen for us to help out with.</p> 1154 + <ul><li>Playgrounds - both javascript and docker-based. The playground on the oxcaml website right now uses github codespaces, which works nicely but currently takes an absolute age to start up. We can almost certainly improve this by building docker images and pushing them to the docker hub, rather than building oxcaml from scratch when starting the codespace. There's also interest in the javascript playgrounds, which can serve a slightly different purpose than the docker-based one, more limited in how it can be used, but without requiring someone to spin up a full docker container.</li><li>Documentation - Odoc has had some patches to run on oxcaml, but there's no support for documenting many of the new features yet, including modes. We've got to do some experiments here to see what the best way is to show the new type-system features in the generated docs. There were some suggestions of using javascript to show/hide the modes, for example.</li><li>Improvements in Merlin - again this is an area ripe for investigation. In particular, how do we best expose the new features of the type system for users? What's needed here is user feedback from people who are actually using oxcaml to build real projects.</li><li>Better error messages - OCaml has been getting improved error messages with each release, but there's still room for improvement, and the new features of the type system in particular have many different failure modes. Again, we need user feedback to understand the pain points and improve the error messages accordingly.</li></ul> 1155 + <h2 id="next-week"><a href="#next-week" class="anchor"></a>Next week</h2> 1156 + <p>Next week, the plan is to:</p> 1157 + <ul><li>Check the occurrences from docs-ci, and integrate into the MCP server</li><li>Talk to <a href="https://tunbury.org/">Mark</a> about building the docker image for the oxcaml playground</li><li>Tidy up the <code>Js_top_worker</code> code so it can be used in the javascript oxcaml playground</li><li>Release Odoc 3.1</li></ul>]]></content> 1158 + </entry> 1159 + <entry> 1160 + <id>https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html</id> 1161 + <title>Odoc 3 is live on OCaml.org!</title> 1162 + <published>2025-07-14T00:00:00Z</published> 1163 + <updated>2025-07-14T00:00:00Z</updated> 1164 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html"/> 1165 + <summary>As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</summary> 1166 + <content type="html"><![CDATA[<h1 id="odoc-3-is-live-on-ocaml.org!"><a href="#odoc-3-is-live-on-ocaml.org!" class="anchor"></a>Odoc 3 is live on OCaml.org!</h1> 1167 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-14</p></li></ul> 1168 + <p>As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</p> 1169 + <p>Some of the highlights include:</p> 1170 + <ul><li>Source code rendering</li><li>Hierarchical manual pages</li><li>Image, video and audio support</li><li>Separation of API docs by library</li></ul> 1171 + <p>A huge amount of work went into the <a href="https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339">Odoc 3.0 release</a>, and I'd like to thank my colleagues at Tarides, in particular <a href="https://github.com/panglesd">Paul-Elliot</a> and <a href="https://github.com/julow/">Jules</a> for the work they put into this.</p> 1172 + <p>But the odoc release happened months ago, so why is it only going live now? So, the doc tool itself is only one small part of getting the docs onto ocaml.org. Odoc works on the <a href="https://discuss.ocaml.org/t/cmt-cmti-question/5308">cmt and cmti</a> files that are produced during the build process, and so part of the process of building docs is to build the packages, so we have to, at minimum, attempt to build all 17,000 or so distinct versions of the packages in opam-repository. The <a href="https://github.com/ocurrent">ocurrent</a> tool <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a>, which I've previously <a href="../05/docs-progress.html" title="docs-progress">written</a> <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">about</a>, is responsible for these builds and in this new release has demonstrated a new approach to this task, where we attempt to do the build in as efficient a way as possible by effectively building binary packages once for each required package in a specific 'universe' of dependencies. For example, many packages require e.g. <a href="https://erratique.ch/software/cmdliner">cmdliner.1.3.0</a> to build, and some require a specific version of OCaml too. So we'll build cmdliner.1.3.0 once against each version of OCaml required -- but <i>only once</i>, which is in contrast to how some of the other tools in the ocurrent suite work, e.g. <a href="https://github.com/ocurrent/opam-repo-ci">opam-repo-ci</a>. Once the packages are built, we then run the new tool <a href="https://ocaml.github.io/odoc/odoc-driver/index.html">odoc_driver</a> to actually build the HTML docs. In addition to this, a new feature of Odoc 3 is to be able to link to packages that are your direct dependencies - so for example, the docs of odoc contain links to the docs of odoc_driver, even though odoc_driver depends upon odoc. This, whilst sounding easy enough, required some radical changes in the docs ci, which I promise I will write about later!</p> 1173 + <p>The builds and the generation of the docs is all done on a single blade server, called <a href="https://sage.caelum.ci.dev">sage</a> with 40 threads, 2 8TiB spinning drives and a 1.8TiB SSD cache, and it produces about 1 TiB of data over the course of a couple of days. The changes required to this part of the process since odoc 2.x were primarily myself and <a href="https://tunbury.org">Mark Elvers</a></p> 1174 + <p>Once the docs are built, how do they get onto ocaml.org? Odoc itself knows nothing about the layout and styling of ocaml.org, so the HTML it produces isn't suitable to be just rendered when a user requests particular docs. What happens is that odoc produces, as well as a self-contained HTML file, a json file with the body of the page, the sidebars, the breadcrumbs and so on as structured data, one per HTML page, which are then served from sage over HTTP. When a user requests a particular docs page, the ocaml.org server will request that json file from sage, then render this with the ocaml.org styling, then send it back to the user.</p> 1175 + <p>As odoc 3 moved a fair bit of logic from ocaml.org into odoc itself, there were quite a few changes that needed to be made to the ocaml.org server to integrate this into the site. This work was mostly done by <a href="https://github.com/panglesd">Paul-Elliot</a> and myself, with a lot of help from the <a href="https://github.com/ocaml/ocaml.org?tab=readme-ov-file#maintainers">ocaml.org team</a>, in particular <a href="">Sabine Schmaltz</a> and <a href="https://github.com/cuihtlauac">Cuihtlauac Alvarado</a>.</p> 1176 + <p>So, quite a lot of integration and infrastructure work was required to get the new docs site up and running, and I'm very happy to see this particular task concluded!</p>]]></content> 1177 + </entry> 1178 + <entry> 1179 + <id>https://jon.recoil.org/blog/2025/07/week27.html</id> 1180 + <title>Weeks 24-27</title> 1181 + <published>2025-07-07T00:00:00Z</published> 1182 + <updated>2025-07-07T00:00:00Z</updated> 1183 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/week27.html"/> 1184 + <summary>It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml ...</summary> 1185 + <content type="html"><![CDATA[<h1 id="weeks-24-27"><a href="#weeks-24-27" class="anchor"></a>Weeks 24-27</h1> 1186 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-07</p></li></ul> 1187 + <p>It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml documentation. This post is about anaspect of that last one that I found particularly interesting.</p> 1188 + <h2 id="odoc-llm"><a href="#odoc-llm" class="anchor"></a>odoc-llm</h2> 1189 + <p>Sadiq and I have been <a href="https://toao.com/blog/ocaml-local-code-models">looking at using LLMs</a> to summarise the documentation produced by Odoc. The idea is to get a concise summary of the purpose of each module, so that we can quickly identify which modules are relevant to a particular task.</p> 1190 + <p>For testing this, we need to see how it works on different types of libraries. The first axis I wanted to test on goes between 'well documented' and 'poorly documented', and so I need at least two libraries on opposite ends of the spectrum.</p> 1191 + <p>For the 'well documented' case, I chose <code>cmdliner</code>. It's a library that I almost always have to look at the docs for when I'm using it, as, despite using it many many times, the interface doesn't seem to stick in my head.</p> 1192 + <p>For the 'poorly documented' case, I chose <code>odoc</code> itself, somewhat ironically. In defence of the odoc authors (me included!), the libraries it provides are simply there for code organisation and aren't meant to be consumed outside of the tool itself!</p> 1193 + <h3 id="the-approach"><a href="#the-approach" class="anchor"></a>The approach</h3> 1194 + <p>The output from Odoc is a set of HTML files, one per module/module type/class/etc., containing the documentation for that item. Our first take on this was to parse the HTML files and extract the text content, which we then fed to an LLM to summarise. However, this was pretty awkward, so we decided to try a PR that <a href="https://github.com/ocaml/odoc/pull/1341">davesnx recently made to Odoc</a> to output markdown instead of HTML.</p> 1195 + <p>We look for leaf modules that don't contain any submodules, and start by summarising those, then move onto the parent modules, splicing in the summaries of the children, and so on, up to the top-level modules. We then move on to summarising the whole library, which usually is just a single namespace module, but occasionally is a group of top-level modules.</p> 1196 + <p>One of the early prompts for the module <code>Cmdliner.Term.Syntax</code> looked roughly as follows:</p> 1197 + <pre>You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 974 1198 - The specific operations and functions this module provides 975 1199 - What data types or structures it works with 976 - - Concrete use cases (avoid generic terms like &amp;quot;utility functions&amp;quot; or &amp;quot;common operations&amp;quot;) 1200 + - Concrete use cases (avoid generic terms like &quot;utility functions&quot; or &quot;common operations&quot;) 977 1201 978 1202 Do NOT: 979 1203 - Repeat the module name in the description 980 - - Mention &amp;quot;functional programming patterns&amp;quot; or &amp;quot;code clarity&amp;quot; 981 - - Use filler phrases like &amp;quot;provides functionality for&amp;quot; or &amp;quot;collection of functions&amp;quot; 1204 + - Mention &quot;functional programming patterns&quot; or &quot;code clarity&quot; 1205 + - Use filler phrases like &quot;provides functionality for&quot; or &quot;collection of functions&quot; 982 1206 - Describe how it works with other modules 983 1207 984 1208 Module: Syntax 985 1209 Module Documentation: let operators. 986 1210 ( let+ ) is map. 987 1211 ( and* ) is product. 988 - - val (let+) : 'a t -&amp;gt; ('a -&amp;gt; 'b) -&amp;gt; 'b t (* ( let+ ) is map. *) 989 - - val (and+) : 'a t -&amp;gt; 990 - 'b t -&amp;gt; 991 - ('a * 'b) t (* ( and* ) is product. *)&lt;/pre&gt; 992 - &lt;p&gt;and the output using a small model (qwen3-30b-a3b) was:&lt;/p&gt; 993 - &lt;pre&gt;&amp;quot;The module provides (let+) for applying functions to values within a context and (and+) for combining two contexts into a product. It operates on applicative 1212 + - val (let+) : 'a t -&gt; ('a -&gt; 'b) -&gt; 'b t (* ( let+ ) is map. *) 1213 + - val (and+) : 'a t -&gt; 1214 + 'b t -&gt; 1215 + ('a * 'b) t (* ( and* ) is product. *)</pre> 1216 + <p>and the output using a small model (qwen3-30b-a3b) was:</p> 1217 + <pre>&quot;The module provides (let+) for applying functions to values within a context and (and+) for combining two contexts into a product. It operates on applicative 994 1218 structures like option, list, or custom types that support these operations. For example, it enables sequential transformation of values in a context or 995 - pairing elements from two separate contexts.&amp;quot;&lt;/pre&gt; 996 - &lt;p&gt;There are quite a few issues with the input here. Firstly, we've only given the module name, not the full path. Secondly, there's nothing to let the model know what &lt;code&gt;t&lt;/code&gt; might be, so it has decided it's a completely generic type. It also has no idea about the context in which this module was found, so it has no idea that it's part of a command-line processing library. By fixing these issues, we end up with the prompt:&lt;/p&gt; 997 - &lt;pre&gt;You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 1219 + pairing elements from two separate contexts.&quot;</pre> 1220 + <p>There are quite a few issues with the input here. Firstly, we've only given the module name, not the full path. Secondly, there's nothing to let the model know what <code>t</code> might be, so it has decided it's a completely generic type. It also has no idea about the context in which this module was found, so it has no idea that it's part of a command-line processing library. By fixing these issues, we end up with the prompt:</p> 1221 + <pre>You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 998 1222 - The specific operations and functions this module provides 999 1223 - What data types or structures it works with 1000 - - Concrete use cases (avoid generic terms like &amp;quot;utility functions&amp;quot; or &amp;quot;common operations&amp;quot;) 1224 + - Concrete use cases (avoid generic terms like &quot;utility functions&quot; or &quot;common operations&quot;) 1001 1225 1002 1226 Do NOT: 1003 1227 - Repeat the module name in the description 1004 - - Mention &amp;quot;functional programming patterns&amp;quot; or &amp;quot;code clarity&amp;quot; 1005 - - Use filler phrases like &amp;quot;provides functionality for&amp;quot; or &amp;quot;collection of functions&amp;quot; 1228 + - Mention &quot;functional programming patterns&quot; or &quot;code clarity&quot; 1229 + - Use filler phrases like &quot;provides functionality for&quot; or &quot;collection of functions&quot; 1006 1230 - Describe how it works with other modules 1007 1231 1008 1232 Module: Cmdliner.Term.Syntax ··· 1015 1239 A term is evaluated by a program to produce a result, which can be turned into an exit status. A term made of terms referring to command line arguments implicitly defines a command line syntax. 1016 1240 1017 1241 Module Documentation: let operators. 1018 - - val (let+) : 'a Cmdliner.Term.t -&amp;gt; ('a -&amp;gt; 'b) -&amp;gt; 'b Cmdliner.Term.t (* ( let+ ) is map. *) 1019 - - val (and+) : 'a Cmdliner.Term.t -&amp;gt; 1020 - 'b Cmdliner.Term.t -&amp;gt; 1021 - ('a * 'b) Cmdliner.Term.t (* ( and* ) is product. *)&lt;/pre&gt; 1022 - &lt;p&gt;The output of this improved prompt is much better:&lt;/p&gt; 1023 - &lt;pre&gt;The module provides operators to map and combine terms, which represent command line argument parsers and their results. `let+` transforms a parsed argument 1242 + - val (let+) : 'a Cmdliner.Term.t -&gt; ('a -&gt; 'b) -&gt; 'b Cmdliner.Term.t (* ( let+ ) is map. *) 1243 + - val (and+) : 'a Cmdliner.Term.t -&gt; 1244 + 'b Cmdliner.Term.t -&gt; 1245 + ('a * 'b) Cmdliner.Term.t (* ( and* ) is product. *)</pre> 1246 + <p>The output of this improved prompt is much better:</p> 1247 + <pre>The module provides operators to map and combine terms, which represent command line argument parsers and their results. `let+` transforms a parsed argument 1024 1248 into a new value, while `and+` merges two independent arguments into a tuple. These enable building structured command line interfaces, such as parsing a 1025 - filename and a flag simultaneously, then combining them into a configuration record.&lt;/pre&gt; 1026 - &lt;p&gt;It still occasionally incorrectly decides that this module provides monadic combinators rather than applicative, but this is where we get better results from using a larger model.&lt;/p&gt; 1027 - &lt;p&gt;There are quite a few other issues that we've fixed - for example, treating module types differently than modules, and a bug where infix operators were being omitted from the documentation. In one case, it uncovered a bug in the markdown generator where includes weren't getting expanded, which I got &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/926cca100c307818e57281c3d40e98f1975f0f95&quot;&gt;Claude to fix&lt;/a&gt;. My &lt;i&gt;modus operandi&lt;/i&gt; has essentially been to look at the output for the test packages, find a summary that looks bonkers, and then look back at the prompt to find that, indeed, the input was missing some crucial information.&lt;/p&gt; 1028 - &lt;p&gt;One thing I'd quite like to try is to re-open a &lt;a href=&quot;https://github.com/ocaml/odoc/pull/655&quot;&gt;PR that Drup made&lt;/a&gt; as an April Fool's joke back in 2001, which ended up outputting OCaml formatted code. This is actually pretty close to what we might want to give to the LLM - though we'd probably format the comments as markdown, and we'd be replacing types with fully-qualified types as above. Funny how things turn out!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/week27.html</id><title type="text">Weeks 24-27</title><updated>2025-07-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Some brief notes on last week.</summary><published>2025-06-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/06/week23.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-23&quot;&gt;&lt;a href=&quot;#week-23&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 23&lt;/h1&gt; 1029 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;opam-format,fpath,rresult,bos&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1030 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;merlinonly&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;merlinonly&lt;/span&gt; &lt;/li&gt;&lt;/ul&gt; 1031 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-06-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1032 - &lt;p&gt;Some brief notes on last week.&lt;/p&gt; 1033 - &lt;h2 id=&quot;docs-ci-and-sherlodoc&quot;&gt;&lt;a href=&quot;#docs-ci-and-sherlodoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI and Sherlodoc&lt;/h2&gt; 1034 - &lt;p&gt;Anil has been working on an &lt;a href=&quot;https://tangled.sh/@anil.recoil.org/odoc-mcp&quot;&gt;MCP server&lt;/a&gt; that searches through the output of the docs CI to find relevant packages and API information for opam packages. For expediency, this works by scraping the HTML output, but a potentially better solution would be to integrate properly with &lt;a href=&quot;https://doc.sherlocode.com&quot;&gt;Sherlodoc&lt;/a&gt;, &lt;a href=&quot;https://github.com/art-w/&quot;&gt;Arthur's&lt;/a&gt; code search engine.&lt;/p&gt; 1035 - &lt;h3 id=&quot;building-the-index&quot;&gt;&lt;a href=&quot;#building-the-index&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Building the index&lt;/h3&gt; 1036 - &lt;p&gt;To make this work with the new docs CI, first we need to build a sherlodoc search database. This involves grabbing all of the &lt;code&gt;.odocl&lt;/code&gt; files that odoc produces for the latest version of each library, copying them locally and running &lt;code&gt;sherlodoc index&lt;/code&gt; on the output. Getting &lt;em&gt;all&lt;/em&gt; of the odocl files is simple, but filtering so we only have the latest version is slightly more complex, as we need to use &lt;code&gt;opam&lt;/code&gt;'s library to make sure we're looking at the latest versions.&lt;/p&gt; 1037 - &lt;p&gt;The simple way to get the odocl files ends up unpacking them into the filesystem in a directory hierarchy that matches the URL on ocaml.org, so we see files like:&lt;/p&gt; 1038 - &lt;pre&gt;p/odoc/3.0.0/doc/odoc.document/odoc_document.odocl&lt;/pre&gt; 1039 - &lt;p&gt;So finding the version number is a matter of listing the directories, for which I took &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/blob/4dfe7e6265610da4e0ce2a386cfbf0b8eac3d9bd/src/lib/track.ml#L58-L76&quot;&gt;some code&lt;/a&gt; from docs CI:&lt;/p&gt; 1040 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type p = { 1249 + filename and a flag simultaneously, then combining them into a configuration record.</pre> 1250 + <p>It still occasionally incorrectly decides that this module provides monadic combinators rather than applicative, but this is where we get better results from using a larger model.</p> 1251 + <p>There are quite a few other issues that we've fixed - for example, treating module types differently than modules, and a bug where infix operators were being omitted from the documentation. In one case, it uncovered a bug in the markdown generator where includes weren't getting expanded, which I got <a href="https://github.com/jonludlam/odoc/commit/926cca100c307818e57281c3d40e98f1975f0f95">Claude to fix</a>. My <i>modus operandi</i> has essentially been to look at the output for the test packages, find a summary that looks bonkers, and then look back at the prompt to find that, indeed, the input was missing some crucial information.</p> 1252 + <p>One thing I'd quite like to try is to re-open a <a href="https://github.com/ocaml/odoc/pull/655">PR that Drup made</a> as an April Fool's joke back in 2001, which ended up outputting OCaml formatted code. This is actually pretty close to what we might want to give to the LLM - though we'd probably format the comments as markdown, and we'd be replacing types with fully-qualified types as above. Funny how things turn out!</p>]]></content> 1253 + </entry> 1254 + <entry> 1255 + <id>https://jon.recoil.org/blog/2025/06/week23.html</id> 1256 + <title>Week 23</title> 1257 + <published>2025-06-09T00:00:00Z</published> 1258 + <updated>2025-06-09T00:00:00Z</updated> 1259 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/06/week23.html"/> 1260 + <summary>Some brief notes on last week.</summary> 1261 + <content type="html"><![CDATA[<h1 id="week-23"><a href="#week-23" class="anchor"></a>Week 23</h1> 1262 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>opam-format,fpath,rresult,bos</p></li></ul> 1263 + <ul class="at-tags"><li class="merlinonly"><span class="at-tag">merlinonly</span> </li></ul> 1264 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-06-09</p></li></ul> 1265 + <p>Some brief notes on last week.</p> 1266 + <h2 id="docs-ci-and-sherlodoc"><a href="#docs-ci-and-sherlodoc" class="anchor"></a>Docs CI and Sherlodoc</h2> 1267 + <p>Anil has been working on an <a href="https://tangled.sh/@anil.recoil.org/odoc-mcp">MCP server</a> that searches through the output of the docs CI to find relevant packages and API information for opam packages. For expediency, this works by scraping the HTML output, but a potentially better solution would be to integrate properly with <a href="https://doc.sherlocode.com">Sherlodoc</a>, <a href="https://github.com/art-w/">Arthur's</a> code search engine.</p> 1268 + <h3 id="building-the-index"><a href="#building-the-index" class="anchor"></a>Building the index</h3> 1269 + <p>To make this work with the new docs CI, first we need to build a sherlodoc search database. This involves grabbing all of the <code>.odocl</code> files that odoc produces for the latest version of each library, copying them locally and running <code>sherlodoc index</code> on the output. Getting <em>all</em> of the odocl files is simple, but filtering so we only have the latest version is slightly more complex, as we need to use <code>opam</code>'s library to make sure we're looking at the latest versions.</p> 1270 + <p>The simple way to get the odocl files ends up unpacking them into the filesystem in a directory hierarchy that matches the URL on ocaml.org, so we see files like:</p> 1271 + <pre>p/odoc/3.0.0/doc/odoc.document/odoc_document.odocl</pre> 1272 + <p>So finding the version number is a matter of listing the directories, for which I took <a href="https://github.com/ocurrent/ocaml-docs-ci/blob/4dfe7e6265610da4e0ce2a386cfbf0b8eac3d9bd/src/lib/track.ml#L58-L76">some code</a> from docs CI:</p> 1273 + <div><pre class="language-ocaml"><code>type p = { 1041 1274 opam : OpamPackage.t; 1042 1275 path : Fpath.t; 1043 1276 } 1044 1277 1045 1278 let rec take n l = 1046 1279 match n, l with 1047 - | n, x::xs when n &amp;gt; 0 -&amp;gt; 1280 + | n, x::xs when n &gt; 0 -&gt; 1048 1281 x :: take (n - 1) xs 1049 - | _, _ -&amp;gt; [] 1282 + | _, _ -&gt; [] 1050 1283 1051 1284 let get_versions ~limit path = 1052 1285 let open Rresult in 1053 1286 let package = Fpath.basename path in 1054 1287 let mk_pkg v = 1055 - Printf.sprintf &amp;quot;%s.%s&amp;quot; package v 1288 + Printf.sprintf &quot;%s.%s&quot; package v 1056 1289 in 1057 1290 Bos.OS.Dir.contents path 1058 - &amp;gt;&amp;gt;| (fun versions -&amp;gt; 1291 + &gt;&gt;| (fun versions -&gt; 1059 1292 versions 1060 - |&amp;gt; List.map (fun path -&amp;gt; 1061 - { opam = Fpath.basename path |&amp;gt; mk_pkg |&amp;gt; OpamPackage.of_string; 1293 + |&gt; List.map (fun path -&gt; 1294 + { opam = Fpath.basename path |&gt; mk_pkg |&gt; OpamPackage.of_string; 1062 1295 path = path }) 1063 1296 ) 1064 - |&amp;gt; Result.get_ok 1065 - |&amp;gt; (fun v -&amp;gt; 1297 + |&gt; Result.get_ok 1298 + |&gt; (fun v -&gt; 1066 1299 v 1067 - |&amp;gt; List.sort (fun a b -&amp;gt; 1300 + |&gt; List.sort (fun a b -&gt; 1068 1301 -OpamPackage.compare a.opam b.opam) 1069 - |&amp;gt; take limit)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1070 - &lt;p&gt;This gives us a sorted list of the versions for the package, and we can pick the first one to get the latest version. With the output from this we can then run &lt;code&gt;sherlodoc index&lt;/code&gt; and we get a nice big (1.7 gig!) index file.&lt;/p&gt; 1071 - &lt;h3 id=&quot;serving-the-index&quot;&gt;&lt;a href=&quot;#serving-the-index&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Serving the index&lt;/h3&gt; 1072 - &lt;p&gt;The next step is to serve this index file so that the MCP server can access it. The file format is a marshalled OCaml value, so we need to have an OCaml program to read it in and perform the search, and it'll have to be a server since the whole index needs to be unmarshalled into memory before any search can be performed, and it would be dumb to do this for every query.&lt;/p&gt; 1073 - &lt;p&gt;Sherlodoc got partially integrated into odoc's code base before the 3.0 release with the exception of the server, which was left out to avoid pulling a load of new dependencies to odoc. Unfortunately, we didn't expose the sherlodoc libraries publicly, so we'll need to do that in order to make anything useful with sherlodoc. In addition, odoc embeds the version of odoc used into the odocl files, and since right now the docs CI is building with a version of odoc that &lt;em&gt;doesn't&lt;/em&gt; expose the libraries, we might have to hack around that in order to use those odocl files. Obviously the longer term solution is just to make a new release of odoc with this change and update the docs CI to use that.&lt;/p&gt; 1074 - &lt;h2 id=&quot;package-to-library-map&quot;&gt;&lt;a href=&quot;#package-to-library-map&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Package to Library map&lt;/h2&gt; 1075 - &lt;p&gt;A related quest was to assemble a map of opam package to ocamlfind library names. It's a quirk of history that the library names that an opam package provides are not necessarily related to the name of the package. That means that tools like &lt;code&gt;dune&lt;/code&gt; have a hard time linting projects to check that the libraries they're using are mentioned in the opam files. Dune, of course, has resolved this be ensuring that it's an error to build a package using dune where the library names don't start with the package name, but as dune is just one of many OCaml build systems, the problem remains.&lt;/p&gt; 1076 - &lt;p&gt;Since docs CI has built every version of every package, and because the Odoc 3 package layout includes the library names in the paths to the documentation, we should be able to produce a fairly definitive list of the libraries that each package provides, which tools like dune can then use for this sort of lint check. We can just tweak the code above slightly to get the library names and output a big JSON file with the mapping - or perhaps with the exceptions to dune's rule.&lt;/p&gt; 1077 - &lt;p&gt;I thought this would be a neat first project to try Claude Code on - a 'starter for ten' - as it were, so I signed up to use Claude code and gave it a shot.&lt;/p&gt; 1078 - &lt;p&gt;It handily produced a working program that did exactly what I wanted, including creating a test directory that it used to verify the code worked. One fascinating thing I noted as it scrolled past was that it tried to use &lt;code&gt;yojson&lt;/code&gt; to write the output, but failed to get it to work and reverted back to &lt;code&gt;printf&lt;/code&gt; output. I suspect this will be due to it finding it troublesome to figure out the various steps that need to be taken to use a new library in a dune project, so this is something to have a play with later.&lt;/p&gt; 1079 - &lt;p&gt;After a couple of iterations with different heuristics to disambiguate between library names and other directories, I got a working program producing a JSON file with only the exceptions to dune's rule. I took a look through and almost immediately found &lt;code&gt;camlidl&lt;/code&gt; suggesting it produces a library called &lt;code&gt;com&lt;/code&gt;. This didn't look right at all so I installed it and found that the library is actually named &lt;code&gt;camlidl&lt;/code&gt;. The &lt;code&gt;cma&lt;/code&gt; file, though, is named &lt;code&gt;com.cma&lt;/code&gt;, so it looks like &lt;code&gt;odoc_driver&lt;/code&gt; has a bug. Interestingly, running &lt;code&gt;odoc_driver&lt;/code&gt; locally gets the library name correct, so it's only an issue when running it in the docs CI. &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1351&quot;&gt;Issue filed&lt;/a&gt;.&lt;/p&gt; 1080 - &lt;h2 id=&quot;further-claude-code-experiments&quot;&gt;&lt;a href=&quot;#further-claude-code-experiments&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Further claude code experiments&lt;/h2&gt; 1081 - &lt;p&gt;To see how well Claude Code could handle more complex tasks, I thought I'd give it a whirl on something more like its home territory, and somewhere where I was less familiar. I decided to ask it to write some code to make a nicer editor experience for the notebooks project. Since the &lt;a href=&quot;https://github.com/jonludlam/jsoo-code-mirror&quot;&gt;bindings to codemirror&lt;/a&gt; I'm using are very minimal, any new features I want to use end up with needing to write a bunch of bindings first, and only then seeing if the feature works as I'd like. So instead I thought I'd get claude to write the editor code for me in javascript, and then I could make sure it works as I want and only then convert it to OCaml. This worked pretty nicely, and I've now got a neat &lt;a href=&quot;https://jon.ludl.am/experiments/notebook-editor/notebook-editable.html&quot;&gt;demonstration editor&lt;/a&gt; that I can use to guide the OCaml implementation.&lt;/p&gt; 1082 - &lt;h2 id=&quot;more-notebook-work&quot;&gt;&lt;a href=&quot;#more-notebook-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;More notebook work&lt;/h2&gt; 1083 - &lt;p&gt;The oxcaml project will be launching this week hopefully. I've been looking at Luke's &lt;a href=&quot;https://github.com/ocaml-flambda/flambda-backend/pull/3886&quot;&gt;Parallelism tutorial&lt;/a&gt; and have been thinking about how this will work with the online notebooks. The parallel library works via effects, and the oxcaml branch of js_of_ocaml doesn't support effects yet, and it might be a while before it does. However, the blog post is mainly talking about the intricacies of the type system work that's been done to ensure the parallel library is safe, and as such perhaps we can get a lot out of doing this online with just Merlin.&lt;/p&gt; 1084 - &lt;p&gt;Some early experimentation showed that the parallel library breaks the worker on load, so we need to do something a bit more sophisticated than just 'not call exec', so I did some work to have a mode of worker that doesn't load the cmas, just the cmis for Merlin. This is almost there.&lt;/p&gt; 1085 - &lt;h2 id=&quot;odoc-work&quot;&gt;&lt;a href=&quot;#odoc-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc work&lt;/h2&gt; 1086 - &lt;p&gt;Ocaml 5.4 is just around the corner, and there's some odoc work to be done before the release. One of the main new features that will impact odoc is the new &lt;a href=&quot;https://tyconmismatch.com/papers/ml2024_labeled_tuples.pdf&quot;&gt;labelled tuples&lt;/a&gt; feature. Fortunately &lt;a href=&quot;https://github.com/lukemaurer&quot;&gt;Luke Maurer&lt;/a&gt; has already done a lot of work to plumb this into odoc, so this will save us a lot of work - thanks, Luke! There's likely to be a few other bits and pieces to do, but hopefully not too much.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/06/week23.html</id><title type="text">Week 23</title><updated>2025-06-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The docs build is progress well, and we've hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any in...</summary><published>2025-05-29T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/docs-progress.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;progress-in-ocaml-docs&quot;&gt;&lt;a href=&quot;#progress-in-ocaml-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Progress in OCaml docs&lt;/h1&gt; 1087 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-29&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1088 - &lt;p&gt;The docs build is progress well, and we've &lt;i&gt;just about&lt;/i&gt; hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any insights to be gained.&lt;/p&gt; 1089 - &lt;p&gt;Odoc requires a built package in order to generate the docs, there are two steps that have to be done before we can begin building the docs. Step one is to figure out the exact set of packages to build - ie, doing an opam solve, and step two is to actually build the packages. These two steps are, to some extent, out of docs-ci's control, and rely on the state of opam repository. While there are efforts to keep this in as good a state as possible, it's still the case that these steps fail much more often than the actual docs build itself. Let's take a look at some of the failures we see in each of these steps.&lt;/p&gt; 1090 - &lt;h2 id=&quot;step-1:-opam-solve&quot;&gt;&lt;a href=&quot;#step-1:-opam-solve&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 1: opam solve&lt;/h2&gt; 1091 - &lt;p&gt;There are 2,074 solver failures. A good chunk of these are due to the way docs ci works itself, that it starts with a specific version of OCaml. In order to do this, the solution must have a specific version of OCaml in it, and this is not always the case, for example, all of the &lt;code&gt;conf-*&lt;/code&gt; packages fail in this way. This particular class of &amp;quot;failures&amp;quot; is not at all important, as mostly they don't contain useful documentation, but even if they do, if they're actually being used then they will be built as part of another solution. For example, while &lt;code&gt;conf-faad&lt;/code&gt; fails with this error, the solution of the &lt;code&gt;faad&lt;/code&gt; package pulls it in anyway, so we can build any docs that it includes. Roughly two thirds (685) of the reported failures are due to this, and by checking the resulting HTML docs we can see that we do get docs for 278 of these, so they must be pulled in by other solutions.&lt;/p&gt; 1092 - &lt;p&gt;The remaining failures are &amp;quot;real&amp;quot; in the sense that we never currently get docs for these packages. In turn, these can be subcategorised. One class of failures happen with platform-specific packages, for example &lt;code&gt;camlkit&lt;/code&gt; which provides bindings to Cocoa frameworks, and is only available on macOS, or &lt;code&gt;eio_windows&lt;/code&gt; which clearly requires Windows. The current docs-ci setup only builds on Linux, and extending this to other platforms will require a little more work, and is not currently scheduled. These are &amp;quot;fixable&amp;quot; failures.&lt;/p&gt; 1093 - &lt;p&gt;The third class of failures are those that will &amp;quot;just never work&amp;quot;. For example, some early versions of &lt;code&gt;domainslib&lt;/code&gt; were released before the OCaml 5.0 APIs were finalised, and so they can only work with alpha versions of OCaml 5.0. We won't be documenting these.&lt;/p&gt; 1094 - &lt;p&gt;Finally there are some more 'unexplained' failures, such as &lt;code&gt;docteur.0.0.2&lt;/code&gt;. This one's particularly interesting as the solve actually succeeds when using the stand-alone tool opam-0install, whereas it's failing in docs-ci, which uses opam-0install as a library! I'm currently suspicious about the 'deprecated' flag, as the failure log contains:&lt;/p&gt; 1095 - &lt;pre&gt;- git-cohttp-unix -&amp;gt; (problem) 1302 + |&gt; take limit)</code></pre></div> 1303 + <p>This gives us a sorted list of the versions for the package, and we can pick the first one to get the latest version. With the output from this we can then run <code>sherlodoc index</code> and we get a nice big (1.7 gig!) index file.</p> 1304 + <h3 id="serving-the-index"><a href="#serving-the-index" class="anchor"></a>Serving the index</h3> 1305 + <p>The next step is to serve this index file so that the MCP server can access it. The file format is a marshalled OCaml value, so we need to have an OCaml program to read it in and perform the search, and it'll have to be a server since the whole index needs to be unmarshalled into memory before any search can be performed, and it would be dumb to do this for every query.</p> 1306 + <p>Sherlodoc got partially integrated into odoc's code base before the 3.0 release with the exception of the server, which was left out to avoid pulling a load of new dependencies to odoc. Unfortunately, we didn't expose the sherlodoc libraries publicly, so we'll need to do that in order to make anything useful with sherlodoc. In addition, odoc embeds the version of odoc used into the odocl files, and since right now the docs CI is building with a version of odoc that <em>doesn't</em> expose the libraries, we might have to hack around that in order to use those odocl files. Obviously the longer term solution is just to make a new release of odoc with this change and update the docs CI to use that.</p> 1307 + <h2 id="package-to-library-map"><a href="#package-to-library-map" class="anchor"></a>Package to Library map</h2> 1308 + <p>A related quest was to assemble a map of opam package to ocamlfind library names. It's a quirk of history that the library names that an opam package provides are not necessarily related to the name of the package. That means that tools like <code>dune</code> have a hard time linting projects to check that the libraries they're using are mentioned in the opam files. Dune, of course, has resolved this be ensuring that it's an error to build a package using dune where the library names don't start with the package name, but as dune is just one of many OCaml build systems, the problem remains.</p> 1309 + <p>Since docs CI has built every version of every package, and because the Odoc 3 package layout includes the library names in the paths to the documentation, we should be able to produce a fairly definitive list of the libraries that each package provides, which tools like dune can then use for this sort of lint check. We can just tweak the code above slightly to get the library names and output a big JSON file with the mapping - or perhaps with the exceptions to dune's rule.</p> 1310 + <p>I thought this would be a neat first project to try Claude Code on - a 'starter for ten' - as it were, so I signed up to use Claude code and gave it a shot.</p> 1311 + <p>It handily produced a working program that did exactly what I wanted, including creating a test directory that it used to verify the code worked. One fascinating thing I noted as it scrolled past was that it tried to use <code>yojson</code> to write the output, but failed to get it to work and reverted back to <code>printf</code> output. I suspect this will be due to it finding it troublesome to figure out the various steps that need to be taken to use a new library in a dune project, so this is something to have a play with later.</p> 1312 + <p>After a couple of iterations with different heuristics to disambiguate between library names and other directories, I got a working program producing a JSON file with only the exceptions to dune's rule. I took a look through and almost immediately found <code>camlidl</code> suggesting it produces a library called <code>com</code>. This didn't look right at all so I installed it and found that the library is actually named <code>camlidl</code>. The <code>cma</code> file, though, is named <code>com.cma</code>, so it looks like <code>odoc_driver</code> has a bug. Interestingly, running <code>odoc_driver</code> locally gets the library name correct, so it's only an issue when running it in the docs CI. <a href="https://github.com/ocaml/odoc/issues/1351">Issue filed</a>.</p> 1313 + <h2 id="further-claude-code-experiments"><a href="#further-claude-code-experiments" class="anchor"></a>Further claude code experiments</h2> 1314 + <p>To see how well Claude Code could handle more complex tasks, I thought I'd give it a whirl on something more like its home territory, and somewhere where I was less familiar. I decided to ask it to write some code to make a nicer editor experience for the notebooks project. Since the <a href="https://github.com/jonludlam/jsoo-code-mirror">bindings to codemirror</a> I'm using are very minimal, any new features I want to use end up with needing to write a bunch of bindings first, and only then seeing if the feature works as I'd like. So instead I thought I'd get claude to write the editor code for me in javascript, and then I could make sure it works as I want and only then convert it to OCaml. This worked pretty nicely, and I've now got a neat <a href="https://jon.ludl.am/experiments/notebook-editor/notebook-editable.html">demonstration editor</a> that I can use to guide the OCaml implementation.</p> 1315 + <h2 id="more-notebook-work"><a href="#more-notebook-work" class="anchor"></a>More notebook work</h2> 1316 + <p>The oxcaml project will be launching this week hopefully. I've been looking at Luke's <a href="https://github.com/ocaml-flambda/flambda-backend/pull/3886">Parallelism tutorial</a> and have been thinking about how this will work with the online notebooks. The parallel library works via effects, and the oxcaml branch of js_of_ocaml doesn't support effects yet, and it might be a while before it does. However, the blog post is mainly talking about the intricacies of the type system work that's been done to ensure the parallel library is safe, and as such perhaps we can get a lot out of doing this online with just Merlin.</p> 1317 + <p>Some early experimentation showed that the parallel library breaks the worker on load, so we need to do something a bit more sophisticated than just 'not call exec', so I did some work to have a mode of worker that doesn't load the cmas, just the cmis for Merlin. This is almost there.</p> 1318 + <h2 id="odoc-work"><a href="#odoc-work" class="anchor"></a>Odoc work</h2> 1319 + <p>Ocaml 5.4 is just around the corner, and there's some odoc work to be done before the release. One of the main new features that will impact odoc is the new <a href="https://tyconmismatch.com/papers/ml2024_labeled_tuples.pdf">labelled tuples</a> feature. Fortunately <a href="https://github.com/lukemaurer">Luke Maurer</a> has already done a lot of work to plumb this into odoc, so this will save us a lot of work - thanks, Luke! There's likely to be a few other bits and pieces to do, but hopefully not too much.</p>]]></content> 1320 + </entry> 1321 + <entry> 1322 + <id>https://jon.recoil.org/blog/2025/05/docs-progress.html</id> 1323 + <title>Progress in OCaml docs</title> 1324 + <published>2025-05-29T00:00:00Z</published> 1325 + <updated>2025-05-29T00:00:00Z</updated> 1326 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/docs-progress.html"/> 1327 + <summary>The docs build is progress well, and we've hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any in...</summary> 1328 + <content type="html"><![CDATA[<h1 id="progress-in-ocaml-docs"><a href="#progress-in-ocaml-docs" class="anchor"></a>Progress in OCaml docs</h1> 1329 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-29</p></li></ul> 1330 + <p>The docs build is progress well, and we've <i>just about</i> hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any insights to be gained.</p> 1331 + <p>Odoc requires a built package in order to generate the docs, there are two steps that have to be done before we can begin building the docs. Step one is to figure out the exact set of packages to build - ie, doing an opam solve, and step two is to actually build the packages. These two steps are, to some extent, out of docs-ci's control, and rely on the state of opam repository. While there are efforts to keep this in as good a state as possible, it's still the case that these steps fail much more often than the actual docs build itself. Let's take a look at some of the failures we see in each of these steps.</p> 1332 + <h2 id="step-1:-opam-solve"><a href="#step-1:-opam-solve" class="anchor"></a>Step 1: opam solve</h2> 1333 + <p>There are 2,074 solver failures. A good chunk of these are due to the way docs ci works itself, that it starts with a specific version of OCaml. In order to do this, the solution must have a specific version of OCaml in it, and this is not always the case, for example, all of the <code>conf-*</code> packages fail in this way. This particular class of &quot;failures&quot; is not at all important, as mostly they don't contain useful documentation, but even if they do, if they're actually being used then they will be built as part of another solution. For example, while <code>conf-faad</code> fails with this error, the solution of the <code>faad</code> package pulls it in anyway, so we can build any docs that it includes. Roughly two thirds (685) of the reported failures are due to this, and by checking the resulting HTML docs we can see that we do get docs for 278 of these, so they must be pulled in by other solutions.</p> 1334 + <p>The remaining failures are &quot;real&quot; in the sense that we never currently get docs for these packages. In turn, these can be subcategorised. One class of failures happen with platform-specific packages, for example <code>camlkit</code> which provides bindings to Cocoa frameworks, and is only available on macOS, or <code>eio_windows</code> which clearly requires Windows. The current docs-ci setup only builds on Linux, and extending this to other platforms will require a little more work, and is not currently scheduled. These are &quot;fixable&quot; failures.</p> 1335 + <p>The third class of failures are those that will &quot;just never work&quot;. For example, some early versions of <code>domainslib</code> were released before the OCaml 5.0 APIs were finalised, and so they can only work with alpha versions of OCaml 5.0. We won't be documenting these.</p> 1336 + <p>Finally there are some more 'unexplained' failures, such as <code>docteur.0.0.2</code>. This one's particularly interesting as the solve actually succeeds when using the stand-alone tool opam-0install, whereas it's failing in docs-ci, which uses opam-0install as a library! I'm currently suspicious about the 'deprecated' flag, as the failure log contains:</p> 1337 + <pre>- git-cohttp-unix -&gt; (problem) 1096 1338 No usable implementations: 1097 1339 git-cohttp-unix.3.6.0: Availability condition not satisfied 1098 1340 git-cohttp-unix.3.5.0: Availability condition not satisfied 1099 1341 git-cohttp-unix.3.4.0: Availability condition not satisfied 1100 1342 git-cohttp-unix.3.3.3: Availability condition not satisfied 1101 1343 git-cohttp-unix.3.3.2: Availability condition not satisfied 1102 - ...&lt;/pre&gt; 1103 - &lt;p&gt;and that flag is the only thing I can immediately see that stands out in &lt;code&gt;git-cohttp-unix&lt;/code&gt;. In contrast, the solution given by opam-0install contains &lt;code&gt;git-cohttp-unix.3.6.0&lt;/code&gt; as a dependency. I suspect fixing this will cause quite a few more packages to succeed.&lt;/p&gt; 1104 - &lt;h2 id=&quot;step-2:-building-packages&quot;&gt;&lt;a href=&quot;#step-2:-building-packages&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 2: building packages&lt;/h2&gt; 1105 - &lt;p&gt;The next step, once we've got the solutions, is to build the packages. This is using the new method I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;previously wrote about&lt;/span&gt;. There are about 1,000 packages that fail to build, and once again we can take a look and categorise some of these failures. There are a wider variety of failures here, and it's quite useful to cross-check with &lt;em&gt;opam health check&lt;/em&gt; to see if it's known to be broken. Unfortunately OHC only builds the latest versions of everything, so we can't check in some cases. The interesting issues are where we're failing to build something that seems to work in OHC.&lt;/p&gt; 1106 - &lt;h3 id=&quot;llvm.18&quot;&gt;&lt;a href=&quot;#llvm.18&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;llvm.18&lt;/h3&gt; 1107 - &lt;p&gt;This is an interesting type of error, where the build fails because of a missing external dependency. The &lt;code&gt;llvm&lt;/code&gt; package depends upon &lt;code&gt;conf-llvm-static.18&lt;/code&gt;, which should be able to install the depext. Looking at the package, it does indeed have a depext for Debian:&lt;/p&gt; 1108 - &lt;pre&gt;depexts: [ 1109 - [&amp;quot;llvm@18&amp;quot; &amp;quot;zstd&amp;quot;] {os-distribution = &amp;quot;homebrew&amp;quot; &amp;amp; os = &amp;quot;macos&amp;quot;} 1110 - [&amp;quot;llvm-18&amp;quot;] {os-distribution = &amp;quot;macports&amp;quot; &amp;amp; os = &amp;quot;macos&amp;quot;} 1111 - [&amp;quot;llvm-18-dev&amp;quot; &amp;quot;zlib1g-dev&amp;quot; &amp;quot;libzstd-dev&amp;quot;] {os-family = &amp;quot;debian&amp;quot;} 1112 - [&amp;quot;llvm18-dev&amp;quot;] {os-distribution = &amp;quot;alpine&amp;quot;} 1113 - [&amp;quot;llvm18&amp;quot;] {os-family = &amp;quot;arch&amp;quot;} 1114 - [&amp;quot;llvm18-devel&amp;quot;] {os-family = &amp;quot;suse&amp;quot; | os-family = &amp;quot;opensuse&amp;quot;} 1115 - [&amp;quot;llvm18-devel&amp;quot;] {os-distribution = &amp;quot;fedora&amp;quot; &amp;amp; os-version &amp;gt;= &amp;quot;41&amp;quot;} 1116 - [&amp;quot;llvm-devel&amp;quot;] {os-distribution = &amp;quot;fedora&amp;quot; &amp;amp; os-version = &amp;quot;40&amp;quot;} 1117 - [&amp;quot;llvm18-devel&amp;quot; &amp;quot;epel-release&amp;quot;] {os-distribution = &amp;quot;centos&amp;quot;} 1118 - [&amp;quot;devel/llvm18&amp;quot;] {os = &amp;quot;freebsd&amp;quot;} 1119 - ]&lt;/pre&gt; 1120 - &lt;p&gt;However, in Debian 12, they've already updated to &lt;code&gt;llvm-19&lt;/code&gt;, so the depext is not available.&lt;/p&gt; 1121 - &lt;h3 id=&quot;camlimages.5.0.5&quot;&gt;&lt;a href=&quot;#camlimages.5.0.5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;camlimages.5.0.5&lt;/h3&gt; 1122 - &lt;p&gt;This one fails due to a linking error. Oddly enough it does seem to work in OHC.&lt;/p&gt; 1123 - &lt;pre&gt;(cd _build/default &amp;amp;&amp;amp; /home/opam/.opam/4.14/bin/ocamlmklib.opt -g -o freetype/camlimages_freetype_stubs freetype/ftintf.o -ldopt -lfreetype) 1344 + ...</pre> 1345 + <p>and that flag is the only thing I can immediately see that stands out in <code>git-cohttp-unix</code>. In contrast, the solution given by opam-0install contains <code>git-cohttp-unix.3.6.0</code> as a dependency. I suspect fixing this will cause quite a few more packages to succeed.</p> 1346 + <h2 id="step-2:-building-packages"><a href="#step-2:-building-packages" class="anchor"></a>Step 2: building packages</h2> 1347 + <p>The next step, once we've got the solutions, is to build the packages. This is using the new method I <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">previously wrote about</a>. There are about 1,000 packages that fail to build, and once again we can take a look and categorise some of these failures. There are a wider variety of failures here, and it's quite useful to cross-check with <em>opam health check</em> to see if it's known to be broken. Unfortunately OHC only builds the latest versions of everything, so we can't check in some cases. The interesting issues are where we're failing to build something that seems to work in OHC.</p> 1348 + <h3 id="llvm.18"><a href="#llvm.18" class="anchor"></a>llvm.18</h3> 1349 + <p>This is an interesting type of error, where the build fails because of a missing external dependency. The <code>llvm</code> package depends upon <code>conf-llvm-static.18</code>, which should be able to install the depext. Looking at the package, it does indeed have a depext for Debian:</p> 1350 + <pre>depexts: [ 1351 + [&quot;llvm@18&quot; &quot;zstd&quot;] {os-distribution = &quot;homebrew&quot; &amp; os = &quot;macos&quot;} 1352 + [&quot;llvm-18&quot;] {os-distribution = &quot;macports&quot; &amp; os = &quot;macos&quot;} 1353 + [&quot;llvm-18-dev&quot; &quot;zlib1g-dev&quot; &quot;libzstd-dev&quot;] {os-family = &quot;debian&quot;} 1354 + [&quot;llvm18-dev&quot;] {os-distribution = &quot;alpine&quot;} 1355 + [&quot;llvm18&quot;] {os-family = &quot;arch&quot;} 1356 + [&quot;llvm18-devel&quot;] {os-family = &quot;suse&quot; | os-family = &quot;opensuse&quot;} 1357 + [&quot;llvm18-devel&quot;] {os-distribution = &quot;fedora&quot; &amp; os-version &gt;= &quot;41&quot;} 1358 + [&quot;llvm-devel&quot;] {os-distribution = &quot;fedora&quot; &amp; os-version = &quot;40&quot;} 1359 + [&quot;llvm18-devel&quot; &quot;epel-release&quot;] {os-distribution = &quot;centos&quot;} 1360 + [&quot;devel/llvm18&quot;] {os = &quot;freebsd&quot;} 1361 + ]</pre> 1362 + <p>However, in Debian 12, they've already updated to <code>llvm-19</code>, so the depext is not available.</p> 1363 + <h3 id="camlimages.5.0.5"><a href="#camlimages.5.0.5" class="anchor"></a>camlimages.5.0.5</h3> 1364 + <p>This one fails due to a linking error. Oddly enough it does seem to work in OHC.</p> 1365 + <pre>(cd _build/default &amp;&amp; /home/opam/.opam/4.14/bin/ocamlmklib.opt -g -o freetype/camlimages_freetype_stubs freetype/ftintf.o -ldopt -lfreetype) 1124 1366 # /usr/bin/ld: freetype/ftintf.o: warning: relocation against `Caml_state' in read-only section `.text' 1125 1367 # /usr/bin/ld: freetype/ftintf.o: relocation R_X86_64_PC32 against undefined symbol `Caml_state' can not be used when making a shared object; recompile with -fPIC 1126 - # /usr/bin/ld: final link failed: bad value&lt;/pre&gt; 1127 - &lt;h3 id=&quot;ahrocksdb.0.2.2&quot;&gt;&lt;a href=&quot;#ahrocksdb.0.2.2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ahrocksdb.0.2.2&lt;/h3&gt; 1128 - &lt;p&gt;This one fails in OHC too, but it looks like it's a build failure with more recent gccs, fixed upstream: https://github.com/ahrefs/ocaml-ahrocksdb/commit/e52316b3d30fededac023141bf8b47da79cabfed&lt;/p&gt; 1129 - &lt;pre&gt;# run: gcc -O2 -fno-strict-aliasing -fwrapv -fPIC -pthread -I/usr/include/rocksdb -I /home/opam/.opam/5.3/lib/ocaml -o /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.exe /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c -lm -lpthread -lrocksdb 1130 - # -&amp;gt; process exited with code 1 1131 - # -&amp;gt; stdout: 1132 - # -&amp;gt; stderr: 1368 + # /usr/bin/ld: final link failed: bad value</pre> 1369 + <h3 id="ahrocksdb.0.2.2"><a href="#ahrocksdb.0.2.2" class="anchor"></a>ahrocksdb.0.2.2</h3> 1370 + <p>This one fails in OHC too, but it looks like it's a build failure with more recent gccs, fixed upstream: https://github.com/ahrefs/ocaml-ahrocksdb/commit/e52316b3d30fededac023141bf8b47da79cabfed</p> 1371 + <pre># run: gcc -O2 -fno-strict-aliasing -fwrapv -fPIC -pthread -I/usr/include/rocksdb -I /home/opam/.opam/5.3/lib/ocaml -o /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.exe /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c -lm -lpthread -lrocksdb 1372 + # -&gt; process exited with code 1 1373 + # -&gt; stdout: 1374 + # -&gt; stderr: 1133 1375 # | In file included from /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c:4: 1134 1376 # | /usr/include/rocksdb/version.h:7:10: fatal error: string: No such file or directory 1135 - # | 7 | #include &amp;lt;string&amp;gt; 1377 + # | 7 | #include &lt;string&gt; 1136 1378 # | | ^~~~~~~~ 1137 1379 # | compilation terminated. 1138 - # Error: discover error&lt;/pre&gt; 1139 - &lt;h3 id=&quot;alt-ergo.2.2.0&quot;&gt;&lt;a href=&quot;#alt-ergo.2.2.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;alt-ergo.2.2.0&lt;/h3&gt; 1140 - &lt;p&gt;Looks like it's trying to write outside the sandbox. The failure only occurs on alt-ergo 1.3.0 - 2.2.0.&lt;/p&gt; 1141 - &lt;pre&gt;# mkdir -p /home/opam/.opam/4.14/man/man1 1380 + # Error: discover error</pre> 1381 + <h3 id="alt-ergo.2.2.0"><a href="#alt-ergo.2.2.0" class="anchor"></a>alt-ergo.2.2.0</h3> 1382 + <p>Looks like it's trying to write outside the sandbox. The failure only occurs on alt-ergo 1.3.0 - 2.2.0.</p> 1383 + <pre># mkdir -p /home/opam/.opam/4.14/man/man1 1142 1384 # cp -f doc/alt-ergo.1 /home/opam/.opam/4.14/man/man1 1143 1385 # mkdir -p /usr/local/lib/alt-ergo/preludes 1144 1386 # mkdir: cannot create directory '/usr/local/lib/alt-ergo': Permission denied 1145 - # make: *** [Makefile.users:243: install-preludes] Error 1&lt;/pre&gt; 1146 - &lt;h3 id=&quot;ctypes-foreign.0.18.0&quot;&gt;&lt;a href=&quot;#ctypes-foreign.0.18.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ctypes-foreign.0.18.0&lt;/h3&gt; 1147 - &lt;p&gt;This one is a much more interesting failure. The logs show:&lt;/p&gt; 1148 - &lt;pre&gt;[ERROR] No solution for ctypes-foreign.0.18.0: * Missing dependency: 1149 - - ctypes-foreign -&amp;gt; ctypes 1150 - unknown package&lt;/pre&gt; 1151 - &lt;p&gt;which is happening because of the optimisation I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;mentioned before&lt;/span&gt; where we build a new &lt;code&gt;opam-repository&lt;/code&gt; with only the packages we're going to need. In this case, we've somehow missed out the &lt;code&gt;ctypes&lt;/code&gt; package. Looking at the opam file for &lt;code&gt;ctypes-foreign&lt;/code&gt;, it has a &lt;code&gt;post&lt;/code&gt; dependency on &lt;code&gt;ctypes&lt;/code&gt;. The &lt;code&gt;post&lt;/code&gt; keyword indicates that &lt;code&gt;ctypes&lt;/code&gt; should be installed with &lt;code&gt;ctypes-foreign&lt;/code&gt;, but that having it as a &amp;quot;normal&amp;quot; dependency would introduce a dependency cycle. Since we require a DAG of dependencies, we explicitly remove any &lt;code&gt;post&lt;/code&gt; dependencies from the set of packages to build, but it seems that &lt;code&gt;opam&lt;/code&gt; would like to know about it anyway!&lt;/p&gt; 1152 - &lt;h3 id=&quot;others&quot;&gt;&lt;a href=&quot;#others&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;others&lt;/h3&gt; 1153 - &lt;p&gt;There are many more. An automatic cross-check with OHC would be really useful, mainly to distinguish between the packages that are broken due to &lt;code&gt;ocaml-docs-ci&lt;/code&gt; issues (like &lt;code&gt;ctypes-foreign&lt;/code&gt;) and those that are broken for other reasons (like &lt;code&gt;ahrocksdb&lt;/code&gt;).&lt;/p&gt; 1154 - &lt;h2 id=&quot;step-3:-building-docs&quot;&gt;&lt;a href=&quot;#step-3:-building-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 3: building docs&lt;/h2&gt; 1155 - &lt;p&gt;Finally, we have the actual docs build. This is where we run &lt;code&gt;odoc&lt;/code&gt; and &lt;code&gt;odoc_driver&lt;/code&gt; to produce the HTML docs. All the errors here are ones that we should be able to fix!&lt;/p&gt; 1156 - &lt;p&gt;Firstly, there are the internal errors:&lt;/p&gt; 1157 - &lt;pre&gt;Uncaught exception: Failure(&amp;quot;\&amp;quot;rm\&amp;quot; \&amp;quot;-rf\&amp;quot; \&amp;quot;/var/cache/obuilder/merged/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&amp;quot; \&amp;quot;/var/cache/obuilder/work/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&amp;quot; failed with exit status 1&amp;quot;) 1158 - 2025-05-22 09:30.18: Job failed: Failed: Internal error&lt;/pre&gt; 1159 - &lt;p&gt;These are some &lt;code&gt;obuilder&lt;/code&gt; error that needs fixing. Currently we're just rerunning the job to fix these.&lt;/p&gt; 1160 - &lt;h3 id=&quot;odoc.2.0.0&quot;&gt;&lt;a href=&quot;#odoc.2.0.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;odoc.2.0.0&lt;/h3&gt; 1161 - &lt;p&gt;Oops, we can't build our own docs! At least it's an old version :-)&lt;/p&gt; 1162 - &lt;pre&gt;odoc: internal error, uncaught exception: 1163 - File &amp;quot;src/html/link.ml&amp;quot;, line 101, characters 16-22: Assertion failed 1164 - Raised at Odoc_html__Link.href in file &amp;quot;src/html/link.ml&amp;quot;, line 101, characters 16-57 1165 - Called from Odoc_html__Generator.internallink in file &amp;quot;src/html/generator.ml&amp;quot;, line 108, characters 19-49 1166 - ...&lt;/pre&gt; 1167 - &lt;p&gt;The failure points &lt;a href=&quot;https://github.com/ocaml/odoc/blob/42190737339d9be4510eeeb0e3c47e84badf4d73/src/html/link.ml#L101&quot;&gt;here&lt;/a&gt;, an assertion about the common ancestor of two paths. &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1345&quot;&gt;Issue filed&lt;/a&gt;.&lt;/p&gt; 1168 - &lt;h3 id=&quot;ocaml-base-compiler.4.07.0&quot;&gt;&lt;a href=&quot;#ocaml-base-compiler.4.07.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ocaml-base-compiler.4.07.0&lt;/h3&gt; 1169 - &lt;p&gt;This one happens because of our &amp;quot;optimisation&amp;quot; to use a base image with OCaml pre-installed. What we &lt;i&gt;actually&lt;/i&gt; do is find the major/minor version of OCaml and use the corresponding docker image - so in this case we'll use ocaml/opam:debian-12-ocaml-4.07. Now this image actually contains OCaml 4.07.1, and the format of &lt;code&gt;cmt&lt;/code&gt; and &lt;code&gt;cmti&lt;/code&gt; files changed between these releases, so we get a failure.&lt;/p&gt; 1170 - &lt;p&gt;We'll fix this by getting rid of the optimisation and building from an empty switch.&lt;/p&gt; 1171 - &lt;h3 id=&quot;lascar.0.7.0&quot;&gt;&lt;a href=&quot;#lascar.0.7.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;lascar.0.7.0&lt;/h3&gt; 1172 - &lt;p&gt;This one is quite interesting. It's another assertion failure in odoc:&lt;/p&gt; 1173 - &lt;pre&gt;odoc: internal error, uncaught exception: 1174 - File &amp;quot;src/xref2/cpath.ml&amp;quot;, line 364, characters 37-43: Assertion failed 1175 - Raised at Odoc_xref2__Cpath.unresolve_resolved_parent_path in file &amp;quot;src/xref2/cpath.ml&amp;quot;, line 364, characters 37-49 1176 - Called from Odoc_xref2__Cpath.unresolve_module_path in file &amp;quot;src/xref2/cpath.ml&amp;quot;, line 349, characters 28-60 1177 - Called from Odoc_xref2__Tools.fragmap.map_module_decl in file &amp;quot;src/xref2/tools.ml&amp;quot;, line 1792, characters 48-80&lt;/pre&gt; 1178 - &lt;p&gt;It's happening when we 'unresolve' a previously resolved path. We end up having to do this when something about the path has changed, in this case while we're handling a &lt;code&gt;S with module Foo = Bar&lt;/code&gt; or similar. Issue &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1346&quot;&gt;filed&lt;/a&gt;.&lt;/p&gt; 1179 - &lt;h3 id=&quot;camlp5&quot;&gt;&lt;a href=&quot;#camlp5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;camlp5&lt;/h3&gt; 1180 - &lt;p&gt;This one actually occurs in &lt;code&gt;odoc_driver&lt;/code&gt; rather than in &lt;code&gt;odoc&lt;/code&gt; itself.&lt;/p&gt; 1181 - &lt;pre&gt;odoc_driver_voodoo: [DEBUG] Found cmi_only_lib in dir: /home/opam/.opam/4.08/lib/camlp5 1387 + # make: *** [Makefile.users:243: install-preludes] Error 1</pre> 1388 + <h3 id="ctypes-foreign.0.18.0"><a href="#ctypes-foreign.0.18.0" class="anchor"></a>ctypes-foreign.0.18.0</h3> 1389 + <p>This one is a much more interesting failure. The logs show:</p> 1390 + <pre>[ERROR] No solution for ctypes-foreign.0.18.0: * Missing dependency: 1391 + - ctypes-foreign -&gt; ctypes 1392 + unknown package</pre> 1393 + <p>which is happening because of the optimisation I <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">mentioned before</a> where we build a new <code>opam-repository</code> with only the packages we're going to need. In this case, we've somehow missed out the <code>ctypes</code> package. Looking at the opam file for <code>ctypes-foreign</code>, it has a <code>post</code> dependency on <code>ctypes</code>. The <code>post</code> keyword indicates that <code>ctypes</code> should be installed with <code>ctypes-foreign</code>, but that having it as a &quot;normal&quot; dependency would introduce a dependency cycle. Since we require a DAG of dependencies, we explicitly remove any <code>post</code> dependencies from the set of packages to build, but it seems that <code>opam</code> would like to know about it anyway!</p> 1394 + <h3 id="others"><a href="#others" class="anchor"></a>others</h3> 1395 + <p>There are many more. An automatic cross-check with OHC would be really useful, mainly to distinguish between the packages that are broken due to <code>ocaml-docs-ci</code> issues (like <code>ctypes-foreign</code>) and those that are broken for other reasons (like <code>ahrocksdb</code>).</p> 1396 + <h2 id="step-3:-building-docs"><a href="#step-3:-building-docs" class="anchor"></a>Step 3: building docs</h2> 1397 + <p>Finally, we have the actual docs build. This is where we run <code>odoc</code> and <code>odoc_driver</code> to produce the HTML docs. All the errors here are ones that we should be able to fix!</p> 1398 + <p>Firstly, there are the internal errors:</p> 1399 + <pre>Uncaught exception: Failure(&quot;\&quot;rm\&quot; \&quot;-rf\&quot; \&quot;/var/cache/obuilder/merged/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&quot; \&quot;/var/cache/obuilder/work/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&quot; failed with exit status 1&quot;) 1400 + 2025-05-22 09:30.18: Job failed: Failed: Internal error</pre> 1401 + <p>These are some <code>obuilder</code> error that needs fixing. Currently we're just rerunning the job to fix these.</p> 1402 + <h3 id="odoc.2.0.0"><a href="#odoc.2.0.0" class="anchor"></a>odoc.2.0.0</h3> 1403 + <p>Oops, we can't build our own docs! At least it's an old version :-)</p> 1404 + <pre>odoc: internal error, uncaught exception: 1405 + File &quot;src/html/link.ml&quot;, line 101, characters 16-22: Assertion failed 1406 + Raised at Odoc_html__Link.href in file &quot;src/html/link.ml&quot;, line 101, characters 16-57 1407 + Called from Odoc_html__Generator.internallink in file &quot;src/html/generator.ml&quot;, line 108, characters 19-49 1408 + ...</pre> 1409 + <p>The failure points <a href="https://github.com/ocaml/odoc/blob/42190737339d9be4510eeeb0e3c47e84badf4d73/src/html/link.ml#L101">here</a>, an assertion about the common ancestor of two paths. <a href="https://github.com/ocaml/odoc/issues/1345">Issue filed</a>.</p> 1410 + <h3 id="ocaml-base-compiler.4.07.0"><a href="#ocaml-base-compiler.4.07.0" class="anchor"></a>ocaml-base-compiler.4.07.0</h3> 1411 + <p>This one happens because of our &quot;optimisation&quot; to use a base image with OCaml pre-installed. What we <i>actually</i> do is find the major/minor version of OCaml and use the corresponding docker image - so in this case we'll use ocaml/opam:debian-12-ocaml-4.07. Now this image actually contains OCaml 4.07.1, and the format of <code>cmt</code> and <code>cmti</code> files changed between these releases, so we get a failure.</p> 1412 + <p>We'll fix this by getting rid of the optimisation and building from an empty switch.</p> 1413 + <h3 id="lascar.0.7.0"><a href="#lascar.0.7.0" class="anchor"></a>lascar.0.7.0</h3> 1414 + <p>This one is quite interesting. It's another assertion failure in odoc:</p> 1415 + <pre>odoc: internal error, uncaught exception: 1416 + File &quot;src/xref2/cpath.ml&quot;, line 364, characters 37-43: Assertion failed 1417 + Raised at Odoc_xref2__Cpath.unresolve_resolved_parent_path in file &quot;src/xref2/cpath.ml&quot;, line 364, characters 37-49 1418 + Called from Odoc_xref2__Cpath.unresolve_module_path in file &quot;src/xref2/cpath.ml&quot;, line 349, characters 28-60 1419 + Called from Odoc_xref2__Tools.fragmap.map_module_decl in file &quot;src/xref2/tools.ml&quot;, line 1792, characters 48-80</pre> 1420 + <p>It's happening when we 'unresolve' a previously resolved path. We end up having to do this when something about the path has changed, in this case while we're handling a <code>S with module Foo = Bar</code> or similar. Issue <a href="https://github.com/ocaml/odoc/issues/1346">filed</a>.</p> 1421 + <h3 id="camlp5"><a href="#camlp5" class="anchor"></a>camlp5</h3> 1422 + <p>This one actually occurs in <code>odoc_driver</code> rather than in <code>odoc</code> itself.</p> 1423 + <pre>odoc_driver_voodoo: [DEBUG] Found cmi_only_lib in dir: /home/opam/.opam/4.08/lib/camlp5 1182 1424 odoc_driver_voodoo: internal error, uncaught exception: 1183 - Invalid_argument(&amp;quot;\&amp;quot;/home/opam/.opam/4.08/lib/camlp5\&amp;quot;: invalid segment&amp;quot;) 1184 - &lt;/pre&gt; 1185 - &lt;p&gt;Here we're trying to add a segment to a path, but rather than a single path segment we've got an entire fully qualified path. Issue &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1347&quot;&gt;filed&lt;/a&gt;.&lt;/p&gt; 1186 - &lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Conclusion&lt;/h2&gt; 1187 - &lt;p&gt;It's pretty good that we've only got 4 types of error happening at the doc-generation phase. However, as a whole, any error that occurs earlier in the pipeline ends up with a missing documentation tab on the website, and we need to do a bit more so that the actual problem can be tracked down and fixed. This is obviously a more general problem than just the docs, and one that &lt;a href=&quot;https://check.ci.ocaml.org&quot;&gt;opam health check&lt;/a&gt; seeks to highlight. However, the current incarnation of OHC is significantly less efficient than docs-ci, so generalising the approach we've taken with &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;opamh&lt;/a&gt; should really help with making this more responsive.&lt;/p&gt; 1188 - &lt;p&gt;In addition, a number of the issues seen here could be addressed with a tool my colleague &lt;a href=&quot;https://ryan.freumh.org/&quot;&gt;Ryan&lt;/a&gt; is working on: &lt;a href=&quot;https://ryan.freumh.org/enki.html&quot;&gt;Enki&lt;/a&gt;. This tool would allow us to run a solve that actually determines not only the set of packages we wish to install, but the platform to install onto - e.g. for &lt;code&gt;eio_windows&lt;/code&gt; the solution would be to install on Windows, and for &lt;code&gt;llvm.18-static&lt;/code&gt; the solution might be Fedora 40.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/docs-progress.html</id><title type="text">Progress in OCaml docs</title><updated>2025-05-29T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, s...</summary><published>2025-05-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/lots-of-things.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;lots-of-things-have-been-happening&quot;&gt;&lt;a href=&quot;#lots-of-things-have-been-happening&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Lots of things have been happening&lt;/h1&gt; 1189 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1190 - &lt;p&gt;I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, so I haven't written up anything for a while.&lt;/p&gt; 1191 - &lt;p&gt;Time for a little summary of things then!&lt;/p&gt; 1192 - &lt;h2 id=&quot;ocaml-docs-ci&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Ocaml-docs-ci&lt;/h2&gt; 1193 - &lt;p&gt;I've been working with &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; on getting the docs CI running using Odoc 3.0. There are quite a few changes involved, both in how we're &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;building the packages&lt;/span&gt; but also how we're running odoc - it's building using &lt;code&gt;odoc_driver&lt;/code&gt; rather than &lt;code&gt;voodoo&lt;/code&gt; now, and while it's looking promising now we had hit a few hurdles along the way.&lt;/p&gt; 1194 - &lt;p&gt;We set the CI going last weekend but discovered that it was having some issues building packages using OCaml 5.3.0. The way the builds work is that we first do a &amp;quot;solve&amp;quot; step for each version of every package so we've got exact versions of all of the packages required to build them. We then look through that solution to figure out the version of OCaml required, and the (to avoid a little bit of work) we start from one of the &lt;a href=&quot;https://hub.docker.com/r/ocaml/opam&quot;&gt;opam docker images&lt;/a&gt; for that version of OCaml.&lt;/p&gt; 1195 - &lt;p&gt;When installing a package using opam it does a few operations that scale with the size of the opam repository, which ends up adding around ten of seconds to the build time. When we're building 50,000 packages, this adds up to quite a lot of time, so we short-cut this process with the simple expedient of creating an opam-repository that only contains the packages we need for the build. However, since we've already got a few packages installed in the image, we need to make sure our repository contains these packages too, otherwise opam gets thoroughly confused. My mistake was that we were missing out the `ocaml-compiler` package, which is new in OCaml 5.3.0, which led to the builds failing. Adding this in and kicking off the build again it's now got a lot further - at time of writing it has built 14,000 packages, there are 6,000 still building, and 1000 that have failed. If it continues in a similar fashion, this will compare quite favourably with the docs CI that's currently powering ocaml.org, where it has successfully built 17,000 packages, and 4,500 have failed.&lt;/p&gt; 1196 - &lt;p&gt;Mark has been working on a different approach to the build process, which is to come up with a new binary that doesn't do any of the &lt;code&gt;O(n)&lt;/code&gt; operations and just builds the package! This is definitely a promising direction, and I'm hoping to take a look at &lt;a href=&quot;https://github.com/mtelvers/ohc&quot;&gt;his prototype&lt;/a&gt; soon.&lt;/p&gt; 1197 - &lt;p&gt;Meanwhile, &lt;a href=&quot;https://choum.net&quot;&gt;panglesd&lt;/a&gt; is working on integrating this into the ocaml.org site, and is making good progress. He spotted last week that we were overwriting the `status.json` file that comes out of `odoc_driver` which we will use to power the redirections on ocaml.org. One of the changes of odoc 3.0 is that we carefully put modules into a directory structure that represents the library in which they are found. It's long been a pain that OCaml libraries (what Ocamlfind unhelpfully calls 'packages') are not always the same name as the opam package in which they're found. For example, the package &lt;code&gt;ocamlfind&lt;/code&gt; contains the library &lt;code&gt;findlib&lt;/code&gt;. So to help the user figure out where to find the module, we're putting it into the URL of the docs, and therefore into the breadcrumbs. The downside is that the modules are now in a different place on the website to where they were before, so the &lt;code&gt;status.json&lt;/code&gt; file is there to help with the redirections.&lt;/p&gt; 1198 - &lt;h2 id=&quot;notebooks&quot;&gt;&lt;a href=&quot;#notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Notebooks&lt;/h2&gt; 1199 - &lt;p&gt;I've been working on Merlin integration with the notebooks, which has been a fun little project. The bits that needed improving most were that merlin didn't work with toplevel-style code, and that each cell was a separate typing context, so while you could define a function in one cell and execute it in another, Merlin would tell you the function was undefined.&lt;/p&gt; 1200 - &lt;p&gt;For the toplevel-style code, what I've ended up doing is to essentially strip out all of the toplevel bits and pieces, and replace them with whitespace. So where I have a cell that looks like:&lt;/p&gt; 1201 - &lt;pre&gt;# let x = 1 + 2;; 1425 + Invalid_argument(&quot;\&quot;/home/opam/.opam/4.08/lib/camlp5\&quot;: invalid segment&quot;) 1426 + </pre> 1427 + <p>Here we're trying to add a segment to a path, but rather than a single path segment we've got an entire fully qualified path. Issue <a href="https://github.com/ocaml/odoc/issues/1347">filed</a>.</p> 1428 + <h2 id="conclusion"><a href="#conclusion" class="anchor"></a>Conclusion</h2> 1429 + <p>It's pretty good that we've only got 4 types of error happening at the doc-generation phase. However, as a whole, any error that occurs earlier in the pipeline ends up with a missing documentation tab on the website, and we need to do a bit more so that the actual problem can be tracked down and fixed. This is obviously a more general problem than just the docs, and one that <a href="https://check.ci.ocaml.org">opam health check</a> seeks to highlight. However, the current incarnation of OHC is significantly less efficient than docs-ci, so generalising the approach we've taken with <a href="https://github.com/jonludlam/opamh">opamh</a> should really help with making this more responsive.</p> 1430 + <p>In addition, a number of the issues seen here could be addressed with a tool my colleague <a href="https://ryan.freumh.org/">Ryan</a> is working on: <a href="https://ryan.freumh.org/enki.html">Enki</a>. This tool would allow us to run a solve that actually determines not only the set of packages we wish to install, but the platform to install onto - e.g. for <code>eio_windows</code> the solution would be to install on Windows, and for <code>llvm.18-static</code> the solution might be Fedora 40.</p>]]></content> 1431 + </entry> 1432 + <entry> 1433 + <id>https://jon.recoil.org/blog/2025/05/lots-of-things.html</id> 1434 + <title>Lots of things have been happening</title> 1435 + <published>2025-05-20T00:00:00Z</published> 1436 + <updated>2025-05-20T00:00:00Z</updated> 1437 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/lots-of-things.html"/> 1438 + <summary>I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, s...</summary> 1439 + <content type="html"><![CDATA[<h1 id="lots-of-things-have-been-happening"><a href="#lots-of-things-have-been-happening" class="anchor"></a>Lots of things have been happening</h1> 1440 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-20</p></li></ul> 1441 + <p>I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, so I haven't written up anything for a while.</p> 1442 + <p>Time for a little summary of things then!</p> 1443 + <h2 id="ocaml-docs-ci"><a href="#ocaml-docs-ci" class="anchor"></a>Ocaml-docs-ci</h2> 1444 + <p>I've been working with <a href="https://tunbury.org/">Mark Elvers</a> on getting the docs CI running using Odoc 3.0. There are quite a few changes involved, both in how we're <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">building the packages</a> but also how we're running odoc - it's building using <code>odoc_driver</code> rather than <code>voodoo</code> now, and while it's looking promising now we had hit a few hurdles along the way.</p> 1445 + <p>We set the CI going last weekend but discovered that it was having some issues building packages using OCaml 5.3.0. The way the builds work is that we first do a &quot;solve&quot; step for each version of every package so we've got exact versions of all of the packages required to build them. We then look through that solution to figure out the version of OCaml required, and the (to avoid a little bit of work) we start from one of the <a href="https://hub.docker.com/r/ocaml/opam">opam docker images</a> for that version of OCaml.</p> 1446 + <p>When installing a package using opam it does a few operations that scale with the size of the opam repository, which ends up adding around ten of seconds to the build time. When we're building 50,000 packages, this adds up to quite a lot of time, so we short-cut this process with the simple expedient of creating an opam-repository that only contains the packages we need for the build. However, since we've already got a few packages installed in the image, we need to make sure our repository contains these packages too, otherwise opam gets thoroughly confused. My mistake was that we were missing out the `ocaml-compiler` package, which is new in OCaml 5.3.0, which led to the builds failing. Adding this in and kicking off the build again it's now got a lot further - at time of writing it has built 14,000 packages, there are 6,000 still building, and 1000 that have failed. If it continues in a similar fashion, this will compare quite favourably with the docs CI that's currently powering ocaml.org, where it has successfully built 17,000 packages, and 4,500 have failed.</p> 1447 + <p>Mark has been working on a different approach to the build process, which is to come up with a new binary that doesn't do any of the <code>O(n)</code> operations and just builds the package! This is definitely a promising direction, and I'm hoping to take a look at <a href="https://github.com/mtelvers/ohc">his prototype</a> soon.</p> 1448 + <p>Meanwhile, <a href="https://choum.net">panglesd</a> is working on integrating this into the ocaml.org site, and is making good progress. He spotted last week that we were overwriting the `status.json` file that comes out of `odoc_driver` which we will use to power the redirections on ocaml.org. One of the changes of odoc 3.0 is that we carefully put modules into a directory structure that represents the library in which they are found. It's long been a pain that OCaml libraries (what Ocamlfind unhelpfully calls 'packages') are not always the same name as the opam package in which they're found. For example, the package <code>ocamlfind</code> contains the library <code>findlib</code>. So to help the user figure out where to find the module, we're putting it into the URL of the docs, and therefore into the breadcrumbs. The downside is that the modules are now in a different place on the website to where they were before, so the <code>status.json</code> file is there to help with the redirections.</p> 1449 + <h2 id="notebooks"><a href="#notebooks" class="anchor"></a>Notebooks</h2> 1450 + <p>I've been working on Merlin integration with the notebooks, which has been a fun little project. The bits that needed improving most were that merlin didn't work with toplevel-style code, and that each cell was a separate typing context, so while you could define a function in one cell and execute it in another, Merlin would tell you the function was undefined.</p> 1451 + <p>For the toplevel-style code, what I've ended up doing is to essentially strip out all of the toplevel bits and pieces, and replace them with whitespace. So where I have a cell that looks like:</p> 1452 + <pre># let x = 1 + 2;; 1202 1453 - val x : int = 3 1203 1454 # let y = x + 1;; 1204 - - val y : int = 4&lt;/pre&gt; 1205 - &lt;p&gt;I tell Merlin that the contents are:&lt;/p&gt; 1206 - &lt;pre&gt; let x = 1 + 2;; 1455 + - val y : int = 4</pre> 1456 + <p>I tell Merlin that the contents are:</p> 1457 + <pre> let x = 1 + 2;; 1207 1458 1208 1459 let y = x + 1;; 1209 - &lt;/pre&gt; 1210 - &lt;p&gt;where I'm careful to maintain the position of the original code. This bit is working quite nicely, but only when the code is syntactically correct, as I'm using the standard toplevel parser to figure out where the expression ends. I think I'm going to end up needing to write a custom parser for this, so something that will end on a &lt;code&gt;;;&lt;/code&gt; but ignore them in string constants, comments and so on.&lt;/p&gt; 1211 - &lt;p&gt;The approach I've taken for the second problem is to treat each cell as a separate module. I then write out a &lt;code&gt;cmi&lt;/code&gt; file into the virtual filesystem as &lt;code&gt;cell__id_0.cmi&lt;/code&gt; and &lt;code&gt;open&lt;/code&gt; all the previous modules in 'line 0' of every cell. I then remap all of the reported locations by removing 'line 0'.&lt;/p&gt; 1212 - &lt;p&gt;There are a number of issues with the current approaches: 1. The stripping of the toplevel bits is a little fragile, and currently only works when the toplevel is syntactically correct. This is fairly fixable. 2. When the contents of the cells change we need to flush any caches merlin and the compiler have. 3. An &lt;code&gt;open&lt;/code&gt; statement in once cell does _not_ cause the module to be available in the next cell. 4. A lot of cells leads to a lot of opens!&lt;/p&gt; 1213 - &lt;p&gt;I suspect that this the 'cells as modules' approach might end up being a bit of a dead-end, so I'll have a chat with &lt;a href=&quot;https://github.com/voodoos&quot;&gt;Ulysse&lt;/a&gt; to figure out the next experiment.&lt;/p&gt; 1214 - &lt;h2 id=&quot;oxcaml&quot;&gt;&lt;a href=&quot;#oxcaml&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml&lt;/h2&gt; 1215 - &lt;p&gt;I've been working on trying out oxcaml too, which has been a bit challenging. Firstly, although Jane Street provide a version of &lt;code&gt;js_of_ocaml&lt;/code&gt;, the toplevel didn't work. Fortunately, my amazing colleagues &lt;a href=&quot;https://patrick.sirref.org/&quot;&gt;Patrick O'Ferris&lt;/a&gt; and &lt;a href=&quot;https://github.com/art-w&quot;&gt;Arthur Wendling&lt;/a&gt; spent a good chunk of time fixing this and provided an &lt;a href=&quot;https://github.com/patricoferris/opam-repository-js#with-extensions&quot;&gt;opam repository&lt;/a&gt; with the relevant changes, without which I would have not been able to make any progress. Thanks, guys! So my goal of making my notebooks work with it looked doable, but I almost immediately hit more dependency issues that make it problematic to port the whole site over - including odoc and various PPXes that I use.&lt;/p&gt; 1216 - &lt;p&gt;I've therefore decided that I would bring forward a feature that I'd had in mind for a while - that we could have different &amp;quot;backends&amp;quot; for the notebooks. So I'd still build the frontend using &amp;quot;normal&amp;quot; OCaml, but the web-worker serving as the toplevel would be an oxcaml one.&lt;/p&gt; 1217 - &lt;p&gt;Of course, it didn't work first time! After a bit of head-scratching, it turned out that the interface between the worker and the main thread, although I'd &lt;i&gt;almost&lt;/i&gt; got it ocaml-agnostic, wasn't quite right. The way it works is that it uses the jsonrpc protocol to communicate, and while it had marshalled the requests into a string, it hadn't turned that final OCaml string into a Javascript string, so it was sending the js_of_ocaml representation of a string as an object, rather than a simple string. When the frontend and workers were built with different compilers, this ended up just failing with an obscure error, which took a good deal of time to track down. Once that was fixed, it was just a case of making sure I could have 2 independent 'switches' on my site - one for oxcaml and one for standard OCaml.&lt;/p&gt; 1218 - &lt;p&gt;The upshot of all this is that I now have a semi-working version of the notebooks using oxcaml. As an initial demonstration I ported one of my colleague &lt;a href=&quot;https://github.com/cuihtlauac&quot;&gt;Cuihtlauac&lt;/a&gt;'s oxcaml tutorial docs to the notebook format, and it &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/oxcaml/local&quot;&gt;works quite nicely&lt;/span&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/lots-of-things.html</id><title type="text">Lots of things have been happening</title><updated>2025-05-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">My colleague and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has don...</summary><published>2025-05-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;solving-first-year-ocaml-exercises-with-ai&quot;&gt;&lt;a href=&quot;#solving-first-year-ocaml-exercises-with-ai&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Solving First-year OCaml exercises with AI&lt;/h1&gt; 1219 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1220 - &lt;p&gt;My colleague &lt;a href=&quot;https://toao.com&quot;&gt;Sadiq Jaffer&lt;/a&gt; and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has done an excellent &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;write up&lt;/a&gt; of our initial results, which you should all go and read! The tl;dr though, as Sadiq writes, is that even some of the smaller models would score top marks on these exercises!&lt;/p&gt; 1221 - &lt;p&gt;One interesting aspect we discovered quite quickly is that we had to make the testing feedback a little more generous than just &amp;quot;exception raised&amp;quot;! The problems are presented as a Jupyter notebook using &lt;a href=&quot;https://github.com/akabe&quot;&gt;akabe's&lt;/a&gt; excellent OCaml kernel, with &lt;a href=&quot;https://nbgrader.readthedocs.io/en/stable/&quot;&gt;nbgrader&lt;/a&gt; to do the assessment. Our students can see the tests that are run, and if they fail they're able to copy the test cell out and play with their code to figure out exactly what went wrong. The AI models, however, have a far less interactive experience, and get just 3 chances to write code that passes the tests. We found that the performance of the models increased hugely when we adjusted the test cells such that they clearly indicated which test failed, the results that were expected, and the results the code actually produced.&lt;/p&gt; 1222 - &lt;p&gt;Of course, we &lt;a href=&quot;https://anil.recoil.org/notes/claude-copilot-sandbox&quot;&gt;already knew&lt;/a&gt; that AI models can code OCaml very well, and we (along with the rest of the teaching world) are still ruminating on the implications of this from a pedagogical perspective. Our plan, though, is to try and make the 'problem' worse by training these models on more OCaml code, and see just how well we can get them to perform! It's pretty amazing, and a little startling to know that a model that'll run pretty comfortably on my laptop can solve these problems so well even without extra training, though given how hot it gets, I'd rather not have the laptop on my actual lap while it's doing so!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html</id><title type="text">Solving First-year OCaml exercises with AI</title><updated>2025-05-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</summary><published>2025-05-02T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;oxcaml-is-getting-closer...&quot;&gt;&lt;a href=&quot;#oxcaml-is-getting-closer...&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OxCaml is getting closer...&lt;/h1&gt; 1223 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-02&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1224 - &lt;p&gt;I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.&lt;/p&gt; 1225 - &lt;p&gt;It seems that mainly what needs to be done before the release can be made is to ensure there is some reasonable documentation for the new features, and that a reasonable number of packages are working, so people are furiously writing and bugfixing to try and get this ready.&lt;/p&gt; 1226 - &lt;p&gt;As well as this though, there are some challenges of a more organisational level that will need to be addressed to ensure the success of the project. Jane Street have long had a public branch of their compiler, but while they've had patches internally to ensure the tooling and other libraries work, these patches haven't previously been made public in a usable way. In order for OxCaml to be useful, it will clearly need these patches not only to be available, but also to be maintained and to easily allow contributions from the community -- in short, they need to be properly Open Source!&lt;/p&gt; 1227 - &lt;p&gt;Personally, I'm looking forward to seeing their branch of &lt;a href=&quot;https://ocaml.github.io/odoc/&quot;&gt;odoc&lt;/a&gt; and having a look to see how the modes will fit into the documentation. I'm also keen to see whether the &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/this-site&quot;&gt;notebook features&lt;/span&gt; I've been working on can be ported over to run on OxCaml!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html</id><title type="text">OxCaml is getting closer...</title><updated>2025-05-02T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text"> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the . A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</summary><published>2025-05-01T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;ai-for-climate-&amp;amp;-nature-community-day&quot;&gt;&lt;a href=&quot;#ai-for-climate-&amp;amp;-nature-community-day&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;AI for Climate &amp;amp; Nature Community Day&lt;/h1&gt; 1228 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-01&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1229 - &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Melissa Leach&lt;/span&gt;&lt;/div&gt; 1230 - &lt;p&gt;&lt;i&gt;Melissa Leach introducing the day&lt;/i&gt; Today was the &amp;quot;AI for Climate &amp;amp; Nature Community Day&amp;quot; at the &lt;a href=&quot;https://map.cam.ac.uk/?maplon=0.12032&amp;amp;maplat=52.20354&amp;amp;mapzoom=18&amp;amp;maplayers=Building+Labels%2CExternal+Sites%2CColleges%2CUniversity+Sites%2CBuildings%2CTransport&amp;amp;mapfeature=mfid257%2CBuildings&quot;&gt;David Attenborough Building&lt;/a&gt;. A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.&lt;/p&gt; 1231 - &lt;h2 id=&quot;keynote-and-main-talks&quot;&gt;&lt;a href=&quot;#keynote-and-main-talks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Keynote and main talks&lt;/h2&gt; 1232 - &lt;p&gt;Following the intro talks from Professors &lt;a href=&quot;https://www.cambridgeconservation.org/about/people/prof-melissa-leach/&quot;&gt;Melissa Leach&lt;/a&gt; and &lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/bill-sutherland&quot;&gt;Bill Sutherland&lt;/a&gt;, the day started with the keynote talk from &lt;a href=&quot;https://www.biology.ox.ac.uk/people/amy-hinsley&quot;&gt;Amy Hinsley&lt;/a&gt;, who, using the specific case of animial trafficking, talked about the need to make AI in conservation equitable, explainable and useful.&lt;/p&gt; 1233 - &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Amy Hinsley&lt;/span&gt;&lt;/div&gt; 1234 - &lt;p&gt;&lt;i&gt;Amy Hinsley delivering the keynote talk&lt;/i&gt;&lt;/p&gt; 1235 - &lt;p&gt;We then moved into the first session with &lt;a href=&quot;https://www.geog.cam.ac.uk/people/lines/&quot;&gt;Emily Lines&lt;/a&gt; from the Geography Department who talked about the challenges processing sensor data in the context of forests. Her group has a variety of data collected from forests across Europe, collected from using many different methods, from drones taking pictures of the canopies to ground-based laser scanners producing 3d point clouds. The challenge is then not only to identify individual trees, which is pretty tricky, but also to then distinguish between the leaves of the trees and the wood.&lt;/p&gt; 1236 - &lt;p&gt;After Emily came &lt;a href=&quot;https://ai.cam.ac.uk/people/robert-rouse.html&quot;&gt;Robert Rouse&lt;/a&gt; from the &lt;a href=&quot;https://iccs.cam.ac.uk&quot;&gt;ICCS&lt;/a&gt;, who's using a small neural net and genetic algorithms to extend a study from the RSPB on figuring out an optimal way to do some land use adjustments to cut carbon and improve outcomes for birds, whilst not significantly impacting the ability to produce food.&lt;/p&gt; 1237 - &lt;p&gt;We then had &lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/dr-sam-reynolds&quot;&gt;Sam Reynolds&lt;/a&gt; and &lt;a href=&quot;https://toao.com&quot;&gt;Sadiq Jaffer&lt;/a&gt; who talked about their project; using AI to sift through millions of papers looking for those relevant to a specified conservation topic. They're able to directly compare their results with results obtained by manually doing this process, a project that's been going on over the last 20 or so years summing to something like 75 man years of effort. In the end they only missed a few papers that the manual process had found, but actually found many relevant papers that had been missed - and all in only a few days of compute.&lt;/p&gt; 1238 - &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Sam Reynolds and Sadiq Jaffer&lt;/span&gt;&lt;/div&gt; 1239 - &lt;p&gt;&lt;i&gt;Sam Reynolds and Sadiq Jaffer sorting millions of papers&lt;/i&gt;&lt;/p&gt; 1240 - &lt;h2 id=&quot;lightning-talks&quot;&gt;&lt;a href=&quot;#lightning-talks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Lightning talks&lt;/h2&gt; 1241 - &lt;p&gt;We then had a number of 'lightning talks', with each presenter having only three minutes to talk about their work.&lt;/p&gt; 1242 - &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://www.maths.cam.ac.uk/person/ss3299&quot;&gt;Sebastian Schemm&lt;/a&gt; presented his work on creating a foundational model for the climate&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.eng.cam.ac.uk/profiles/ac685&quot;&gt;Alice Cicirello&lt;/a&gt; talked about the prospects of applying machine learning to &lt;a href=&quot;https://en.wikipedia.org/wiki/Marine_cloud_brightening&quot;&gt;Marine Cloud Brightening&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.maths.cam.ac.uk/person/sdat2&quot;&gt;Simon Thomas&lt;/a&gt; has been looking at analysing the heights of tropical cyclone storm surges&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://github.com/niccolozanotti&quot;&gt;Niccolò Zanotti&lt;/a&gt; gave us an introduction to &lt;a href=&quot;https://github.com/cambridge-ICCS/FTorch&quot;&gt;FTorch&lt;/a&gt;, a library to integrate the worlds of PyTorch and Fortran&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.nceo.ac.uk/contact-us/people/dr-simon-driscoll/&quot;&gt;Simon Driscoll&lt;/a&gt; then talked about melt ponds on arctic sea ice, a poorly understood but important component of the climate in the Arctic.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/emilio-luz-ricca&quot;&gt;Emilio Luz-Ricca&lt;/a&gt; talked about his project to apply machine learning to predict hunting pressure&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://orlando-code.github.io&quot;&gt;Orlando Timmerman&lt;/a&gt; gave us some insights into how he's been using machine learning to predict the future of coral reefs, and how we might use this to help with their conservation.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/ruari-marshall-hawkes&quot;&gt;Ruari Marshall-Hawkes&lt;/a&gt; showed us how to listen very carefully to figure out population numbers,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/harriet-branson-a93a8313b/&quot;&gt;Hattie Branson&lt;/a&gt; from &lt;a href=&quot;https://www.fauna-flora.org&quot;&gt;Fauna &amp;amp; Flora&lt;/a&gt; talked about habitat detection in South Sudan,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/martakoch/&quot;&gt;Marta Koch&lt;/a&gt; showed us an analysis of how well ChatGPT, Claude and the like would perform at setting the agendas for SDPs,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/zhengpeng-feng-2410a132a/&quot;&gt;Frank Feng&lt;/a&gt; finished the session with a talk on the &lt;a href=&quot;https://www.cst.cam.ac.uk/seminars/list/227335&quot;&gt;Barlow Twins Earth Foundation Model&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt; 1460 + </pre> 1461 + <p>where I'm careful to maintain the position of the original code. This bit is working quite nicely, but only when the code is syntactically correct, as I'm using the standard toplevel parser to figure out where the expression ends. I think I'm going to end up needing to write a custom parser for this, so something that will end on a <code>;;</code> but ignore them in string constants, comments and so on.</p> 1462 + <p>The approach I've taken for the second problem is to treat each cell as a separate module. I then write out a <code>cmi</code> file into the virtual filesystem as <code>cell__id_0.cmi</code> and <code>open</code> all the previous modules in 'line 0' of every cell. I then remap all of the reported locations by removing 'line 0'.</p> 1463 + <p>There are a number of issues with the current approaches: 1. The stripping of the toplevel bits is a little fragile, and currently only works when the toplevel is syntactically correct. This is fairly fixable. 2. When the contents of the cells change we need to flush any caches merlin and the compiler have. 3. An <code>open</code> statement in once cell does _not_ cause the module to be available in the next cell. 4. A lot of cells leads to a lot of opens!</p> 1464 + <p>I suspect that this the 'cells as modules' approach might end up being a bit of a dead-end, so I'll have a chat with <a href="https://github.com/voodoos">Ulysse</a> to figure out the next experiment.</p> 1465 + <h2 id="oxcaml"><a href="#oxcaml" class="anchor"></a>Oxcaml</h2> 1466 + <p>I've been working on trying out oxcaml too, which has been a bit challenging. Firstly, although Jane Street provide a version of <code>js_of_ocaml</code>, the toplevel didn't work. Fortunately, my amazing colleagues <a href="https://patrick.sirref.org/">Patrick O'Ferris</a> and <a href="https://github.com/art-w">Arthur Wendling</a> spent a good chunk of time fixing this and provided an <a href="https://github.com/patricoferris/opam-repository-js#with-extensions">opam repository</a> with the relevant changes, without which I would have not been able to make any progress. Thanks, guys! So my goal of making my notebooks work with it looked doable, but I almost immediately hit more dependency issues that make it problematic to port the whole site over - including odoc and various PPXes that I use.</p> 1467 + <p>I've therefore decided that I would bring forward a feature that I'd had in mind for a while - that we could have different &quot;backends&quot; for the notebooks. So I'd still build the frontend using &quot;normal&quot; OCaml, but the web-worker serving as the toplevel would be an oxcaml one.</p> 1468 + <p>Of course, it didn't work first time! After a bit of head-scratching, it turned out that the interface between the worker and the main thread, although I'd <i>almost</i> got it ocaml-agnostic, wasn't quite right. The way it works is that it uses the jsonrpc protocol to communicate, and while it had marshalled the requests into a string, it hadn't turned that final OCaml string into a Javascript string, so it was sending the js_of_ocaml representation of a string as an object, rather than a simple string. When the frontend and workers were built with different compilers, this ended up just failing with an obscure error, which took a good deal of time to track down. Once that was fixed, it was just a case of making sure I could have 2 independent 'switches' on my site - one for oxcaml and one for standard OCaml.</p> 1469 + <p>The upshot of all this is that I now have a semi-working version of the notebooks using oxcaml. As an initial demonstration I ported one of my colleague <a href="https://github.com/cuihtlauac">Cuihtlauac</a>'s oxcaml tutorial docs to the notebook format, and it <a href="../../../notebooks/oxcaml/local.html" title="local">works quite nicely</a>.</p>]]></content> 1470 + </entry> 1471 + <entry> 1472 + <id>https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html</id> 1473 + <title>Solving First-year OCaml exercises with AI</title> 1474 + <published>2025-05-07T00:00:00Z</published> 1475 + <updated>2025-05-07T00:00:00Z</updated> 1476 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html"/> 1477 + <summary>My colleague and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has don...</summary> 1478 + <content type="html"><![CDATA[<h1 id="solving-first-year-ocaml-exercises-with-ai"><a href="#solving-first-year-ocaml-exercises-with-ai" class="anchor"></a>Solving First-year OCaml exercises with AI</h1> 1479 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-07</p></li></ul> 1480 + <p>My colleague <a href="https://toao.com">Sadiq Jaffer</a> and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has done an excellent <a href="https://toao.com/blog/ocaml-local-code-models">write up</a> of our initial results, which you should all go and read! The tl;dr though, as Sadiq writes, is that even some of the smaller models would score top marks on these exercises!</p> 1481 + <p>One interesting aspect we discovered quite quickly is that we had to make the testing feedback a little more generous than just &quot;exception raised&quot;! The problems are presented as a Jupyter notebook using <a href="https://github.com/akabe">akabe's</a> excellent OCaml kernel, with <a href="https://nbgrader.readthedocs.io/en/stable/">nbgrader</a> to do the assessment. Our students can see the tests that are run, and if they fail they're able to copy the test cell out and play with their code to figure out exactly what went wrong. The AI models, however, have a far less interactive experience, and get just 3 chances to write code that passes the tests. We found that the performance of the models increased hugely when we adjusted the test cells such that they clearly indicated which test failed, the results that were expected, and the results the code actually produced.</p> 1482 + <p>Of course, we <a href="https://anil.recoil.org/notes/claude-copilot-sandbox">already knew</a> that AI models can code OCaml very well, and we (along with the rest of the teaching world) are still ruminating on the implications of this from a pedagogical perspective. Our plan, though, is to try and make the 'problem' worse by training these models on more OCaml code, and see just how well we can get them to perform! It's pretty amazing, and a little startling to know that a model that'll run pretty comfortably on my laptop can solve these problems so well even without extra training, though given how hot it gets, I'd rather not have the laptop on my actual lap while it's doing so!</p>]]></content> 1483 + </entry> 1484 + <entry> 1485 + <id>https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html</id> 1486 + <title>OxCaml is getting closer...</title> 1487 + <published>2025-05-02T00:00:00Z</published> 1488 + <updated>2025-05-02T00:00:00Z</updated> 1489 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html"/> 1490 + <summary>I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</summary> 1491 + <content type="html"><![CDATA[<h1 id="oxcaml-is-getting-closer..."><a href="#oxcaml-is-getting-closer..." class="anchor"></a>OxCaml is getting closer...</h1> 1492 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-02</p></li></ul> 1493 + <p>I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</p> 1494 + <p>It seems that mainly what needs to be done before the release can be made is to ensure there is some reasonable documentation for the new features, and that a reasonable number of packages are working, so people are furiously writing and bugfixing to try and get this ready.</p> 1495 + <p>As well as this though, there are some challenges of a more organisational level that will need to be addressed to ensure the success of the project. Jane Street have long had a public branch of their compiler, but while they've had patches internally to ensure the tooling and other libraries work, these patches haven't previously been made public in a usable way. In order for OxCaml to be useful, it will clearly need these patches not only to be available, but also to be maintained and to easily allow contributions from the community -- in short, they need to be properly Open Source!</p> 1496 + <p>Personally, I'm looking forward to seeing their branch of <a href="https://ocaml.github.io/odoc/">odoc</a> and having a look to see how the modes will fit into the documentation. I'm also keen to see whether the <a href="../04/this-site.html" title="this-site">notebook features</a> I've been working on can be ported over to run on OxCaml!</p>]]></content> 1497 + </entry> 1498 + <entry> 1499 + <id>https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html</id> 1500 + <title>AI for Climate &amp; Nature Community Day</title> 1501 + <published>2025-05-01T00:00:00Z</published> 1502 + <updated>2025-05-01T00:00:00Z</updated> 1503 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html"/> 1504 + <summary> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the . A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</summary> 1505 + <content type="html"><![CDATA[<h1 id="ai-for-climate-&amp;-nature-community-day"><a href="#ai-for-climate-&amp;-nature-community-day" class="anchor"></a>AI for Climate &amp; Nature Community Day</h1> 1506 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-01</p></li></ul> 1507 + <div><a href="melissa.jpg" class="img-link"><img src="melissa.jpg" alt="Melissa Leach"/></a></div> 1508 + <p><i>Melissa Leach introducing the day</i> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the <a href="https://map.cam.ac.uk/?maplon=0.12032&amp;maplat=52.20354&amp;mapzoom=18&amp;maplayers=Building+Labels%2CExternal+Sites%2CColleges%2CUniversity+Sites%2CBuildings%2CTransport&amp;mapfeature=mfid257%2CBuildings">David Attenborough Building</a>. A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</p> 1509 + <h2 id="keynote-and-main-talks"><a href="#keynote-and-main-talks" class="anchor"></a>Keynote and main talks</h2> 1510 + <p>Following the intro talks from Professors <a href="https://www.cambridgeconservation.org/about/people/prof-melissa-leach/">Melissa Leach</a> and <a href="https://www.zoo.cam.ac.uk/directory/bill-sutherland">Bill Sutherland</a>, the day started with the keynote talk from <a href="https://www.biology.ox.ac.uk/people/amy-hinsley">Amy Hinsley</a>, who, using the specific case of animial trafficking, talked about the need to make AI in conservation equitable, explainable and useful.</p> 1511 + <div><a href="amy.jpg" class="img-link"><img src="amy.jpg" alt="Amy Hinsley"/></a></div> 1512 + <p><i>Amy Hinsley delivering the keynote talk</i></p> 1513 + <p>We then moved into the first session with <a href="https://www.geog.cam.ac.uk/people/lines/">Emily Lines</a> from the Geography Department who talked about the challenges processing sensor data in the context of forests. Her group has a variety of data collected from forests across Europe, collected from using many different methods, from drones taking pictures of the canopies to ground-based laser scanners producing 3d point clouds. The challenge is then not only to identify individual trees, which is pretty tricky, but also to then distinguish between the leaves of the trees and the wood.</p> 1514 + <p>After Emily came <a href="https://ai.cam.ac.uk/people/robert-rouse.html">Robert Rouse</a> from the <a href="https://iccs.cam.ac.uk">ICCS</a>, who's using a small neural net and genetic algorithms to extend a study from the RSPB on figuring out an optimal way to do some land use adjustments to cut carbon and improve outcomes for birds, whilst not significantly impacting the ability to produce food.</p> 1515 + <p>We then had <a href="https://www.zoo.cam.ac.uk/directory/dr-sam-reynolds">Sam Reynolds</a> and <a href="https://toao.com">Sadiq Jaffer</a> who talked about their project; using AI to sift through millions of papers looking for those relevant to a specified conservation topic. They're able to directly compare their results with results obtained by manually doing this process, a project that's been going on over the last 20 or so years summing to something like 75 man years of effort. In the end they only missed a few papers that the manual process had found, but actually found many relevant papers that had been missed - and all in only a few days of compute.</p> 1516 + <div><a href="sadiq.jpg" class="img-link"><img src="sadiq.jpg" alt="Sam Reynolds and Sadiq Jaffer"/></a></div> 1517 + <p><i>Sam Reynolds and Sadiq Jaffer sorting millions of papers</i></p> 1518 + <h2 id="lightning-talks"><a href="#lightning-talks" class="anchor"></a>Lightning talks</h2> 1519 + <p>We then had a number of 'lightning talks', with each presenter having only three minutes to talk about their work.</p> 1520 + <ul><li><a href="https://www.maths.cam.ac.uk/person/ss3299">Sebastian Schemm</a> presented his work on creating a foundational model for the climate</li><li><a href="https://www.eng.cam.ac.uk/profiles/ac685">Alice Cicirello</a> talked about the prospects of applying machine learning to <a href="https://en.wikipedia.org/wiki/Marine_cloud_brightening">Marine Cloud Brightening</a></li><li><a href="https://www.maths.cam.ac.uk/person/sdat2">Simon Thomas</a> has been looking at analysing the heights of tropical cyclone storm surges</li><li><a href="https://github.com/niccolozanotti">Niccolò Zanotti</a> gave us an introduction to <a href="https://github.com/cambridge-ICCS/FTorch">FTorch</a>, a library to integrate the worlds of PyTorch and Fortran</li><li><a href="https://www.nceo.ac.uk/contact-us/people/dr-simon-driscoll/">Simon Driscoll</a> then talked about melt ponds on arctic sea ice, a poorly understood but important component of the climate in the Arctic.</li><li><a href="https://www.zoo.cam.ac.uk/directory/emilio-luz-ricca">Emilio Luz-Ricca</a> talked about his project to apply machine learning to predict hunting pressure</li><li><a href="https://orlando-code.github.io">Orlando Timmerman</a> gave us some insights into how he's been using machine learning to predict the future of coral reefs, and how we might use this to help with their conservation.</li><li><a href="https://www.zoo.cam.ac.uk/directory/ruari-marshall-hawkes">Ruari Marshall-Hawkes</a> showed us how to listen very carefully to figure out population numbers,</li><li><a href="https://www.linkedin.com/in/harriet-branson-a93a8313b/">Hattie Branson</a> from <a href="https://www.fauna-flora.org">Fauna &amp; Flora</a> talked about habitat detection in South Sudan,</li><li><a href="https://www.linkedin.com/in/martakoch/">Marta Koch</a> showed us an analysis of how well ChatGPT, Claude and the like would perform at setting the agendas for SDPs,</li><li><a href="https://www.linkedin.com/in/zhengpeng-feng-2410a132a/">Frank Feng</a> finished the session with a talk on the <a href="https://www.cst.cam.ac.uk/seminars/list/227335">Barlow Twins Earth Foundation Model</a>.</li></ul> 1243 1521 1244 - &lt;div style=&quot;display: grid; grid-template-columns: 1fr 1fr; gap: 20px;&quot;&gt; 1245 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1246 - &lt;img src=&quot;sebastian.jpg&quot; alt=&quot;Sebastian Schemm&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1247 - &lt;figcaption&gt;Sebastian Schemm&lt;/figcaption&gt; 1248 - &lt;/figure&gt; 1249 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1250 - &lt;img src=&quot;alice.jpg&quot; alt=&quot;Alice Cicirello&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1251 - &lt;figcaption&gt;Alice Cicirello&lt;/figcaption&gt; 1252 - &lt;/figure&gt; 1253 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1254 - &lt;img src=&quot;simon.jpg&quot; alt=&quot;Simon Thomas&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1255 - &lt;figcaption&gt;Simon Thomas&lt;/figcaption&gt; 1256 - &lt;/figure&gt; 1257 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1258 - &lt;img src=&quot;simond.jpg&quot; alt=&quot;Simon Driscoll&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1259 - &lt;figcaption&gt;Simon Driscoll&lt;/figcaption&gt; 1260 - &lt;/figure&gt; 1261 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1262 - &lt;img src=&quot;emilio.jpg&quot; alt=&quot;Emilio Luz-Ricca&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1263 - &lt;figcaption&gt;Emilio Luz-Ricca&lt;/figcaption&gt; 1264 - &lt;/figure&gt; 1265 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1266 - &lt;img src=&quot;orlando.jpg&quot; alt=&quot;Orlando Timmerman&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1267 - &lt;figcaption&gt;Orlando Timmerman&lt;/figcaption&gt; 1268 - &lt;/figure&gt; 1269 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1270 - &lt;img src=&quot;ruari.jpg&quot; alt=&quot;Ruari Marshall-Hawkes&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1271 - &lt;figcaption&gt;Ruari Marshall-Hawkes&lt;/figcaption&gt; 1272 - &lt;/figure&gt; 1273 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1274 - &lt;img src=&quot;hattie.jpg&quot; alt=&quot;Hattie Branson&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1275 - &lt;figcaption&gt;Hattie Branson&lt;/figcaption&gt; 1276 - &lt;/figure&gt; 1277 - &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1278 - &lt;img src=&quot;frank.jpg&quot; alt=&quot;Frank Feng&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1279 - &lt;figcaption&gt;Frank Feng&lt;/figcaption&gt; 1280 - &lt;/figure&gt; 1522 + <div style="display: grid; grid-template-columns: 1fr 1fr; gap: 20px;"> 1523 + <figure style="margin:0; width: 100%;"> 1524 + <img src="sebastian.jpg" alt="Sebastian Schemm" style="max-width: 100%; height: auto;"> 1525 + <figcaption>Sebastian Schemm</figcaption> 1526 + </figure> 1527 + <figure style="margin:0; width: 100%;"> 1528 + <img src="alice.jpg" alt="Alice Cicirello" style="max-width: 100%; height: auto;"> 1529 + <figcaption>Alice Cicirello</figcaption> 1530 + </figure> 1531 + <figure style="margin:0; width: 100%;"> 1532 + <img src="simon.jpg" alt="Simon Thomas" style="max-width: 100%; height: auto;"> 1533 + <figcaption>Simon Thomas</figcaption> 1534 + </figure> 1535 + <figure style="margin:0; width: 100%;"> 1536 + <img src="simond.jpg" alt="Simon Driscoll" style="max-width: 100%; height: auto;"> 1537 + <figcaption>Simon Driscoll</figcaption> 1538 + </figure> 1539 + <figure style="margin:0; width: 100%;"> 1540 + <img src="emilio.jpg" alt="Emilio Luz-Ricca" style="max-width: 100%; height: auto;"> 1541 + <figcaption>Emilio Luz-Ricca</figcaption> 1542 + </figure> 1543 + <figure style="margin:0; width: 100%;"> 1544 + <img src="orlando.jpg" alt="Orlando Timmerman" style="max-width: 100%; height: auto;"> 1545 + <figcaption>Orlando Timmerman</figcaption> 1546 + </figure> 1547 + <figure style="margin:0; width: 100%;"> 1548 + <img src="ruari.jpg" alt="Ruari Marshall-Hawkes" style="max-width: 100%; height: auto;"> 1549 + <figcaption>Ruari Marshall-Hawkes</figcaption> 1550 + </figure> 1551 + <figure style="margin:0; width: 100%;"> 1552 + <img src="hattie.jpg" alt="Hattie Branson" style="max-width: 100%; height: auto;"> 1553 + <figcaption>Hattie Branson</figcaption> 1554 + </figure> 1555 + <figure style="margin:0; width: 100%;"> 1556 + <img src="frank.jpg" alt="Frank Feng" style="max-width: 100%; height: auto;"> 1557 + <figcaption>Frank Feng</figcaption> 1558 + </figure> 1281 1559 1282 - &lt;/div&gt; 1560 + </div> 1283 1561 1284 - &lt;h2 id=&quot;discussions&quot;&gt;&lt;a href=&quot;#discussions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Discussions&lt;/h2&gt; 1285 - &lt;p&gt;We then split up into three discussion groups; one on the future of this work, one on how to continue building this community of researchers, and the last on applying AI to real-world problems. As a newcomer to the field I was interested in the direction it's heading in, so I joined in &lt;a href=&quot;https://dorchard.github.io&quot;&gt;Dominic Orchard&lt;/a&gt;'s led session on the future of AI.&lt;/p&gt; 1286 - &lt;p&gt;We had a fascinating discussion on both the immediate things we can do and longer term worries. We were imagining a world where AI becomes 'just a tool' that we don't need to be experts in to apply it, but right now we're in a much more tightly coupled collaborative world where we need experts in AI to complement the experts in the application field to make progress. This comes with challenges - applying for funding for multidisciplinary work is not the norm, so we spent some time discussing this too.&lt;/p&gt; 1287 - &lt;p&gt;One of our group spoke about statistics now being 'just a tool', but it's been one that we've worked with for a long time now and we know where the sharp corners are. We have protocols for applying statistical tools and we have diagnostic plots to tell us whether the results are trustworthy, but not only do we not have these for AI models, but it's not yet clear whether such a thing will be even possible.&lt;/p&gt; 1288 - &lt;p&gt;Overall it was a fascinating day, and I'm very much looking forward to following the work of these outstanding researchers, and maybe even contributing to their work in some way in the future.&lt;/p&gt; 1289 - &lt;p&gt;&lt;i&gt;Thanks to &lt;a href=&quot;https://anil.recoil.org&quot;&gt;Anil Madhavapeddy&lt;/a&gt; for the photos of the day.&lt;/i&gt;&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html</id><title type="text">AI for Climate &amp; Nature Community Day</title><updated>2025-05-01T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The release of Odoc 3 means that we need to update the project so that the documentation that appears on is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give t...</summary><published>2025-04-29T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;ocaml-docs-ci-and-odoc-3&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci-and-odoc-3&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml-Docs-CI and Odoc 3&lt;/h1&gt; 1290 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-29&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1291 - &lt;p&gt;The release of Odoc 3 means that we need to update the &lt;em&gt;docs-ci&lt;/em&gt; project so that the documentation that appears on &lt;em&gt;ocaml.org&lt;/em&gt; is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give the CI pipeline a bit of an overhaul too, and fix some of the irritations that it causes.&lt;/p&gt; 1292 - &lt;h2 id=&quot;the-challenge-of-documenting-ocaml&quot;&gt;&lt;a href=&quot;#the-challenge-of-documenting-ocaml&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The challenge of documenting OCaml&lt;/h2&gt; 1293 - &lt;p&gt;As I wrote about &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/semantic-versioning-is-hard&quot;&gt;recently&lt;/span&gt;, the APIs of OCaml libraries are dependent not only on the version of its package, but possibly also on the versions of any of its dependencies. Due to this fact, to produce the docs for ocaml.org means that sometimes we need to build the docs for a particular version of a particular package multiple times with different versions of its dependencies.&lt;/p&gt; 1294 - &lt;p&gt;It's clearly impractical to try to build every possible combination, so what we do is to run the opam solver once for each version of each package. This gives us a set of packages at particular versions. We then take that, and for each package in the set, we pluck out &lt;i&gt;its&lt;/i&gt; dependencies from the set, producing a &amp;quot;universe&amp;quot; of dependencies for every package in the set. Let's look at a very simple example; the package &lt;code&gt;cry&lt;/code&gt; from the &lt;a href=&quot;https://www.liquidsoap.info&quot;&gt;LiquidSoap&lt;/a&gt; project.&lt;/p&gt; 1295 - &lt;p&gt;The oldest version of &lt;code&gt;cry&lt;/code&gt; from before the &lt;a href=&quot;https://discuss.ocaml.org/t/opam-repository-archival-phase-1-unavailable-packages/15797/6&quot;&gt;Great Purge&lt;/a&gt; was 0.2.2, which when solved produced the following dependencies:&lt;/p&gt; 1296 - &lt;pre&gt;cry.0.2.2 1562 + <h2 id="discussions"><a href="#discussions" class="anchor"></a>Discussions</h2> 1563 + <p>We then split up into three discussion groups; one on the future of this work, one on how to continue building this community of researchers, and the last on applying AI to real-world problems. As a newcomer to the field I was interested in the direction it's heading in, so I joined in <a href="https://dorchard.github.io">Dominic Orchard</a>'s led session on the future of AI.</p> 1564 + <p>We had a fascinating discussion on both the immediate things we can do and longer term worries. We were imagining a world where AI becomes 'just a tool' that we don't need to be experts in to apply it, but right now we're in a much more tightly coupled collaborative world where we need experts in AI to complement the experts in the application field to make progress. This comes with challenges - applying for funding for multidisciplinary work is not the norm, so we spent some time discussing this too.</p> 1565 + <p>One of our group spoke about statistics now being 'just a tool', but it's been one that we've worked with for a long time now and we know where the sharp corners are. We have protocols for applying statistical tools and we have diagnostic plots to tell us whether the results are trustworthy, but not only do we not have these for AI models, but it's not yet clear whether such a thing will be even possible.</p> 1566 + <p>Overall it was a fascinating day, and I'm very much looking forward to following the work of these outstanding researchers, and maybe even contributing to their work in some way in the future.</p> 1567 + <p><i>Thanks to <a href="https://anil.recoil.org">Anil Madhavapeddy</a> for the photos of the day.</i></p>]]></content> 1568 + </entry> 1569 + <entry> 1570 + <id>https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html</id> 1571 + <title>OCaml-Docs-CI and Odoc 3</title> 1572 + <published>2025-04-29T00:00:00Z</published> 1573 + <updated>2025-04-29T00:00:00Z</updated> 1574 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html"/> 1575 + <summary>The release of Odoc 3 means that we need to update the project so that the documentation that appears on is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give t...</summary> 1576 + <content type="html"><![CDATA[<h1 id="ocaml-docs-ci-and-odoc-3"><a href="#ocaml-docs-ci-and-odoc-3" class="anchor"></a>OCaml-Docs-CI and Odoc 3</h1> 1577 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-29</p></li></ul> 1578 + <p>The release of Odoc 3 means that we need to update the <em>docs-ci</em> project so that the documentation that appears on <em>ocaml.org</em> is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give the CI pipeline a bit of an overhaul too, and fix some of the irritations that it causes.</p> 1579 + <h2 id="the-challenge-of-documenting-ocaml"><a href="#the-challenge-of-documenting-ocaml" class="anchor"></a>The challenge of documenting OCaml</h2> 1580 + <p>As I wrote about <a href="semantic-versioning-is-hard.html" title="semantic-versioning-is-hard">recently</a>, the APIs of OCaml libraries are dependent not only on the version of its package, but possibly also on the versions of any of its dependencies. Due to this fact, to produce the docs for ocaml.org means that sometimes we need to build the docs for a particular version of a particular package multiple times with different versions of its dependencies.</p> 1581 + <p>It's clearly impractical to try to build every possible combination, so what we do is to run the opam solver once for each version of each package. This gives us a set of packages at particular versions. We then take that, and for each package in the set, we pluck out <i>its</i> dependencies from the set, producing a &quot;universe&quot; of dependencies for every package in the set. Let's look at a very simple example; the package <code>cry</code> from the <a href="https://www.liquidsoap.info">LiquidSoap</a> project.</p> 1582 + <p>The oldest version of <code>cry</code> from before the <a href="https://discuss.ocaml.org/t/opam-repository-archival-phase-1-unavailable-packages/15797/6">Great Purge</a> was 0.2.2, which when solved produced the following dependencies:</p> 1583 + <pre>cry.0.2.2 1297 1584 ocaml.4.05.0 1298 1585 ocaml-base-compiler.4.05.0 1299 1586 ocaml-config.1 1300 - ocamlfind.1.9.6&lt;/pre&gt; 1301 - &lt;p&gt;and the oldest version of &lt;code&gt;cry&lt;/code&gt; after the purge is 0.6.0 which produces the following solution:&lt;/p&gt; 1302 - &lt;pre&gt;cry.0.6.0 1587 + ocamlfind.1.9.6</pre> 1588 + <p>and the oldest version of <code>cry</code> after the purge is 0.6.0 which produces the following solution:</p> 1589 + <pre>cry.0.6.0 1303 1590 ocaml.5.2.1 1304 1591 ocaml-base-compiler.5.2.1 1305 1592 ocaml-config.3 1306 - ocamlfind.1.9.6&lt;/pre&gt; 1307 - &lt;p&gt;so we we can see from these two solutions that we'll need to build &lt;code&gt;ocamlfind.1.9.6&lt;/code&gt; twice, once with &lt;code&gt;ocaml.4.05.0&lt;/code&gt; and once with &lt;code&gt;ocaml.5.2.1&lt;/code&gt;.&lt;/p&gt; 1308 - &lt;p&gt;Once we've got, for every version of every package, a set of dependency universes, we choose one of these to be the one presented to the user under the &lt;code&gt;ocaml.org/p/&lt;/code&gt; hierarchy. For all of the other universes, we build the package againt them, and put the docs under the &lt;code&gt;ocaml.org/u/&lt;/code&gt; hierarchy.&lt;/p&gt; 1309 - &lt;h2 id=&quot;performing-the-builds&quot;&gt;&lt;a href=&quot;#performing-the-builds&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Performing the builds&lt;/h2&gt; 1310 - &lt;p&gt;Once we've got a complete set of solutions and builds to do, the current CI pipeline batches the builds up to try and build as many packages as possible in as few builds as possible. While this works well enough, it does mean that we build a lot packages more than once - dune, for example, is built thousands of times during this process, producing exactly the same binaries each time.&lt;/p&gt; 1311 - &lt;p&gt;In the new pipeline, I wrote a &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;little tool&lt;/a&gt; that allows opam packages to be archived and restored, which happens to work nicely because we're always building the packages in the same container in the same location. This means there are no worries about relocatability, although that is something that is &lt;a href=&quot;https://www.dra27.uk/blog/platform/2025/04/22/branching-out.html&quot;&gt;nearly here!&lt;/a&gt;&lt;/p&gt; 1312 - &lt;p&gt;The downside to this is that our storage requirements are quite a bit larger, as we're storing the entire package rather than just the bits that odoc needs. However, we were always going to use more storage than before simply because the new &lt;code&gt;odoc&lt;/code&gt; and &lt;code&gt;odoc_driver&lt;/code&gt; pair are more capable, and the new features like &lt;a href=&quot;https://github.com/ocaml/odoc/pull/909&quot;&gt;source code rendering&lt;/a&gt; and &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1121/files#diff-10c8829023814c0bcc3316f95f643623404c000b13c68849ef3d61097a6e03a6R1-R415&quot;&gt;classify&lt;/a&gt; require more files from the original package.&lt;/p&gt; 1313 - &lt;p&gt;The upshot is that I'll be working with &lt;a href=&quot;https://www.tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; to move the docs CI from its current machine to a shiny new &lt;a href=&quot;https://www.tunbury.org/blade-reallocation/&quot;&gt;blade server&lt;/a&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html</id><title type="text">OCaml-Docs-CI and Odoc 3</title><updated>2025-04-29T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Odoc 3 was and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</summary><published>2025-04-25T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/odoc-3.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-3:-so-what?&quot;&gt;&lt;a href=&quot;#odoc-3:-so-what?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc 3: So what?&lt;/h1&gt; 1314 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-25&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1315 - &lt;p&gt;Odoc 3 was &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339&quot;&gt;released last month&lt;/a&gt; and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.&lt;/p&gt; 1316 - &lt;p&gt;It's &lt;b&gt;manuals&lt;/b&gt;, the theme of Odoc 3 is &lt;b&gt;manuals&lt;/b&gt;. It's got a load of features to make it much better for writing &lt;code&gt;mld&lt;/code&gt; pages (files written using odoc's markup) to document your packages and their relationship to the surrounding ecosystem. Previous versions of Odoc were very library-centric, in that while we did have mld-file support, most of the effort went into making sure that we were generating correct per-module pages, which show the shape of your API even if you've not put in any doc comments at all. We've still got that, obviously, but we've added many features to make write &lt;code&gt;mld&lt;/code&gt; pages far more useful, and we're really hoping that these will draw people in to make documenting packages a much more enjoyable experience.&lt;/p&gt; 1317 - &lt;h2 id=&quot;odoc's-special-skill:-links!&quot;&gt;&lt;a href=&quot;#odoc's-special-skill:-links!&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc's special skill: links!&lt;/h2&gt; 1318 - &lt;p&gt;But why you might want to use Odoc at all for your package's manuals, rather than, say, markdown, asciidoc, rst or any other similar language? The biggest thing that Odoc brings, and has always brought, is &lt;b&gt;reliable linking&lt;/b&gt;. Just write &lt;code&gt;{!Module.func}&lt;/code&gt; and Odoc will check that the target exists and ensure that the link goes to the correct place, no matter how complex the definition of &lt;code&gt;Module&lt;/code&gt; is or what the layout of the docs. We can link to almost all elements of an OCaml library, from modules and types through to fields of records, exceptions and extensions, and we have facilities for disambiguating, so if you happen to have both a module &lt;code&gt;S&lt;/code&gt; and a module type &lt;code&gt;S&lt;/code&gt; you can easily link to whichever you please.&lt;/p&gt; 1319 - &lt;p&gt;In Odoc 2 though, these links were pretty limited - the only ones possible were only those to docs and API elements (modules, types, values, etc) in your own package, or to API elements in any libraries that your package depends on. When writing API docs, which tend to be at the level of types and functions, this wasn't a huge problem, but when considering manuals this turned out to be a really limiting constraint. For example, in Odoc's own docs, we really want to have a link to &lt;code&gt;odoc-driver&lt;/code&gt;, but since &lt;code&gt;odoc-driver&lt;/code&gt; is a separate package and depends upon &lt;code&gt;odoc&lt;/code&gt;, the only way to do that in Odoc 2.x would be to use an HTML link. With Odoc 3, this constraint is gone, so you can &lt;b&gt;link to any other package or library&lt;/b&gt;. The link to &lt;code&gt;odoc-driver&lt;/code&gt; would look like &lt;code&gt;{!/odoc-driver/page-index}&lt;/code&gt;, as can be seen in &lt;a href=&quot;https://github.com/ocaml/odoc/blob/master/doc/driver.mld#L10&quot;&gt;odoc's source&lt;/a&gt;. The only requirement is that you must be able to simultaneously install all of the packages you'd like to link to, so you can't easily link to, for example, different versions of the same package.&lt;/p&gt; 1320 - &lt;p&gt;This will be particularly useful for any projects that's grouped into multiple packages. For example, the &lt;a href=&quot;https://mirage.io&quot;&gt;Mirage project&lt;/a&gt;. The main package there -- &lt;code&gt;mirage&lt;/code&gt; -- is actually right at the bottom of the dependency hierarchy, but it's the perfect place to have docs that link to all of the other Mirage packages. On a smaller scale, the &lt;a href=&quot;https://github.com/ocaml-multicore/picos&quot;&gt;Picos project&lt;/a&gt; consists of multiple packages all from a single git repository, and this would allow the docs pages from the &lt;code&gt;picos&lt;/code&gt; package to link to any of the other packages.&lt;/p&gt; 1321 - &lt;p&gt;Of course there are also a lot of other new features in this release, which are called out in the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;annoucement post on discuss&lt;/a&gt;, some of which I may post about in the future.&lt;/p&gt; 1322 - &lt;h2 id=&quot;can-i-use-it-now?&quot;&gt;&lt;a href=&quot;#can-i-use-it-now?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Can I use it now?&lt;/h2&gt; 1323 - &lt;p&gt;Of course! These new features can be used right now, so long as you're happy to self-host the docs. All that's needed is to create a switch containing all the packages you're interested in together, and use &lt;code&gt;odoc_driver&lt;/code&gt; to generate the HTML and push them to your web server. At time of writing though, ocaml.org is still using Odoc 2.4, so any packages that are published to opam that choose to use these new features will be missing the new features. Furthermore, it's actually quite a challenge to do this, since we'll have to extend the package-universe solutions to include all relevant packages, for which we need extra fields in the opam metadata.&lt;/p&gt; 1324 - &lt;h2 id=&quot;what's-next?&quot;&gt;&lt;a href=&quot;#what's-next?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's next?&lt;/h2&gt; 1325 - &lt;p&gt;We're actively working on getting Odoc 3 into the pipeline generating the docs found in https://ocaml.org/p/. This will bring with it some of the developments that landed in Odoc 2, but didn't make it onto ocaml.org - for example, the rendering of source pages. Not only are there challenges related to the package-universe solutions as mentioned above, but the storage requirements are considerably larger, so I'll be working with &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; to complete this project.&lt;/p&gt; 1326 - &lt;p&gt;We've also got work to do to update the build rules in dune to take advantage of these features. While &lt;code&gt;odoc_driver&lt;/code&gt; works very well as part of the process of deploying a docs site, it's quite impractical as a tool to help while you're actually writing the docs. For that, we'll need to make sure &lt;code&gt;dune&lt;/code&gt; understands how to use these new features. Fortunately we've had some experience with those rules in the past, and part of the work that's gone into Odoc 3 was to ensure that incremental build rules should be far more straightforward to write than for Odoc 2. In addition, some of the logic that previously only existed in &lt;a href=&quot;https://github.com/ocaml-doc/voodoo&quot;&gt;Voodoo&lt;/a&gt; - the old driver that builds docs for ocaml.org - has been integrated into &lt;code&gt;odoc&lt;/code&gt; itself, meaning one again that getting dune to produce correct docs for non-dune packages (e.g. the standard library!) should again be simpler.&lt;/p&gt; 1327 - &lt;p&gt;After we've done these, there are plans afoot to make more improvements to the manual writing experience. &lt;a href=&quot;https://choum.net/&quot;&gt;@panglesd&lt;/a&gt; has been investigating how to add admonitions to the language, I've been thinking about custom tag support, we're looking at the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-oxidizing-ocaml-an-update/15237&quot;&gt;modes&lt;/a&gt; work coming from Jane Street to see how to support that. There's plenty more to do, so if you'd like to lend a hand, reach out and join in!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/odoc-3.html</id><title type="text">Odoc 3: So what?</title><updated>2025-04-25T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text"> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projec...</summary><published>2025-04-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;semantic-versioning-in-ocaml-is-hard&quot;&gt;&lt;a href=&quot;#semantic-versioning-in-ocaml-is-hard&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Semantic Versioning in OCaml is Hard&lt;/h1&gt; 1328 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1329 - &lt;p&gt;&lt;a href=&quot;https://semver.org&quot;&gt;Semantic versioning&lt;/a&gt; is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projects that are trying to do this, including a recent &lt;a href=&quot;https://www.outreachy.org&quot;&gt;Outreachy&lt;/a&gt; project by &lt;a href=&quot;https://github.com/azzsal/&quot;&gt;Abdulaziz Alkurd&lt;/a&gt; mentored by &lt;a href=&quot;https://choum.net&quot;&gt;panglesd&lt;/a&gt; and &lt;a href=&quot;https://github.com/nathanreb&quot;&gt;Nathan Reb&lt;/a&gt;. While this is a great start, there are some subtleties of the OCaml module system that make it a good deal more complex than in other languages.&lt;/p&gt; 1330 - &lt;h2 id=&quot;opam-format.2.3.0-≠-opam-format.2.3.0?&quot;&gt;&lt;a href=&quot;#opam-format.2.3.0-≠-opam-format.2.3.0?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;opam-format.2.3.0 ≠ opam-format.2.3.0?&lt;/h2&gt; 1331 - &lt;p&gt;Let's take the case that hit me this morning. I've been working on &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; in order to bring the exciting new &lt;a href=&quot;https://ocaml.github.io/odoc&quot;&gt;odoc 3&lt;/a&gt; features to &lt;a href=&quot;https://ocaml.org/&quot;&gt;ocaml.org&lt;/a&gt; for everyone to enjoy. I have it checked out and building locally, but to deploy it to the infrastructure managed by &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; it needs to be packaged up into a Docker image. So I issued the usual &lt;code&gt;docker build .&lt;/code&gt; and after it churned through the setup stages and got on to building the project, it hit an error:&lt;/p&gt; 1332 - &lt;pre&gt;File &amp;quot;src/solver/solver.ml&amp;quot;, line 58, characters 75-98: 1333 - let deps = List.map (fun pkg -&amp;gt; OpamPackage.Map.find pkg simple_deps) (OpamPackage.Set.to_list pkgs) in 1593 + ocamlfind.1.9.6</pre> 1594 + <p>so we we can see from these two solutions that we'll need to build <code>ocamlfind.1.9.6</code> twice, once with <code>ocaml.4.05.0</code> and once with <code>ocaml.5.2.1</code>.</p> 1595 + <p>Once we've got, for every version of every package, a set of dependency universes, we choose one of these to be the one presented to the user under the <code>ocaml.org/p/</code> hierarchy. For all of the other universes, we build the package againt them, and put the docs under the <code>ocaml.org/u/</code> hierarchy.</p> 1596 + <h2 id="performing-the-builds"><a href="#performing-the-builds" class="anchor"></a>Performing the builds</h2> 1597 + <p>Once we've got a complete set of solutions and builds to do, the current CI pipeline batches the builds up to try and build as many packages as possible in as few builds as possible. While this works well enough, it does mean that we build a lot packages more than once - dune, for example, is built thousands of times during this process, producing exactly the same binaries each time.</p> 1598 + <p>In the new pipeline, I wrote a <a href="https://github.com/jonludlam/opamh">little tool</a> that allows opam packages to be archived and restored, which happens to work nicely because we're always building the packages in the same container in the same location. This means there are no worries about relocatability, although that is something that is <a href="https://www.dra27.uk/blog/platform/2025/04/22/branching-out.html">nearly here!</a></p> 1599 + <p>The downside to this is that our storage requirements are quite a bit larger, as we're storing the entire package rather than just the bits that odoc needs. However, we were always going to use more storage than before simply because the new <code>odoc</code> and <code>odoc_driver</code> pair are more capable, and the new features like <a href="https://github.com/ocaml/odoc/pull/909">source code rendering</a> and <a href="https://github.com/ocaml/odoc/pull/1121/files#diff-10c8829023814c0bcc3316f95f643623404c000b13c68849ef3d61097a6e03a6R1-R415">classify</a> require more files from the original package.</p> 1600 + <p>The upshot is that I'll be working with <a href="https://www.tunbury.org/">Mark Elvers</a> to move the docs CI from its current machine to a shiny new <a href="https://www.tunbury.org/blade-reallocation/">blade server</a>.</p>]]></content> 1601 + </entry> 1602 + <entry> 1603 + <id>https://jon.recoil.org/blog/2025/04/odoc-3.html</id> 1604 + <title>Odoc 3: So what?</title> 1605 + <published>2025-04-25T00:00:00Z</published> 1606 + <updated>2025-04-25T00:00:00Z</updated> 1607 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/odoc-3.html"/> 1608 + <summary>Odoc 3 was and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</summary> 1609 + <content type="html"><![CDATA[<h1 id="odoc-3:-so-what?"><a href="#odoc-3:-so-what?" class="anchor"></a>Odoc 3: So what?</h1> 1610 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-25</p></li></ul> 1611 + <p>Odoc 3 was <a href="https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339">released last month</a> and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</p> 1612 + <p>It's <b>manuals</b>, the theme of Odoc 3 is <b>manuals</b>. It's got a load of features to make it much better for writing <code>mld</code> pages (files written using odoc's markup) to document your packages and their relationship to the surrounding ecosystem. Previous versions of Odoc were very library-centric, in that while we did have mld-file support, most of the effort went into making sure that we were generating correct per-module pages, which show the shape of your API even if you've not put in any doc comments at all. We've still got that, obviously, but we've added many features to make write <code>mld</code> pages far more useful, and we're really hoping that these will draw people in to make documenting packages a much more enjoyable experience.</p> 1613 + <h2 id="odoc's-special-skill:-links!"><a href="#odoc's-special-skill:-links!" class="anchor"></a>Odoc's special skill: links!</h2> 1614 + <p>But why you might want to use Odoc at all for your package's manuals, rather than, say, markdown, asciidoc, rst or any other similar language? The biggest thing that Odoc brings, and has always brought, is <b>reliable linking</b>. Just write <code>{!Module.func}</code> and Odoc will check that the target exists and ensure that the link goes to the correct place, no matter how complex the definition of <code>Module</code> is or what the layout of the docs. We can link to almost all elements of an OCaml library, from modules and types through to fields of records, exceptions and extensions, and we have facilities for disambiguating, so if you happen to have both a module <code>S</code> and a module type <code>S</code> you can easily link to whichever you please.</p> 1615 + <p>In Odoc 2 though, these links were pretty limited - the only ones possible were only those to docs and API elements (modules, types, values, etc) in your own package, or to API elements in any libraries that your package depends on. When writing API docs, which tend to be at the level of types and functions, this wasn't a huge problem, but when considering manuals this turned out to be a really limiting constraint. For example, in Odoc's own docs, we really want to have a link to <code>odoc-driver</code>, but since <code>odoc-driver</code> is a separate package and depends upon <code>odoc</code>, the only way to do that in Odoc 2.x would be to use an HTML link. With Odoc 3, this constraint is gone, so you can <b>link to any other package or library</b>. The link to <code>odoc-driver</code> would look like <code>{!/odoc-driver/page-index}</code>, as can be seen in <a href="https://github.com/ocaml/odoc/blob/master/doc/driver.mld#L10">odoc's source</a>. The only requirement is that you must be able to simultaneously install all of the packages you'd like to link to, so you can't easily link to, for example, different versions of the same package.</p> 1616 + <p>This will be particularly useful for any projects that's grouped into multiple packages. For example, the <a href="https://mirage.io">Mirage project</a>. The main package there -- <code>mirage</code> -- is actually right at the bottom of the dependency hierarchy, but it's the perfect place to have docs that link to all of the other Mirage packages. On a smaller scale, the <a href="https://github.com/ocaml-multicore/picos">Picos project</a> consists of multiple packages all from a single git repository, and this would allow the docs pages from the <code>picos</code> package to link to any of the other packages.</p> 1617 + <p>Of course there are also a lot of other new features in this release, which are called out in the <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">annoucement post on discuss</a>, some of which I may post about in the future.</p> 1618 + <h2 id="can-i-use-it-now?"><a href="#can-i-use-it-now?" class="anchor"></a>Can I use it now?</h2> 1619 + <p>Of course! These new features can be used right now, so long as you're happy to self-host the docs. All that's needed is to create a switch containing all the packages you're interested in together, and use <code>odoc_driver</code> to generate the HTML and push them to your web server. At time of writing though, ocaml.org is still using Odoc 2.4, so any packages that are published to opam that choose to use these new features will be missing the new features. Furthermore, it's actually quite a challenge to do this, since we'll have to extend the package-universe solutions to include all relevant packages, for which we need extra fields in the opam metadata.</p> 1620 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 1621 + <p>We're actively working on getting Odoc 3 into the pipeline generating the docs found in https://ocaml.org/p/. This will bring with it some of the developments that landed in Odoc 2, but didn't make it onto ocaml.org - for example, the rendering of source pages. Not only are there challenges related to the package-universe solutions as mentioned above, but the storage requirements are considerably larger, so I'll be working with <a href="https://tunbury.org/">Mark Elvers</a> to complete this project.</p> 1622 + <p>We've also got work to do to update the build rules in dune to take advantage of these features. While <code>odoc_driver</code> works very well as part of the process of deploying a docs site, it's quite impractical as a tool to help while you're actually writing the docs. For that, we'll need to make sure <code>dune</code> understands how to use these new features. Fortunately we've had some experience with those rules in the past, and part of the work that's gone into Odoc 3 was to ensure that incremental build rules should be far more straightforward to write than for Odoc 2. In addition, some of the logic that previously only existed in <a href="https://github.com/ocaml-doc/voodoo">Voodoo</a> - the old driver that builds docs for ocaml.org - has been integrated into <code>odoc</code> itself, meaning one again that getting dune to produce correct docs for non-dune packages (e.g. the standard library!) should again be simpler.</p> 1623 + <p>After we've done these, there are plans afoot to make more improvements to the manual writing experience. <a href="https://choum.net/">@panglesd</a> has been investigating how to add admonitions to the language, I've been thinking about custom tag support, we're looking at the <a href="https://discuss.ocaml.org/t/ann-oxidizing-ocaml-an-update/15237">modes</a> work coming from Jane Street to see how to support that. There's plenty more to do, so if you'd like to lend a hand, reach out and join in!</p>]]></content> 1624 + </entry> 1625 + <entry> 1626 + <id>https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html</id> 1627 + <title>Semantic Versioning in OCaml is Hard</title> 1628 + <published>2025-04-20T00:00:00Z</published> 1629 + <updated>2025-04-20T00:00:00Z</updated> 1630 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html"/> 1631 + <summary> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projec...</summary> 1632 + <content type="html"><![CDATA[<h1 id="semantic-versioning-in-ocaml-is-hard"><a href="#semantic-versioning-in-ocaml-is-hard" class="anchor"></a>Semantic Versioning in OCaml is Hard</h1> 1633 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-20</p></li></ul> 1634 + <p><a href="https://semver.org">Semantic versioning</a> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projects that are trying to do this, including a recent <a href="https://www.outreachy.org">Outreachy</a> project by <a href="https://github.com/azzsal/">Abdulaziz Alkurd</a> mentored by <a href="https://choum.net">panglesd</a> and <a href="https://github.com/nathanreb">Nathan Reb</a>. While this is a great start, there are some subtleties of the OCaml module system that make it a good deal more complex than in other languages.</p> 1635 + <h2 id="opam-format.2.3.0-≠-opam-format.2.3.0?"><a href="#opam-format.2.3.0-≠-opam-format.2.3.0?" class="anchor"></a>opam-format.2.3.0 ≠ opam-format.2.3.0?</h2> 1636 + <p>Let's take the case that hit me this morning. I've been working on <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> in order to bring the exciting new <a href="https://ocaml.github.io/odoc">odoc 3</a> features to <a href="https://ocaml.org/">ocaml.org</a> for everyone to enjoy. I have it checked out and building locally, but to deploy it to the infrastructure managed by <a href="https://tunbury.org/">Mark Elvers</a> it needs to be packaged up into a Docker image. So I issued the usual <code>docker build .</code> and after it churned through the setup stages and got on to building the project, it hit an error:</p> 1637 + <pre>File &quot;src/solver/solver.ml&quot;, line 58, characters 75-98: 1638 + let deps = List.map (fun pkg -&gt; OpamPackage.Map.find pkg simple_deps) (OpamPackage.Set.to_list pkgs) in 1334 1639 Error: Unbound value OpamPackage.Set.to_list 1335 - Hint: Did you mean of_list?&lt;/pre&gt; 1336 - &lt;p&gt;Now &lt;code&gt;OpamPackage&lt;/code&gt; is a module in the &lt;code&gt;opam-format&lt;/code&gt; library, which is easily discovered using the excellent &lt;a href=&quot;https://doc.sherlocode.com/?q=OpamPackage&quot;&gt;Sherlodoc&lt;/a&gt; tool, so I checked what version I had locally, and what version I had in the Docker container, and it turned out I was using exactly the same version -- 2.3.0 -- both locally and in the container. So what's going on?&lt;/p&gt; 1337 - &lt;p&gt;The problem is that the Dockerfile I was using was using OCaml version 4.14, whereas locally I was using OCaml 5.3. &amp;quot;But how on earth can this cause the API of &lt;code&gt;opam-format&lt;/code&gt; to change?&amp;quot; I hear you wail! Well, this is actually one of the simpler outcomes of the way the OCaml module system works. Let's look at &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli&quot;&gt;the code&lt;/a&gt;.&lt;/p&gt; 1338 - &lt;p&gt;The first thing we note is the absence of any definition of &lt;code&gt;Set&lt;/code&gt; or &lt;code&gt;Map&lt;/code&gt; here&lt;/p&gt; 1339 - &lt;ul&gt;&lt;li&gt;where do they come from? It turns out they come from &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli#L49&quot;&gt;this line here&lt;/a&gt;:&lt;/li&gt;&lt;/ul&gt; 1340 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;include OpamStd.ABSTRACT with type t := t&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1341 - &lt;p&gt;So let's take a look over in &lt;code&gt;opamStd.mli&lt;/code&gt; to see what that signature looks like:&lt;/p&gt; 1342 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;(** A signature for handling abstract keys and collections thereof *) 1640 + Hint: Did you mean of_list?</pre> 1641 + <p>Now <code>OpamPackage</code> is a module in the <code>opam-format</code> library, which is easily discovered using the excellent <a href="https://doc.sherlocode.com/?q=OpamPackage">Sherlodoc</a> tool, so I checked what version I had locally, and what version I had in the Docker container, and it turned out I was using exactly the same version -- 2.3.0 -- both locally and in the container. So what's going on?</p> 1642 + <p>The problem is that the Dockerfile I was using was using OCaml version 4.14, whereas locally I was using OCaml 5.3. &quot;But how on earth can this cause the API of <code>opam-format</code> to change?&quot; I hear you wail! Well, this is actually one of the simpler outcomes of the way the OCaml module system works. Let's look at <a href="https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli">the code</a>.</p> 1643 + <p>The first thing we note is the absence of any definition of <code>Set</code> or <code>Map</code> here</p> 1644 + <ul><li>where do they come from? It turns out they come from <a href="https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli#L49">this line here</a>:</li></ul> 1645 + <div><pre class="language-ocaml"><code>include OpamStd.ABSTRACT with type t := t</code></pre></div> 1646 + <p>So let's take a look over in <code>opamStd.mli</code> to see what that signature looks like:</p> 1647 + <div><pre class="language-ocaml"><code>(** A signature for handling abstract keys and collections thereof *) 1343 1648 module type ABSTRACT = sig 1344 1649 1345 1650 type t 1346 1651 1347 - val compare: t -&amp;gt; t -&amp;gt; int 1348 - val equal: t -&amp;gt; t -&amp;gt; bool 1349 - val of_string: string -&amp;gt; t 1350 - val to_string: t -&amp;gt; string 1652 + val compare: t -&gt; t -&gt; int 1653 + val equal: t -&gt; t -&gt; bool 1654 + val of_string: string -&gt; t 1655 + val to_string: t -&gt; string 1351 1656 val to_json: t OpamJson.encoder 1352 1657 val of_json: t OpamJson.decoder 1353 1658 1354 1659 module Set: SET with type elt = t 1355 1660 module Map: MAP with type key = t 1356 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1357 - &lt;p&gt;OK, so we've found the definitions of &lt;code&gt;Set&lt;/code&gt; and &lt;code&gt;Map&lt;/code&gt; - they refer to signatures &lt;code&gt;SET&lt;/code&gt; and &lt;code&gt;MAP&lt;/code&gt; which are defined just above in &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/core/opamStd.mli#L17-L98&quot;&gt;opamStd.mli&lt;/a&gt;. Let's just look at &lt;code&gt;Set&lt;/code&gt; since that's where the problem was:&lt;/p&gt; 1358 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type SET = sig 1661 + end</code></pre></div> 1662 + <p>OK, so we've found the definitions of <code>Set</code> and <code>Map</code> - they refer to signatures <code>SET</code> and <code>MAP</code> which are defined just above in <a href="https://github.com/ocaml/opam/blob/2.3.0/src/core/opamStd.mli#L17-L98">opamStd.mli</a>. Let's just look at <code>Set</code> since that's where the problem was:</p> 1663 + <div><pre class="language-ocaml"><code>module type SET = sig 1359 1664 1360 1665 include Set.S 1361 1666 1362 - val map: (elt -&amp;gt; elt) -&amp;gt; t -&amp;gt; t 1667 + val map: (elt -&gt; elt) -&gt; t -&gt; t 1363 1668 1364 - val is_singleton: t -&amp;gt; bool 1669 + val is_singleton: t -&gt; bool 1365 1670 1366 1671 (** Returns one element, assuming the set is a singleton. Raises [Not_found] 1367 1672 on an empty set, [Failure] on a non-singleton. *) 1368 - val choose_one : t -&amp;gt; elt 1673 + val choose_one : t -&gt; elt 1369 1674 1370 - val choose_opt: t -&amp;gt; elt option 1675 + val choose_opt: t -&gt; elt option 1371 1676 1372 - val of_list: elt list -&amp;gt; t 1373 - val to_list_map: (elt -&amp;gt; 'b) -&amp;gt; t -&amp;gt; 'b list 1374 - val to_string: t -&amp;gt; string 1677 + val of_list: elt list -&gt; t 1678 + val to_list_map: (elt -&gt; 'b) -&gt; t -&gt; 'b list 1679 + val to_string: t -&gt; string 1375 1680 val to_json: t OpamJson.encoder 1376 1681 val of_json: t OpamJson.decoder 1377 - val find: (elt -&amp;gt; bool) -&amp;gt; t -&amp;gt; elt 1378 - val find_opt: (elt -&amp;gt; bool) -&amp;gt; t -&amp;gt; elt option 1682 + val find: (elt -&gt; bool) -&gt; t -&gt; elt 1683 + val find_opt: (elt -&gt; bool) -&gt; t -&gt; elt option 1379 1684 1380 1685 (** Raises Failure in case the element is already present *) 1381 - val safe_add: elt -&amp;gt; t -&amp;gt; t 1686 + val safe_add: elt -&gt; t -&gt; t 1382 1687 1383 1688 (** Accumulates the resulting sets of a function of elements until a fixpoint 1384 1689 is reached *) 1385 - val fixpoint: (elt -&amp;gt; t) -&amp;gt; t -&amp;gt; t 1690 + val fixpoint: (elt -&gt; t) -&gt; t -&gt; t 1386 1691 1387 1692 (** [map_reduce f op t] applies [f] to every element of [t] and combines the 1388 1693 results using associative operator [op]. Raises [Invalid_argument] on an 1389 1694 empty set, or returns [default] if it is defined. *) 1390 - val map_reduce: ?default:'a -&amp;gt; (elt -&amp;gt; 'a) -&amp;gt; ('a -&amp;gt; 'a -&amp;gt; 'a) -&amp;gt; t -&amp;gt; 'a 1695 + val map_reduce: ?default:'a -&gt; (elt -&gt; 'a) -&gt; ('a -&gt; 'a -&gt; 'a) -&gt; t -&gt; 'a 1391 1696 1392 1697 module Op : sig 1393 - val (++): t -&amp;gt; t -&amp;gt; t (** Infix set union *) 1698 + val (++): t -&gt; t -&gt; t (** Infix set union *) 1394 1699 1395 - val (--): t -&amp;gt; t -&amp;gt; t (** Infix set difference *) 1700 + val (--): t -&gt; t -&gt; t (** Infix set difference *) 1396 1701 1397 - val (%%): t -&amp;gt; t -&amp;gt; t (** Infix set intersection *) 1702 + val (%%): t -&gt; t -&gt; t (** Infix set intersection *) 1398 1703 end 1399 1704 1400 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1401 - &lt;p&gt;Sure enough, there's no &lt;code&gt;to_list&lt;/code&gt; function defined in there. Once again though, there's an &lt;code&gt;include Set.S&lt;/code&gt; in there. It turns out that that refers to the &lt;code&gt;Set&lt;/code&gt; module in the OCaml standard library. We can again &lt;a href=&quot;https://github.com/ocaml/ocaml/blob/5.3.0/stdlib/set.mli&quot;&gt;look at the source&lt;/a&gt;:&lt;/p&gt; 1402 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;val to_list : t -&amp;gt; elt list 1705 + end</code></pre></div> 1706 + <p>Sure enough, there's no <code>to_list</code> function defined in there. Once again though, there's an <code>include Set.S</code> in there. It turns out that that refers to the <code>Set</code> module in the OCaml standard library. We can again <a href="https://github.com/ocaml/ocaml/blob/5.3.0/stdlib/set.mli">look at the source</a>:</p> 1707 + <div><pre class="language-ocaml"><code>val to_list : t -&gt; elt list 1403 1708 (** [to_list s] is {!elements}[ s]. 1404 - @since 5.1 *)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1405 - &lt;p&gt;And there it is. The &lt;code&gt;to_list&lt;/code&gt; function has only been in the &lt;code&gt;Set&lt;/code&gt; module since version 5.1.&lt;/p&gt; 1406 - &lt;h2 id=&quot;using-ocaml.org-docs&quot;&gt;&lt;a href=&quot;#using-ocaml.org-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using ocaml.org docs&lt;/h2&gt; 1407 - &lt;p&gt;It was pretty difficult to figure that out from the source, but happily there's a better way. We can browse the docs on https://ocaml.org/ - We can look at the docs for the &lt;a href=&quot;https://ocaml.org/p/opam-format/2.3.0/doc/OpamPackage/Set/index.html&quot;&gt;OpamPackage.Set module&lt;/a&gt; which, as of today, does not contain any &lt;code&gt;to_list&lt;/code&gt; function. The &lt;code&gt;include Set.S&lt;/code&gt; is there with the expansion showing all of the types and values coming from it, so we can click on the &lt;code&gt;Set.S&lt;/code&gt; link on the include line which takes us to the documentation for the stdlib from OCaml 4.11.2. Changing the version from the dropdown at the top to something more recent takes us to a page containing the &lt;code&gt;to_list&lt;/code&gt; function with the helpful &lt;code&gt;since 5.1&lt;/code&gt; annotation.&lt;/p&gt; 1408 - &lt;p&gt;This is, in fact, a relatively simple example of the sorts of issues that can occur that make semantic versioning a headache. In this example, it was a change due to a difference in the compiler version used, but there's nothing particularly special about that - a package may expose signatures derived from any of its dependencies! So is there anything we can do about this? Obviously, yes!&lt;/p&gt; 1409 - &lt;h2 id=&quot;towards-a-solution&quot;&gt;&lt;a href=&quot;#towards-a-solution&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Towards a solution&lt;/h2&gt; 1410 - &lt;p&gt;Step 1 of any approach to solving this is to be able to identify which bits of a libraries API come from which packages, and therefore which versions of those packages. It turns out there may well be a nice way to piggy-back on a recent feature from Odoc, which was originally intended to help with suppressing suprious warnings.&lt;/p&gt; 1411 - &lt;p&gt;The problem we were tackling was that if your library ends up exporting a module whose signature is defined in someone else's package, then any warnings that come from it are unfixable. To fix this we added a tag to each signature of a module that indicates which package it originally came from. Odoc is then very careful to keep track of this as it performs its signature manipulations, resulting in an accurate way to know which signature elements came from which package. This fixed the problem of the spurious warnings nicely.&lt;/p&gt; 1412 - &lt;p&gt;Quite separately, we've got the docs CI that is attempting to build docs for every version of every package. Obviously given the above, in order to exhaustively show all the possible APIs of every library, we should build all possible combinations of every version of every package. Clearly we can't possibly do this, so the docs CI focuses on the goal of building at least one solution for every version of every package.&lt;/p&gt; 1413 - &lt;p&gt;Now if you combine these two ideas, we can use the builds of the packages with the tracking of the package of the originating signatures to be able to precisely track the differences in API between different versions of a package. This would allow us to build a database of those changes, and with this in hand we can look at what APIs are used in any other package and be able to suggest upper and lower bounds on the versions of its dependencies.&lt;/p&gt; 1414 - &lt;p&gt;Now wouldn't that be cool?&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html</id><title type="text">Semantic Versioning in OCaml is Hard</title><updated>2025-04-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">It's tremendously exciting to be back in the , as the last time I worked here was just before the pandemic. I'm now a member of the whose goal is &quot;to have a measurable impact on tools and techniques ...</summary><published>2025-04-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/meeting-the-team.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;meeting-the-team&quot;&gt;&lt;a href=&quot;#meeting-the-team&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Meeting the Team&lt;/h1&gt; 1415 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1416 - &lt;p&gt;It's tremendously exciting to be back in the &lt;a href=&quot;https://www.cst.cam.ac.uk/&quot;&gt;Computer Laboratory&lt;/a&gt;, as the last time I worked here was just before the pandemic. I'm now a member of the &lt;a href=&quot;https://www.cst.cam.ac.uk/research/eeg&quot;&gt;Energy and Environment Group&lt;/a&gt; whose goal is &amp;quot;to have a measurable impact on tools and techniques for de-risking the future&amp;quot;.&lt;/p&gt; 1417 - &lt;h2 id=&quot;what's-going-on?&quot;&gt;&lt;a href=&quot;#what's-going-on?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's going on?&lt;/h2&gt; 1418 - &lt;p&gt;With such a broad goal, it's hard to know where to start and how I'll fit in, so my first few weeks have been spent getting to know the other members of the group and what they're up to. It's an incredibly inspiring group of individuals who are all doing amazing work, and it's really humbling and daunting to be a part of it.&lt;/p&gt; 1419 - &lt;p&gt;There's some really interesting work going on in our group on LLMs, principally led by the fantastic &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq Jaffer&lt;/a&gt;. We had a chat a few weeks ago and have started to explore some ideas around seeing how well LLMs can program in OCaml already before we start to do some RL training on them. Having not done any LLM stuff before, it's a steep learning curve for me, but we're already seeing some interesting results. We should have some more to say about this in the coming weeks.&lt;/p&gt; 1420 - &lt;p&gt;Last week I met with &lt;a href=&quot;https://digitalflapjack.com/&quot;&gt;Michael Dales&lt;/a&gt;, and he talked about the project &lt;a href=&quot;https://github.com/quantifyearth/shark&quot;&gt;shark&lt;/a&gt; that he and &lt;a href=&quot;patrick.sirref.org&quot;&gt;Patrick Ferris&lt;/a&gt; have been working on. It's kind of a mix between a shell and a jupyter-style notebook, with a strong focus on reproducibility. The traditional pain of notebooks is, of course, the execution model, whereby cells might be executed in any order you like. This means that the state you find the notebook in might not be even reachable again, let alone consistently reproducible. Shark is trying to address this by using file-system snapshots and clever analysis of the inputs and outputs of each cell to both ensure reproducibility, but also to allow a fast editing cycle, rerunning of only the bits that need to be rerun, even in the presence of slow data processing steps. It's a fascinating project, and I can't wait to see it in action when Michael gives us a demo!&lt;/p&gt; 1421 - &lt;p&gt;I also met up with &lt;a href=&quot;https://ryan.freumh.org&quot;&gt;Ryan Gibb&lt;/a&gt; with &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;David Allsopp&lt;/a&gt; and we had a good chat about his project &lt;a href=&quot;https://github.com/RyanGibb/babel&quot;&gt;Babel&lt;/a&gt;, which is using the &lt;a href=&quot;https://nex3.medium.com/pubgrub-2fb6470504f&quot;&gt;PubGrub&lt;/a&gt; algorithm to do package resolution for multiple package domains at once. We've got a number of avenues to explore here, from building a PubGrub implementation in OxCaml, to using Babel to construct Docker images for opam packages entirely from scratch, without using a base image.&lt;/p&gt; 1422 - &lt;p&gt;With my other hat on as a member of the CTO office at &lt;a href=&quot;https://tarides.com/&quot;&gt;Tarides&lt;/a&gt;, I'm very much looking forward to using OCaml and OxCaml to solve some real-world problems that are in an entirely different domain than I've been used to over the last few years.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/meeting-the-team.html</id><title type="text">Meeting the Team</title><updated>2025-04-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've spent a of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about ho...</summary><published>2025-04-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/this-site.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;this-site&quot;&gt;&lt;a href=&quot;#this-site&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;This site&lt;/h1&gt; 1423 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;mime_printer&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1424 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1425 - &lt;p&gt;I've spent a &lt;em&gt;lot&lt;/em&gt; of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about how I might use odoc as part of it. We've spent a lot of time recently trying to make odoc more able to generate structured documentation sites, so I've gone all in and am trialling using it as a tool to generate my entire site. This is a bit of an experiment, and I don't know how well it will work out, but let's see how it goes.&lt;/p&gt; 1426 - &lt;p&gt;Additionally, I've recently been working on a project currently called &lt;code&gt;odoc_notebook&lt;/code&gt;, which is a set of tools to allow odoc &lt;code&gt;mld&lt;/code&gt; files to be used as a sort of Jupyter-style notebook. The idea is that you can write both text and code in the same file, and then run the code in the notebook interactively. Since I've only got a webserver, all the execution of code has to be done client side, so I'm making extensive use of the phenomenal &lt;a href=&quot;https://github.com/ocsigen/js_of_ocaml&quot;&gt;Js_of_ocaml&lt;/a&gt; project to get an OCaml engine running in the browser.&lt;/p&gt; 1427 - &lt;p&gt;My focus has initially been on getting 'toplevel-style' code execution working. As an example, let's write a little demo.&lt;/p&gt; 1428 - &lt;h2 id=&quot;demo&quot;&gt;&lt;a href=&quot;#demo&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Demo&lt;/h2&gt; 1429 - &lt;p&gt;Let's start with a little demo:&lt;/p&gt; 1430 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let x = 1 + 2&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1431 - &lt;p&gt;It's intended to look like an OCaml toplevel session, so each new expression starts with a &lt;code&gt;#&lt;/code&gt; and is terminated with a double semicolon. The response from the toplevel is then below that indented with 2 spaces. Right now, there's not much in the way of error checking so you can make it all very confused by deleting the hash, removing the &lt;code&gt;;;&lt;/code&gt; and so on. Avoiding this, however, you can edit the numbers here and hit 'run' (maybe twice!) to see the results being updated.&lt;/p&gt; 1432 - &lt;p&gt;There is also a little integration to allow the code to produce output more interesting than just text. The following cell creates an SVG image and 'pushes' it to &lt;code&gt;Mime_printer&lt;/code&gt;, which receives the mime value and renders it in the browser below the code block.&lt;/p&gt; 1433 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let svg = [ 1434 - {|&amp;lt;svg height=&amp;quot;210&amp;quot; width=&amp;quot;500&amp;quot; xmlns=&amp;quot;http://www.w3.org/2000/svg&amp;quot;&amp;gt;|}; 1435 - {|&amp;lt;polygon points=&amp;quot;100,10 40,198 190,78 10,78 160,198&amp;quot; |}; 1436 - {|style=&amp;quot;fill:lime;stroke:purple;stroke-width:5;&amp;quot;/&amp;gt;&amp;lt;/svg&amp;gt;|}];; 1709 + @since 5.1 *)</code></pre></div> 1710 + <p>And there it is. The <code>to_list</code> function has only been in the <code>Set</code> module since version 5.1.</p> 1711 + <h2 id="using-ocaml.org-docs"><a href="#using-ocaml.org-docs" class="anchor"></a>Using ocaml.org docs</h2> 1712 + <p>It was pretty difficult to figure that out from the source, but happily there's a better way. We can browse the docs on https://ocaml.org/ - We can look at the docs for the <a href="https://ocaml.org/p/opam-format/2.3.0/doc/OpamPackage/Set/index.html">OpamPackage.Set module</a> which, as of today, does not contain any <code>to_list</code> function. The <code>include Set.S</code> is there with the expansion showing all of the types and values coming from it, so we can click on the <code>Set.S</code> link on the include line which takes us to the documentation for the stdlib from OCaml 4.11.2. Changing the version from the dropdown at the top to something more recent takes us to a page containing the <code>to_list</code> function with the helpful <code>since 5.1</code> annotation.</p> 1713 + <p>This is, in fact, a relatively simple example of the sorts of issues that can occur that make semantic versioning a headache. In this example, it was a change due to a difference in the compiler version used, but there's nothing particularly special about that - a package may expose signatures derived from any of its dependencies! So is there anything we can do about this? Obviously, yes!</p> 1714 + <h2 id="towards-a-solution"><a href="#towards-a-solution" class="anchor"></a>Towards a solution</h2> 1715 + <p>Step 1 of any approach to solving this is to be able to identify which bits of a libraries API come from which packages, and therefore which versions of those packages. It turns out there may well be a nice way to piggy-back on a recent feature from Odoc, which was originally intended to help with suppressing suprious warnings.</p> 1716 + <p>The problem we were tackling was that if your library ends up exporting a module whose signature is defined in someone else's package, then any warnings that come from it are unfixable. To fix this we added a tag to each signature of a module that indicates which package it originally came from. Odoc is then very careful to keep track of this as it performs its signature manipulations, resulting in an accurate way to know which signature elements came from which package. This fixed the problem of the spurious warnings nicely.</p> 1717 + <p>Quite separately, we've got the docs CI that is attempting to build docs for every version of every package. Obviously given the above, in order to exhaustively show all the possible APIs of every library, we should build all possible combinations of every version of every package. Clearly we can't possibly do this, so the docs CI focuses on the goal of building at least one solution for every version of every package.</p> 1718 + <p>Now if you combine these two ideas, we can use the builds of the packages with the tracking of the package of the originating signatures to be able to precisely track the differences in API between different versions of a package. This would allow us to build a database of those changes, and with this in hand we can look at what APIs are used in any other package and be able to suggest upper and lower bounds on the versions of its dependencies.</p> 1719 + <p>Now wouldn't that be cool?</p>]]></content> 1720 + </entry> 1721 + <entry> 1722 + <id>https://jon.recoil.org/blog/2025/04/meeting-the-team.html</id> 1723 + <title>Meeting the Team</title> 1724 + <published>2025-04-08T00:00:00Z</published> 1725 + <updated>2025-04-08T00:00:00Z</updated> 1726 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/meeting-the-team.html"/> 1727 + <summary>It's tremendously exciting to be back in the , as the last time I worked here was just before the pandemic. I'm now a member of the whose goal is &quot;to have a measurable impact on tools and techniques ...</summary> 1728 + <content type="html"><![CDATA[<h1 id="meeting-the-team"><a href="#meeting-the-team" class="anchor"></a>Meeting the Team</h1> 1729 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-08</p></li></ul> 1730 + <p>It's tremendously exciting to be back in the <a href="https://www.cst.cam.ac.uk/">Computer Laboratory</a>, as the last time I worked here was just before the pandemic. I'm now a member of the <a href="https://www.cst.cam.ac.uk/research/eeg">Energy and Environment Group</a> whose goal is &quot;to have a measurable impact on tools and techniques for de-risking the future&quot;.</p> 1731 + <h2 id="what's-going-on?"><a href="#what's-going-on?" class="anchor"></a>What's going on?</h2> 1732 + <p>With such a broad goal, it's hard to know where to start and how I'll fit in, so my first few weeks have been spent getting to know the other members of the group and what they're up to. It's an incredibly inspiring group of individuals who are all doing amazing work, and it's really humbling and daunting to be a part of it.</p> 1733 + <p>There's some really interesting work going on in our group on LLMs, principally led by the fantastic <a href="https://toao.com/">Sadiq Jaffer</a>. We had a chat a few weeks ago and have started to explore some ideas around seeing how well LLMs can program in OCaml already before we start to do some RL training on them. Having not done any LLM stuff before, it's a steep learning curve for me, but we're already seeing some interesting results. We should have some more to say about this in the coming weeks.</p> 1734 + <p>Last week I met with <a href="https://digitalflapjack.com/">Michael Dales</a>, and he talked about the project <a href="https://github.com/quantifyearth/shark">shark</a> that he and <a href="patrick.sirref.org">Patrick Ferris</a> have been working on. It's kind of a mix between a shell and a jupyter-style notebook, with a strong focus on reproducibility. The traditional pain of notebooks is, of course, the execution model, whereby cells might be executed in any order you like. This means that the state you find the notebook in might not be even reachable again, let alone consistently reproducible. Shark is trying to address this by using file-system snapshots and clever analysis of the inputs and outputs of each cell to both ensure reproducibility, but also to allow a fast editing cycle, rerunning of only the bits that need to be rerun, even in the presence of slow data processing steps. It's a fascinating project, and I can't wait to see it in action when Michael gives us a demo!</p> 1735 + <p>I also met up with <a href="https://ryan.freumh.org">Ryan Gibb</a> with <a href="https://www.dra27.uk/blog/">David Allsopp</a> and we had a good chat about his project <a href="https://github.com/RyanGibb/babel">Babel</a>, which is using the <a href="https://nex3.medium.com/pubgrub-2fb6470504f">PubGrub</a> algorithm to do package resolution for multiple package domains at once. We've got a number of avenues to explore here, from building a PubGrub implementation in OxCaml, to using Babel to construct Docker images for opam packages entirely from scratch, without using a base image.</p> 1736 + <p>With my other hat on as a member of the CTO office at <a href="https://tarides.com/">Tarides</a>, I'm very much looking forward to using OCaml and OxCaml to solve some real-world problems that are in an entirely different domain than I've been used to over the last few years.</p>]]></content> 1737 + </entry> 1738 + <entry> 1739 + <id>https://jon.recoil.org/blog/2025/04/this-site.html</id> 1740 + <title>This site</title> 1741 + <published>2025-04-07T00:00:00Z</published> 1742 + <updated>2025-04-07T00:00:00Z</updated> 1743 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/this-site.html"/> 1744 + <summary>I've spent a of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about ho...</summary> 1745 + <content type="html"><![CDATA[<h1 id="this-site"><a href="#this-site" class="anchor"></a>This site</h1> 1746 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>mime_printer</p></li></ul> 1747 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-07</p></li></ul> 1748 + <p>I've spent a <em>lot</em> of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about how I might use odoc as part of it. We've spent a lot of time recently trying to make odoc more able to generate structured documentation sites, so I've gone all in and am trialling using it as a tool to generate my entire site. This is a bit of an experiment, and I don't know how well it will work out, but let's see how it goes.</p> 1749 + <p>Additionally, I've recently been working on a project currently called <code>odoc_notebook</code>, which is a set of tools to allow odoc <code>mld</code> files to be used as a sort of Jupyter-style notebook. The idea is that you can write both text and code in the same file, and then run the code in the notebook interactively. Since I've only got a webserver, all the execution of code has to be done client side, so I'm making extensive use of the phenomenal <a href="https://github.com/ocsigen/js_of_ocaml">Js_of_ocaml</a> project to get an OCaml engine running in the browser.</p> 1750 + <p>My focus has initially been on getting 'toplevel-style' code execution working. As an example, let's write a little demo.</p> 1751 + <h2 id="demo"><a href="#demo" class="anchor"></a>Demo</h2> 1752 + <p>Let's start with a little demo:</p> 1753 + <div><pre class="language-ocaml"><code>let x = 1 + 2</code></pre></div> 1754 + <p>It's intended to look like an OCaml toplevel session, so each new expression starts with a <code>#</code> and is terminated with a double semicolon. The response from the toplevel is then below that indented with 2 spaces. Right now, there's not much in the way of error checking so you can make it all very confused by deleting the hash, removing the <code>;;</code> and so on. Avoiding this, however, you can edit the numbers here and hit 'run' (maybe twice!) to see the results being updated.</p> 1755 + <p>There is also a little integration to allow the code to produce output more interesting than just text. The following cell creates an SVG image and 'pushes' it to <code>Mime_printer</code>, which receives the mime value and renders it in the browser below the code block.</p> 1756 + <div><pre class="language-ocaml"><code>let svg = [ 1757 + {|&lt;svg height=&quot;210&quot; width=&quot;500&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;&gt;|}; 1758 + {|&lt;polygon points=&quot;100,10 40,198 190,78 10,78 160,198&quot; |}; 1759 + {|style=&quot;fill:lime;stroke:purple;stroke-width:5;&quot;/&gt;&lt;/svg&gt;|}];; 1437 1760 1438 - Mime_printer.push &amp;quot;image/svg&amp;quot; (String.concat &amp;quot;\n&amp;quot; svg)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1439 - &lt;h2 id=&quot;things-to-come&quot;&gt;&lt;a href=&quot;#things-to-come&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Things to come&lt;/h2&gt; 1440 - &lt;h3 id=&quot;merlin-support&quot;&gt;&lt;a href=&quot;#merlin-support&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Merlin support&lt;/h3&gt; 1441 - &lt;p&gt;There are a bunch of things I want to add to this, for example, Merlin support. In fact, &lt;a href=&quot;https://github.com/voodoos/merlin-js&quot;&gt;merlin-js&lt;/a&gt; already exists and works, thanks to the fantastic work of &lt;a href=&quot;https://github.com/voodoos&quot;&gt;Ulysse&lt;/a&gt;, but the problem is that it's not really designed for toplevel work, and it doesn't work when the code is broken up into chunks like I do here. So either I need to concatenate all the cells together before I give it to Merlin, or I need to make each cell it's own little module and 'open' every previous cell's module.&lt;/p&gt; 1442 - &lt;p&gt;Within a single cell, it does already work. You can see that Merlin is correctly underlining the error in the following cell. You should also be able to hover over the variables and see their types.&lt;/p&gt; 1443 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type t = { foo : int; bar : string };; 1761 + Mime_printer.push &quot;image/svg&quot; (String.concat &quot;\n&quot; svg)</code></pre></div> 1762 + <h2 id="things-to-come"><a href="#things-to-come" class="anchor"></a>Things to come</h2> 1763 + <h3 id="merlin-support"><a href="#merlin-support" class="anchor"></a>Merlin support</h3> 1764 + <p>There are a bunch of things I want to add to this, for example, Merlin support. In fact, <a href="https://github.com/voodoos/merlin-js">merlin-js</a> already exists and works, thanks to the fantastic work of <a href="https://github.com/voodoos">Ulysse</a>, but the problem is that it's not really designed for toplevel work, and it doesn't work when the code is broken up into chunks like I do here. So either I need to concatenate all the cells together before I give it to Merlin, or I need to make each cell it's own little module and 'open' every previous cell's module.</p> 1765 + <p>Within a single cell, it does already work. You can see that Merlin is correctly underlining the error in the following cell. You should also be able to hover over the variables and see their types.</p> 1766 + <div><pre class="language-ocaml"><code>type t = { foo : int; bar : string };; 1444 1767 1445 - let x = { foo = 1; bar = &amp;quot;hello&amp;quot; };; 1768 + let x = { foo = 1; bar = &quot;hello&quot; };; 1446 1769 1447 - let this_line_has_an_error = { foo = 1; bar = None };;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1448 - &lt;p&gt;But across cells, I've broken Merlin, though the code is executes correctly. You can see the problem in the following cell, which re-pushes the SVG image using the variable &lt;code&gt;svg&lt;/code&gt; defined in the cell above. Merlin highlights the use of the varible &lt;code&gt;svg&lt;/code&gt; is, because it's not aware of the varible, but the code gets executed correctly and the image is rendered below the cell.&lt;/p&gt; 1449 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;Mime_printer.push &amp;quot;image/svg&amp;quot; (String.concat &amp;quot;\n&amp;quot; svg)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1450 - &lt;p&gt;Edit 2025-05-20: I have now got merlin working across cells, though I'm not convinced the current solution is the right long-term solution. S&lt;/p&gt; 1451 - &lt;h3 id=&quot;dynamic-libraries&quot;&gt;&lt;a href=&quot;#dynamic-libraries&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dynamic libraries&lt;/h3&gt; 1452 - &lt;p&gt;Currently the use of libraries it quite limited - they are defined more-or-less statically. I've had dynamic libraries working in the past, but I need to re-implement them. The plan is to have the &lt;code&gt;cma&lt;/code&gt; files converted to &lt;code&gt;js&lt;/code&gt; files and then load them on-demand when the notebook specifies them. The tricky thing here is that we need to be able to use them both in the browser and in bytecode executables so that the 'test-promote' workflow still works. This will probably require specifying the libraries by name, and having to re-implement the work that &lt;a href=&quot;https://projects.camlcity.org/projects/findlib.html&quot;&gt;findlib&lt;/a&gt; does to find the libraries and load them and their dependencies in the right order, though this time entirely over HTTP.&lt;/p&gt; 1453 - &lt;h3 id=&quot;other-things&quot;&gt;&lt;a href=&quot;#other-things&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Other things&lt;/h3&gt; 1454 - &lt;p&gt;There are loads of other things I'm interested in doing, including:&lt;/p&gt; 1455 - &lt;ul&gt;&lt;li&gt;Investigating how to do 'exercises' to allow readers to try things out in a guided way&lt;/li&gt;&lt;li&gt;'Test cells' to see if implementations are correct&lt;/li&gt;&lt;li&gt;Persistence of the notebook state - both using local and remote storage&lt;/li&gt;&lt;li&gt;Integration of docs&lt;/li&gt;&lt;li&gt;Exploration of the execution model - how to run the code in the right order and ensure reproducibility&lt;/li&gt;&lt;li&gt;Use of remote execution engines rather than just in the browser&lt;/li&gt;&lt;li&gt;Other languages?&lt;/li&gt;&lt;/ul&gt; 1456 - &lt;p&gt;Right now though, my focus is on the functionality required for this blog, with a secondary goal of looking at how we might use this sort of technology on the docs site on ocaml.org. Wouldn't it be cool to be able to drop into a live OCaml toplevel for any library in opam?&lt;/p&gt; 1457 - &lt;h2 id=&quot;example-notebooks&quot;&gt;&lt;a href=&quot;#example-notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Example notebooks&lt;/h2&gt; 1458 - &lt;p&gt;As a more extended example of odoc notebooks, I have converted to this format the course that I help teach at the University of Cambridge; &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/&quot;&gt;Foundations of Computer Science&lt;/a&gt;. &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/foundations/index&quot;&gt;Try them out for yourself!&lt;/span&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/this-site.html</id><title type="text">This site</title><updated>2025-04-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">There are that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an that landed in May 2024.</summary><published>2025-03-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/03/module-type-of.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;the-road-to-odoc-3:-module-type-of&quot;&gt;&lt;a href=&quot;#the-road-to-odoc-3:-module-type-of&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The Road to Odoc 3: Module Type Of&lt;/h1&gt; 1459 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-03-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1460 - &lt;p&gt;There are &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;many new and improved features&lt;/a&gt; that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1081&quot;&gt;overhaul of &amp;quot;module type of&amp;quot;&lt;/a&gt; that landed in May 2024.&lt;/p&gt; 1461 - &lt;h2 id=&quot;module-type-of&quot;&gt;&lt;a href=&quot;#module-type-of&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Module Type Of&lt;/h2&gt; 1462 - &lt;p&gt;module type of is a language feature of OCaml allowing one to recover the signature of an existing module. For example, if I had a module &lt;code&gt;X&lt;/code&gt;:&lt;/p&gt; 1463 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module X = struct 1770 + let this_line_has_an_error = { foo = 1; bar = None };;</code></pre></div> 1771 + <p>But across cells, I've broken Merlin, though the code is executes correctly. You can see the problem in the following cell, which re-pushes the SVG image using the variable <code>svg</code> defined in the cell above. Merlin highlights the use of the varible <code>svg</code> is, because it's not aware of the varible, but the code gets executed correctly and the image is rendered below the cell.</p> 1772 + <div><pre class="language-ocaml"><code>Mime_printer.push &quot;image/svg&quot; (String.concat &quot;\n&quot; svg)</code></pre></div> 1773 + <p>Edit 2025-05-20: I have now got merlin working across cells, though I'm not convinced the current solution is the right long-term solution. S</p> 1774 + <h3 id="dynamic-libraries"><a href="#dynamic-libraries" class="anchor"></a>Dynamic libraries</h3> 1775 + <p>Currently the use of libraries it quite limited - they are defined more-or-less statically. I've had dynamic libraries working in the past, but I need to re-implement them. The plan is to have the <code>cma</code> files converted to <code>js</code> files and then load them on-demand when the notebook specifies them. The tricky thing here is that we need to be able to use them both in the browser and in bytecode executables so that the 'test-promote' workflow still works. This will probably require specifying the libraries by name, and having to re-implement the work that <a href="https://projects.camlcity.org/projects/findlib.html">findlib</a> does to find the libraries and load them and their dependencies in the right order, though this time entirely over HTTP.</p> 1776 + <h3 id="other-things"><a href="#other-things" class="anchor"></a>Other things</h3> 1777 + <p>There are loads of other things I'm interested in doing, including:</p> 1778 + <ul><li>Investigating how to do 'exercises' to allow readers to try things out in a guided way</li><li>'Test cells' to see if implementations are correct</li><li>Persistence of the notebook state - both using local and remote storage</li><li>Integration of docs</li><li>Exploration of the execution model - how to run the code in the right order and ensure reproducibility</li><li>Use of remote execution engines rather than just in the browser</li><li>Other languages?</li></ul> 1779 + <p>Right now though, my focus is on the functionality required for this blog, with a secondary goal of looking at how we might use this sort of technology on the docs site on ocaml.org. Wouldn't it be cool to be able to drop into a live OCaml toplevel for any library in opam?</p> 1780 + <h2 id="example-notebooks"><a href="#example-notebooks" class="anchor"></a>Example notebooks</h2> 1781 + <p>As a more extended example of odoc notebooks, I have converted to this format the course that I help teach at the University of Cambridge; <a href="https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/">Foundations of Computer Science</a>. <a href="../../../notebooks/foundations/index.html" title="index">Try them out for yourself!</a>.</p>]]></content> 1782 + </entry> 1783 + <entry> 1784 + <id>https://jon.recoil.org/blog/2025/03/module-type-of.html</id> 1785 + <title>The Road to Odoc 3: Module Type Of</title> 1786 + <published>2025-03-08T00:00:00Z</published> 1787 + <updated>2025-03-08T00:00:00Z</updated> 1788 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/03/module-type-of.html"/> 1789 + <summary>There are that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an that landed in May 2024.</summary> 1790 + <content type="html"><![CDATA[<h1 id="the-road-to-odoc-3:-module-type-of"><a href="#the-road-to-odoc-3:-module-type-of" class="anchor"></a>The Road to Odoc 3: Module Type Of</h1> 1791 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-03-08</p></li></ul> 1792 + <p>There are <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">many new and improved features</a> that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an <a href="https://github.com/ocaml/odoc/pull/1081">overhaul of &quot;module type of&quot;</a> that landed in May 2024.</p> 1793 + <h2 id="module-type-of"><a href="#module-type-of" class="anchor"></a>Module Type Of</h2> 1794 + <p>module type of is a language feature of OCaml allowing one to recover the signature of an existing module. For example, if I had a module <code>X</code>:</p> 1795 + <div><pre class="language-ocaml"><code>module X = struct 1464 1796 type t = Foo | Bar 1465 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1466 - &lt;p&gt;then I can get back the signature of &lt;code&gt;X&lt;/code&gt; using &lt;code&gt;module type of&lt;/code&gt;:&lt;/p&gt; 1467 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Xsig = module type of X&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1468 - &lt;p&gt;which can be very useful if you’re trying to &lt;a href=&quot;https://discuss.ocaml.org/t/extend-existing-module/1389&quot;&gt;extend existing modules&lt;/a&gt; amongst other things.&lt;/p&gt; 1469 - &lt;p&gt;OCaml and Odoc treat module type of in somewhat different ways. OCaml internally expands the expression immediately it sees it, and effectively replaces it with the signature - ie, in the above example Xsig is now a signature, not a module type of expression.&lt;/p&gt; 1470 - &lt;p&gt;In contrast, Odoc would like to keep track of the fact that this signature came from a &lt;code&gt;module type of&lt;/code&gt; expression, as it’s very useful to know. If you’re extending a module, your signature might look like:&lt;/p&gt; 1471 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type UnitExtended = sig 1797 + end</code></pre></div> 1798 + <p>then I can get back the signature of <code>X</code> using <code>module type of</code>:</p> 1799 + <div><pre class="language-ocaml"><code>module type Xsig = module type of X</code></pre></div> 1800 + <p>which can be very useful if you’re trying to <a href="https://discuss.ocaml.org/t/extend-existing-module/1389">extend existing modules</a> amongst other things.</p> 1801 + <p>OCaml and Odoc treat module type of in somewhat different ways. OCaml internally expands the expression immediately it sees it, and effectively replaces it with the signature - ie, in the above example Xsig is now a signature, not a module type of expression.</p> 1802 + <p>In contrast, Odoc would like to keep track of the fact that this signature came from a <code>module type of</code> expression, as it’s very useful to know. If you’re extending a module, your signature might look like:</p> 1803 + <div><pre class="language-ocaml"><code>module type UnitExtended = sig 1472 1804 include module type of Unit 1473 - val extra_unit_function : unit -&amp;gt; unit 1474 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1475 - &lt;p&gt;The documentation we produce will expand the contents of the &lt;code&gt;include&lt;/code&gt; statement, but keep track of the fact that it came from a &lt;code&gt;module type of&lt;/code&gt; expression so the reader can see where these signature items came from. In practice, you'd probably want to use &lt;code&gt;module type of struct include Unit end&lt;/code&gt;, which is a bit different from simply &lt;code&gt;module type of Unit&lt;/code&gt;, and I'll talk about this at some point in a future post.&lt;/p&gt; 1476 - &lt;h2 id=&quot;the-problem&quot;&gt;&lt;a href=&quot;#the-problem&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The problem&lt;/h2&gt; 1477 - &lt;p&gt;We run into difficulties as soon as we introduce another language feature that operates on signatures: with. Let’s start with a module type &lt;code&gt;S&lt;/code&gt;:&lt;/p&gt; 1478 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S = sig 1805 + val extra_unit_function : unit -&gt; unit 1806 + end</code></pre></div> 1807 + <p>The documentation we produce will expand the contents of the <code>include</code> statement, but keep track of the fact that it came from a <code>module type of</code> expression so the reader can see where these signature items came from. In practice, you'd probably want to use <code>module type of struct include Unit end</code>, which is a bit different from simply <code>module type of Unit</code>, and I'll talk about this at some point in a future post.</p> 1808 + <h2 id="the-problem"><a href="#the-problem" class="anchor"></a>The problem</h2> 1809 + <p>We run into difficulties as soon as we introduce another language feature that operates on signatures: with. Let’s start with a module type <code>S</code>:</p> 1810 + <div><pre class="language-ocaml"><code>module type S = sig 1479 1811 module X : sig 1480 1812 type t = int 1481 1813 end 1482 1814 1483 1815 module type Y = 1484 1816 module type of X 1485 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1486 - &lt;p&gt;We’ll now define a new module &lt;code&gt;X2&lt;/code&gt; that we intend to use as a replacement for &lt;code&gt;X&lt;/code&gt;:&lt;/p&gt; 1487 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module X2 = struct 1817 + end</code></pre></div> 1818 + <p>We’ll now define a new module <code>X2</code> that we intend to use as a replacement for <code>X</code>:</p> 1819 + <div><pre class="language-ocaml"><code>module X2 = struct 1488 1820 type t = int 1489 1821 type u = float 1490 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1491 - &lt;p&gt;Now we’ll define a new module type &lt;code&gt;T&lt;/code&gt; which is &lt;code&gt;S&lt;/code&gt; but with &lt;code&gt;X&lt;/code&gt; replaced:&lt;/p&gt; 1492 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type T = S with module X := X2&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1493 - &lt;p&gt;Here you can see that OCaml has expanded the &lt;code&gt;module type of&lt;/code&gt; expressions and told us the computed signature. The interesting thing here is that in module type &lt;code&gt;T&lt;/code&gt;, module type &lt;code&gt;Y&lt;/code&gt; only has a type &lt;code&gt;t&lt;/code&gt; in it, not a type &lt;code&gt;u&lt;/code&gt;. As above, Odoc wants to keep the &lt;code&gt;module type of&lt;/code&gt; expression so the reader can tell where module type &lt;code&gt;Y&lt;/code&gt; came from. However, the substitution would do a different thing in this case - we would have the following:&lt;/p&gt; 1494 - &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type T = sig 1822 + end</code></pre></div> 1823 + <p>Now we’ll define a new module type <code>T</code> which is <code>S</code> but with <code>X</code> replaced:</p> 1824 + <div><pre class="language-ocaml"><code>module type T = S with module X := X2</code></pre></div> 1825 + <p>Here you can see that OCaml has expanded the <code>module type of</code> expressions and told us the computed signature. The interesting thing here is that in module type <code>T</code>, module type <code>Y</code> only has a type <code>t</code> in it, not a type <code>u</code>. As above, Odoc wants to keep the <code>module type of</code> expression so the reader can tell where module type <code>Y</code> came from. However, the substitution would do a different thing in this case - we would have the following:</p> 1826 + <div><pre class="language-ocaml"><code>module type T = sig 1495 1827 module type Y = module type of X2 1496 - end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1497 - &lt;p&gt;and the expansion of this would then clearly have both types &lt;code&gt;t&lt;/code&gt; and &lt;code&gt;u&lt;/code&gt; in it.&lt;/p&gt; 1498 - &lt;p&gt;So now Odoc has two problems: We need to compute the correct signature, and we need to be able to describe how we computed it.&lt;/p&gt; 1499 - &lt;h2 id=&quot;the-solution&quot;&gt;&lt;a href=&quot;#the-solution&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The solution&lt;/h2&gt; 1500 - &lt;p&gt;The previous solution to this was to have a ‘phase 0’ of odoc which would compute the expansions of all module type of expressions before doing any other work. This was necessary because of a ‘simplfying’ assumption in how we handled the typing environment. The new, simpler approach was to calculate the expansion during the normal flow of work, and never to attempt to recalculate it, but simply operate on the signature. This was a nice big simplification and optimisation that removed a few corner cases in the previous code (including an &lt;a href=&quot;https://github.com/ocaml/odoc/blob/v2.4/src/xref2/type_of.ml#L167-L174&quot;&gt;infinite loop&lt;/a&gt; that we &lt;em&gt;hoped&lt;/em&gt; always terminated…!)&lt;/p&gt; 1501 - &lt;p&gt;The second issue was how to describe it. We still want it clear that this signature was derived from another, but it’s clear we can’t honestly say that in the above example that it’s &lt;code&gt;module type of X2&lt;/code&gt;. The answer is that we have applied a transparent ascription to the signature. Essentially, the signature is &lt;code&gt;X2&lt;/code&gt; but constrained to only have the fields of &lt;code&gt;X&lt;/code&gt;.&lt;/p&gt; 1502 - &lt;p&gt;This is not a current feature of OCaml, though Jane Street has &lt;a href=&quot;https://blog.janestreet.com/plans-for-ocaml-408/&quot;&gt;done some work&lt;/a&gt; on this, including declaring the syntax: &lt;code&gt;X2 &amp;lt;: X&lt;/code&gt;. However, there’s another interesting wrinkle here. &lt;code&gt;X&lt;/code&gt; is a module defined in the module type &lt;code&gt;S&lt;/code&gt;, so it’s not possible to write a valid OCaml path that points to it – &lt;code&gt;S.X&lt;/code&gt; has no meaning. In addition, the right-hand side of the &lt;code&gt;&amp;lt;:&lt;/code&gt; operator should be a module type, so we’d actually need to write &lt;code&gt;X2 &amp;lt;: module type of S.X&lt;/code&gt; . We’re still figuring out the right thing to do here, so for now Odoc 3 will still pretend that it’s simply &lt;code&gt;module type of X2&lt;/code&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/03/module-type-of.html</id><title type="text">The Road to Odoc 3: Module Type Of</title><updated>2025-03-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Back in 2021 introduced some to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language ta...</summary><published>2025-03-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/03/code-block-metadata.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;code-block-metadata&quot;&gt;&lt;a href=&quot;#code-block-metadata&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Code block metadata&lt;/h1&gt; 1503 - &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-03-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1504 - &lt;p&gt;Back in 2021 &lt;a href=&quot;https://github.com/julow&quot;&gt;julow&lt;/a&gt; introduced some &lt;a href=&quot;https://github.com/ocaml-doc/odoc-parser/pull/2&quot;&gt;new syntax&lt;/a&gt; to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language tag and the start of the code block. Now odoc needs to use it itself, we need to be a bit more precise about how it’s defined.&lt;/p&gt; 1505 - &lt;p&gt;The original concept looked like this:&lt;/p&gt; 1506 - &lt;pre&gt;{@ocaml metadata goes here in an unstructured way[ 1828 + end</code></pre></div> 1829 + <p>and the expansion of this would then clearly have both types <code>t</code> and <code>u</code> in it.</p> 1830 + <p>So now Odoc has two problems: We need to compute the correct signature, and we need to be able to describe how we computed it.</p> 1831 + <h2 id="the-solution"><a href="#the-solution" class="anchor"></a>The solution</h2> 1832 + <p>The previous solution to this was to have a ‘phase 0’ of odoc which would compute the expansions of all module type of expressions before doing any other work. This was necessary because of a ‘simplfying’ assumption in how we handled the typing environment. The new, simpler approach was to calculate the expansion during the normal flow of work, and never to attempt to recalculate it, but simply operate on the signature. This was a nice big simplification and optimisation that removed a few corner cases in the previous code (including an <a href="https://github.com/ocaml/odoc/blob/v2.4/src/xref2/type_of.ml#L167-L174">infinite loop</a> that we <em>hoped</em> always terminated…!)</p> 1833 + <p>The second issue was how to describe it. We still want it clear that this signature was derived from another, but it’s clear we can’t honestly say that in the above example that it’s <code>module type of X2</code>. The answer is that we have applied a transparent ascription to the signature. Essentially, the signature is <code>X2</code> but constrained to only have the fields of <code>X</code>.</p> 1834 + <p>This is not a current feature of OCaml, though Jane Street has <a href="https://blog.janestreet.com/plans-for-ocaml-408/">done some work</a> on this, including declaring the syntax: <code>X2 &lt;: X</code>. However, there’s another interesting wrinkle here. <code>X</code> is a module defined in the module type <code>S</code>, so it’s not possible to write a valid OCaml path that points to it – <code>S.X</code> has no meaning. In addition, the right-hand side of the <code>&lt;:</code> operator should be a module type, so we’d actually need to write <code>X2 &lt;: module type of S.X</code> . We’re still figuring out the right thing to do here, so for now Odoc 3 will still pretend that it’s simply <code>module type of X2</code>.</p>]]></content> 1835 + </entry> 1836 + <entry> 1837 + <id>https://jon.recoil.org/blog/2025/03/code-block-metadata.html</id> 1838 + <title>Code block metadata</title> 1839 + <published>2025-03-07T00:00:00Z</published> 1840 + <updated>2025-03-07T00:00:00Z</updated> 1841 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/03/code-block-metadata.html"/> 1842 + <summary>Back in 2021 introduced some to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language ta...</summary> 1843 + <content type="html"><![CDATA[<h1 id="code-block-metadata"><a href="#code-block-metadata" class="anchor"></a>Code block metadata</h1> 1844 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-03-07</p></li></ul> 1845 + <p>Back in 2021 <a href="https://github.com/julow">julow</a> introduced some <a href="https://github.com/ocaml-doc/odoc-parser/pull/2">new syntax</a> to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language tag and the start of the code block. Now odoc needs to use it itself, we need to be a bit more precise about how it’s defined.</p> 1846 + <p>The original concept looked like this:</p> 1847 + <pre>{@ocaml metadata goes here in an unstructured way[ 1507 1848 ... code ... 1508 - ]}&lt;/pre&gt; 1509 - &lt;p&gt;where everything in between the language (“ocaml” in this case) and the opening square bracket would be captured and put into the AST verbatim. Odoc itself has had no particular use for this, but it has been used in &lt;a href=&quot;https://github.com/realworldocaml/mdx&quot;&gt;mdx&lt;/a&gt; to control how it handles the code blocks, for example to skip processing of the block, to synchronise the block with another file, to disable testing the block on particular OSs and so on.&lt;/p&gt; 1510 - &lt;p&gt;As part of the Odoc 3 release we decided to address one of our &lt;a href=&quot;https://github.com/ocaml/odoc/pull/303&quot;&gt;oldest open issues&lt;/a&gt;, that of extracting code blocks from mli/mld files for inclusion into other files. This is similar to the file-sync facility in mdx but it works in the other direction: the canonical source is in the mld/mli file. In order to do this, we now need to use the metadata so we can select which code blocks to extract, and so we needed a more concrete specification of how the metadata should be parsed.&lt;/p&gt; 1511 - &lt;p&gt;We looked at what &lt;a href=&quot;https://github.com/realworldocaml/mdx/blob/main/lib/label.ml#L195-L210&quot;&gt;mdx does&lt;/a&gt;, but the way it works is rather ad-hoc, using very simple String.splits to chop up the metadata. This is OK for mdx as it’s fully in charge of what things the user might want to put into the metadata, but for a general parsing library like odoc.parser we need to be a bit more careful. Daniel Bünzli &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1326#issuecomment-2702260053&quot;&gt;suggested&lt;/a&gt; a simple strategy of atoms and bindings inspired by s-expressions. The idea is that we can have something like this:&lt;/p&gt; 1512 - &lt;pre&gt;{@ocaml atom1 &amp;quot;atom two&amp;quot; key1=value1 &amp;quot;key 2&amp;quot;=&amp;quot;value with spaces&amp;quot;[ 1849 + ]}</pre> 1850 + <p>where everything in between the language (“ocaml” in this case) and the opening square bracket would be captured and put into the AST verbatim. Odoc itself has had no particular use for this, but it has been used in <a href="https://github.com/realworldocaml/mdx">mdx</a> to control how it handles the code blocks, for example to skip processing of the block, to synchronise the block with another file, to disable testing the block on particular OSs and so on.</p> 1851 + <p>As part of the Odoc 3 release we decided to address one of our <a href="https://github.com/ocaml/odoc/pull/303">oldest open issues</a>, that of extracting code blocks from mli/mld files for inclusion into other files. This is similar to the file-sync facility in mdx but it works in the other direction: the canonical source is in the mld/mli file. In order to do this, we now need to use the metadata so we can select which code blocks to extract, and so we needed a more concrete specification of how the metadata should be parsed.</p> 1852 + <p>We looked at what <a href="https://github.com/realworldocaml/mdx/blob/main/lib/label.ml#L195-L210">mdx does</a>, but the way it works is rather ad-hoc, using very simple String.splits to chop up the metadata. This is OK for mdx as it’s fully in charge of what things the user might want to put into the metadata, but for a general parsing library like odoc.parser we need to be a bit more careful. Daniel Bünzli <a href="https://github.com/ocaml/odoc/pull/1326#issuecomment-2702260053">suggested</a> a simple strategy of atoms and bindings inspired by s-expressions. The idea is that we can have something like this:</p> 1853 + <pre>{@ocaml atom1 &quot;atom two&quot; key1=value1 &quot;key 2&quot;=&quot;value with spaces&quot;[ 1513 1854 ... code content ... 1514 - ]}&lt;/pre&gt; 1515 - &lt;p&gt;Daniel suggested a very minimal escaping rule, whereby a string could contain a literal &amp;quot; by prefixing with a backslash - something like; &amp;quot;value with a \&amp;quot; and spaces&amp;quot;, but we discussed it during the &lt;a href=&quot;https://ocaml.org/governance/platform&quot;&gt;odoc developer meeting&lt;/a&gt; and felt that we might want something a little more familiar. So we took a look at the lexer in &lt;a href=&quot;https://github.com/janestreet/sexplib/blob/master/src/lexer.mll&quot;&gt;sexplib&lt;/a&gt; and found that it follows the &lt;a href=&quot;https://github.com/janestreet/sexplib/blob/d7c5e3adc16fcf0435220c3cd44bb695775020c1/README.org#lexical-conventions-of-s-expression&quot;&gt;lexical conventions&lt;/a&gt; of OCaml’s strings, and decided that would be a reasonable approach for us to follow too.&lt;/p&gt; 1516 - &lt;p&gt;The resulting code, including the extraction logic, was implemented in &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1326/&quot;&gt;PR 1326&lt;/a&gt; mainly by &lt;a href=&quot;https://github.com/panglesd&quot;&gt;panglesd&lt;/a&gt; with a little help from me on the lexer.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/03/code-block-metadata.html</id><title type="text">Code block metadata</title><updated>2025-03-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry></feed> 1855 + ]}</pre> 1856 + <p>Daniel suggested a very minimal escaping rule, whereby a string could contain a literal &quot; by prefixing with a backslash - something like; &quot;value with a \&quot; and spaces&quot;, but we discussed it during the <a href="https://ocaml.org/governance/platform">odoc developer meeting</a> and felt that we might want something a little more familiar. So we took a look at the lexer in <a href="https://github.com/janestreet/sexplib/blob/master/src/lexer.mll">sexplib</a> and found that it follows the <a href="https://github.com/janestreet/sexplib/blob/d7c5e3adc16fcf0435220c3cd44bb695775020c1/README.org#lexical-conventions-of-s-expression">lexical conventions</a> of OCaml’s strings, and decided that would be a reasonable approach for us to follow too.</p> 1857 + <p>The resulting code, including the extraction logic, was implemented in <a href="https://github.com/ocaml/odoc/pull/1326/">PR 1326</a> mainly by <a href="https://github.com/panglesd">panglesd</a> with a little help from me on the lexer.</p>]]></content> 1858 + </entry> 1859 + </feed>
+41
blog-index.mld
··· 1 + {0 Blog} 2 + 3 + @children_order 2026/ 2025/ 4 + 5 + @recent-posts 6 + {ul 7 + {- {{!//blog/2026/03/page-"weeknotes-2026-10"}Weeknotes 2026 week 10} 2026-03-09} 8 + {- {{!//blog/2026/03/page-"weeknotes-2026-09"}Weeknotes 2026 week 9} 2026-03-02} 9 + {- {{!//blog/2026/02/page-"weeknotes-2026-08"}Weeknotes weeks 7-8} 2026-02-23} 10 + {- {{!//blog/2026/02/page-"weeknotes-2026-06"}Weeknotes for week 6} 2026-02-09} 11 + {- {{!//blog/2026/01/page-"weeknotes-2026-04-05"}Weeknotes for weeks 4-5} 2026-01-31} 12 + {- {{!//blog/2026/01/page-"weeknotes-2026-03"}Weeknotes for week 3} 2026-01-18} 13 + {- {{!//blog/2025/12/page-"claude-and-dune"}Claude and Dune} 2025-12-18} 14 + {- {{!//blog/2025/12/page-"an-svg-is-all-you-need"}An SVG is all you need} 2025-12-15} 15 + {- {{!//blog/2025/11/page-"foundations-of-computer-science"}Foundations of Computer Science} 2025-11-15} 16 + {- {{!//blog/2025/09/page-"caching-opam-solutions2"}Caching opam solutions - part 2} 2025-09-20} 17 + {- {{!//blog/2025/09/page-"odoc-bugs"}Odoc bugs} 2025-09-15} 18 + {- {{!//blog/2025/09/page-"caching-opam-solutions"}Caching opam solutions} 2025-09-10} 19 + {- {{!//blog/2025/09/page-"build-ids-for-day10"}Build IDs for Day10} 2025-09-05} 20 + {- {{!//blog/2025/09/page-"giving-hub-cl-an-upgrade"}Giving hub.cl an upgrade} 2025-09-01} 21 + {- {{!//blog/2025/08/page-"ocaml-lsp-mcp"}Using ocaml-lsp-server via an MCP server} 2025-08-25} 22 + {- {{!//blog/2025/08/page-"ocaml-mcp-server"}An OCaml MCP server} 2025-08-20} 23 + {- {{!//blog/2025/08/page-week33}Week 33} 2025-08-11} 24 + {- {{!//blog/2025/07/page-retrospective}4 months in, a retrospective} 2025-07-28} 25 + {- {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}Odoc 3 is live on OCaml.org!} 2025-07-21} 26 + {- {{!//blog/2025/07/page-week28}Week 28} 2025-07-14} 27 + {- {{!//blog/2025/07/page-week27}Weeks 24-27} 2025-07-07} 28 + {- {{!//blog/2025/06/page-week23}Week 23} 2025-06-09} 29 + {- {{!//blog/2025/05/page-"docs-progress"}Progress in OCaml docs} 2025-05-28} 30 + {- {{!//blog/2025/05/page-"lots-of-things"}Lots of things have been happening} 2025-05-22} 31 + {- {{!//blog/2025/05/page-"ticks-solved-by-ai"}Solving First-year OCaml exercises with AI} 2025-05-15} 32 + {- {{!//blog/2025/05/page-"oxcaml-gets-closer"}OxCaml is getting closer...} 2025-05-10} 33 + {- {{!//blog/2025/05/page-"ai-for-climate-and-nature-day"}AI for Climate & Nature Community Day} 2025-05-05} 34 + {- {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}OCaml-Docs-CI and Odoc 3} 2025-04-28} 35 + {- {{!//blog/2025/04/page-"odoc-3"}Odoc 3: So what?} 2025-04-21} 36 + {- {{!//blog/2025/04/page-"semantic-versioning-is-hard"}Semantic Versioning in OCaml is Hard} 2025-04-14} 37 + {- {{!//blog/2025/04/page-"meeting-the-team"}Meeting the Team} 2025-04-07} 38 + {- {{!//blog/2025/04/page-"this-site"}This site} 2025-04-01} 39 + {- {{!//blog/2025/03/page-"module-type-of"}The Road to Odoc 3: Module Type Of} 2025-03-24} 40 + {- {{!//blog/2025/03/page-"code-block-metadata"}Code block metadata} 2025-03-17} 41 + }
+99 -42
deploy-site.sh
··· 1 1 #!/bin/bash 2 2 # Build and optionally serve the full jon.recoil.org site. 3 3 # 4 - # The site is built in two passes: 5 - # 1. `dune build @site` — site content (.mld pages → HTML) 6 - # Output: _build/default/site/_html/ 7 - # 2. `dune build @doc` — reference docs (library API docs) 8 - # Output: _build/default/_doc/_html/reference/ 9 - # 10 - # These are combined into a single tree for serving/deployment: 11 - # _build/default/site/_html/ — site root 12 - # _build/default/site/_html/reference/ — symlinked to @doc output 13 - # 14 - # Prerequisites: 15 - # - opam switch "default" with dune 3.21+ (html_flags + prefix support) 16 - # - odoc-jons-plugins and odoc-interactive-extension in the workspace 4 + # Dune outputs go into _build/ (which dune controls and may wipe). 5 + # We assemble the final site into _site/ so that expensive artifacts 6 + # like universes persist across rebuilds. 17 7 # 18 8 # Usage: 19 9 # ./deploy-site.sh # build everything and serve on port 8080 20 10 # ./deploy-site.sh --no-serve # build only 11 + # ./deploy-site.sh --fresh # rebuild universes from scratch 21 12 22 13 set -euo pipefail 23 14 24 15 MONO=$(cd "$(dirname "$0")" && pwd) 25 - SITE_HTML="$MONO/_build/default/site/_html" 26 - DOC_HTML="$MONO/_build/default/_doc/_html" 16 + SITE="$MONO/_site" 17 + DUNE_SITE="$MONO/_build/default/site/_html" 18 + DUNE_DOC="$MONO/_build/default/_doc/_html" 27 19 SERVE=true 20 + FRESH=false 28 21 29 - if [[ "${1:-}" == "--no-serve" ]]; then 30 - SERVE=false 22 + for arg in "$@"; do 23 + case "$arg" in 24 + --no-serve) SERVE=false ;; 25 + --fresh) FRESH=true ;; 26 + esac 27 + done 28 + 29 + if $FRESH; then 30 + echo "=== --fresh: removing _site ===" 31 + rm -rf "$SITE" 31 32 fi 32 33 34 + mkdir -p "$SITE" 35 + 33 36 # Ensure we're on the right switch. 34 37 export OPAMSWITCH=5.2.0+ox 35 38 eval "$(opam env)" ··· 37 40 echo "=== Step 1: Build and register plugins ===" 38 41 cd "$MONO" 39 42 dune build @install 40 - # Shell and extensions must be findlib-visible for odoc to load them. 41 - # x-ocaml.js is found from _build/install/ so x-ocaml itself doesn't need installing. 42 - # All workspace packages must be findlib-visible for odoc shell/extensions and jtw opam. 43 43 dune install 2>/dev/null 44 44 echo " plugins registered" 45 45 46 46 echo "" 47 47 echo "=== Step 2: Build site content ===" 48 48 dune build @site 2>&1 | grep -v '^Warning\|^File\|^$' | tail -5 || true 49 - echo " site built → $SITE_HTML/" 49 + echo " dune @site done" 50 + 51 + echo "" 52 + echo "=== Step 3: Build reference docs ===" 53 + dune build @doc 2>&1 | tail -5 || true 54 + echo " dune @doc done" 55 + 56 + echo "" 57 + echo "=== Step 4: Assemble site ===" 58 + # Copy site HTML from dune output into _site. 59 + # rsync would be ideal but cp -rf works fine. 60 + cp -rf "$DUNE_SITE/"* "$SITE/" 61 + # Merge reference docs on top (site has reference/index.html, @doc adds packages). 62 + cp -rf "$DUNE_DOC/reference/"* "$SITE/reference/" 63 + mkdir -p "$SITE/odoc.support/" 64 + cp -rf "$DUNE_DOC/odoc.support/"* "$SITE/odoc.support/" 65 + echo " assembled into $SITE/" 50 66 51 67 echo "" 52 - echo "=== Step 3: Build universes for interactive code cells ===" 53 - jtw opam astring brr note mime_printer fpath rresult \ 54 - opam-format bos odoc.model tyxml yojson uri jsonm \ 55 - js_top_worker-widget-leaflet \ 56 - tessera-geotessera-jsoo tessera-viz-jsoo -o "$SITE_HTML/_opam" 57 - echo " universe built → $SITE_HTML/_opam/" 68 + echo "=== Step 5: Build site universe (notebooks) ===" 69 + if [ -f "$SITE/_opam/worker.js" ]; then 70 + echo " universe already exists, skipping (use --fresh to rebuild)" 71 + else 72 + jtw opam astring base brr note mime_printer fpath rresult \ 73 + opam-format bos odoc.model tyxml yojson uri jsonm \ 74 + js_top_worker-widget-leaflet \ 75 + tessera-geotessera-jsoo tessera-viz-jsoo \ 76 + onnxrt -o "$SITE/_opam" 77 + echo " universe built → $SITE/_opam/" 78 + fi 58 79 59 80 echo "" 60 - echo "=== Step 4: Build reference docs ===" 61 - dune build @doc 2>&1 | tail -5 || true 62 - echo " reference docs built → $DOC_HTML/reference/" 81 + echo "=== Step 6: Deploy onnxrt example assets ===" 82 + SENTIMENT_SRC="$MONO/onnxrt/example/sentiment" 83 + if [ ! -f "$SENTIMENT_SRC/model_quantized.onnx" ]; then 84 + echo " downloading DistilBERT model..." 85 + bash "$SENTIMENT_SRC/download_model.sh" 86 + fi 87 + cp "$SENTIMENT_SRC/vocab.txt" "$SITE/_opam/vocab.txt" 88 + cp "$SENTIMENT_SRC/model_quantized.onnx" "$SITE/_opam/model_quantized.onnx" 89 + cp "$MONO/onnxrt/example/add.onnx" "$SITE/_opam/add.onnx" 90 + echo " deployed vocab.txt + model_quantized.onnx + add.onnx → $SITE/_opam/" 63 91 64 92 echo "" 65 - echo "=== Step 5: Merge reference docs into site ===" 66 - # Copy @doc reference output into the site's reference/ directory. 67 - # The site build already creates reference/index.html from reference/index.mld; 68 - # we add the API docs packages alongside it. 69 - # Make existing files writable first (dune marks build outputs read-only). 70 - chmod -R u+w "$SITE_HTML/reference/" "$SITE_HTML/odoc.support/" 2>/dev/null || true 71 - cp -rf "$DOC_HTML/reference/"* "$SITE_HTML/reference/" 72 - mkdir -p "$SITE_HTML/odoc.support/" 73 - cp -rf "$DOC_HTML/odoc.support/"* "$SITE_HTML/odoc.support/" 74 - echo " merged reference docs into site tree" 93 + echo "=== Step 7: Build demo universes ===" 94 + DEMO_DIR="$SITE/reference/odoc-interactive-extension" 95 + 96 + if [ -f "$DEMO_DIR/universe/worker.js" ]; then 97 + echo " demo universes already exist, skipping (use --fresh to rebuild)" 98 + else 99 + UNIVERSES=$(mktemp -d) 100 + trap 'rm -rf "$UNIVERSES"' EXIT 101 + 102 + echo " building default universe (cmdliner, 5.2.0+ox switch)..." 103 + jtw opam --switch=5.2.0+ox -o "$UNIVERSES/default" cmdliner 104 + 105 + echo " building v3 universe (cmdliner, 5.2.0+ox switch)..." 106 + jtw opam --switch=5.2.0+ox -o "$UNIVERSES/v3" cmdliner 107 + 108 + echo " building oxcaml universe (5.2.0+ox switch)..." 109 + jtw opam --switch=5.2.0+ox -o "$UNIVERSES/oxcaml" 110 + 111 + for d in universe universe-v2 universe-v3 universe-oxcaml; do 112 + rm -rf "$DEMO_DIR/$d" 113 + done 114 + 115 + cp -r "$UNIVERSES/default" "$DEMO_DIR/universe" 116 + echo " deployed universe/" 117 + 118 + cp -r "$UNIVERSES/default" "$DEMO_DIR/universe-v2" 119 + echo " deployed universe-v2/" 120 + 121 + cp -r "$UNIVERSES/v3" "$DEMO_DIR/universe-v3" 122 + echo " deployed universe-v3/" 123 + 124 + cp -r "$UNIVERSES/oxcaml" "$DEMO_DIR/universe-oxcaml" 125 + echo " deployed universe-oxcaml/" 126 + 127 + for d in universe universe-v2 universe-v3 universe-oxcaml; do 128 + cp "$SITE/_x-ocaml/x-ocaml.js" "$DEMO_DIR/$d/x-ocaml.js" 129 + done 130 + fi 75 131 76 132 echo "" 77 133 echo "=== Done ===" 78 134 echo "" 79 - echo "Site root: $SITE_HTML/" 135 + echo "Site root: $SITE/" 80 136 echo "" 81 137 echo "Key pages:" 82 138 echo " /index.html — site home" ··· 85 141 echo " /notebooks/foundations/index.html — foundations of CS" 86 142 echo " /projects/index.html — projects" 87 143 echo " /reference/ — API reference docs" 144 + echo " /reference/odoc-interactive-extension/ — interactive demos" 88 145 89 146 if $SERVE; then 90 147 echo "" 91 148 echo "Starting HTTP server on http://localhost:8080" 92 - cd "$SITE_HTML" 149 + cd "$SITE" 93 150 exec python3 -m http.server 8080 94 151 fi
+15 -6
js_top_worker/bin/mk_backend.ml
··· 22 22 in 23 23 let _ = Util.lines_of_process cmd in 24 24 (* No longer query library stubs - they are now linked directly into each library's JS file *) 25 + (* TODO: This probes for basement in every switch, but basement only exists 26 + in oxcaml switches. Should only check when building for an oxcaml switch, 27 + or better yet, detect from worker.bc's actual dependencies. *) 28 + let has_basement = 29 + let cmd = Bos.Cmd.(ocamlfind_cmd % "query" % "basement") in 30 + match Bos.OS.Cmd.(run_out ~err:err_null cmd |> out_null |> success) with 31 + | Ok () -> true 32 + | Error _ -> false 33 + in 25 34 let cmd = 26 35 Bos.Cmd.( 27 36 js_of_ocaml_cmd % "--toplevel" % "--linkall" % "--pretty") 28 37 in 38 + let runtime_files = 39 + [ "+dynlink.js"; "+toplevel.js"; "+bigstringaf/runtime.js"; 40 + "+js_top_worker/stubs.js" ] 41 + @ (if has_basement then [ "+basement/runtime.js" ] else []) 42 + in 29 43 let cmd = 30 44 List.fold_right 31 45 (fun a cmd -> Bos.Cmd.(cmd % a)) 32 - [ 33 - "+dynlink.js"; 34 - "+toplevel.js"; 35 - "+bigstringaf/runtime.js"; 36 - "+js_top_worker/stubs.js"; 37 - ] 46 + runtime_files 38 47 cmd 39 48 in 40 49 let cmd =
+8
js_top_worker/bin/opam.ml
··· 1 + (* TODO: Add content-addressed output paths (like day10's jtw_gen.ml) so that 2 + worker.js, .cma.js, and .cmi files are served from paths containing a 3 + content hash. This prevents stale browser caches after rebuilds. 4 + See day10/bin/jtw_gen.ml for the reference implementation: 5 + - compiler/<version>/<hash>/worker.js 6 + - p/<pkg>/<version>/<hash>/lib/<name>/... 7 + - findlib_index.json references hashed paths *) 8 + 1 9 open Bos 2 10 3 11 let opam = Cmd.v "opam"
+1 -1
odoc-dot-extension/index.mld
··· 6 6 7 7 {1 Installation} 8 8 9 - {[ 9 + {@shell[ 10 10 opam install odoc-dot-extension 11 11 ]} 12 12
+17 -13
odoc-interactive-extension/doc/demo1.mld
··· 1 1 {0 Interactive OCaml Demo} 2 2 3 + @x-ocaml.universe ./universe 4 + @x-ocaml.worker ./universe/worker.js 5 + 3 6 This page demonstrates interactive OCaml code cells powered by 4 7 [x-ocaml] and [js_top_worker]. 5 8 ··· 7 10 8 11 Try evaluating some OCaml expressions: 9 12 10 - {@ocaml[ 13 + {@ocaml x[ 11 14 1 + 2 * 3 12 15 ]} 13 16 14 - {@ocaml[ 17 + {@ocaml x[ 15 18 let greet name = Printf.sprintf "Hello, %s!" name 16 19 17 20 let () = print_endline (greet "World") 18 21 ]} 19 22 20 - {1 Using Yojson} 23 + {1 Using Cmdliner} 21 24 22 - These cells use the [yojson] library loaded from the universe: 25 + These cells use the [cmdliner] library loaded from the universe: 23 26 24 - {@ocaml[ 25 - #require "yojson" 27 + {@ocaml x[ 28 + #require "cmdliner" 26 29 ]} 27 30 28 - {@ocaml[ 29 - let json = `Assoc [ 30 - ("name", `String "OCaml"); 31 - ("version", `Float 5.4); 32 - ("features", `List [`String "modules"; `String "types"]) 33 - ] 31 + {@ocaml x[ 32 + let greeting = 33 + Cmdliner.Arg.(value & opt string "World" & info ["name"] ~docv:"NAME") 34 + 35 + let greet_term = 36 + Cmdliner.Term.(const (Printf.printf "Hello, %s!\n") $ greeting) 34 37 35 - let () = print_endline (Yojson.Safe.pretty_to_string json) 38 + let cmd = Cmdliner.Cmd.v (Cmdliner.Cmd.info "greet") greet_term 39 + let () = Printf.printf "Command: %s\n" (Cmdliner.Cmd.name cmd) 36 40 ]}
+20 -14
odoc-interactive-extension/doc/demo2_v2.mld
··· 1 - {0 Yojson v2 Demo} 1 + {0 Cmdliner v1 Demo} 2 2 3 3 @x-ocaml.universe ./universe-v2 4 4 @x-ocaml.worker ./universe-v2/worker.js 5 5 6 - This page uses {b yojson 2.2.2} from a separate universe directory. 6 + This page uses {b cmdliner 1.3.0} from a separate universe directory. 7 7 8 - {@ocaml[ 9 - #require "yojson" 8 + {@ocaml x[ 9 + #require "cmdliner" 10 10 ]} 11 11 12 - {@ocaml[ 13 - (* Yojson 2.x API *) 14 - let json = `Assoc [("key", `String "value")] 15 - let s = Yojson.Safe.to_string json 16 - let () = print_endline s 12 + {@ocaml x[ 13 + (* Cmdliner 1.x: define an argument and build a command *) 14 + let greeting = 15 + Cmdliner.Arg.(value & opt string "World" & info ["name"] ~docv:"NAME") 16 + 17 + let greet_term = 18 + Cmdliner.Term.(const (Printf.printf "Hello, %s!\n") $ greeting) 19 + 20 + let cmd = Cmdliner.Cmd.v (Cmdliner.Cmd.info "greet" ~doc:"A greeting command") greet_term 21 + let () = Printf.printf "Command: %s\n" (Cmdliner.Cmd.name cmd) 17 22 ]} 18 23 19 - {@ocaml[ 20 - (* Yojson 2.x: Yojson.Safe.prettify is a string->string function *) 21 - let ugly = Yojson.Safe.to_string (`Assoc [("compact", `Bool true); ("data", `List [`Int 1; `Int 2; `Int 3])]) 22 - let pretty = Yojson.Safe.prettify ugly 23 - let () = print_endline pretty 24 + {@ocaml x[ 25 + (* Cmdliner 1.x: multiple arguments *) 26 + let verbose = Cmdliner.Arg.(value & flag & info ["v"; "verbose"]) 27 + let count = Cmdliner.Arg.(value & opt int 1 & info ["count"] ~docv:"N") 28 + let term = Cmdliner.Term.(const (fun v n -> Printf.printf "verbose=%b count=%d\n" v n) $ verbose $ count) 29 + let () = print_endline "Term with multiple arguments defined" 24 30 ]}
+20 -14
odoc-interactive-extension/doc/demo2_v3.mld
··· 1 - {0 Yojson v3 Demo} 1 + {0 Cmdliner v2 Demo} 2 2 3 3 @x-ocaml.universe ./universe-v3 4 4 @x-ocaml.worker ./universe-v3/worker.js 5 5 6 - This page uses {b yojson 3.0.0} from a separate universe directory. 6 + This page uses {b cmdliner 2.1.0} from a separate universe directory. 7 7 8 - {@ocaml[ 9 - #require "yojson" 8 + {@ocaml x[ 9 + #require "cmdliner" 10 10 ]} 11 11 12 - {@ocaml[ 13 - (* Yojson 3.0 API *) 14 - let json = `Assoc [("key", `String "value")] 15 - let s = Yojson.Safe.to_string json 16 - let () = print_endline s 12 + {@ocaml x[ 13 + (* Cmdliner 2.x: same core API for simple cases *) 14 + let greeting = 15 + Cmdliner.Arg.(value & opt string "World" & info ["name"] ~docv:"NAME") 16 + 17 + let greet_term = 18 + Cmdliner.Term.(const (Printf.printf "Hello, %s!\n") $ greeting) 19 + 20 + let cmd = Cmdliner.Cmd.v (Cmdliner.Cmd.info "greet" ~doc:"A greeting command") greet_term 21 + let () = Printf.printf "Command: %s\n" (Cmdliner.Cmd.name cmd) 17 22 ]} 18 23 19 - {@ocaml[ 20 - (* Build and query JSON *) 21 - let parsed = `Assoc [("x", `Int 42); ("y", `String "hello")] 22 - let x = Yojson.Safe.Util.member "x" parsed 23 - let () = Printf.printf "x = %s\n" (Yojson.Safe.to_string x) 24 + {@ocaml x[ 25 + (* Cmdliner 2.x: Cmd.group for subcommands *) 26 + let sub1 = Cmdliner.Cmd.v (Cmdliner.Cmd.info "sub1") (Cmdliner.Term.const ()) 27 + let sub2 = Cmdliner.Cmd.v (Cmdliner.Cmd.info "sub2") (Cmdliner.Term.const ()) 28 + let group = Cmdliner.Cmd.group (Cmdliner.Cmd.info "myapp") [sub1; sub2] 29 + let () = Printf.printf "Group: %s\n" (Cmdliner.Cmd.name group) 24 30 ]}
+7 -7
odoc-interactive-extension/doc/demo3_oxcaml.mld
··· 10 10 11 11 OxCaml adds Python/Haskell-style list and array comprehensions: 12 12 13 - {@ocaml[ 13 + {@ocaml x[ 14 14 let squares = [ x * x for x = 1 to 10 ] 15 15 16 16 let () = List.iter (fun x -> Printf.printf "%d " x) squares 17 17 ]} 18 18 19 - {@ocaml[ 19 + {@ocaml x[ 20 20 let evens = [ x for x = 1 to 20 when x mod 2 = 0 ] 21 21 22 22 let () = Printf.printf "Evens: %s\n" ··· 25 25 26 26 Nested comprehensions produce the cartesian product: 27 27 28 - {@ocaml[ 28 + {@ocaml x[ 29 29 let pairs = [ (x, y) for x = 1 to 3 for y = 1 to 3 when x <> y ] 30 30 31 31 let () = List.iter (fun (x, y) -> Printf.printf "(%d,%d) " x y) pairs ··· 35 35 36 36 Array comprehensions create arrays using the same syntax as list comprehensions: 37 37 38 - {@ocaml[ 38 + {@ocaml x[ 39 39 let squares = [| x * x for x = 1 to 10 |] 40 40 41 41 let () = Array.iter (fun x -> Printf.printf "%d " x) squares 42 42 ]} 43 43 44 - {@ocaml[ 44 + {@ocaml x[ 45 45 let fibs = 46 46 let a = Array.make 10 0 in 47 47 a.(0) <- 1; a.(1) <- 1; ··· 55 55 56 56 [let mutable] provides mutable local variables without heap allocation: 57 57 58 - {@ocaml[ 58 + {@ocaml x[ 59 59 let triangle n = 60 60 let mutable total = 0 in 61 61 for i = 1 to n do ··· 66 66 let () = Printf.printf "triangle 10 = %d\n" (triangle 10) 67 67 ]} 68 68 69 - {@ocaml[ 69 + {@ocaml x[ 70 70 let fizzbuzz n = 71 71 let mutable result = [] in 72 72 for i = n downto 1 do
+16 -14
odoc-interactive-extension/doc/demo4_crossorigin.mld
··· 1 1 {0 Cross-Origin Demo} 2 2 3 - @x-ocaml.universe http://localhost:9090/universe 4 - @x-ocaml.worker http://localhost:9090/universe/worker.js 3 + @x-ocaml.universe https://jon.ludl.am/universe 4 + @x-ocaml.worker https://jon.ludl.am/universe/worker.js 5 5 6 6 This page demonstrates {b cross-origin} loading of OCaml universes. 7 - The page is served from [localhost:8080] while the worker and libraries 8 - are loaded from [localhost:9090], exercising the blob: URL worker 7 + The page is served from the main site while the worker and libraries 8 + are loaded from [jon.ludl.am], exercising the blob: URL worker 9 9 creation and sync XHR + eval library loading code paths. 10 10 11 11 {1 Basic Expression} 12 12 13 - {@ocaml[ 13 + {@ocaml x[ 14 14 1 + 2 * 3 15 15 ]} 16 16 17 - {@ocaml[ 17 + {@ocaml x[ 18 18 let greet name = Printf.sprintf "Hello, %s!" name 19 19 20 20 let () = print_endline (greet "Cross-Origin World") ··· 22 22 23 23 {1 Loading a Library} 24 24 25 - {@ocaml[ 26 - #require "yojson" 25 + {@ocaml x[ 26 + #require "cmdliner" 27 27 ]} 28 28 29 - {@ocaml[ 30 - let json = `Assoc [ 31 - ("origin", `String "cross-origin"); 32 - ("port", `Int 9090) 33 - ] 29 + {@ocaml x[ 30 + let greeting = 31 + Cmdliner.Arg.(value & opt string "Cross-Origin" & info ["name"] ~docv:"NAME") 32 + 33 + let greet_term = 34 + Cmdliner.Term.(const (Printf.printf "Hello from %s!\n") $ greeting) 34 35 35 - let () = print_endline (Yojson.Safe.pretty_to_string json) 36 + let cmd = Cmdliner.Cmd.v (Cmdliner.Cmd.info "cross-greet") greet_term 37 + let () = Printf.printf "Cross-origin command: %s\n" (Cmdliner.Cmd.name cmd) 36 38 ]}
+4 -4
odoc-interactive-extension/doc/demo5_multiverse.mld
··· 20 20 21 21 {1 Basic Expression} 22 22 23 - {@ocaml[ 23 + {@ocaml x[ 24 24 1 + 2 * 3 25 25 ]} 26 26 27 - {@ocaml[ 27 + {@ocaml x[ 28 28 let greet name = Printf.sprintf "Hello, %s!" name 29 29 30 30 let () = print_endline (greet "Multiverse World") ··· 32 32 33 33 {1 Loading a Library} 34 34 35 - {@ocaml[ 35 + {@ocaml x[ 36 36 #require "yojson" 37 37 ]} 38 38 39 - {@ocaml[ 39 + {@ocaml x[ 40 40 let json = `Assoc [ 41 41 ("source", `String "multiverse"); 42 42 ("linked_universes", `Int 2)
+4 -4
odoc-interactive-extension/doc/demo7_oxcaml_porting_real.mld
··· 29 29 30 30 Runnable miniature: 31 31 32 - {@ocaml[ 32 + {@ocaml x[ 33 33 let unlink_like win32_unlink is_win = 34 34 if is_win then win32_unlink else fun s -> "unlink " ^ s 35 35 ··· 53 53 54 54 Pedagogical simplified version: 55 55 56 - {@ocaml[ 56 + {@ocaml x[ 57 57 type 'a folder = 'a -> local_ (int -> char -> 'a) 58 58 59 59 let fold_chars (f : local_ 'a folder) acc s = ··· 86 86 87 87 Teaching-scale analogue: 88 88 89 - {@ocaml[ 89 + {@ocaml x[ 90 90 type frac = { global_ num : int; global_ den : int } 91 91 92 92 let compare__local (local_ a) (local_ b) = compare (a.num * b.den) (b.num * a.den) ··· 107 107 108 108 Typical shape: runtime probing + portability-safe fallback. 109 109 110 - {@ocaml[ 110 + {@ocaml x[ 111 111 module Runtime = struct 112 112 let runtime5 = false 113 113 let recommended_domain_count () = if runtime5 then 8 else 1
+3 -3
odoc-interactive-extension/doc/demo_map.mld
··· 3 3 This page demonstrates a managed Leaflet map widget with FRP signals 4 4 and commands. 5 5 6 - {@ocaml[ 6 + {@ocaml x[ 7 7 #require "note";; 8 8 #require "js_top_worker-widget";; 9 9 #require "js_top_worker-widget-leaflet";; ··· 15 15 Click on the map to see coordinates. The click position is captured 16 16 as a [Note] event and displayed as a signal: 17 17 18 - {@ocaml[ 18 + {@ocaml x[ 19 19 let click_e, send_click = Note.E.create () 20 20 let last_click = Note.S.hold "No clicks yet" click_e 21 21 ··· 44 44 45 45 This cell sends a command to the map — clicking the button flies to Paris: 46 46 47 - {@ocaml[ 47 + {@ocaml x[ 48 48 let fly_view = 49 49 let open Widget.View in 50 50 Element { tag = "button"; attrs = [Handler ("click", "fly")];
+61 -5
odoc-interactive-extension/doc/demo_widgets.mld
··· 3 3 This page demonstrates interactive FRP widgets powered by 4 4 [Widget] and [Note]. 5 5 6 - {@ocaml[ 6 + {@ocaml x[ 7 7 #require "note";; 8 8 #require "js_top_worker-widget";; 9 9 ]} ··· 12 12 13 13 A simple widget that renders static HTML: 14 14 15 - {@ocaml[ 15 + {@ocaml x[ 16 16 let open Widget.View in 17 17 Widget.display ~id:"hello" ~handlers:[] 18 18 (Element { tag = "div"; attrs = []; ··· 24 24 A counter driven by [Note] signals. Pressing the buttons sends events 25 25 back to the worker, which updates the signal: 26 26 27 - {@ocaml[ 27 + {@ocaml x[ 28 28 let inc_e, send_inc = Note.E.create () 29 29 let dec_e, send_dec = Note.E.create () 30 30 ··· 68 68 An input slider that drives a signal. Moving the slider sends the value 69 69 back to the worker: 70 70 71 - {@ocaml[ 71 + {@ocaml x[ 72 72 let x_e, send_x = Note.E.create () 73 73 let x = Note.S.hold 50 x_e 74 74 ··· 103 103 This widget derives from the slider signal [x] defined above. Moving 104 104 the slider updates this widget too: 105 105 106 - {@ocaml[ 106 + {@ocaml x[ 107 107 let doubled_view v = 108 108 let open Widget.View in 109 109 Element { tag = "div"; attrs = []; children = [ ··· 120 120 (Widget.update ~id:"doubled") 121 121 let () = Note.Logr.hold _logr3 122 122 ]} 123 + 124 + {1 Text Entry} 125 + 126 + A text input with a button. Typing in the textarea fires [text_changed], 127 + which updates the signal. Clicking "Shout" reads the current text and 128 + displays it in uppercase: 129 + 130 + {@ocaml x[ 131 + let text_e, send_text = Note.E.create () 132 + let text_s = Note.S.hold "hello world" text_e 133 + 134 + let shout_e, send_shout = Note.E.create () 135 + let shouted = Note.S.hold "" shout_e 136 + 137 + let text_entry_view txt = 138 + let open Widget.View in 139 + Element { tag = "div"; attrs = []; children = [ 140 + Element { tag = "textarea"; attrs = [ 141 + Property ("rows", "2"); 142 + Handler ("input", "text_changed"); 143 + ]; children = [Text txt] }; 144 + Element { tag = "button"; attrs = [ 145 + Handler ("click", "shout"); 146 + ]; children = [Text "Shout"] }; 147 + ] } 148 + 149 + let result_view s = 150 + let open Widget.View in 151 + Element { tag = "div"; attrs = [ 152 + Style ("font-family", "monospace"); 153 + Style ("padding", "0.5em"); 154 + ]; children = [Text s] } 155 + 156 + let () = 157 + Widget.display ~id:"text-entry" 158 + ~handlers:[ 159 + "text_changed", (fun v -> send_text (Option.value ~default:"" v)); 160 + "shout", (fun _ -> 161 + let current = Note.S.value text_s in 162 + send_shout (String.uppercase_ascii current)); 163 + ] 164 + (text_entry_view "hello world") 165 + 166 + let () = 167 + Widget.display ~id:"text-result" 168 + ~handlers:[] 169 + (result_view "") 170 + 171 + let _logr_text = Note.S.log text_s (fun _ -> ()) 172 + let () = Note.Logr.hold _logr_text 173 + 174 + let _logr4 = Note.S.log 175 + (Note.S.map result_view shouted) 176 + (Widget.update ~id:"text-result") 177 + let () = Note.Logr.hold _logr4 178 + ]}
+2 -2
odoc-interactive-extension/doc/focs_2020_q2.mld
··· 20 20 21 21 We will use the following type throughout: 22 22 23 - {@ocaml[ 23 + {@ocaml x[ 24 24 type colour = Red | Green | Blue 25 25 26 26 exception SizeMismatch ··· 82 82 83 83 Think about this, then check by evaluating: 84 84 85 - {@ocaml[ 85 + {@ocaml x[ 86 86 test 87 87 ]} 88 88
+4 -4
odoc-interactive-extension/doc/focs_2024_q1.mld
··· 9 9 exam consisting of 10 questions, of which each student has answered 6. If a 10 10 student has not attempted a question, this is indicated by a zero in the list. 11 11 12 - {@ocaml[ 12 + {@ocaml x[ 13 13 type marks = int list 14 14 15 15 (* Some sample data to work with *) ··· 89 89 function that does this and returns an appropriate type. Remember that the 90 90 result is not defined for some [marks] values (e.g. all zeros). 91 91 92 - {@ocaml[ 92 + {@ocaml x[ 93 93 (* These are available for your use *) 94 94 let float_of_int = float_of_int 95 95 let sqrt = sqrt ··· 148 148 149 149 Now that you have the building blocks, try some explorations: 150 150 151 - {@ocaml[ 151 + {@ocaml x[ 152 152 (* Which question had the highest average mark? *) 153 153 let best_question = 154 154 let means = List.init 10 (fun q -> (q, qmean q results)) in ··· 159 159 Printf.printf "Best question: Q%d (mean %.1f)\n" (fst best) (snd best) 160 160 ]} 161 161 162 - {@ocaml[ 162 + {@ocaml x[ 163 163 (* Per-student statistics *) 164 164 let () = 165 165 List.iteri (fun i row ->
+2 -2
odoc-interactive-extension/doc/focs_2025_q2.mld
··· 7 7 The following type definition allows the representation of some mathematical 8 8 expressions as an OCaml value: 9 9 10 - {@ocaml[ 10 + {@ocaml x[ 11 11 type expr = 12 12 | Add of expr * expr 13 13 | Mul of expr * expr ··· 95 95 96 96 Here is one possible type definition for [t] and a framework for [reduce]: 97 97 98 - {@ocaml[ 98 + {@ocaml x[ 99 99 (* A possible type definition -- adjust to match your own *) 100 100 type t = Plus | Times | Num of int 101 101 ]}
+21 -1
odoc-interactive-extension/src/interactive_extension.ml
··· 89 89 (** Recognised cell modes — first bare tag matching one of these wins. *) 90 90 let mode_tags = [ "interactive"; "exercise"; "test"; "hidden" ] 91 91 92 + (** All tags recognised by this extension. A code block must carry at 93 + least one of these (or [x]) to opt in to interactive treatment. 94 + Plain [{[...]}] and [{@ocaml[...]}] without tags are left alone. *) 95 + let known_tags = 96 + [ "x"; "interactive"; "exercise"; "test"; "hidden"; 97 + "autorun"; "skip"; "no-merlin" ] 98 + 92 99 module X_ocaml_code : Api.Code_Block_Extension = struct 93 100 let prefix = "ocaml" 94 101 95 102 let to_document meta code = 96 103 let tags = meta.Api.tags in 104 + (* Opt-in: the block must carry at least one recognised bare tag or 105 + a known key=value binding (id, for, env, run-on) to be treated as 106 + interactive. This prevents plain {[...]} code blocks from being 107 + hijacked. *) 108 + let bare = Api.get_all_tags tags in 109 + let has_known_tag = 110 + List.exists (fun t -> List.mem t known_tags) bare 111 + in 112 + let has_known_binding = 113 + List.exists (fun k -> Api.get_binding k tags <> None) 114 + [ "id"; "for"; "env"; "run-on"; "kind" ] 115 + in 116 + if not (has_known_tag || has_known_binding) then None 117 + else 97 118 (* Mode: first bare tag in mode_tags, default "interactive" *) 98 119 let mode = 99 - let bare = Api.get_all_tags tags in 100 120 match List.find_opt (fun t -> List.mem t mode_tags) bare with 101 121 | Some m -> m 102 122 | None -> "interactive"
+14
odoc-jons-plugins/src/odoc_jons_plugins.ml
··· 197 197 Html.header 198 198 ~a:[ Html.a_class [ "jon-shell-header" ] ] 199 199 [ 200 + Html.button 201 + ~a: 202 + [ 203 + Html.a_class [ "jon-shell-sidebar-toggle" ]; 204 + Html.a_title "Toggle sidebar"; 205 + ] 206 + [ Html.txt "\xe2\x98\xb0" ]; 200 207 Html.a ~a:[ Html.a_href "/" ] [ Html.txt "jon.recoil.org" ]; 201 208 Html.nav 202 209 [ ··· 307 314 Html.header 308 315 ~a:[ Html.a_class [ "jon-shell-header" ] ] 309 316 [ 317 + Html.button 318 + ~a: 319 + [ 320 + Html.a_class [ "jon-shell-sidebar-toggle" ]; 321 + Html.a_title "Toggle sidebar"; 322 + ] 323 + [ Html.txt "\xe2\x98\xb0" ]; 310 324 Html.a ~a:[ Html.a_href "/" ] [ Html.txt "jon.recoil.org" ]; 311 325 Html.nav 312 326 [
+85 -1
odoc-jons-plugins/src/odoc_jons_plugins_css.ml
··· 89 89 .jon-shell-header { 90 90 display: flex; 91 91 align-items: center; 92 - justify-content: space-between; 92 + gap: 12px; 93 93 max-width: calc(var(--max-width) + 300px); 94 94 margin: 0 auto; 95 95 padding: 16px 20px; ··· 110 110 .jon-shell-header nav { 111 111 display: flex; 112 112 gap: 20px; 113 + margin-left: auto; 113 114 } 114 115 115 116 .jon-shell-header nav a { ··· 686 687 687 688 .jon-shell-main h1 { 688 689 font-size: 1.6rem; 690 + } 691 + } 692 + /* Sidebar toggle button */ 693 + .jon-shell-sidebar-toggle { 694 + background: none; 695 + border: 1px solid var(--border-color); 696 + border-radius: 4px; 697 + color: var(--text-muted); 698 + font-size: 14px; 699 + cursor: pointer; 700 + padding: 2px 6px; 701 + line-height: 1; 702 + transition: color 0.15s, border-color 0.15s; 703 + } 704 + .jon-shell-sidebar-toggle:hover { 705 + color: var(--link-color); 706 + border-color: var(--link-color); 707 + } 708 + 709 + /* Hidden sidebar state */ 710 + body.sidebar-hidden .jon-shell-sidebar { display: none; } 711 + 712 + /* ================================================================ 713 + Scrollycode theming 714 + Maps jon-shell variables to --sc-* custom properties. 715 + The extension's structural CSS uses these properties for styling. 716 + ================================================================ */ 717 + 718 + /* Theme custom properties — map shell vars to scrollycode contract */ 719 + .sc-container { 720 + --sc-font-display: var(--font-body); 721 + --sc-font-body: var(--font-body); 722 + --sc-font-code: var(--font-mono); 723 + --sc-bg: var(--bg-color); 724 + --sc-text: var(--text-color); 725 + --sc-text-dim: var(--text-muted); 726 + --sc-accent: var(--link-color); 727 + --sc-accent-soft: var(--highlight-bg); 728 + --sc-code-bg: #1a1a2e; 729 + --sc-code-text: #d4d0c8; 730 + --sc-code-gutter: #3a3a52; 731 + --sc-border: var(--border-color); 732 + --sc-focus-bg: var(--highlight-bg); 733 + --sc-panel-radius: 12px; 734 + --sc-mobile-step-bg: rgba(255,255,255,0.5); 735 + 736 + /* Syntax highlighting */ 737 + --sc-hl-keyword: #f0a6a0; 738 + --sc-hl-type: #8ec8e8; 739 + --sc-hl-string: #b8d89a; 740 + --sc-hl-comment: #6a6a82; 741 + --sc-hl-number: #ddb97a; 742 + --sc-hl-module: #e8c87a; 743 + --sc-hl-operator: #c8a8d8; 744 + --sc-hl-punct: #7a7a92; 745 + } 746 + 747 + /* Hero: centered for jon-shell */ 748 + .sc-container .sc-hero { 749 + border-bottom: 1px solid var(--sc-border); 750 + text-align: center; 751 + } 752 + .sc-container .sc-hero p { 753 + margin: 0 auto; 754 + } 755 + 756 + /* Dark mode overrides */ 757 + @media (prefers-color-scheme: dark) { 758 + .sc-container { 759 + --sc-code-bg: #0f0f18; 760 + --sc-code-text: #c8c5d8; 761 + --sc-code-gutter: #2a2a3e; 762 + --sc-mobile-step-bg: rgba(255,255,255,0.04); 763 + 764 + /* Dark syntax colors */ 765 + --sc-hl-keyword: #ff7eb3; 766 + --sc-hl-type: #7dd3fc; 767 + --sc-hl-string: #4ade80; 768 + --sc-hl-comment: #4a4a62; 769 + --sc-hl-number: #fbbf24; 770 + --sc-hl-module: #c4b5fd; 771 + --sc-hl-operator: #67e8f9; 772 + --sc-hl-punct: #4a4a62; 689 773 } 690 774 } 691 775 |}
-27
odoc-scrollycode-extension/src/scrollycode_css.ml
··· 13 13 14 14 let structural_css = 15 15 {| 16 - /* === Override odoc page chrome for scrollycode pages === */ 17 - .odoc-nav, .odoc-tocs, .odoc-search { display: none !important; } 18 - .odoc-preamble > h1, .odoc-preamble > h2, .odoc-preamble > h3 { display: none !important; } 19 - .at-tags > li > .at-tag { display: none !important; } 20 - .odoc-preamble, .odoc-content { 21 - max-width: none !important; 22 - padding: 0 !important; 23 - margin: 0 !important; 24 - display: block !important; 25 - } 26 - .at-tags { 27 - list-style: none !important; 28 - padding: 0 !important; 29 - margin: 0 !important; 30 - } 31 - .at-tags > li { 32 - display: block !important; 33 - margin: 0 !important; 34 - padding: 0 !important; 35 - } 36 - body.odoc, .odoc { 37 - padding: 0 !important; 38 - margin: 0 !important; 39 - max-width: none !important; 40 - background: inherit; 41 - } 42 - 43 16 /* === Container === */ 44 17 .sc-container { 45 18 font-family: var(--sc-font-body);
+3 -11
odoc-scrollycode-extension/src/scrollycode_themes.ml
··· 516 516 } 517 517 |} 518 518 519 - (** Register all theme CSS files as support files *) 520 - let () = 521 - let register name content = 522 - Odoc_extension_api.Registry.register_support_file ~prefix:"scrolly" { 523 - filename = "extensions/scrollycode-" ^ name ^ ".css"; 524 - content = Inline content; 525 - } 526 - in 527 - register "warm" warm_css; 528 - register "dark" dark_css; 529 - register "notebook" notebook_css 519 + (* Theme CSS strings are retained as documentation of the custom property 520 + contract. Registration as support files has been removed — theming is 521 + now the shell's responsibility. *)
+2 -2
onnxrt/doc/add_example.mld
··· 8 8 9 9 Load [onnxrt], [Note] (FRP), and the widget library: 10 10 11 - {@ocaml[ 11 + {@ocaml x[ 12 12 #require "onnxrt";; 13 13 #require "note";; 14 14 #require "js_top_worker-widget";; ··· 16 16 17 17 Load the ONNX Runtime JavaScript library into the worker: 18 18 19 - {@ocaml[ 19 + {@ocaml x[ 20 20 let () = 21 21 Js_of_ocaml.Js.Unsafe.meth_call 22 22 Js_of_ocaml.Js.Unsafe.global "importScripts"
+83 -41
onnxrt/doc/sentiment_example.mld
··· 9 9 10 10 Load [onnxrt], [Note] (FRP), and the widget library: 11 11 12 - {@ocaml[ 12 + {@ocaml x[ 13 13 #require "onnxrt";; 14 14 #require "note";; 15 15 #require "js_top_worker-widget";; 16 16 ]} 17 17 18 - {@ocaml[ 18 + {@ocaml x[ 19 19 let () = 20 20 Js_of_ocaml.Js.Unsafe.meth_call 21 21 Js_of_ocaml.Js.Unsafe.global "importScripts" ··· 189 189 Create int64 tensors (required by DistilBERT) and compute softmax 190 190 over logits: 191 191 192 - {@ocaml[ 192 + {@ocaml x[ 193 193 open Onnxrt 194 194 195 195 let make_int64_tensor (data : int array) (dims : int array) : Tensor.t = ··· 251 251 the quantized DistilBERT model. A status widget updates as loading 252 252 progresses: 253 253 254 - {@ocaml[ 254 + {@ocaml x[ 255 255 let max_length = 128 256 256 257 257 let vocab = ref None ··· 266 266 Style ("padding", "0.75em 1em"); 267 267 Style ("border-radius", "6px"); 268 268 Style ("font-family", "monospace"); 269 - Style ("background", "#f0f4f8"); 269 + Style ("border", "1px solid currentColor"); 270 + Style ("opacity", "0.8"); 270 271 ]; children = [Text msg] } 271 272 272 273 let () = ··· 287 288 let open Lwt.Syntax in 288 289 let* s = Session.create "model_quantized.onnx" () in 289 290 session := Some s; 290 - send_model_status "Ready! Edit the text below and click Analyze."; 291 + send_model_status "Model ready."; 291 292 Lwt.return_unit) 292 293 ]} 293 294 294 295 {1 Analyze Sentiment} 295 296 296 - Edit the sample text and click {b Run} to classify it. The result 297 - widget updates reactively when inference completes: 298 - 299 - {@ocaml exercise[ 300 - let result_e, send_result = Note.E.create () 301 - let result_s = Note.S.hold "" result_e 302 - 303 - let result_view msg = 304 - let open Widget.View in 305 - if msg = "" then Element { tag = "div"; attrs = []; children = [] } 306 - else 307 - Element { tag = "div"; attrs = [ 308 - Style ("padding", "0.75em 1em"); 309 - Style ("border-radius", "6px"); 310 - Style ("font-family", "monospace"); 311 - Style ("font-size", "1.1em"); 312 - Style ("background", "#e8f5e9"); 313 - ]; children = [Text msg] } 297 + Type or edit the text below — the model classifies it reactively 298 + via [Note] signals whenever you click {b Analyze}. 314 299 315 - let () = 316 - Widget.display ~id:"sentiment-result" ~handlers:[] 317 - (result_view "") 300 + {@ocaml x[ 301 + let input_e, send_input = Note.E.create () 302 + let input_text = Note.S.hold 303 + "This movie was absolutely wonderful, I loved every minute of it!" 304 + input_e 318 305 319 - let _logr2 = Note.S.log 320 - (Note.S.map result_view result_s) 321 - (Widget.update ~id:"sentiment-result") 322 - let () = Note.Logr.hold _logr2 323 - ]} 306 + let result_e, send_result = Note.E.create () 307 + let result_s = Note.S.hold "Type something and click Analyze." result_e 324 308 325 - {@ocaml exercise run-on=click[ 326 - let text = "This movie was absolutely wonderful, I loved every minute of it!" 327 - 328 - let () = 309 + let analyze text = 329 310 match !vocab, !session with 330 311 | Some v, Some s -> 331 312 send_result "Running inference..."; ··· 349 330 let label, confidence = 350 331 if probs.(1) > probs.(0) then ("POSITIVE", probs.(1)) 351 332 else ("NEGATIVE", probs.(0)) in 352 - send_result (Printf.sprintf "%s (confidence: %.1f%%)" 353 - label (confidence *. 100.0)); 333 + let emoji = if label = "POSITIVE" then "👍" else "👎" in 334 + send_result (Printf.sprintf "%s %s (%.1f%% confident)" 335 + emoji label (confidence *. 100.0)); 354 336 Tensor.dispose input_ids_tensor; 355 337 Tensor.dispose attention_mask_tensor; 356 338 Tensor.dispose logits_tensor; 357 339 Lwt.return_unit) 358 340 | _ -> 359 341 send_result "Model not loaded yet — wait a moment and try again." 360 - ]} 361 342 362 - Edit the [text] variable above, click {b Run}, and the result will 363 - appear automatically. 343 + let input_view text = 344 + let open Widget.View in 345 + Element { tag = "div"; attrs = [ 346 + Style ("display", "flex"); 347 + Style ("flex-direction", "column"); 348 + Style ("gap", "0.75em"); 349 + ]; children = [ 350 + Element { tag = "textarea"; attrs = [ 351 + Property ("rows", "3"); 352 + Style ("width", "100%"); 353 + Style ("padding", "0.75em"); 354 + Style ("border-radius", "6px"); 355 + Style ("border", "1px solid currentColor"); 356 + Style ("font-size", "1em"); 357 + Style ("font-family", "inherit"); 358 + Style ("background", "transparent"); 359 + Style ("color", "inherit"); 360 + Style ("resize", "vertical"); 361 + Style ("opacity", "0.9"); 362 + Handler ("input", "text_changed"); 363 + ]; children = [Text text] }; 364 + Element { tag = "button"; attrs = [ 365 + Style ("padding", "0.5em 1.5em"); 366 + Style ("border-radius", "6px"); 367 + Style ("border", "1px solid currentColor"); 368 + Style ("background", "transparent"); 369 + Style ("color", "inherit"); 370 + Style ("font-size", "1em"); 371 + Style ("cursor", "pointer"); 372 + Handler ("click", "analyze"); 373 + ]; children = [Text "Analyze"] }; 374 + ] } 375 + 376 + let result_view result = 377 + let open Widget.View in 378 + Element { tag = "div"; attrs = [ 379 + Style ("font-family", "monospace"); 380 + Style ("padding", "0.75em 1em"); 381 + Style ("border-radius", "6px"); 382 + Style ("border", "1px solid currentColor"); 383 + Style ("opacity", "0.8"); 384 + ]; children = [Text result] } 385 + 386 + let () = 387 + Widget.display ~id:"sentiment-input" 388 + ~handlers:[ 389 + "text_changed", (fun v -> send_input (Option.value ~default:"" v)); 390 + "analyze", (fun _ -> analyze (Note.S.value input_text)); 391 + ] 392 + (input_view (Note.S.value input_text)) 393 + 394 + let () = 395 + Widget.display ~id:"sentiment-result" ~handlers:[] 396 + (result_view (Note.S.value result_s)) 397 + 398 + let _logr_input = Note.S.log input_text (fun _ -> ()) 399 + let () = Note.Logr.hold _logr_input 400 + 401 + let _logr_result = Note.S.log 402 + (Note.S.map result_view result_s) 403 + (Widget.update ~id:"sentiment-result") 404 + let () = Note.Logr.hold _logr_result 405 + ]}
+1858 -1
scripts/atom.xml
··· 1 1 <?xml version="1.0" encoding="UTF-8"?> 2 - <feed xmlns="http://www.w3.org/2005/Atom"><id>https://jon.recoil.org/atom.xml</id><title type="text">Jon's blog</title><updated>2025-04-24T15:41:17-00:00</updated></feed> 2 + <feed xmlns="http://www.w3.org/2005/Atom"> 3 + <id>https://jon.recoil.org/atom.xml</id> 4 + <title>Jon's blog</title> 5 + <updated>2026-03-10T00:31:40Z</updated> 6 + <author> 7 + <name>Jon Ludlam</name> 8 + <uri>https://jon.recoil.org/</uri> 9 + </author> 10 + <link rel="self" href="https://jon.recoil.org/atom.xml"/> 11 + <link rel="alternate" href="https://jon.recoil.org/blog/"/> 12 + <entry> 13 + <id>https://jon.recoil.org/blog/2026/03/weeknotes-2026-10.html</id> 14 + <title>Weeknotes 2026 week 10</title> 15 + <published>2026-03-09T00:00:00Z</published> 16 + <updated>2026-03-09T00:00:00Z</updated> 17 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/03/weeknotes-2026-10.html"/> 18 + <summary>Here are my weeknotes for the last week, while I'm still writing up some more focused posts on some specific topics - like the experience of putting everything in a monorepo to create this site, and m...</summary> 19 + <content type="html"><![CDATA[<h1 id="weeknotes-2026-week-10"><a href="#weeknotes-2026-week-10" class="anchor"></a>Weeknotes 2026 week 10</h1> 20 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-03-09</p></li></ul> 21 + <p>Here are my weeknotes for the last week, while I'm still writing up some more focused posts on some specific topics - like the experience of putting everything in a monorepo to create this site, and more notes on Claude and Agentic coding in general, and its impact on the world of software. But for now, here's what I've been up to.</p> 22 + <h2 id="what-did-i-do?"><a href="#what-did-i-do?" class="anchor"></a>What did I do?</h2> 23 + <ul><li><p>New site design. The old site was a bit of a mess and was simply reusing odoc's default default styling. I've also rearranged the content a bit to make it more navigable and cohesive.</p><div><a href="old.png" class="img-link"><img src="old.png" alt="old.png"/></a></div><div><a href="new.png" class="img-link"><img src="new.png" alt="new.png"/></a></div></li><li><p>TESSERA in the browser is a <a href="https://tee.cl.cam.ac.uk/">hot</a> <a href="https://anil.recoil.org/notes/2026w10">topic</a> right now, so I've applied the work I've been doing with x-ocaml, js_top_worker and odoc plugins to make a <a href="/notebooks/interactive_map.html">TESSERA notebook</a> that's based on the <a href="https://github.com/ucam-eo/tessera-interactive-map">example notebook</a>.</p><div><a href="tessera.png" class="img-link"><img src="tessera.png" alt="tessera.png"/></a></div></li><li>I was interested in whether we'll be able to do inference in reasonable time using these notebooks. <a href="https://onnx.ai/">ONNX</a> has a web version of its runtime, so I got Claude to make some bindings, and checked it was working by doing a sentiment analysis notebook. This is working nicely, so the next step is to do something a bit more useful.</li><li>The docs CI was again causing problems. This time it had decided that it had never built anything, and therefore needed to rebuilt the entire world. However, despite being set up as a custom dedicated runner, all its jobs were queued waiting to start. It turned out that the runner paused itself when the docker partition reached 70%. This was a little surprising on two counts - firstly we don't actually use docker for running the jobs, we use obuilder, which doesn't share space with docker. Secondly, with that in mind, how did it get to 70%? It turned out to be the job logs - including 250 gigs of older logs from a previous instance. Simply blowing those away caused everything to restart and so it's now live again.</li><li>I met up with <a href="">Andrés C. Zúñiga-González</a> to have a chat about how he's using interactive maps and notebooks. He pointed me at his <a href="https://ancazugo.github.io/blog.html">blog</a>, some of which which is using <a href="https://quarto.org/">quarto</a>, which he rates very highly. An <a href="https://ancazugo.github.io/posts/2025-11-16-tessera_example.html">example of quarto output</a>.</li><li>Our group seminar this week was <a href="https://tombearpark.com/">Tom Bearpark</a> who talked about his proposed 'Carbon at Risk' measure in order to compare diverse ways of removing carbon from the atomsphere to help with the carbon removal market.</li></ul> 24 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 25 + <ul><li>More writing before more coding, I think.</li></ul>]]></content> 26 + </entry> 27 + <entry> 28 + <id>https://jon.recoil.org/blog/2026/03/weeknotes-2026-09.html</id> 29 + <title>Weeknotes 2026 week 9</title> 30 + <published>2026-03-02T00:00:00Z</published> 31 + <updated>2026-03-02T00:00:00Z</updated> 32 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/03/weeknotes-2026-09.html"/> 33 + <summary>Let's make this really terse!</summary> 34 + <content type="html"><![CDATA[<h1 id="weeknotes-2026-week-9"><a href="#weeknotes-2026-week-9" class="anchor"></a>Weeknotes 2026 week 9</h1> 35 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-03-02</p></li></ul> 36 + <ul class="at-tags"><li class="notanotebook"><span class="at-tag">notanotebook</span> </li></ul> 37 + <p>Let's make this really terse!</p> 38 + <h2 id="what-did-i-do?"><a href="#what-did-i-do?" class="anchor"></a>What did I do?</h2> 39 + <ul><li>Got docs working with github actions on Anil's oxmono monorepo. Results are <a href="https://jonludlam.github.io/oxmono/">here</a>. This includes experimental support for oxcaml modes/layouts.</li><li><p>Got markdown mode output into Sherlodoc's db so you can query it - great for agents!</p><div><a href="search.png" class="img-link"><img src="search.png" alt="search.png"/></a></div></li><li><p>Widgets in the JS OCaml toplevels - using FRP for the interactions. The neat thing here is that using FRP via Daniel Bunzli's <a href="">note</a> library is that all the interactions are all purely functional, no refs or mutables in sight. You provide a little wrapper scripts that's run in the frontend and the interactions and send back and forth with the worker running the code where it's translated into Events and Signals. My proof-of-concept of this is a widget that works with the <a href="https://leafletjs.com/">leaflet.js</a> library:</p><div><video src="mapdemo.mov" controls="controls" aria-label="mapdemo.mov"></video></div><p>Demo coming soon!</p></li><li>Consolidating all of the Odoc toplevel bits and pieces into the one monorepo. Again, demo of this coming soon!</li></ul> 40 + <h2 id="what-am-i-going-to-do?"><a href="#what-am-i-going-to-do?" class="anchor"></a>What am I going to do?</h2> 41 + <ul><li>New website!</li><li>Odoc plugins showcase</li><li>Writing writing writing writing</li></ul>]]></content> 42 + </entry> 43 + <entry> 44 + <id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-08.html</id> 45 + <title>Weeknotes weeks 7-8</title> 46 + <published>2026-02-24T00:00:00Z</published> 47 + <updated>2026-02-24T00:00:00Z</updated> 48 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-08.html"/> 49 + <summary>A combination one again as I took some time off due to school half term.</summary> 50 + <content type="html"><![CDATA[<h1 id="weeknotes-weeks-7-8"><a href="#weeknotes-weeks-7-8" class="anchor"></a>Weeknotes weeks 7-8</h1> 51 + <ul class="at-tags"><li class="notanotebook"><span class="at-tag">notanotebook</span> </li></ul> 52 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-02-24</p></li></ul> 53 + <p>A combination one again as I took some time off due to school half term.</p> 54 + <h2 id="finished-off-my-exam-questions"><a href="#finished-off-my-exam-questions" class="anchor"></a>Finished off my exam questions</h2> 55 + <p>This was a lot of fun! Obviously I can't talk about it, but while it was stressful and worrying and anxiety inducing and scary, it was also engaging and interesting and thought-provoking. Having some ideas come together to make a nice coherent whole was very cool.</p> 56 + <h2 id="testing-llms-on-past-paper-questions"><a href="#testing-llms-on-past-paper-questions" class="anchor"></a>Testing LLMs on past paper questions</h2> 57 + <p>Similar to our work on the ticks that <a href="https://www.youtube.com/watch?v=Ub8k1BcSRLQ">Sadiq, I and others did last year</a>, I wanted to try to see how well LLMs could answer tripos questions. Partly I wanted to do this so I could check that my own questions were of the right sort of level, and partly it was just a displacement activity while I wasn't making progress on the actual exam questions! I've not done a useful analysis of the results yet, but seemed in line with our experience with the ticks, though the pass rate was lower for the same models (qwen).</p> 58 + <h2 id="claude-from-a-sunbed"><a href="#claude-from-a-sunbed" class="anchor"></a>Claude from a sunbed</h2> 59 + <p>I went away for a vitamin-D boosting bit of sun. Before I went, I got Claude to spin me up a little Telegram bridge so that I could tell it what to do, while it's still running in safeties-off mode on my sacrificial VM. This was kind of fun - I got to just indulge thoughts as they came to me, and off it would go and do stuff. It was a bit limited in how it talked back to me, which wasn't by design but turned out to be nice for this sort of workflow. The downside is that I've now got a load of stuff to sift through - much of which is a 'good start', but none of it is likely to be usable without a good deal more effort. Here's a short-list of things I had it do:</p> 60 + <ul><li>Resurrect Fay Carson's work on the <a href="https://github.com/ocaml/odoc/pull/1295">Menhir parser for odoc</a>, pushed <a href="https://github.com/jonludlam/odoc/tree/menhir-parser-rebased">here</a></li><li>Added some instrumentation to Odoc to do some performance experiments</li><li>Ran some simple experiments to measure the impact of various pre-existing performance knobs/switches</li><li>Resurrected an old patch of mine to <a href="https://github.com/jonludlam/odoc/tree/parameterised-paths">unify the two path representations</a> in odoc to measure its effect on performance.</li><li>Tested aggresively reuse of records if their fields don't change during compile/link</li><li>Mixed up the <a href="https://tangled.org/jon.recoil.org/odoc-scrollycode-extension">scrollycode backend</a> and the x-ocaml backend and stuck a playground on at each step</li><li>Unified the oxcaml/ocaml branches of <a href="https://tangled.org/jon.recoil.org/js_top_worker">js_top_worker</a> and x-ocaml via cppo</li><li>Added oxcaml mode/layout annotations to odoc</li></ul> 61 + <h2 id="oxcaml"><a href="#oxcaml" class="anchor"></a>OxCaml</h2> 62 + <p>I investigated the oxcaml docs build, which I had got working last week. Anil reported that it wasn't working for him, so I looked at the build I had and it definitely <i>was</i> working. However, I was building on our machine Monteverde, which is a bit of a beast, so I checked the memory usage and it was enormous! I tried the build again on my 64 gig VM and it OOM'd. I'd noticed before that the <code>cmti</code> files for base, in particular <code>base__Container.cmti</code> were absolutely massive, and so had just assumed that the problem was that. Luke had also mentioned that some of the output from the template machinery was hidden. However, I had Claude look into this and it couldn't see any doc stop comments. So I asked it to look a little closer and figure out what was using all the memory. It took an unexpectedly large number of prods from me to finally figure out what was going on - it was to do with how odoc processes <code>includes</code> - specifically an <code>include sig ... end</code>. Essentially an include of that type ends up doubling the storage required of the signature. As the ppx_template extension does quite a lot of this, and in particular nests them, this ends up going exponential and this turned out to be the cause of most of the memory usage. With a fair bit more prodding by me, Claude and I eventually got to a solution, which I'll be upstreaming soon - the fix applies to OCaml as well as OxCaml, but it's this particularly pathalogical usage of includes that ppx_template uses where it'll make the most difference.</p> 63 + <h2 id="odoc,-plugins,-js-and-more"><a href="#odoc,-plugins,-js-and-more" class="anchor"></a>Odoc, plugins, JS and more</h2> 64 + <p>Teaser... I have a blog post coming soon with more on this. It's been a lot of fun, and should provide a decent inspiration for a roadmap for Odoc and online notebooks!</p>]]></content> 65 + </entry> 66 + <entry> 67 + <id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html</id> 68 + <title>Weeknotes for week 6</title> 69 + <published>2026-02-09T00:00:00Z</published> 70 + <updated>2026-02-09T00:00:00Z</updated> 71 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html"/> 72 + <summary>Highlights:</summary> 73 + <content type="html"><![CDATA[<h1 id="weeknotes-for-week-6"><a href="#weeknotes-for-week-6" class="anchor"></a>Weeknotes for week 6</h1> 74 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-02-09</p></li></ul> 75 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.xref2,odoc.loader,odoc.model</p></li></ul> 76 + <ul class="at-tags"><li class="packages"><span class="at-tag">packages</span> <p>odoc</p></li></ul> 77 + <p>Highlights:</p> 78 + <ul><li><a href="https://jon.ludl.am/experiments/day10-jtw/standalone/index.html">day10 / javascript toplevels integration</a></li><li><a href="https://jon.ludl.am/experiments/scrollycoder/">Scrollycode experiments</a></li></ul> 79 + <h2 id="oxmono"><a href="#oxmono" class="anchor"></a>Oxmono</h2> 80 + <p>I spent some time on Anil's oxmono repo getting odoc to work correctly. It turned out that the bug I was working on last week was critically important for this - and that the bugfix was incomplete. One of the issues was to do with identifiers needing to be unique. For example, consider the following code:</p> 81 + <div><pre class="language-ocaml"><code>module type S = sig 82 + type t 83 + 84 + include sig 85 + type t 86 + 87 + val f : t -&gt; t 88 + end with type t := t 89 + end</code></pre></div> 90 + <p>The problem here is that both definitions of `type t` have the same identifier, which causes problems when we move to and from the 'Component' types. The solution was to introduce a 'dummy' parent for the type defined within the include. This works because we never actually render the body of the include into HTML - we render the <i>expansion</i>, which <i>doesn't</i> have <code>type t</code> in it, as it has been substituted out.</p> 91 + <p>The fix I made last week fixed the <span class="xref-unresolved" title="Odoc_loader">loader</span>, which reads in the <code>cmt</code>/<code>cmti</code> files produced by the compiler. There's one more place where we create these in the code - when we translate from the <span class="xref-unresolved" title="Odoc_xref2.Component">Component</span> types back into <span class="xref-unresolved" title="Odoc_model.Lang">Lang</span> types. I was a little curious about whether it was possible to make this happen, so I thought I'd ask Claude to see if it could come up with a scenario where we'd end up in this situation. This was a complete failure, which was a real disappointment to me, as doing this sort of thing is a quite tedious and annoying part of working on odoc.</p> 92 + <p>Meanwhile, I was running odoc on Anil's <a href="https://github.com/avsm/oxmono">oxmono</a> repo, which was using <a href="https://github.com/art-w">art-w</a>'s <a href="https://github.com/ocaml/odoc/pull/1399">PR to upstream oxcaml support</a>. It was failing with an exception that was very familiar, so I pulled in the fix I'd been working on, and that enabled it to get much further. However, it did subsequently fail with another slightly different exception. I had my suspicions at this point that it might be due to the other place, but I thought this again was a good opportunity to test Claude's debugging skills. However, this again was a complete failure. I spend quite a long time prodding it - at least 4 separate sessions - and it really didn't get anywhere close to a solution, despite knowing precisely that the commit we'd made that had fixed the first problem. Two of the four times it ended up telling me that the oxcaml compiler was broken and suggesting that we create an issue!</p> 93 + <p>I'm only very mildly disappointed in this - it's all quite subtle, and something I still end up scratching my head over sometimes, but it would have been wonderful to be able to offload this sort of work!</p> 94 + <p>In any case, the docs now all build on <a href="https://github.com/jonludlam/oxmono/commit/2a53f6857d5b8849a73f5bb3e5244b9ac0f36708">my fork of oxmono</a>.</p> 95 + <h2 id="docs-ci"><a href="#docs-ci" class="anchor"></a>Docs CI</h2> 96 + <p>The fix I deployed last week for ocaml-docs-ci was taking forever to complete, so I ended up spending some time investigating this. The problem was happening during the 'prep' phase, which is the first part of the pipeline where we simply build the package to be documented. This is supposed to work by building a graph of all inter-package dependencies across all of the solved packages, so we maximise sharing of built artefacts. Each 'prep' job builds precisely one package by coping in the dependencies from previous prep jobs, then running <a href="https://github.com/jonludlam/opamh">opamh</a> to fix up the metadata so that opam believes it has installed everything itself, then running opam to build the one package required. It was this last step that was going wrong, where it would decide that there had been upstream changes to the compiler itself, and rebuild <i>everything</i>, so rather than a prep job taking a few seconds, it would take a few minutes.</p> 97 + <p>I was totally unable to repro this locally - everything build very quickly and just how it should have done. After much head-scratching I finally realised that the problem was somewhere in the caching. I think what's going on is that we dynamically build an opam repository to make the `opam install` command faster, and that repo contains only the packages that are required to build whatever it is we're building. Those opam files are cached by the docs CI server and passed to the build script as a base64-encoded gzipped tarball inline in the obuilder file (!). This should all be totally consistent as we're also caching all the builds - except for the compiler itself, which comes from the base docker image. This, of course, is the problem. The ocaml compiler opam files had been updated, and then when we reconstructed the opam repo with our cached opam files, opam noticed they had changed (gone <i>backwards</i> in time!) and decided it needed to rebuild the compiler, and therefore <i>everything</i> else. Clearing out the opam-files cache and restarting the builds fixed this entirely, and the full rebuild job completed after about 2 days. I flipped the switch on Saturday night and the docs are now fully up to date again. Phew!</p> 98 + <h2 id="day10-work"><a href="#day10-work" class="anchor"></a>day10 work</h2> 99 + <p>This was a fun week of large-scale building! I integrated day10 and odoc_driver and js_top_worker and x-ocaml and have now successfully got a docs-ci-like system that's able to build docs and toplevels that can coexist in the one HTML tree. I've not got a full integrated demo yet, but you can see the test cases for this <a href="https://jon.ludl.am/experiments/day10-jtw/standalone/index.html">here</a>. Be sure to take a look at the 'network' tab in the browser dev tools to see what it's doing!</p> 100 + <h2 id="scrollycode-experiments"><a href="#scrollycode-experiments" class="anchor"></a>Scrollycode experiments</h2> 101 + <p>I've long been a fan of <a href="https://pomb.us/">Rodrigo Pombo's</a> work on &quot;building tools for better code reading comprehension&quot;, ever since first seeing his post &quot;<a href="https://pomb.us/build-your-own-react/">Build your own React</a>&quot;. Claude is <i>fantastically good</i> at doing this sort of thing, so I asked it to go and build me some simple OCaml-focused versions. We came up with 5 variations in the end - and they're all pretty neat! <a href="https://jon.ludl.am/experiments/scrollycoder/">take a look!</a>. The best part of this was that it took me less than half-an-hour to get Claude to do all this.</p> 102 + <h2 id="dune-pr"><a href="#dune-pr" class="anchor"></a>Dune PR</h2> 103 + <p>I attended the bi-weekly dune dev meeting to talk about the first part of the dune PR - the bit that Paul Elliot did almost a year ago.</p> 104 + <h2 id="coming-week"><a href="#coming-week" class="anchor"></a>Coming week</h2> 105 + <p>So the clock is ticking on writing the exam questions for FoCS, so I'll need to be spending time this week on that.</p>]]></content> 106 + </entry> 107 + <entry> 108 + <id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html</id> 109 + <title>Weeknotes for weeks 4-5</title> 110 + <published>2026-01-30T00:00:00Z</published> 111 + <updated>2026-01-30T00:00:00Z</updated> 112 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html"/> 113 + <summary>I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</summary> 114 + <content type="html"><![CDATA[<h1 id="weeknotes-for-weeks-4-5"><a href="#weeknotes-for-weeks-4-5" class="anchor"></a>Weeknotes for weeks 4-5</h1> 115 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-01-30</p></li></ul> 116 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.extension_api</p></li></ul> 117 + <ul class="at-tags"><li class="packages"><span class="at-tag">packages</span> <p>odoc-admonition-extension odoc-rfc-extension odoc-msc-extension odoc-mermaid-extension odoc-dot-extension</p></li></ul> 118 + <p>I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</p> 119 + <p>Probably the most interesting part of this is the <a href="#retrospective" title="retrospective">Retrospective</a>, so make sure to read that bit.</p> 120 + <h2 id="the-last-two-weeks"><a href="#the-last-two-weeks" class="anchor"></a>The Last Two Weeks</h2> 121 + <p>As is becoming more and more apparent, the <em>breadth</em> of what I'm working on is ever expanding, powered by agentic AI. It's become so much more (cognitively) cheaper to have an idea and set an agent off investigating it that I've been finding that I'm working in parallel on far more things in a single week than I would have even six months ago. Here are some of the bigger headings though.</p> 122 + <h3 id="monorepo-excitement"><a href="#monorepo-excitement" class="anchor"></a>Monorepo excitement</h3> 123 + <p>We're currently experimenting with a new tool - <a href="https://tangled.org/anil.recoil.org/monopam">monopam</a> to help develop across multiple OCaml libraries by using git subtrees to create a monorepo with all of the packages in. We then extract patches to the individual repos to push upstream. I've been moving my development workflow from in-vscode-claude with careful permissions checking to running claude with `--dangerously-skip-permissions` in a container with the monorepo checked out. This has been a bit of a bumpy ride, with the tool evolving daily, but I'm very much seeing the benefits of letting Claude just get on with things, given a strict enough early design and testing strategy, and using Anil's method of creating the interfaces first.</p> 124 + <h4 id="odoc"><a href="#odoc" class="anchor"></a>Odoc</h4> 125 + <p>I also did quite a bit related to odoc these 2 weeks, split over improving functionality and bugfixing.</p> 126 + <h5 id="plugins"><a href="#plugins" class="anchor"></a>Plugins</h5> 127 + <p>Getting Claude to run with all of the monorepo libraries implicitly requires that they're well documented, as looking at the source to figure out how to use them exhausts the context window pretty rapidly. Odoc's main focus has been on getting the expansions and referencing correct, and while we've made progress on the actual content markup, introducing <a href="https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#media">media tags</a> for example, there's still a good distance to go.</p> 128 + <p>Using the plugins mechanism I <a href="weeknotes-2026-03.html" title="weeknotes-2026-03">wrote about last week</a>, I've made a plugin interface for odoc and implemented a few plugins. Initially I was just going to support 'custom tags' but it occurred to me that rendering code blocks could also be done in this way. So I've made a few. Two custom tag plugins:</p> 129 + <ul><li><span class="xref-unresolved" title="/odoc-admonition-extension/index">odoc-admonition-extension</span> - styled callout blocks for notes, warnings, tips. Note that we are intending to make this more first-class - there's a <a href="https://hackmd.io/ETSOAmetTI-E3vrDk3Bfrw">design out there</a>. This was just a convenient way to test the feature!</li><li><span class="xref-unresolved" title="/odoc-rfc-extension/index">odoc-rfc-extension</span> - links to IETF RFC documents</li></ul> 130 + <p>and 3 code block plugins:</p> 131 + <ul><li><span class="xref-unresolved" title="/odoc-msc-extension/index">odoc-msc-extension</span> - Message Sequence Charts</li><li><span class="xref-unresolved" title="/odoc-mermaid-extension/index">odoc-mermaid-extension</span> - Mermaid diagrams (flowcharts, sequence diagrams, etc.)</li><li><span class="xref-unresolved" title="/odoc-dot-extension/index">odoc-dot-extension</span> - Graphviz/DOT diagrams</li></ul> 132 + <p>The module signatures relevant to the plugins are documented in <code>/odoc.extension_api/Odoc_extension_api</code> and the plugins each have to implement an interface described in <code>Odoc_extension_api.Code_Block_Extension</code> or <code>Odoc_extension_api.Extension</code> for custom tags.</p> 133 + <h5 id="bugfixing"><a href="#bugfixing" class="anchor"></a>Bugfixing</h5> 134 + <p><a href="https://github.com/lukemaurer">Luke Maurer</a> at Jane Street pointed out that they're still suffering from yet another repro of <a href="https://github.com/ocaml/odoc/issues/930">issue 930</a> at Jane Street. I'd worked on this <a href="../../2025/09/odoc-bugs.html" title="odoc-bugs">back in September</a> but turns out I hadn't actually made a PR, so I tidied up the branch and <a href="https://github.com/ocaml/odoc/pull/1400">made a PR</a>.</p> 135 + <h3 id="docs-ci"><a href="#docs-ci" class="anchor"></a>Docs CI</h3> 136 + <p>Docs CI has been fixed and is even now rebuilding all of the docs for ocaml.org. I've added in the <a href="https://github.com/ocurrent/ocaml-docs-ci/commit/c6231fa383820b4c700aaa1e72107536b1872112">handling of `post &amp; with-doc`</a> in place of x-extra-doc-deps, so we should be able to use either mechanism now. The idea is to deprecate x-extra-doc-deps soon though. Somehow despite an explicit button to press to update the epoch symlinks, it got updated anyway and broke most of the docs on ocaml.org. Fortunately <a href="https://discuss.ocaml.org/t/is-caqti-doc-missing/17741/5">someone noticed</a> and posted on discuss and so I switched it back.</p> 137 + <p>Unfortunately, it seemed to be taking a long time to build the docs - at time of writing it's now Friday, and the CI jobs have been running since Tuesday. In that time, it's only managed to build about 6500 packages, a long way short of the 16,000 or so that I expect a full build will produce. Looking through the logs, it seems that some change to opam is causing it to sometime rebuild the entire opam universe when it should only be building 1 package. For example, in a job that should be building just `tezos-protocol-004-Pt24m4xi`, it installs all of the prebuilt dependencies, then runs `opamh` to try to convince opam that everything is all set up to just run the build step for the package we want. Unfortunately the logs show the following:</p> 138 + <div><pre class="language-ocaml"><code>The following actions will be performed: 139 + === recompile 178 packages 140 + - recompile aches 1.1.0 [uses ocaml] 141 + - recompile aches-lwt 1.1.0 [uses ocaml] 142 + ... 143 + - recompile mtime 2.1.0 [uses ocaml] 144 + - recompile ocaml 4.14.2 [upstream or system changes] 145 + - recompile ocaml-compiler-libs v0.12.4 [uses ocaml] 146 + ...</code></pre></div> 147 + <p>where it seems opam has decided that something has changed enough for it to want to recompile the `ocaml` package, and therefore <i>everything</i> in the entire opam switch! So this job took 12 minutes instead of 21 seconds, which was the time required to finally build the `tezos-protocol` package.</p> 148 + <h3 id="day10-and-docs"><a href="#day10-and-docs" class="anchor"></a>Day10 and docs</h3> 149 + <p>In closely related news, <a href="https://tunbury.org/">mtelver's</a> day10 project looked precisely the right shape for building docs - in fact it shares its architecture and some components with the docs CI. So I asked Claude to take a look and see what it would take, and discovered that it doesn't take very much! We have a Really Big Machine here at the CL that was temporarily underused; and by Really Big I mean 768 cores and 3TB of RAM. So, how long could building all of the docs for all of the packages possibly take? Well, it takes 5 hours 40 mins. And I was only using roughly a third of the machine. Nice!</p> 150 + <p>So should I push on with fixing ocaml-docs-ci and figure out why it's rebuilding everything all the time? Or should I forge ahead with day10 and turn it into a proper CI system as opposed to a slightly flakey bespoke thing I have to handhold through a build? This is next week's problem.</p> 151 + <h3 id="js-toplevels"><a href="#js-toplevels" class="anchor"></a>JS toplevels</h3> 152 + <p>Something I keep coming back to is javascript toplevels. I'd really like to be able to be able to host JS toplevels on ocaml.org for each different version of each different package. This is something I've worked on on-and-off for a long time now, and several fixes to help have been merged to various projects along the way. The tricky thing is to not put a massive load onto ocaml.org with this, so we need to be efficient. That means firstly having a single toplevel js file with all of the logic in but none of the libraries, and then dynamically loading libraries as we need them. Also we can save some bandwidth by not immediately sending all of the cmi files, as these can be faulted in as necessary too. So once again I've got Claude on the task, and things are honestly looking pretty hopeful now. I've got 2 demos:</p> 153 + <ul><li><a href="https://jon.ludl.am/experiments/findlibish/">Dynamic library loading</a></li><li><a href="https://jon.ludl.am/experiments/multi-universe-demo/">Multi-version support</a></li></ul> 154 + <p>In both cases, make sure you take a look at the network tab to see it dynamically loading only what it needs.</p> 155 + <h2 id="retrospective"><a href="#retrospective" class="anchor"></a>Retrospective</h2> 156 + <h3 id="autonomous-claude"><a href="#autonomous-claude" class="anchor"></a>Autonomous Claude</h3> 157 + <p>The power of sending Claude off to do some work can be immense. However, it does mean investing time up front telling it precisely what problem you're trying to solve, what approach to take, finer details on how you want it done, and how you can tell if it's working when it finishes. A 'failure mode' I've been experiencing is when I end up in a long, drawn out real time interaction, especially if that's happening with 2 projects simultaneously - and by 'failure' I really mean just 'slow'. Ideally what would be going on is for all of my agents to be getting on with whatever task they've been allocated without bothering me for more details. For Claude to have to ask me a question has much more latency involved than it just getting on with things, especially if I don't notice it immediately.</p> 158 + <h3 id="when-to-stop"><a href="#when-to-stop" class="anchor"></a>When to Stop</h3> 159 + <p>The 'finishing criteria' are important - many times this week I've had Claude tell me it's finished something, having verified that it's passing all the tests, only for me to take a look to find that it's very obviously broken. As quite a few things recently have involved the web, I've put Playwright into all of my devcontainers, and told Claude to use it to verify things are working. This has been working pretty well, so I'll be adding it to my prompts. It's not too dissimilar to what we used to call 'pre-flight checks' back in the Citrix days.</p> 160 + <h3 id="containers-vs-accounts"><a href="#containers-vs-accounts" class="anchor"></a>Containers vs accounts</h3> 161 + <p>I've been running everything with `--dangerously-ignore-permissions` in containers, and while the outcome is amazing, the containers bit has been a bit of a headache. Next week I'll be trialling the idea of just giving the agents their own account (non-admin!) on my servers, their own github account, tangled account and so on, and just treating them more like I would if I had a real colleague. It's always slightly alarming to see my own name on the output of the bots, assigning me (or sometimes someone else (!!)) copyright over code I've never seen. This is, of course, a whole other pandora's box that I really don't want to open right now - but I think the point is that I'll feel a lot more comfortable if the commits are all by `Jon's Agent &lt;jon+claude@recoil.org&gt;` rather than by me!</p> 162 + <h3 id="deciding-next-steps"><a href="#deciding-next-steps" class="anchor"></a>Deciding next steps</h3> 163 + <p>The question of whether I should fix up ocaml-docs-ci or improve the day10 solution requires a bit of thought. In fact, it requires a bit of a gap analysis between the two. This isn't something I've asked Claude to do before, so I'll try that and see how it turns out. I'll be asking it to be &quot;scientific&quot; in its approach, coming up with hypotheses and verifying them - for which I think I'll need to give it a platform on which it can perform experiments. This is a bit trickier with ocaml-docs-ci than day10 as day10 runs entirely on any given linux computer, whereas ocaml-docs-ci needs ocurrent workers and a routable ssh server. I'll report on the outcome of this next week!</p>]]></content> 164 + </entry> 165 + <entry> 166 + <id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html</id> 167 + <title>Weeknotes for week 3</title> 168 + <published>2026-01-19T00:00:00Z</published> 169 + <updated>2026-01-19T00:00:00Z</updated> 170 + <link rel="alternate" href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html"/> 171 + <summary>First week back of 2026! Let's write some terse weeknotes.</summary> 172 + <content type="html"><![CDATA[<h1 id="weeknotes-for-week-3"><a href="#weeknotes-for-week-3" class="anchor"></a>Weeknotes for week 3</h1> 173 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2026-01-19</p></li></ul> 174 + <p>First week back of 2026! Let's write some terse weeknotes.</p> 175 + <h2 id="projects"><a href="#projects" class="anchor"></a>Projects</h2> 176 + <h3 id="dune-odoc-rules"><a href="#dune-odoc-rules" class="anchor"></a>Dune odoc rules</h3> 177 + <p>Last thing I did last year was to push the new rules for odoc 3. This week, Anil handed me an excellent opportunity to test the rules on the monorepo containing his <a href="https://anil.recoil.org/notes/aoah-2025">AOAH</a> projects. Claude tends to actually write ocamldoc-formatted comments, so this is really useful to test the rules. I've <a href="https://github.com/jonludlam/dune/tree/odoc-v3-rules-3.21">rebased the commits</a> on the just-released Dune 3.21 and we've been trying them out. There were a few things to fix:</p> 178 + <ul><li>More careful <a href="https://github.com/jonludlam/dune/commit/25158eabf0c3cac2826e16ce590b4bd4d7c09818">dependency tracking</a> during the compile phase - this particularly affected the <code>@doc</code> target, which was pulling in unnecessary dependencies. Most of these dependencies were compiling just fine, but one - Anstrom - is slightly odd in that the opam install of Angstrom installs a META file that references libraries that aren't in the dependencies of its opam package. This is a backward-compatibility hack that was implemented when the Anstrom package was split into several in order to manage the dependencies better.</li><li>A similar issue happens with eio, where the documentation of the package depends upon <code>bigstring</code>, which isn't in eio's dependencies. This is entirely intentional - the extra doc dependencies is stated in the opam file with a <code>x-extra-doc-deps</code> field. However, <code>opam install</code> totally ignores this field (quite reasonably), and so a simple install gives you an opam repo whose docs can't be built. Once again, this broke <code>dune build @doc</code> unnecessarily, but the fix was <a href="https://github.com/jonludlam/dune/commit/2afe046cf4290d7a83b5f2c5646e3391ca94b630">relatively simple</a>. The <i>real</i> fix here is to not use <code>x-extra-doc-deps</code>, but switch to using a <i>real</i> dependency, but marked with <code>with-doc</code> and <code>post</code> if it would otherwise introduce a circular dependency. That way, an <code>opam install --with-doc</code> <i>would</i> install the extra dependency.</li><li>Over the Christmas break, <a href="https://discuss.ocaml.org/u/tbrk">tbrk</a> posted <a href="https://discuss.ocaml.org/t/odoc-index-for-multiple-packages-inter-package-links-and-local-global-sidebar/17652">on discuss</a> a question about building docs, for which my dune branch was a partial answer. One feature he was requesting though was the ability to use a custom top-level index. It's a useful feature that's implemented in <code>odoc_driver</code> so I've <a href="https://github.com/jonludlam/dune/commit/efecdee1b36b7e47906e7c64b7496a1fc7954a2d">added it</a>.</li><li>More sensible <a href="https://github.com/jonludlam/dune/commit/039eb3d2a3e9d28f8b195905f43839daf5ce8c21">default link scope</a>. By default, documentation references in the <code>mli</code> files of a library can link to any other library in the package. However, by default it wasn't possible to link to the dependencies of another library, unless it happened to be a dependency of your own library. Similarly, the package-wide mld files could only reference the modules in the package's libraries, not to the dependencies. This seems overly cautious, as we can be sure that if we've managed to build the libraries then their dependencies are installed, and if there are any module name conflicts, we can resolve them via the <code>/&lt;lib&gt;/Module</code> syntax.</li><li>Lastly, implementations of virtual libraries <a href="https://github.com/jonludlam/dune/commit/12f9ecbd4888444c2d359049a914ffb4827912f9">need to be skipped</a> as they've all got the same docs (as they share mli files), and the rules as they were causing Dune to crash with a &quot;Conflicting implementations&quot; error.</li></ul> 179 + <p>I've also rebased the PR onto latest <code>main</code>, but I've not yet put these patches there, which I'll need to do for the PR to be mergable. For now, the 3.21 branch is successfully building the docs for the monorepo.</p> 180 + <h3 id="ocaml-docs-ci"><a href="#ocaml-docs-ci" class="anchor"></a>OCaml Docs CI</h3> 181 + <p><a href="https://github.com/jmid">Jan Midtgaard</a> noticed over xmas that the Docs CI <a href="https://github.com/ocaml/ocaml.org/issues/3437">was broken</a> and submitted <a href="https://github.com/jonludlam/opamh/pull/1">a fix</a>. I've therefore been poking <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> to get the fix incorporated and into production. I almost immediately hit the issue that <code>odoc_driver</code> now breaks for the exact same reason. I couldn't quite understand how <code>opam-format</code> <a href="https://github.com/ocaml/opam-repository/pull/28978">had been merged</a> to <code>opam-repository</code> without someone noticing that it had broken <code>odoc_driver</code>, but it turned out that it <i>had</i> been noticed, but on a <a href="https://github.com/ocaml/opam-repository/pull/28877">beta release</a>. The fix to docs ci was to install <code>odoc_driver</code> from opam rather than <a href="https://github.com/ocurrent/ocaml-docs-ci/blob/81ca17c7b7a2f47ca571b1d6bc866720cebef136/src/lib/config.ml#L226">pinning directly</a> to a github hash, especially if that hash happens to be the hash of the released version!</p> 182 + <p>While I'm working on docs CI, I thought it's probably also a good idea to move over to the <code>with-doc &amp; post</code> suggestion from above, so we're ready for when packages start to use that. This is now being tested, and hopefully we'll have the CI back up and running early next week.</p> 183 + <h3 id="better-styling-for-odoc"><a href="#better-styling-for-odoc" class="anchor"></a>Better styling for odoc</h3> 184 + <p>I've done very little to the styling of odoc since I took maintainership way back in 2019 or so. It's a bit dated, and there are some annoying usability issues, so I thought it's a good opportunity to vibe-code a nice new frontend for it. Rather than hack directly on the HTML generator of odoc, this seemed to be a good opportunity to test the JSON output from the new Dune rules, so I asked Claude to make me a static site generator that read in the JSON files and spat out some nicely styled HTML. This worked like a charm, and the results are <a href="https://jon.ludl.am/experiments/vibe-coded-odoc-frontend/">here</a>. Next steps are to see what it would take to get the native odoc output looking more like that.</p> 185 + <h3 id="custom-tags-in-odoc"><a href="#custom-tags-in-odoc" class="anchor"></a>Custom tags in odoc</h3> 186 + <p>One of the themes of Anil's <a href="">AOAH</a> coding spree was that many libraries were implementations of RFCs. In many places in the docs, there are links to relevant sections of the RFCs. It'd be nice in future to be able to validate that we've covered all of the parts of the RFCs, so making the links a little more parsable seemed like a good idea. In fact, it seemed that this might be a perfect use for custom tags - a feature that was present in ocamldoc that odoc has yet to implement.</p> 187 + <p><a href="https://github.com/art-w">Arthur Wendling</a> then pointed me at dune's <a href="https://dune.readthedocs.io/en/stable/reference/dune/plugin.html">plugin system</a>, which seemed just the ticket as a way to implement this. It's really nice, taking all of the hard work out of creating OCaml plugins, so I've now got <a href="https://github.com/jonludlam/odoc/tree/extension-plugins">an extension-plugins branch</a> that implements this. It allows you to add support to odoc for tags like <code>@rfc</code> which generate custom HTML, markdown or any other backend, can include links in their bodies, and can add custom headers to the web page, and custom files to be output by <code>odoc support-files</code>. It looks like this should &quot;just work&quot; and no further changes to the dune rules are needed - though I need to actually test this out.</p> 188 + <h3 id="day10-and-docs"><a href="#day10-and-docs" class="anchor"></a>Day10 and docs</h3> 189 + <p>I've <a href="../../2025/09/build-ids-for-day10.html" title="build-ids-for-day10">written about</a> <a href="https://tunbury.org/">Mark's</a> day10 project before. It's a tool to very rapidly build odoc packages mainly in order to test that they build correctly. An obvious extension would be to use this to then build the docs for those packages, as the way we do this requires the packages to be built first. This would be a replacement for the Docs CI that I talked about above, though there's considerable work to do before it's fully-featured enough to be a viable alternative. It seemed like a good time to experiment with this though, so I set up one of Anil's <a href="https://anil.recoil.org/notes/ocaml-claude-dev">devcontainers</a>, gave Claude some instructions on what to do, took the safety belt off, and let him hack away! Previously most of my interactions with Claude had been via the vscode plugin, so using the terminal interface was a bit of a different experience. I'm fairly certain though that I'm going to switch everything over to working this way, as letting Claude just get on with things without having to OK every step is a far more efficient way to work - especially when you're not that concerned with the actual code being produced. This has been mostly a good experience, though Claude does sometimes go off in rather odd directions. At one point there was a network error with a dependency while trying to build odoc_driver, so it decided that it should have a fallback mechanism that executed odoc directly. I told it <i>NEVER</i> to replace functionality in odoc_driver, so it rolled this back, but a few hours later in then did exactly the same thing again.</p> 190 + <h3 id="misc-other-stuff"><a href="#misc-other-stuff" class="anchor"></a>Misc other stuff</h3> 191 + <p>A few other things too - <a href="https://github.com/jonludlam/odoc/commit/59037341cd53d8734a5874f7af2b728b5be70035">improving the <code>--warn-error</code> logic in odoc</a>, and one of its <a href="https://github.com/jonludlam/odoc/commit/9d18feff5eda543652c6749062750de6e5bb4d6e">error messages</a>, improving the build of this website so I can iterate on it more quickly, fixing up some of my self-hosted services like my tangled knot, and other bits and bobs.</p> 192 + <h2 id="reflections"><a href="#reflections" class="anchor"></a>Reflections</h2> 193 + <p>I think the most important thing this week has been the slightly eye-opening benefits of using Claude outside of the context of VSCode. I suspect I'll be doing much more of my work this way in future. There's also a good chance I'll have to upgrade my subscription from the $100-per-month to the $200 one...</p> 194 + <h2 id="next-week"><a href="#next-week" class="anchor"></a>Next week</h2> 195 + <ul><li>Start of term tutorial meetings</li><li>Sherldoc in monopam-myspace</li><li>Get ocaml-docs-ci deployed and working</li><li>Update the Dune PR</li><li>Integrate the custom-tags and website generator into monopam-myspace</li><li>Unleash Claude on my js-top-worker repo</li></ul>]]></content> 196 + </entry> 197 + <entry> 198 + <id>https://jon.recoil.org/blog/2025/12/claude-and-dune.html</id> 199 + <title>Claude and Dune</title> 200 + <published>2025-12-18T00:00:00Z</published> 201 + <updated>2025-12-18T00:00:00Z</updated> 202 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/12/claude-and-dune.html"/> 203 + <summary>Back in March of this year we released , a major new version of the OCaml documentation generator. It had a whole load of , many of which came with new demands on the build system driving it. We decid...</summary> 204 + <content type="html"><![CDATA[<h1 id="claude-and-dune"><a href="#claude-and-dune" class="anchor"></a>Claude and Dune</h1> 205 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-12-18</p></li></ul> 206 + <p>Back in March of this year we released <a href="https://ocaml.github.io/odoc/odoc/index.html">odoc 3.0.0</a>, a major new version of the OCaml documentation generator. It had a whole load of <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">new features</a>, many of which came with new demands on the build system driving it. We decided when working on it to build a new driver for odoc so that we could adjust it as we were building the new features, and this driver is now used to <a href="../07/odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">build the documentation</a> that appears on <a href="https://ocaml.org/p/base/latest/doc/index.html">ocaml.org</a>. However, it was always the plan to integrate the new features into <a href="https://dune.build">Dune</a> so that everyone could just run <code>dune build @doc</code> and be able to use all of the new odoc 3 features.</p> 207 + <p>So over the last few weeks I have been wrestling with getting Claude to update the odoc rules in Dune to support <i>some</i> of the new features of odoc v3. What began as a background experiment during a lecture series has turned into a multi-week effort to turn mostly-working code into a clean, reviewable patch. AI-developed software is clearly going to be a big part of our future, and Anil is showing us all the way with his <a href="https://anil.recoil.org/notes/aoah-2025-1">Advent of Agentic Humps</a> by building <i>new</i> software, but upstreaming AI-generated changes to an existing, well established code base <a href="https://github.com/ocaml/ocaml/pull/14369">hasn't got off to a good start</a> in the OCaml community, so I wanted to be extra careful to get this right.</p> 208 + <h3 id="claude-as-a-protyping-tool"><a href="#claude-as-a-protyping-tool" class="anchor"></a>Claude as a protyping tool</h3> 209 + <p>The initial progress was pretty amazing, despite my initial worries that the dune code-base would be <a href="https://github.com/ocaml/dune/pull/12529">too large and subtle</a> for an LLM to be able to make workable changes. In order to get going, first I had it look at several bits of example code:</p> 210 + <p>1. <a href="https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc.ml">dune_rules/odoc.ml</a> - this is the current home of the odoc rules in dune. It's local-only, meaning it only builds the docs for the current package in isolation, so no resolution of links to stdlib, other packages or libraries.</p> 211 + <p>2. <a href="https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc_new.ml">dune_rules/odoc_new.ml</a> - these are the rules for odoc v2, which allow you to build the docs for your package plus all of the dependencies. I wrote this mostly myself some time ago. It does a pretty poor job of caching, error reporting, and has none of the odoc v3 features like assets, source rendering, hierarchical docs, better errors and so on.</p> 212 + <p>3. <a href="https://github.com/ocaml/odoc/tree/d8460cdaa2b91a03434a9a045d673703b7fabfb2/src/driver">odoc_driver</a> - this is the driver we wrote when building odoc v3. It's fully featured, but not at all incremental, and actually external to the dune codebase. It's the reference implementation that's used to build the docs that appear on <a href="https://ocaml.org/p/base/latest/doc/index.html">ocaml.org</a>.</p> 213 + <p>Armed with these three code-bases, I asked Claude to synthesise a new incremental version of the odoc rules for dune that has some of the features of <code>odoc_driver</code>.</p> 214 + <h3 id="the-working-prototype"><a href="#the-working-prototype" class="anchor"></a>The working prototype</h3> 215 + <p>Claude quickly produced a prototype that actually compiled and generated documentation. At that stage I was not interested in the quality of the generated source; I only needed to know whether Claude could navigate Dune's codebase and produce something that <b>works</b>. I let the prototype evolve incrementally, adding in new features one at a time, for example, fixing the error reporting so that you only get warned about documentation errors that you can actually fix.</p> 216 + <p>When the lectures finished, it turned out I had something that was pretty useful to me, and had a good chance to be useful to others too. So I opened up my editor and had a look through what had been produced, at this point hoping that a little bit of polishing should be enough - after all, it <i>was</i> working!</p> 217 + <p>It was dreadful.</p> 218 + <p>There were long, rambling functions, code duplication, bad comments, it was unstructured, with repeated-but-slightly-different chunks all over the place. It wasn't just bad on one length scale - it was bad from the large-scale organisation of the code down to small scale baffling weirdnesses on one line. The more I looked, the more bonkers it appeared. But it did <i>work</i>! So I thought I'd get Claude to clean up its own messes.</p> 219 + <h3 id="the-clean-up"><a href="#the-clean-up" class="anchor"></a>The clean-up</h3> 220 + <p>I resolved that I would continue to let Claude do <i>all</i> of the editing, and not do <i>any</i> myself, and so thus began the more frustrating part of this adventure! I ended up giving a mix of very specific instructions: &quot;move this code here&quot;, &quot;factorize out this functionality&quot;, &quot;rename this function&quot;, and sometimes more general ones: &quot;Remove any comments that don't add anything of value&quot;, or &quot;Think of a better way to do this&quot;. The constant was that I needed to be looking over each change that it did, because while most of them were pretty good, there were still a few, even with the very explicit instructions, where it messed up. From the very broad, where at one point it told me &quot;I'll remove this code to create odoc files for external dependencies, as they're installed by opam&quot;, which isn't true, down to the very small - for example, it produced the following:</p> 221 + <div><pre class="language-ocaml"><code>let lib_names = deps.Odoc_config.libraries in 222 + if List.is_empty lib_names 223 + then Memo.return [] 224 + else Memo.List.filter_map lib_names ~f:(fun lib_name -&gt; Lib.DB.find lib_db lib_name)</code></pre></div> 225 + <p>where it has come up with a totally redundant check for the empty list.</p> 226 + <p>It was at this point where it became frustrating, because although it's almost magical that Claude can do what it does in the time it does, this fact of having to keep a constant eye on it meant the the tens-of-seconds to minutes delay in between it doing something meant I ended up either twiddling my thumbs for long periods of time, or getting started on some other task and forgetting to come back to Claude, sometimes for hours!</p> 227 + <h3 id="ocaml-is-not-the-problem"><a href="#ocaml-is-not-the-problem" class="anchor"></a>OCaml is <b>not</b> the problem</h3> 228 + <p>One part that particularly impressed, and also quite surprised me, was with its knowledge of OCaml. In particular, I had at one point two different types representing the 'target' - either a library or a package - and a 'kind' - either a module or a page. Now pages can only be associated with package targets, and modules can only be associated with libraries, but these two values were distinct, so there was a fair bit of code pattern matching invalid combinations and either throwing exceptions or picking some random value, depending on the whims of Claude's context. I bravely suggested it think of a better way to represent this, maybe using GADTs, and it did indeed come up with a pretty nice refactoring of the types:</p> 229 + <p>Before:</p> 230 + <div><pre class="language-ocaml"><code>type target = 231 + | Lib of Package.Name.t * Lib.t 232 + | Pkg of Package.Name.t 233 + 234 + type artifact_kind = 235 + | Module of 236 + { visible : bool 237 + ; module_name : Module_name.t 238 + ; archive : string (* Which archive the module belongs to *) 239 + } 240 + | Page of 241 + { name : string 242 + ; pkg_libs : Lib.t list 243 + }</code></pre></div> 244 + <p>After:</p> 245 + <div><pre class="language-ocaml"><code>(* Artifact data types *) 246 + type page = { name : string; pkg_libs : Lib.t list } 247 + 248 + type mod_ = 249 + { visible : bool 250 + ; module_name : Module_name.t 251 + ; archive : string (* Which archive the module belongs to *) 252 + } 253 + 254 + type _ target = 255 + | Lib : Package.Name.t * Lib.t -&gt; mod_ target 256 + | Pkg : Package.Name.t -&gt; page target 257 + 258 + type artifact_kind = 259 + | Module : mod_ * mod_ target -&gt; artifact_kind 260 + | Page : page * page target -&gt; artifact_kind</code></pre></div> 261 + <p>This refactoring immediately removed a whole swathe of invalid combinations, making the code both safer and clearer. It's quite clear that Claude had no trouble understanding how GADTs work in OCaml, quite happily also using some existentials to pack them into lists and so on.</p> 262 + <h3 id="odd-behaviours"><a href="#odd-behaviours" class="anchor"></a>Odd behaviours</h3> 263 + <p>Sometimes Claude just went a little bit bananas. One annoyance that <i>repeatedly</i> occurred was that it would forget how to build and test the dune executable, despite clear instructions in <code>Claude.md</code>. Most of the time when it went wrong it would build dune, execute <code>dune clean</code>, then try to run the dune binary that it had just removed with the <code>clean</code>. Sometimes it would decide to use the bootstrap binary instead, which isn't rebuilt on every change, sometimes it would run the switch-installed dune binary, and on one occasion it tried to run <code>./configure &amp;&amp; make</code>!</p> 264 + <p>It would usually figure out eventually what the right thing to do was, but when you're waiting for it to complete so you can check what it's done these sorts of delays got a bit frustrating.</p> 265 + <h3 id="reflections"><a href="#reflections" class="anchor"></a>Reflections</h3> 266 + <p>At one point, I ran out of Claude credits (despite paying $100 a month or so), at about 6:20pm one evening, and it told me that I needed to wait until 7pm to carry on. I'd just got to the point when I needed to write a short bit of code rather than refactoring what was already there, and I realised that while it would take me maybe 10 mins, it would take Claude maybe 10 seconds. Now, it could just be that it was the end of a long day and I was running out of steam, but I was content to switch focus elsewhere for a bit to wait for my credits to reset before carrying on! The point being that for the small implementation that I was after, it would be possible for me to get Claude to do it, and to eyeball the result to make sure it was OK in less time than I would have been able to do it myself. But I absolutely wouldn't have trusted Claude to do it in an upstreamable way <b>without</b> looking at the result.</p> 267 + <p>Overall, It's clear that Claude will be an incredibly useful tool for working with software. It's unbelievably good at jumping into a new code-base and figuring things out quickly, but less good at producing high-quality code that can be directly submitted upstream (yet?) - at least, not that <b>I</b> would be comfortable submitting anyway. However, I think it's still a bit of an open question as to what the quality bar <em>should</em> be. If it builds correctly, passes the tests, looks <i>broadly</i> sensible and isn't on the critical path for performance, how much should we care about the line-to-line quality? <b>I</b> certainly care, but am I being old fashioned?</p> 268 + <p>I've submitted a <a href="https://github.com/ocaml/dune/pull/12995">PR with these changes</a> for review, and we'll see what happens there. I ended up squashing all of the commits into one, as the intermediate steps are very likely not useful. However, for historical interest, the branch on which I did most of the work is <a href="https://github.com/ocaml/dune/compare/main...jonludlam:dune:odoc3-global-sidebar">here</a>.</p>]]></content> 269 + </entry> 270 + <entry> 271 + <id>https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html</id> 272 + <title>An SVG is all you need</title> 273 + <published>2025-12-09T00:00:00Z</published> 274 + <updated>2025-12-09T00:00:00Z</updated> 275 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html"/> 276 + <summary>SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them int...</summary> 277 + <content type="html"><![CDATA[<h1 id="an-svg-is-all-you-need"><a href="#an-svg-is-all-you-need" class="anchor"></a>An SVG is all you need</h1> 278 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-12-09</p></li></ul> 279 + <p>SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them interactive. They're <a href="https://www.youtube.com/watch?v=4laPOtTRteI">way more capable</a> than many people realise, and I think we can capitalise on some of that unrealised potential.</p> 280 + <p>Anil's recent post <a href="https://anil.recoil.org/notes/principles-for-collective-knowledge">Four Ps for Building Massive Collective Knowledge Systems</a> got me thinking about the permanence of the experimentation that underlies our scientific papers. In my idealistic vision of how scientific publishing should work, each paper would be accompanied by a fully interactive environment where the reader could explore the data, rerun the experiments, tweak the parameters, and see how the results changed. Obviously we can't do this in the general case - some experiments are just too expensive or time-consuming to rerun on demand. But for many papers, especially in computer science, this is entirely feasible.</p> 281 + <p>That line of thought reminded me of a project I tackled about 20 years ago as a post-doc in the Department of Plant Sciences here in Cambridge. I was writing a paper on <a href="https://royalsocietypublishing.org/rsif/article/9/70/949/173/Applications-of-percolation-theory-to-fungal">synergy in fungal networks</a> and built a tiny SVG visualisation tool that let readers wander through the raw data captured from a real fungal network growing in a petri dish. I dug it up recently and was surprised (and delighted) to see that it still works perfectly in modern browsers - even though the original “cover page” suggested Firefox 1.5 or the Adobe SVG plug-in (!). Give it a spin; click the 'forward', 'back' and other buttons below the petri dish!</p> 282 + <div><a href="fungus.svg" class="img-link"><img src="fungus.svg" alt="fungus.svg"/></a></div> 283 + <p>And that, dear reader, is literally all you need. A completely self-contained SVG file can either fetch data from a versioned repository or embed the data directly, as the example does. It can process that data, generate visualisations, and render knobs and sliders for interactive exploration. No server-side magic required - everything runs client-side in the browser, served by a plain static web server, and very easily to share.</p> 284 + <p>How does it fit in with Anil's four Ps?</p> 285 + <ul><li>Permanence: SVGs can be assigned DOIs just like papers, blog posts, or datasets. The fact that the above SVG still works after two decades is a testament to the durability of the format.</li></ul> 286 + <ul><li>Provenance: Because SVG is plain text, it plays nicely with version control systems such as Git. When an SVG pulls in external data, the same provenance-tracking strategies Anil describes for datasets apply here as well.</li></ul> 287 + <ul><li>Permission: Once again, with the separation between the processing in the SVG and that data that it works on, the same permissioning models apply as for data in general.</li></ul> 288 + <ul><li>Placement: SVGs are <i>inherently</i> spatial; it's very easy, for example, to make beautiful <a href="https://stephanwagner.me/coding/blog/create-world-map-charts-with-svgmap#svgMapDemoGDP">world maps</a> with SVG.</li></ul> 289 + <p>The SVG above is only a visualisation tool for data; it doesn't really do any processing, but it certainly <i>could</i>. The biggest change that's happened over the 20 years since I wrote this is the <i>massive</i> increase in the computation power available in the browser. If would be entirely feasible to implement the entire data analysis pipeline for that paper in an SVG today, probably without even spinning up the fans on my laptop!</p> 290 + <p>So this is yet another tool in our ongoing effort to be able to effortlessly share and remix our work - added to the pile of Jupyter notebooks, <a href="https://digitalflapjack.com/blog/marimo/">Marimo botebooks</a>, the <a href="https://slipshow.readthedocs.io/en/stable/">slipshow</a>/<a href="https://github.com/art-w/x-ocaml/">x-ocaml</a> <a href="../11/foundations-of-computer-science.html" title="foundations-of-computer-science">combination</a>, <a href="https://patrick.sirref.org/weekly-2025-w45/index.xml">Patrick's take</a> on Jon Sterling's <a href="https://sr.ht/~jonsterling/forester/">Forester</a>, my own <a href="../../../notebooks/index.html" title="index">notebooks</a>, and many others - and this is a subset of what we're using just in our own group!</p>]]></content> 291 + </entry> 292 + <entry> 293 + <id>https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html</id> 294 + <title>Foundations of Computer Science</title> 295 + <published>2025-11-14T00:00:00Z</published> 296 + <updated>2025-11-14T00:00:00Z</updated> 297 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html"/> 298 + <summary>I recently completed lecturing the course to our newly arrived first-year computer scientists here at . This is the first time I've lectured this course, taking over from while he's on sabbatical. A...</summary> 299 + <content type="html"><![CDATA[<h1 id="foundations-of-computer-science"><a href="#foundations-of-computer-science" class="anchor"></a>Foundations of Computer Science</h1> 300 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-11-14</p></li></ul> 301 + <p>I recently completed lecturing the course <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/">&quot;Foundations of Computer Science&quot;</a> to our newly arrived first-year computer scientists here at <a href="https://www.cam.ac.uk">Cambridge</a>. This is the first time I've lectured this course, taking over from <a href="https://anil.recoil.org/">Anil</a> while he's on sabbatical. Although I was very nervous indeed about it, I ended up really enjoying the experience - and I hope the students did too! This post is a little brain dump of my thoughts on how it went and how we might improve it for next year.</p> 302 + <h2 id="course-overview"><a href="#course-overview" class="anchor"></a>Course Overview</h2> 303 + <p>The course is 12 lectures long and has been lectured in a similar way since I myself was an undergraduate here, way back in 1996. There have been a few changes, not least of which is that back then it was in Standard ML rather than OCaml, but the core material has remained largely the same: lists, recursive functions, trees, higher-order functions, search and finally mutability. There are no prerequisites for the course, although all students have at least a maths A-level (or equivalent), and almost all of them have done some programming before, though the experience varies widely. Very few have done any functional programming, and even fewer have written any OCaml before.</p> 304 + <p>The notes for the course are distributed both in hard copy and also as an <a href="https://github.com/ocamllabs/focs-notebooks/blob/main/1A%20Foundations%20of%20Computer%20Science.ipynb">interactive Jupyter Notebook</a>, which we host on our <a href="https://hub.cl.cam.ac.uk">JupyterHub server</a> that I maintain. The idea is that the students can read through the notes and then play around with the code examples directly in the notebook. I don't encourage them or give them time to do much <i>during</i> the lectures - not that I think this is a terrible idea, but it's a struggle to fit all the material in otherwise! The notes are pretty closely coupled to the lectures, organised into 11 chapters that correspond to the first 11 lectures, with exercises at the end of each chapter that are intended to be covered in the supervisions. We also have some assessed exercises - &quot;Ticks&quot; - that the students complete in their own time using the JupyterHub server using <a href="https://github.com/jupyter/nbgrader">nbgrader</a>. They are automatically assessed in a very transparent way; each &quot;tick&quot; is a Jupyter notebook with editable answer cells and read-only test cells. Overall we're aiming for the students not to <i>have</i> to install OCaml locally at all, though I hope many of them will choose to do so anyway.</p> 305 + <p>While I didn't want them playing around with the notebook during the lectures, I do, however, try to get them to interact by getting them to answer questions. It's pretty intimidating to stick your head above the parapet like this, so as an incentive I rewarded those that answered (rightly or wrongly) with some of the excellent stickers that Tarides has printed over the years. Everybody loves stickers!</p> 306 + <p>The questions I asked varied quite a lot in their difficulty, and many were in the first few minutes of each lecture, where I had a short 'warm-up' where we recapped the contents of the previous lecture. These warm-ups were strongly suggested by Anil, and as well as reminding everyone of where we left off, they also gave me a bit of feedback on the things that the students found challenging.</p> 307 + <p>One entertaining aspect is that during the first lecture I do actually encourage them to at least log on to the JupyterHub server, mostly to get them used to the idea of trying it. The entertaining part is that our server isn't particularly big and beefy, and so with 130 students all trying to log on at once, it invariably caves in under the load. At this point in the lecture I ssh to the server and run btop/htop and we watch it die in real time!</p> 308 + <h2 id="what-changed-this-year"><a href="#what-changed-this-year" class="anchor"></a>What changed this year</h2> 309 + <p>During the lectures themselves, rather than use Keynote or PowerPoint for the slides, I decided to try using <a href="https://slipshow.readthedocs.io/en/stable/">Slipshow</a>, augmented with <a href="https://github.com/art-w/x-ocaml">x-ocaml</a> to embed executable OCaml code snippets. I'm very happy with how this worked out. I was able to prepare both working and broken snippets, modify them live during the lecture, and things like type-on-hover was very useful. In a few lectures where we were discussing big-O notation, I was able to run code on different input sizes and really demonstrate the big difference in run-time of certain algorithms. After the lectures, I posted the slides onto the course website so that students can refer back to them, and they can also try out the live code snippets directly in the slides.</p> 310 + <p>Both Slipshow and x-ocaml are still quite young projects, so it was inevitable that there were a few rough edges, and in fact the interaction of the two revealed the biggest problem: that when you use the 'speaker-view' mode of Slipshow, where you have a separate window with notes and the current slide, the x-ocaml widgets are effectively independent in the two windows, so updating in one doesn't update in the other. <a href="https://choum.net/panglesd/">Paul-Elliot</a>, the author of Slipshow, had already got a potential fix for this in the works when I spoke to him about it, so hopefully next time I use this I'll be able to have speaker notes on screen, instead of hand-written index cards! The x-ocaml project is a lot smaller than Slipshow, so I was able to use Claude to help me add functionality I needed, such as being able to programmatically highlight sections of the code.</p> 311 + <p>Another new thing I tried this year was to go over 'tracing' of execution to help the students understand how programs run. We've always taught reduction steps in the course, which works well as it's only the last lecture where we introduce mutability, but it can quickly become unwieldy, and it can be challenging to do this all by hand. Tracing a function tells the runtime to log when function calls and returns happen, so you just need to call the function on your desired input, and you get a fully automatic trace of the execution. As it's only function calls and returns, it doesn't tell the full story, but alongside the handwritten reduction, it can help reassure students that they're on the right track. I ended up writing up a trace of a particularly complicated lazy-list evaluation using Slipshow and x-ocaml, which I posted <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/interleave_explanation.html">here</a>.</p> 312 + <h2 id="thoughts-for-next-year"><a href="#thoughts-for-next-year" class="anchor"></a>Thoughts for next year</h2> 313 + <p>Overall I'm very happy with how the course went this year, though in some ways it did feel a little bit like the course finished just when it had started to get to the good stuff! There's a Tripos review process going on at the moment, so maybe we'll get to expand this course a bit in future years.</p> 314 + <p>While the Slipshow+x-ocaml combination worked well, the fact that we ended up with two separate systems for executing OCaml wasn't ideal. I think it'd be a really nice project to investigate just how far we can push x-ocaml / Slipshow / some other web technology to have a true &quot;serverless&quot; experience so we can ditch the JupyterHub server entirely. By caching the x-ocaml 'execution' web worker in the browser, we could have a system that works fully offline, removing an annoyingly failure-prone single point of failure. Of course, we'd still need some way to do the assessed exercises, but that's a small point in a much larger problem: we really can't continue to ignore how LLMs are impacting the way that students are approaching these exercises in both positive and negative ways. To answer this properly, we need to think hard about what the purpose of these exercises is and look around to see what our <a href="https://eecs.iisc.ac.in/people/prof-viraj-kumar/">colleagues</a> are doing <a href="https://dl.acm.org/doi/10.1145/3724363.3729100">in this space</a>.</p> 315 + <p>The slide decks themselves are fully open and available on the <a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/">course website</a>:</p> 316 + <ol><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture1/lecture1.html">Introduction</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture2/lecture2.html">Recursion and Complexity</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture3/lecture3.html">Lists and Polymorphism</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture4/lecture4.html">More Lists and Making Change</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture5/lecture5.html">Sorting</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture6/lecture6.html">Datatypes and Trees</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture7/lecture7.html">Dictionaries and Functional Arrays</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture8/lecture8.html">Currying</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture9/lecture9.html">Sequences, or Lazy Lists</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture10/lecture10.html">Search</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture11/lecture11.html">Procedural Programming</a></li><li><a href="https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture12/lecture12.html">Recap and Real World Use!</a></li></ol>]]></content> 317 + </entry> 318 + <entry> 319 + <id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html</id> 320 + <title>Caching opam solutions - part 2</title> 321 + <published>2025-09-23T00:00:00Z</published> 322 + <updated>2025-09-23T00:00:00Z</updated> 323 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html"/> 324 + <summary>Some results from the . This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</summary> 325 + <content type="html"><![CDATA[<h1 id="caching-opam-solutions---part-2"><a href="#caching-opam-solutions---part-2" class="anchor"></a>Caching opam solutions - part 2</h1> 326 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-23</p></li></ul> 327 + <p>Some results from the <a href="caching-opam-solutions.html" title="caching-opam-solutions">previous post</a>. This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</p> 328 + <p>First let's talk about the &quot;examination map&quot;. This is a map from package name to a list of other packages whose solutions should be recalculated if the package in question is altered. It's built by first looking at the packages that the solver asks about during the solution for a package, and then taking <em>all</em> of the solutions, and 'inverting' the map, so for example, if both packages 'a' and 'b' ask about package 'c' during their solutions, then altering 'c' means that the solutions for both 'a' and 'b' need to be recalculated. The examination map entry for 'c' would then be <code>'a'; 'b'</code>. We can plot the histogram of the sizes of each entry in the examination map:</p> 329 + <div><a href="examination_map_histogram.svg" class="img-link"><img src="examination_map_histogram.svg" alt="Package Examiner Distribution Histogram"/></a></div> 330 + <p>Some interesting features from these data:</p> 331 + <ul><li>The most common number of observers is 1, meaning that the package is not involved in the solution of any other package. There are approximately 2000 such packages.</li><li>Most (~80%) of packages have fewer than 100 observers. This means that if we alter one of these packages, we only need to recalculate the solutions for fewer than 100 other packages.</li><li>A <em>very</em> small number of packages are observed in all 4,400 solutions. This is actually a bit artificial, as the solver adds the ocaml-compiler package as an input to all solves to ensure we get the correct compiler version. There's another way to do this which would avoid this particular problem.</li><li>A small number of packages have a very large number of observers, around 3800. This mostly corresponds with <code>dune</code> and its dependencies and associated packages. There are around 350 such packages, and any change to these means we need to recalcuate most of the solutions.</li></ul> 332 + <p>This last point doesn't mean that we actually <em>recompile</em> 3,800 packages, just that we need to recalcualte the solution, which might then lead to a cache hit of the layer and no actual compilation. However, recalculating the solutions of all of the packages takes (on my computer) around 10,000 seconds, or roughly 5 minutes of wall-clock time as I've got 32 threads.</p> 333 + <p>However, if the package that's changes <i>isn't</i> one of those 350 packages, then the number of solutions that need to be recalculated is dramatically reduced. I ran the logic over the last few weeks of commits to opam-repository, from commit <code>109398e2fd61803126becd398df0f1eabc9f3ca2</code> of the 10th September up until commit <code>3f21ebe342ce440d9c9142ffe1185d8e5a326085</code> from the 22nd. In this time there were 144 commits (counting only those from <code>git log --first-parent</code>). Of these, only 4 resulted in a full resolve - the first commit, since obviously we have no cache at that point, the <a href="https://github.com/ocaml/opam-repository/commit/40283204789e7116e1c99466de902cd565d121cf">release of OCaml 5.4.0 beta2</a> by <a href="https://perso.quaesituri.org/florian.angeletti/">Florian Angeletti</a>, a fix of <a href="https://github.com/ocaml/opam-repository/commit/6ef6813522b6ea29933f6451236a1639bdbaec61">ocaml-base-compiler for MSVC</a> by <a href="https://www.dra27.uk/blog/">David</a> and a fix for <a href="https://github.com/ocaml/opam-repository/commit/d141887ab0b4fc0836ad0787f1f806585a260bc8">BER-OCaml</a> by <a href="https://www.cl.cam.ac.uk/~jdy22/">Jeremy Yallop</a>. Then 25 commits resulted in recalculating solutions for 3800 packages as they hit dune-adjacent packages, 5 commits resulted in recalculating between 100 and 300 packages and the remaining 110 commits resulted in recalculating fewer than 100 packages, the majority of which resulted in recalculating fewer than 5 packages.</p> 334 + <p>Overall, at a rough estimate, this means that over this period, using this caching strategy gave us a 5x speedup in the solver!</p>]]></content> 335 + </entry> 336 + <entry> 337 + <id>https://jon.recoil.org/blog/2025/09/odoc-bugs.html</id> 338 + <title>Odoc bugs</title> 339 + <published>2025-09-22T00:00:00Z</published> 340 + <updated>2025-09-22T00:00:00Z</updated> 341 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/odoc-bugs.html"/> 342 + <summary>This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they bo...</summary> 343 + <content type="html"><![CDATA[<h1 id="odoc-bugs"><a href="#odoc-bugs" class="anchor"></a>Odoc bugs</h1> 344 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-22</p></li></ul> 345 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>odoc.model</p></li></ul> 346 + <p>This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they both had the same backtrace and similar immediate causes, they're actually quite different. They both involve <em>expansion</em>, which is the process that odoc uses to work out the contents of a module from its expression - what allows you to see the contents of a module such as <code>module M = Map.Make(String)</code>.</p> 347 + <h3 id="bug-930:-inline-destructive-substitutions"><a href="#bug-930:-inline-destructive-substitutions" class="anchor"></a>Bug 930: inline destructive substitutions</h3> 348 + <p>Bug #930 in odoc is about a substitution problem:</p> 349 + <div><pre class="language-ocaml"><code>module type S1 = sig 350 + type t0 351 + type 'a t := unit 352 + 353 + val x : t0 t 354 + end 355 + 356 + module type S2 = sig 357 + type t (* must be the same name as [S1.t] *) 358 + 359 + include S1 with type t0 := t 360 + end 361 + 362 + module type S3 = sig 363 + type t1 364 + 365 + include S2 with type t := t1 366 + end</code></pre></div> 367 + <p>which when processed by odoc 2.4 throws an exception:</p> 368 + <pre>odoc: internal error, uncaught exception: 369 + Invalid_argument(&quot;List.fold_left2&quot;) 370 + Raised at Stdlib.invalid_arg in file &quot;stdlib.ml&quot;, line 33, characters 20-45 371 + Called from Odoc_xref2__Subst.type_expr in file &quot;subst.ml&quot;, line 598, characters 21-59 372 + Called from Odoc_xref2__Subst.value in file &quot;subst.ml&quot; (inlined), line 842, characters 19-38 373 + Called from Odoc_xref2__Subst.apply_sig_map.inner.(fun) in file &quot;subst.ml&quot;, line 1089, characters 19-52 374 + Called from Odoc_xref2__Component.Delayed.get in file &quot;component.ml&quot; (inlined), line 55, characters 16-22 375 + Called from Odoc_xref2__Lang_of.signature_items.inner in file &quot;lang_of.ml&quot;, line 438, characters 16-39 376 + Called from Odoc_xref2__Lang_of.signature in file &quot;lang_of.ml&quot; (inlined), line 466, characters 12-43 377 + Called from Odoc_xref2__Lang_of.include_ in file &quot;lang_of.ml&quot;, line 641, characters 18-69</pre> 378 + <p>The key thing here is that definition of <code>'a t</code> in <code>S1</code> - a destructive substituion. If you type this code into an OCaml toplevel, you will see that the signature of <code>S1</code> is:</p> 379 + <div><pre class="language-ocaml"><code>module type S1 = sig 380 + type t0 381 + type 'a t := unit 382 + 383 + val x : t0 t 384 + end</code></pre></div> 385 + <p>where the substitution has clearly taken place. In contrast, odoc takes the position that the use of these inline destructive substitutions is to make the code easier to understand, and so it tries to keep them in the signature rather than simply apply them and present the resulting signature. So when rendering <code>S1</code> we end up with:</p> 386 + 387 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; border-radius: 5px"> 388 + <a id="module-type-S1" class="anchor"></a><h2>Module type <code><span>S1</span></code></h2> 389 + <div class="odoc-spec"><div class="spec type anchored" id="type-t0"><a href="#type-t0" class="anchor"></a><code><span><span class="keyword">type</span> t0</span></code></div></div><div class="odoc-spec"><div class="spec type subst anchored" id="type-t"><a href="#type-t" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a t</span></span><span> := unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : <span><a href="#type-t0">t0</a> <a href="#type-t">t</a></span></span></code></div></div> 390 + </div> 391 + 392 + <p>The reported problem is a failure with a stack trace while processing <code>S3</code>, but upon looking closely the real problem has happened when expanding <code>S2</code>. What happens is that we have a type <code>t</code> defined in <code>S2</code> and a type <code>t</code> that will later be substituted away that comes from the inclusion of <code>S1</code>. The rendered signature of <code>S2</code> is:</p> 393 + 394 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px"> 395 + <a id="module-type-S2" class="anchor"></a><h2>Module type <code><span>S2</span></code></h2> 396 + <div class="odoc-spec"><div class="spec type anchored" id="type-s2-t"><a href="#type-s2-t" class="anchor"></a><code><span><span class="keyword">type</span> t</span></code></div></div><div class="odoc-include"><details open="open"><summary class="spec include"><code><span><span class="keyword">include</span> <a href="#module-type-S1">S1</a> <span class="keyword">with</span> <span><span class="keyword">type</span> <a href="#type-t0">t0</a> := <a href="#type-s2-t">t</a></span></span></code></summary><div class="odoc-spec"><div class="spec type subst anchored" id="type-s2-t"><a href="#type-s2-t" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a t</span></span><span> := unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : <span><a href="#type-s2-t">t</a> <a href="#type-s2-t">t</a></span></span></code></div></div></details></div> 397 + </div> 398 + 399 + <p>where the type of <code>x</code> is now <code>t t</code>, which is clearly incorrect. The problem is that odoc assumes that type names are unique within a signature (modulo shadowing, which isn't quite what's going on here), but in this signature there are two definitions of <code>type t</code>, one of which is parameterised and one is not. At this point nothing fatal has happened, but when we try to process <code>S3</code> the substitution code gets very confused by these different arities and <code>List.fold_left2</code> throws the above exception.</p> 400 + <p>The fix I'm trialling for this is that when we're including a signature that contains an inline destructive substitution, we will perform that substitution when the expansion of the include is done. This means that the rendered signature of <code>S1</code> will be just the same as before, but the rendered signature of <code>S2</code> will now be:</p> 401 + 402 + <div class="inset" style="border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px"> 403 + <a id="module-type-newS2" class="anchor"></a><h2>Module type <code><span>S2</span></code></h2> 404 + <div class="odoc-spec"><div class="spec type anchored" id="type-s2new-t"><a href="#type-s2new-t" class="anchor"></a><code><span><span class="keyword">type</span> t</span></code></div></div><div class="odoc-include"><details open="open"><summary class="spec include"><code><span><span class="keyword">include</span> <a href="#module-type-S1">S1</a> <span class="keyword">with</span> <span><span class="keyword">type</span> <a href="#type-t0">t0</a> := <a href="#type-s2new-t">t</a></span></span></code></summary><div class="odoc-spec"><div class="spec value anchored" id="val-x"><a href="#val-x" class="anchor"></a><code><span><span class="keyword">val</span> x : unit</span></code></div></div></details></div> 405 + </div> 406 + 407 + <p>where the type of <code>x</code> is now simply <code>unit</code>, which is what OCaml itself thinks, happily! I think this strikes the balance between keeping the substitutions visible for clarity where they are originally defined, but when including them elsewhere we simply see the resulting signature.</p> 408 + <h3 id="bug-#1385:-exception-raised-during-compilation"><a href="#bug-#1385:-exception-raised-during-compilation" class="anchor"></a>Bug #1385: Exception raised during compilation</h3> 409 + <p>The second bug has the identical backtrace, indicating a problem with arities. However, the repro case for this one does not involve any inline destructive substitution, though it does involve destructive substitution at the module expression level:</p> 410 + <div><pre class="language-ocaml"><code>module type Creators_base = sig 411 + type ('a, _, _) t 412 + type (_, _, _) concat 413 + 414 + include sig 415 + type ('a, 'b, 'c) t 416 + 417 + val concat : (('a, 'p1, 'p2) t, 'p1, 'p2) concat -&gt; ('a, 'p1, 'p2) t 418 + end 419 + with type ('a, 'b, 'c) t := ('a, 'b, 'c) t 420 + end 421 + 422 + module type S0_with_creators_base = sig 423 + type t 424 + 425 + include Creators_base with type ('a, _, _) t := t and type ('a, _, _) concat := t 426 + end</code></pre></div> 427 + <p>There's quite a lot of type parameters flying around here, so the first step was to try to simplify this as much as possible while still getting the exception. I got it down to:</p> 428 + <div><pre class="language-ocaml"><code>module type Creators_base = sig 429 + type 'a t 430 + type _ concat 431 + 432 + include sig 433 + type 'a t 434 + 435 + val concat : 'a concat -&gt; 'a t 436 + end 437 + with type 'a t := 'a t 438 + end 439 + 440 + module type S0_with_creators_base = sig 441 + type t 442 + 443 + include Creators_base with type _ t := t with type _ concat := t 444 + end</code></pre></div> 445 + <p>which still throws the same exception. So, what's going on here? Fundamentally, it's a similar issue to the first bug, just caused in a different way, in that once again we'll end up with a signature that has two definitions of <code>type t</code> with different arities. In this case, the problem occurs during the expansion of <code>S0_with_creators_base</code>.</p> 446 + <p>This is the intermediate expansion of <code>Creators_base</code> that odoc calculates:</p> 447 + <div><pre class="language-ocaml"><code>module type S0_with_creators_base = sig 448 + type t 449 + 450 + include Creators_base with type _ t := t with type _ concat := t (* 451 + 452 + The expansion as calculated by odoc is: 453 + 454 + include sig 455 + type 'a t 456 + val concat : t -&gt; 'a t 457 + end 458 + *) 459 + end</code></pre></div> 460 + <p>What's happened here is during the calculation of the body of the include, odoc has taken the signature of <code>Creators_base</code> and has its two type definitions both replaced with <code>type t</code> (with no parameters). However, since the <code>type t</code> in the body of the include is defined in that signature, that one wasn't replaced. So we end up with the type of <code>concat</code> being <code>t -&gt; 'a t</code>, which looks very odd! At this point though, odoc knows very well that they're different types. However, when odoc converts this signature back into the datatype that represents the expansions, it loses that information and we end up with the two types mixed up. We then go on to process this signature, the mixup of the arities causes the failure.</p> 461 + <p>There are several independent fixes that we can make here. Firstly we can make sure that we don't mix up the types. This we can do because we can distinguish between items that are declared within the signature of the include's declaration and those that come from the outer context. We don't have to do this for the expansion of the include as OCaml's type system means that there can't be two types of the same name in the resulting signature. We never actually render any signature that occurs within the body of an include, so this doesn't actually make any difference to the output.</p> 462 + <p>The second fix is to make sure that we only calculate the expansion of the include once. Currently the bug happens because we try to re-calculate the expansion of the <code>include sig ... end</code> expression, even though we calculated it during the processing of <code>S0_with_creators_base</code>. What we should do instead is apply the substitutions to the expansion of that calculated include, which would end up with the same result. This isn't a perfect solution though, as there are occasions when we have to recalculate the signature anyway.</p> 463 + <p>The third fix is - and this takes a little care to parse - to ensure that we never actually try to process the items within a signature within a &quot;with&quot; expression within a module-type expression. Before diving into the 'why' of this, let's first explain how Odoc represents module-type expressions.</p> 464 + <p>Internally, we have <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.expr">a datatype</span> that represents module expressions, which looks like this:</p> 465 + <div><pre class="language-ocaml"><code>type expr = 466 + | Path of path_t 467 + | Signature of Signature.t 468 + | Functor of FunctorParameter.t * expr 469 + | With of with_t 470 + | TypeOf of typeof_t</code></pre></div> 471 + <p>Now, each of the arguments to these constructors might contain an expansion of the expression that Odoc will calculate. For example, the definition of <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.path_t">path_t</span> is:</p> 472 + <div><pre class="language-ocaml"><code>type path_t = { 473 + p_expansion : simple_expansion option; 474 + p_path : Paths.Path.ModuleType.t; 475 + }</code></pre></div> 476 + <p>and this expansion is initially <code>None</code> and then filled in by Odoc in order to render the expansion in the HTML. In the case of a <code>With</code> expression, the <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.with_t">with_t</span> type is:</p> 477 + <div><pre class="language-ocaml"><code>type with_t = { 478 + w_substitutions : substitution list; 479 + w_expansion : simple_expansion option; 480 + w_expr : U.expr; 481 + }</code></pre></div> 482 + <p>here you can see that the <code>With</code> expression contains another module expression, as a <code>with</code> expression operates on another module type. Early during Odoc's development, this simply was another `ModuleType.expr`, but we had a couple of bugs where we ended up calculating expansions for these inner expressions, which was all very wasteful as we only ever rendered the &quot;outer&quot; expansion. So we changed this to be a <span class="xref-unresolved" title="Odoc_model.Lang.ModuleType.U.expr">U.expr</span>, which is an &quot;unexpanded&quot; module type expression, and is very similar to the main expression above, but without the expansions and also with the functor case, as we can't have functors inside a &quot;with&quot; expression.</p> 483 + <p>These &quot;unexpanded&quot; expressions still contain signatures though, so aren't <em>completely</em> unexpanded, and it's <em>these</em> signatures that we should avoid processing.</p> 484 + <p>So, what I expected to be just one bug when I started looking at this turned out to be two related issues, and a total of four different fixes!</p>]]></content> 485 + </entry> 486 + <entry> 487 + <id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html</id> 488 + <title>Caching opam solutions</title> 489 + <published>2025-09-09T00:00:00Z</published> 490 + <updated>2025-09-09T00:00:00Z</updated> 491 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html"/> 492 + <summary>The system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In or...</summary> 493 + <content type="html"><![CDATA[<h1 id="caching-opam-solutions"><a href="#caching-opam-solutions" class="anchor"></a>Caching opam solutions</h1> 494 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-09</p></li></ul> 495 + <p>The <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In order to give the docs some stability, as the docs may well <a href="../04/semantic-versioning-is-hard.html" title="semantic-versioning-is-hard">depend upon your dependencies</a>, we currently cache the solve results so that a package will always be built with the same set of dependencies, even if a new version of one of those dependencies has been released.</p> 496 + <p>The downside to this is that as time goes on, the number of distinct universes that we build increases, and docs get more and more out of date. So it's not necessarily the best thing to do, though it does mean we minimise the amount of time spent solving.</p> 497 + <p>The alternative approach is that on every commit to opam-repository we could resolve for all packages and use the latest, greatest solution to build the docs. Using this approach we would maximise the sharing of builds and keep the total amount of required storage steadier. Of course, this would mean solving for every package on every commit to opam-repository, even if we didn't end up rebuilding all of them due to the way that the cache works.</p> 498 + <p>One possibility that might be worth investigating is to cache the solutions - but then Leon Bambrick <a href="https://twitter.com/secretGeek/status/7269997868">advises us</a>:</p> 499 + <div><pre class="language-quote"><code>There are 2 hard problems in computer science: cache invalidation, 500 + naming things, and off-by-1 errors.</code></pre></div> 501 + <p>and indeed it's not obvious what the best approach to cache invalidation is here. A sledgehammer approach would be to hook into the solver and note what questions it asks of opam-repository and record the responses. If any of these change, then it's safe to say that we need to recalculate. I had a quick look at this and checked what packages were involved in the solution of <code>ocaml</code> as this would represent a minimum set of packages that would affect virtually all packages. The list was big, but not <i>too</i> big:</p> 502 + <pre>winpthreads, system-msvc, system-mingw, ocaml-variants, ocaml-system, 503 + ocaml-options-vanilla, ocaml-option-tsan, ocaml-option-static, 504 + ocaml-option-spacetime, ocaml-option-no-flat-float-array, 505 + ocaml-option-no-compression, ocaml-option-nnpchecker, 506 + ocaml-option-nnp, ocaml-option-musl, ocaml-option-mingw, 507 + ocaml-option-leak-sanitizer, ocaml-option-fp, ocaml-option-flambda, 508 + ocaml-option-default-unsafe-string, ocaml-option-bytecode-only, 509 + ocaml-option-afl, ocaml-option-address-sanitizer,ocaml-option-32bit, 510 + ocaml-config, ocaml-compiler, ocaml-beta, ocaml-base-compiler, ocaml, 511 + dkml-base-compiler, conf-unwind, conf-pkg-config, base-unix, 512 + base-threads, base-ocamlbuild, base-nnp, base-metaocaml-ocamlfind, 513 + base-implicits, base-effects, base-domains, base-bigarray</pre> 514 + <p>I tried the same thing whilst using the oxcaml opam-repository, and this time, the list became much <i>much</i> larger:</p> 515 + <pre>zed, zarith-xen, zarith-freestanding, zarith, yojson, xenstore, xdg, 516 + x509, webbrowser, wasm_of_ocaml-compiler, variantslib, uutf, uuseg, 517 + uunf, uucp, uTop, uri-sexp, uri, uopt, univ_map, uchar, tyxml, 518 + typerex, typerep, trie, topkg, tls-lwt, tls, timezone, time_now, 519 + textutils_kernel, textutils, tcpip, system-msvc, system-mingw, 520 + swhid_core, stringext, string_dict, stdune, stdlib-shims, stdio, ssl, 521 + splittable_random, spdx_licenses, spawn, shell, 522 + shared-memory-ring-lwt, shared-memory-ring, sha, sexplib0, sexplib, 523 + sexp_pretty, seq, sedlex, rresult, result, regex_parser_intf, 524 + record_builder, react, re2, re, randomconv, publish, ptime, psq, 525 + protocol_version_header, ppxlib_jane, ppxlib_ast, ppxlib, ppxfind, 526 + ppx_yojson_conv_lib, ppx_yojson_conv, ppx_variants_conv, ppx_var_name, 527 + ppx_typerep_conv, ppx_typed_fields, ppx_tydi, ppx_tools_versioned, 528 + ppx_tools, ppx_template, ppx_string_conv, ppx_string, 529 + ppx_stable_witness, ppx_stable, ppx_shorthand, ppx_sexp_value, 530 + ppx_sexp_message, ppx_sexp_conv, ppx_pipebang, ppx_optional, 531 + ppx_optcomp, ppx_module_timer, ppx_log, ppx_let, ppx_js_style, 532 + ppx_jane, ppx_inline_test, ppx_ignore_instrumentation, ppx_here, 533 + ppx_helpers, ppx_hash, ppx_globalize, ppx_fixed_literal, 534 + ppx_fields_conv, ppx_fail, ppx_expect, ppx_enumerate, 535 + ppx_disable_unused_warnings, ppx_diff, ppx_deriving, ppx_derivers, 536 + ppx_custom_printf, ppx_cstruct, ppx_compare, ppx_cold, ppx_bin_prot, 537 + ppx_bench, ppx_base, ppx_assert, pp, portable, pipe_with_writer_error, 538 + pcre, pbkdf, patch, parsexp, ounit2, ordering, optint, opam-state, 539 + opam-repository, opam-publish, opam-lib, opam-format, 540 + opam-file-format, opam-core, ojs, ohex, odoc-parser, odoc, octavius, 541 + ocplib-endian, ocp-indent, ocp-build, ocb-stubblr, ocamlnet, 542 + ocamlgraph, ocamlformat-rpc-lib, ocamlformat-lib, ocamlformat, 543 + ocamlfind-secondary, ocamlfind, ocamlc-loc, ocamlbuild, 544 + ocaml_intrinsics_kernel, ocaml_intrinsics, ocaml-version, 545 + ocaml-variants, ocaml-system, ocaml-syntax-shims, 546 + ocaml-secondary-compiler, ocaml-options-vanilla, ocaml-option-tsan, 547 + ocaml-option-static, ocaml-option-spacetime, 548 + ocaml-option-no-flat-float-array, ocaml-option-no-compression, 549 + ocaml-option-nnpchecker, ocaml-option-nnp, ocaml-option-musl, 550 + ocaml-option-mingw, ocaml-option-leak-sanitizer, ocaml-option-fp, 551 + ocaml-option-flambda, ocaml-option-default-unsafe-string, 552 + ocaml-option-bytecode-only, ocaml-option-afl, 553 + ocaml-option-address-sanitizer, ocaml-option-32bit, 554 + ocaml-migrate-parsetree, ocaml-lsp-server, ocaml-index, 555 + ocaml-freestanding, ocaml-config, ocaml-compiler-libs, 556 + ocaml-base-compiler, ocaml, obuild, num, nocrypto, mtime, mmap, 557 + mirage-xen-posix, mirage-xen, mirage-types, mirage-time, mirage-stack, 558 + mirage-solo5, mirage-sleep, mirage-runtime, mirage-random, 559 + mirage-ptime, mirage-protocols, mirage-profile, mirage-no-xen, 560 + mirage-no-solo5, mirage-net-xen, mirage-net, mirage-mtime, 561 + mirage-kv-mem, mirage-kv-lwt, mirage-kv, mirage-flow, mirage-entropy, 562 + mirage-device, mirage-crypto-rng-mirage, mirage-crypto-rng-lwt, 563 + mirage-crypto-rng, mirage-crypto-pk, mirage-crypto-ec, mirage-crypto, 564 + mirage-clock-unix, mirage-clock-lwt, mirage-clock, mew_vi, mew, 565 + metrics-lwt, metrics, merlin-lib, merlin, menhirSdk, menhirLib, 566 + menhirCST, menhir, mdx, magic-mime, macaddr-cstruct, macaddr, lwt_ssl, 567 + lwt_react, lwt_ppx, lwt_log, lwt-dllist, lwt, lsp, lru, logs, 568 + lambda-term, kdf, jst-config, jsonrpc, jsonm, js_of_ocaml-toplevel, 569 + js_of_ocaml-ppx, js_of_ocaml-lwt, js_of_ocaml-compiler, js_of_ocaml, 570 + jbuilder, jane_rope, jane-street-headers, ipaddr-sexp, ipaddr-cstruct, 571 + ipaddr, io-page, int_repr, http, hkdf, hex, hacl_x25519, gmap, 572 + github-unix, github-data, github, gen_js_api, gen, gel, 573 + functoria-runtime, fpath, fmt, fix, fieldslib, fiber, fiat-p256, 574 + ezjsonm, extlib-compat, extlib, expectree, expect_test_helpers_core, 575 + ethernet, eqaf, either, easy-format, dyn, duration, dune-site, 576 + dune-rpc, dune-release, dune-private-libs, dune-configurator, 577 + dune-compiledb, dune-build-info, dune, dot-merlin-reader, domain-name, 578 + dkml-base-compiler, digestif, curly, cstruct-sexp, cstruct-lwt, 579 + cstruct, csexp, crunch, cpuid, cppo, core_unix, core_kernel, 580 + core_extended, core, configurator, conf-which, conf-unwind, 581 + conf-pkg-config, conf-ninja, conf-m4, conf-libssl, conf-libpcre, 582 + conf-gmp-powm-sec, conf-gmp, conf-g++, conf-cmake, conf-c++, 583 + conf-bash, conf-autoconf, conduit-lwt-unix, conduit-lwt, conduit, 584 + cohttp-lwt-unix, cohttp-lwt-jsoo, cohttp-lwt, cohttp, cmdliner, 585 + cmarkit, chrome-trace, charInfo_width, capitalization, camomile, 586 + camlp4, camlp-streams, ca-certs, bos, biniou, binaryen-bin, bin_prot, 587 + bigstringaf, bigarray-compat, bheap, basement, base_quickcheck, 588 + base_bigstring, base64, base-unix, base-threads, base-ocamlbuild, 589 + base-num, base-nnp, base-effects, base-domains, base-bytes, 590 + base-bigarray, base, backoff, atdgen-runtime, atdgen, atd, async_unix, 591 + async_rpc_kernel, async_log, async_kernel, async_extra, async, 592 + astring, asn1-combinators, arp, angstrom, alcotest</pre> 593 + <p>This enormous list is because the opam file for oxcaml - <code>ocaml-variants.5.2.0+ox</code> - lists a bunch of conflicts to ensure that various incompatible packages are never selected:</p> 594 + <pre>conflicts: [ 595 + &quot;base&quot; {&lt; &quot;v0.18~&quot;} 596 + &quot;alcotest&quot; {!= &quot;1.9.0+ox&quot;} 597 + &quot;backoff&quot; {!= &quot;0.1.1+ox&quot;} 598 + &quot;dot-merlin-reader&quot; {!= &quot;5.2.1-502+ox&quot;} 599 + &quot;gen_js_api&quot; {!= &quot;1.1.2+ox&quot;} 600 + &quot;js_of_ocaml&quot; {!= &quot;6.0.1+ox&quot;} 601 + &quot;js_of_ocaml-compiler&quot; {!= &quot;6.0.1+ox&quot;} 602 + &quot;js_of_ocaml-ppx&quot; {!= &quot;6.0.1+ox&quot;} 603 + &quot;js_of_ocaml-toplevel&quot; {!= &quot;6.0.1+ox&quot;} 604 + &quot;jsonrpc&quot; {!= &quot;1.19.0+ox&quot;} 605 + &quot;lsp&quot; {!= &quot;1.19.0+ox&quot;} 606 + &quot;lwt_ppx&quot; {!= &quot;5.9.1+ox&quot;} 607 + &quot;mdx&quot; {!= &quot;2.5.0+ox&quot;} 608 + &quot;merlin&quot; {!= &quot;5.2.1-502+ox&quot;} 609 + &quot;merlin-lib&quot; {!= &quot;5.2.1-502+ox&quot;} 610 + &quot;ocaml-compiler-libs&quot; {!= &quot;v0.17.0+ox&quot;} 611 + &quot;ocaml-index&quot; {!= &quot;1.1+ox&quot;} 612 + &quot;ocaml-lsp-server&quot; {!= &quot;1.19.0+ox&quot;} 613 + &quot;ocamlbuild&quot; {!= &quot;0.15.0+ox&quot;} 614 + &quot;ocamlformat&quot; {!= &quot;0.26.2+ox&quot;} 615 + &quot;ocamlformat-lib&quot; {!= &quot;0.26.2+ox&quot;} 616 + &quot;ojs&quot; {!= &quot;1.1.2+ox&quot;} 617 + &quot;ppxlib&quot; {!= &quot;0.33.0+ox&quot;} 618 + &quot;ppxlib_ast&quot; {!= &quot;0.33.0+ox&quot;} 619 + &quot;sedlex&quot; {!= &quot;3.3+ox&quot;} 620 + &quot;topkg&quot; {!= &quot;1.0.8+ox&quot;} 621 + &quot;uTop&quot; {!= &quot;2.15.0+ox&quot;} 622 + &quot;uutf&quot; {!= &quot;1.0.3+ox&quot;} 623 + &quot;wasm_of_ocaml-compiler&quot; {!= &quot;6.0.1+ox&quot;} 624 + &quot;zarith&quot; {!= &quot;1.12+ox&quot;} 625 + ]</pre> 626 + <p>and it seems that the solver is looking not just at these packages, but also at all of their dependencies too. So this is a much larger set of packages that we need to track changes for, probably making the caching an awful lot less effective. It's not clear to me that this is the best way for the solver to handle conflicts, but I don't know enough about how it works yet to say for sure.</p>]]></content> 627 + </entry> 628 + <entry> 629 + <id>https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html</id> 630 + <title>Build IDs for Day10</title> 631 + <published>2025-09-08T00:00:00Z</published> 632 + <updated>2025-09-08T00:00:00Z</updated> 633 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html"/> 634 + <summary>, and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opa...</summary> 635 + <content type="html"><![CDATA[<h1 id="build-ids-for-day10"><a href="#build-ids-for-day10" class="anchor"></a>Build IDs for Day10</h1> 636 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-08</p></li></ul> 637 + <p><a href="https://tunbury.org">mtelvers</a>, <a href="https://www.dra27.uk/blog/">dra27</a> and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opam repository. It supports building even mutually-incompatible packages by dynamically creating the build environment for each package, and thus allows us to generate something akin to <a href="">opam health check</a> but much faster.</p> 638 + <p>Currently the cache of a package is a key-value store where the key is a hash of the package name and version and all of its dependencies and their name and version, alongside some information about the OS. This is great when this info can uniquely identify the output, but this isn't always the case. In particular, the oxcaml opam-repository has several packages where the version number is the upstream version number with `-ox` appended, as they have patches to make them compatible with oxcaml. If these patches change without bumping the suffix the currently caching mechanism would lead to trouble. When we discussed this David pointed out the idea of the <a href="https://github.com/ocaml/opam/blob/c36dd1ce40a715ef27122184715bbf3e9aa7f0c9/src/state/opamPackageVar.ml#L178-L211">build-id</a> in opam, which would perfectly satisfy our needs. Unfortunately this code is quite deep within the opam codebase and at the point we need it we don't have an installed opam switch, so we need to pull the code out and insert it into our project.</p> 639 + <p>One of the first challenges was that day10 currently includes the OS details in the hash so that we can test across different distros. This is at odds with the opam build-id which doesn't include that, so in order to try to get as close as possible to the opam hash I split the cache into 2 layers - a per-OS cache directory containing hashes based on pure opam metadata. The idea is that these should be identical to the build-ids of opam. With that fixed, the new cache layout looks like:</p> 640 + <div><pre class="language-ocaml"><code>debian-12-x86_64/123...abc/{build.log,config,...}</code></pre></div> 641 + <p>where the <code>123...abc</code> should be the same as the build-id you would get with all the packages contained installed.</p> 642 + <p>Now my actual use case for this is to track the state of the oxcaml world day by day, so for this I need to track both the opam-repository for OCaml and also the opam repository for OxCaml. The project currently uses a Makefile for coordinating the builds, but I thought it was time we moved on to a dedicated batch execution process. So I asked Claude to knock me up one of those, using odoc_driver for inspiration. It's very basic right now, simply iterating through the latest versions of every package, but I have got it to check on cache hits and misses, so I should be able to run it tomorrow to see how quickly we can test PRs to oxcaml/opam-repository</p>]]></content> 643 + </entry> 644 + <entry> 645 + <id>https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html</id> 646 + <title>Giving hub.cl an upgrade</title> 647 + <published>2025-09-07T00:00:00Z</published> 648 + <updated>2025-09-07T00:00:00Z</updated> 649 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html"/> 650 + <summary>For a few years now we've been running , a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form o...</summary> 651 + <content type="html"><![CDATA[<h1 id="giving-hub.cl-an-upgrade"><a href="#giving-hub.cl-an-upgrade" class="anchor"></a>Giving hub.cl an upgrade</h1> 652 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-09-07</p></li></ul> 653 + <p>For a few years now we've been running <code>hub.cl.cam.ac.uk</code>, a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form of Jupyter notebooks, and as a playground where students can try OCaml, and it also is used to run the assessed exercises that are a mandatory part of the course.</p> 654 + <p>Since I spent some time setting it up back in 2018 or so, its aggregated some cruft over the years, and has also fallen somewhat behind the bleeding edge of the Jupyter software stack. So I thought this year, as I'm actually lecturing the course, I'd give it a bit of loving care and attention.</p> 655 + <p>We were still on Jupyterhub 1.5.3 whereas the current release is 5.3.0 - so there was quite a bit of work to do. I brief play with putting things on the latest version seemed to break quite a lot of things, so I thought it might be better to go back to the drawing board and start the config again from scratch. So with some help from Claude, I've now managed to hugely simplify the whole config of Jupyterhub, and even given it a makeover to try to match the style of www.cst.cam.ac.uk as well. The improvements include:</p> 656 + <ul><li>Using caddy as a reverse proxy for TLS termination, meaning I don't have to manually renew the letsencrypt cert every 3 months</li><li>Unifying the configuration of the two container images used for students and instructors</li><li>Upgrading to much newer jupyterhub, notebook and nbgrader images</li><li>Simplifying the configuration required to make it work on a new server - persistent user directories are now docker volumes rather than bindmounts on the local filesystem</li><li>Updating the authentication method to use Raven via OAuth2 rather than the unmaintained <a href="https://github.com/pyCav/jupyterhub-raven-auth">jupyterhub-raven-auth</a> which I'd had to maintain <a href="https://github.com/jonludlam/jupyterhub-raven-auth/commit/36eaf16b410e7ac3cfc532269e0ae5f1de34f231">a patch</a>.</li><li>Rebasing <a href="https://github.com/jonludlam/nbgrader/commit/c83a6cbb7b530ce87b0b157accddcdc832bcba38">my patch</a> to nbgrader to verify all of the output of the cells when grading answers</li></ul> 657 + <p>As ever, this took longer than I'd anticipated, but I'm mostly there now. There are a few more steps to try:</p> 658 + <ul><li>trial the <a href="https://github.com/akabe/ocaml-jupyter/pull/210">new patch</a> for using ocaml-jupyter with OCaml 5.x</li><li>see how to upgrade to notebook v7, as I've stuck with v6 in order to keep the extensions we're using going.</li></ul>]]></content> 659 + </entry> 660 + <entry> 661 + <id>https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html</id> 662 + <title>Using ocaml-lsp-server via an MCP server</title> 663 + <published>2025-08-27T00:00:00Z</published> 664 + <updated>2025-08-27T00:00:00Z</updated> 665 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html"/> 666 + <summary>Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</summary> 667 + <content type="html"><![CDATA[<h1 id="using-ocaml-lsp-server-via-an-mcp-server"><a href="#using-ocaml-lsp-server-via-an-mcp-server" class="anchor"></a>Using ocaml-lsp-server via an MCP server</h1> 668 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-27</p></li></ul> 669 + <p>Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</p> 670 + <p>We're going to use <a href="https://github.com/isaacphi">issacphi</a>'s adapter for LSP servers, which is written in go. So install go, and then:</p> 671 + <div><pre class="language-bash"><code>go install github.com/isaacphi/mcp-language-server@latest</code></pre></div> 672 + <p>Once that's done, make sure you've got `ocaml-lsp-server` installed in your switch:</p> 673 + <div><pre class="language-bash"><code>opam install ocaml-lsp-server</code></pre></div> 674 + <p>Then add the MCP config for claude where you want to run it:</p> 675 + <div><pre class="language-bash"><code>claude mcp add ocamllsp -s local -t stdio -- /Users/jon/go/bin/mcp-language-server -workspace . -lsp ocamllsp</code></pre></div> 676 + <p>It'd be nice to get this working `globally` - that is, with `-s user` - but I haven't been able to get that to work yet.</p>]]></content> 677 + </entry> 678 + <entry> 679 + <id>https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html</id> 680 + <title>An OCaml MCP server</title> 681 + <published>2025-08-20T00:00:00Z</published> 682 + <updated>2025-08-20T00:00:00Z</updated> 683 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html"/> 684 + <summary>LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller langua...</summary> 685 + <content type="html"><![CDATA[<h1 id="an-ocaml-mcp-server"><a href="#an-ocaml-mcp-server" class="anchor"></a>An OCaml MCP server</h1> 686 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-20</p></li></ul> 687 + <p>LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller languages like OCaml there simply isn't enough training material out there, particularly when it comes to new language features like <a href="https://ocaml.org/manual/5.3/effects.html">effects</a> or new packages that haven't had time to be widely used. With my colleagues <a href="https://anil.recoil.org/">Anil</a>, <a href="https://ryan.freumh.org/">Ryan</a> and <a href="https://toao.com/">Sadiq</a> we've been exploring ways to <a href="https://anil.recoil.org/notes/cresting-the-ocaml-ai-hump">improve this situation</a>. One way we can mitigate these challenges is to provide a Model Context Protocol (<a href="https://modelcontextprotocol.io">MCP</a>) server that's capable of providing up-to-date info on the current state of the OCaml world.</p> 688 + <p>The <a href="https://docs.anthropic.com/en/docs/mcp">MCP specification</a> was released by Anthropic at the end of last year. Since then it has become an astonishingly popular mechanism for extending the capabilities of LLMs, allowing them to become incredibly powerful agents capable of much more than simply chatting. There are now a huge variety of MCP servers, from one that provides <a href="https://github.com/r-huijts/firstcycling-mcp">professional cycling data</a> to one that can <a href="https://github.com/GongRzhe/Gmail-MCP-Server">do your email</a>. The <a href="https://github.com/punkpeye/awesome-mcp-servers">awesome mcp server list</a> already lists hundreds, and these are just the <em>awesome</em> ones!</p> 689 + <p>I've been working with <a href="https://toao.com/">Sadiq</a> to make an <a href="https://github.com/sadiqj/odoc-llm/">MCP server for OCaml</a>, with an initial focus on building it such that it can be hosted for everyone rather than something that is run locally. Our plan is to start with a service that can help with choosing OCaml libraries, by taking advantage of the work done by <a href="https://github.com/ocurrent/ocaml-docs-ci/">ocaml-docs-ci</a> which is the tool used to generate the documentation for all packages in <a href="https://github.com/ocaml/opam-repository">opam-repository</a> and is served by <a href="https://ocaml.org/">ocaml.org</a>. As well as producing HTML docs, we can also extract a number of other formats from the pipeline, including a newly created <a href="https://github.com/ocaml/odoc/pull/1341">markdown backend</a>. Using this, we can get markdown-formatted documentation for the every version of every package in the OCaml ecosystem.</p> 690 + <h2 id="semantic-searching"><a href="#semantic-searching" class="anchor"></a>Semantic searching</h2> 691 + <p>The first thing we focused on was being able to do a <em>semantic search</em> over the whole OCaml ecosystem. To do this, we're using <a href="https://huggingface.co/spaces/hesamation/primer-llm-embedding">LLM embeddings</a>, for which we need some natural-language description to seach through.</p> 692 + <p>The documentation produced by <code>ocaml-docs-ci</code> is generated per library module using <a href="https://github.com/ocaml/odoc">odoc</a>, relying on the package author to provide documentation comments for each element in the signature. However, even if the package authors <em>hasn't</em> provided any documentation, we can still see the types, values, modules and so on that the library exposes, and this is often enough to get a good idea of what the module does. We then take these documentation pages, which are formatted in markdown, and summarise them via an LLM at the module level. This is done hierarchically, so we start with the 'deepest' modules, and then insert their summaries into the text of their parent module, then summarise those and so on. We found it useful to include the names and <a href="https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#preamble">preambles</a> of the ancestor modules when doing the summarisation to give additional context to the LLM. For example, here is the prompt generated for a submodule of the <a href="https://erratique.ch/software/astring">astring</a> library:</p> 693 + <div><pre class="language-markdown"><code>Module: Astring.String.Ascii 694 + 695 + Ancestor Module Context: 696 + - Astring: Alternative `Char` and `String` modules. Open the module to 697 + use it. This defines one value in your scope, redefines the `(^)` 698 + operator, the `Char` module and the `String` module. Consult the 699 + differences with the OCaml `String` module, the porting guide and a 700 + few examples. 701 + - Astring.String: Strings, `substrings`, string sets and maps. A 702 + string `s` of length `l` is a zero-based indexed sequence of `l` 703 + bytes. An index `i` of `s` is an integer in the range [`0`;`l-1`], it 704 + represents the `i`th byte of `s` which can be accessed using the 705 + string indexing operator `s.[i]`. 706 + Important. OCaml's `string`s became immutable since 4.02. Whenever 707 + possible compile your code with the `-safe-string` option. This module 708 + does not expose any mutable operation on strings and assumes strings 709 + are immutable. See the porting guide. 710 + 711 + Module Documentation: US-ASCII string support. 712 + References. 713 + 714 + ## Predicates 715 + - val is_valid : string -&gt; bool (* `is_valid s` is `true` iff only for 716 + all indices `i` of `s`, `s.[i]` is an US-ASCII character, i.e. a 717 + byte in the range [`0x00`;`0x7F`]. *) 718 + 719 + ## Casing transforms 720 + The following functions act only on US-ASCII code points that is on 721 + bytes in range [`0x00`;`0x7F`], leaving any other byte intact. The 722 + functions can be safely used on UTF-8 encoded strings; they will of 723 + course only deal with US-ASCII casings. 724 + 725 + - val uppercase : string -&gt; string (* `uppercase s` is `s` with 726 + US-ASCII characters `'a'` to `'z'` mapped to `'A'` to `'Z'`. *) 727 + - val lowercase : string -&gt; string (* `lowercase s` is `s` with 728 + US-ASCII characters `'A'` to `'Z'` mapped to `'a'` to `'z'`. *) 729 + - val capitalize : string -&gt; string (* `capitalize s` is like 730 + `uppercase` but performs the map only on `s.[0]`. *) 731 + - val uncapitalize : string -&gt; string (* `uncapitalize s` is like 732 + `lowercase` but performs the map only on `s.[0]`. *) 733 + 734 + ## Escaping to printable US-ASCII 735 + - val escape : string -&gt; string (* `escape s` is `s` with: *) 736 + - val unescape : string -&gt; string option (* `unescape s` unescapes 737 + what `escape` did. The letters of hex escapes can be upper, lower or 738 + mixed case, and any two letter hex escape is decoded to its 739 + corresponding byte. Any other escape not defined by `escape` or 740 + truncated escape makes the function return `None`. *) 741 + - val escape_string : string -&gt; string (* `escape_string s` is like 742 + `escape` except it escapes `s` according to OCaml's lexical 743 + conventions for strings with: *) 744 + - val unescape_string : string -&gt; string option (* `unescape_string` 745 + is to `escape_string` what `unescape` is to `escape` and also 746 + additionally unescapes the sequence `&quot;\\'&quot;` (`0x5C,0x27`) to `&quot;'&quot;` 747 + (`0x27`). *)</code></pre></div> 748 + <p>where clearly the package author has provided excellent documentation comments. This is then passed to an LLM which generated the following description:</p> 749 + <div><pre class="language-ocaml"><code>This module provides functions to check if a string contains only 750 + US-ASCII characters, convert case for ASCII letters, and escape or 751 + unescape strings using ASCII conventions. It operates on standard 752 + OCaml strings, treating them as sequences of bytes, and ensures 753 + compatibility with UTF-8 encoded strings when transforming case. Use 754 + cases include sanitizing input for ASCII-only protocols, preparing 755 + strings for environments requiring strict ASCII formatting, and 756 + handling escaped string representations in configuration or 757 + serialization contexts.</code></pre></div> 758 + <p>Once we have these natural language descriptions, we can generate embeddings for them to allow for semantic search amongst all modules in opam.</p> 759 + <p>In addition to the module descriptions, we also generate similar natural-language descriptions of the <em>package</em> as a whole, by taking the README from the package and summarising it similarly. Where there is no README, we summarise the summaries of the modules of the libraries, so we're always able to generate some text description of the entire package.</p> 760 + <p>To help with the ranking, we're also using a measure of popularity for both modules and packages. For packages, we're using the number of reverse dependencies in opam as a proxy for popularity, and for modules, we're using the &quot;occurrences&quot; generated as part of the docs build. These [occurrences] are a count of how often modules are used in other modules, and are calculated by looking at the compiled [cmt] files and resolving references to external modules using odoc's internal logic and counting them.</p> 761 + <p>Once we have both the module and package summaries, we generate an embedding of the descriptions to allow for a semantic search to be performed efficiently. We're using this in two ways - to search for packages for broad queries of functionality, which just uses the package summaries, and for more specific queries to search for modules within packages.</p> 762 + <p>For the module search, if the packages to search in haven't been specified, we search for both modules and packages and then combine the results. This is particularly helpful when the search is for generic functionality that might be found in more specific packages. For example, a module-only search for the term &quot;time and date manipulation functions&quot; returns the strongest match with a <a href="https://ocaml.org/p/caqti/2.2.4/doc/caqti.platform/Caqti_platform/Conv/index.html">module from caqti</a>, which, as caqti is a library for talking to relational databases, might not be what the user is looking for.</p> 763 + <p>We then put these search tools into an MCP server, along with a little more functionality. The server currently provides these five functions: </p> 764 + <ol></ol> 765 + <p>The first 2 use the LLM-generated summaries as described above, and the last is using <a href="https://github.com/art-w/">Arthur's</a> <a href="https://github.com/art-w/sherlodoc">sherlodoc tool</a> which can do various searches, including type-based search, across the output of the <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a>.</p> 766 + <h2 id="example-searches"><a href="#example-searches" class="anchor"></a>Example searches</h2> 767 + <p>The following are the results from some example package searches: </p> 768 + <ul><li>&quot;HTTP client&quot;</li></ul> 769 + <div><pre class="language-nolang"><code>#1 - http (v6.1.1) 770 + Similarity: 0.7593 771 + Reverse Dependencies: 407 772 + Combined Score: 0.6588 773 + Description: This package provides a comprehensive OCaml library for 774 + building HTTP clients and servers with support for multiple 775 + asynchronous programming model s. It enables developers to implement 776 + efficient, portable HTTP services using different backends such as 777 + Lwt, Async, Eio, and JavaScript, making it suitable for both Unix and 778 + browser environments. The library emphasizes performance, modularity, 779 + and interoperability, allowing custom backend implementations and 780 + seamless in tegration with other OCaml libraries. It is commonly used 781 + in web services, API clients, standalone microkernels, and 782 + OCaml-to-JavaScript compilations for web app lications. 783 + 784 + #2 - cohttp (v6.1.1) 785 + Similarity: 0.7377 786 + Reverse Dependencies: 403 787 + Combined Score: 0.6435 788 + Description: This package provides a comprehensive library for 789 + building HTTP clients and servers in OCaml. It supports multiple 790 + asynchronous programming models and backends, enabling flexible 791 + development across different runtime environments. The library offers 792 + efficient handling of HTTP/1.1 and HTTPS, with portable pa rsing and 793 + modular architecture. It is widely used for web services, API clients, 794 + and standalone network applications. 795 + 796 + #3 - cohttp-lwt-unix (v6.1.1) 797 + Similarity: 0.7089 798 + Reverse Dependencies: 338 799 + Combined Score: 0.6212 800 + Description: This package provides an implementation of the Cohttp 801 + library using the Lwt asynchronous programming framework with Unix 802 + bindings. It enables buil ding efficient HTTP clients and servers in 803 + OCaml, supporting both synchronous and asynchronous network 804 + operations. The package handles core HTTP functionality, i ncluding 805 + request and response parsing, connection management, and HTTPS support 806 + via OCaml-TLS. It is suitable for applications requiring 807 + high-performance web ser vices, microservices, or networked 808 + applications in the OCaml ecosystem. 809 + 810 + #4 - cohttp-lwt (v6.1.1) 811 + Similarity: 0.7067 812 + Reverse Dependencies: 367 813 + Combined Score: 0.6207 814 + Description: This package provides a comprehensive library for 815 + building HTTP clients and servers in OCaml, supporting multiple 816 + asynchronous programming models. It enables developers to implement 817 + efficient, portable HTTP services with support for both synchronous 818 + and asynchronous I/O, including secure HTTPS communicatio n. The 819 + package includes backends for Lwt, Async, Mirage, JavaScript, and 820 + Eio, making it versatile for use in different runtime environments, 821 + from Unix servers to web browsers. It is well-suited for applications 822 + requiring high-performance networking, such as web services, API 823 + clients, and embedded networked systems. 824 + 825 + #5 - quests (v0.1.3) 826 + Similarity: 0.7960 827 + Reverse Dependencies: 1 828 + Combined Score: 0.6180 829 + Description: This package provides a high-level HTTP client library 830 + for making web requests in OCaml. It simplifies interacting with HTTP 831 + servers by offering a n intuitive API for common methods like GET and 832 + POST, supporting features such as query parameters, form and JSON 833 + data submission, and automatic handling of gzip compression and 834 + redirects. It also includes authentication mechanisms like basic and 835 + bearer tokens, with partial support for sessions. Typical use cases 836 + include consuming REST APIs, scraping web content, or integrating 837 + with web services securely and efficiently. 838 + 839 + #6 - ezcurl (v0.2.4) 840 + Similarity: 0.7395 841 + Reverse Dependencies: 6 842 + Combined Score: 0.5979 843 + Description: This package provides a simplified interface for making 844 + HTTP requests in OCaml, built on top of the OCurl library. It 845 + addresses the need for an ea sy-to-use, reliable, and stable API for 846 + handling common web interaction tasks, such as fetching URLs and 847 + processing responses. The package supports both synchron ous and 848 + asynchronous operations, enabling efficient handling of parallel 849 + requests and non-blocking I/O. Practical use cases include web 850 + scraping, API client deve lopment, and integrating HTTP-based services 851 + into OCaml applications. 852 + </code></pre></div> 853 + <ul><li>&quot;Cryptographic hash&quot;</li></ul> 854 + <div><pre class="language-nolang"><code>#1 - digestif (v1.3.0) 855 + Similarity: 0.8165 856 + Reverse Dependencies: 621 857 + Combined Score: 0.7041 858 + Description: This package provides a comprehensive implementation of 859 + cryptographic hash functions, supporting algorithms such as MD5, 860 + SHA1, SHA2, SHA3, WHIRLPOOL, BLAKE2, and RIPEMD160. It allows users 861 + to choose between C and OCaml backends at link time, offering 862 + flexibility in performance and deployment scenarios. The library is 863 + designed for applications requiring secure hashing, such as data 864 + integrity verification, digital signatures, and cryptographic 865 + protocols. It is well-suited for systems programming and 866 + security-related applications in the OCaml ecosystem. 867 + 868 + #2 - ppx_hash (vv0.17.0) 869 + Similarity: 0.7284 870 + Reverse Dependencies: 3337 871 + Combined Score: 0.6833 872 + Description: This package generates efficient hash functions for 873 + OCaml types based on their structure, enabling precise control over 874 + hashing behavior. It addresses the limitations of OCaml's built-in 875 + polymorphic hashing by allowing users to define custom hash 876 + functions during type derivation. Key features include selective 877 + field ignoring, support for folding-style hash accumulation, and 878 + compatibility with comparison and serialization systems. It is 879 + suitable for use with hash tables, persistent data structures, and 880 + any application requiring deterministic, type-driven hashing. 881 + 882 + #3 - ez_hash (v0.5.3) 883 + Similarity: 0.8366 884 + Reverse Dependencies: 3 885 + Combined Score: 0.6583 886 + Description: This package provides a straightforward interface to 887 + common cryptographic hash functions, simplifying their use in OCaml 888 + applications. It wraps secure, widely-used algorithms like SHA-256 889 + and Blake2b, offering consistent and safe APIs for hashing data. The 890 + library is designed for clarity and ease of integration, making it 891 + ideal for developers needing reliable cryptographic operations 892 + without deep expertise in security. Practical uses include data 893 + integrity verification, digital signatures, and secure data storage. 894 + 895 + #4 - murmur3 (v0.3) 896 + Similarity: 0.7805 897 + Reverse Dependencies: 1 898 + Combined Score: 0.6072 899 + Description: This package provides OCaml bindings for MurmurHash, a 900 + fast and widely used non-cryptographic hash function. It enables 901 + efficient hash value compu tation for arbitrary data, making it 902 + suitable for applications like hash tables, checksums, and data 903 + fingerprinting. The bindings offer consistent hashing across platforms 904 + and integrate seamlessly into OCaml projects requiring 905 + high-performance hashing. Use cases include caching, distributed 906 + systems, and data integrity ve rification where cryptographic security 907 + is not required. 908 + 909 + #5 - kdf (v1.0.0) 910 + Similarity: 0.6775 911 + Reverse Dependencies: 473 912 + Combined Score: 0.6033 913 + Description: This package implements standard key derivation 914 + functions (KDFs) for cryptographic applications in OCaml. It supports 915 + scrypt, PBKDF1, PBKDF2, and HKDF, enabling secure generation of 916 + cryptographic keys from passwords or shared secrets. These functions 917 + help mitigate brute-force attacks and ensure keys are de rived in a 918 + reproducible, secure manner. Use cases include password-based 919 + encryption, secure token generation, and key material expansion in 920 + cryptographic protocols.</code></pre></div> 921 + <p>and a module-level search for &quot;time and date manipulation functions&quot;</p> 922 + <div><pre class="language-nolang"><code>#1 - timmy-jsoo: Timmy_jsoo 923 + Similarity: 0.5460 924 + Original Similarity: 0.7800 925 + Popularity Score: 0.0000 926 + Description: This module provides precise date and time arithmetic, 927 + conversion, and comparison operations across multiple representations, 928 + including OCaml-nati ve, JavaScript, and string formats. It works with 929 + structured types like `Date.t`, `Time.t`, and ISO weeks, supporting 930 + timezone-aware transformations and RFC3339 formatting. Concrete use 931 + cases include cross-runtime timestamp synchronization, calendar-aware 932 + scheduling, and robust temporal data validation in distributed 933 + systems. 934 + 935 + #2 - calendar: CalendarLib 936 + Similarity: 0.5331 937 + Original Similarity: 0.7616 938 + Popularity Score: 0.3448 939 + Description: This module provides precise date and time manipulation 940 + with support for calendar operations, time zones, periods, and 941 + formatted input/output. It works with types like `Calendar.t`, 942 + `Date.t`, `Time.t`, and `Period.t` to handle tasks such as event 943 + scheduling, timestamp conversion, and historical date calculations. 944 + Concrete use cases include scheduling systems, log timestamping, 945 + holiday calculations, and cross-timezone time normalization. 946 + 947 + #3 - calendar: CalendarLib.Fcalendar 948 + Similarity: 0.5191 949 + Original Similarity: 0.6820 950 + Popularity Score: 0.1390 951 + Description: This module provides float-based calendar operations 952 + for date creation, conversion, and manipulation, including time zone 953 + adjustments, component extraction (year/month/day/hour/second), and 954 + arithmetic with periods. It works with a `t` type representing time 955 + as float seconds, alongside `day`, `month`, `year`, and Unix time 956 + structures, prioritizing Unix time precision over sub-second 957 + accuracy. It suits applications tolerating minor imprecision in date 958 + comparisons or arithmetic, such as logging systems or coarse-grained 959 + scheduling, where exact floating-point equality isn't critical. 960 + 961 + #4 - calendar: CalendarLib.Calendar_builder.Make 962 + Similarity: 0.5112 963 + Original Similarity: 0.7302 964 + Popularity Score: 0.0785 965 + Description: This module combines date and time functionality to 966 + construct and manipulate calendar values with float-based precision, 967 + offering operations like timezone conversion, component extraction 968 + (day, month, year, etc.), and arithmetic using `Period.t`. It works 969 + with a calendar type `t` that integrates date and time components, 970 + alongside conversions to Unix timestamps, Julian day numbers, and 971 + structured representations like `Unix.tm`. Designed for scenarios 972 + requiring precise temporal calculations (e.g., calendar arithmetic, 973 + Gregorian date validation, or leap day checks), it balances 974 + flexibility with known precision limitations inherent to float-based 975 + time representations. 976 + 977 + #5 - timmy-unix: Clock 978 + Similarity: 0.5080 979 + Original Similarity: 0.7257 980 + Popularity Score: 0.0000 981 + Description: This module provides functions to retrieve the current 982 + POSIX time, the local timezone, and the current date in the local 983 + timezone. It works with time and date types from the Timmy library, 984 + specifically `Timmy.Time.t` and `Timmy.Date.t`. Use this module to 985 + obtain precise time and date information for logging, scheduling, or 986 + time-based computations.</code></pre></div> 987 + <p>and for &quot;Balanced Tree&quot;:</p> 988 + <div><pre class="language-nolang"><code>#1 - grenier: Mbt 989 + Similarity: 0.5274 990 + Original Similarity: 0.7534 991 + Popularity Score: 0.0495 992 + Description: This module implements a balanced binary tree structure 993 + with efficient concatenation and size-based operations. It supports 994 + tree construction through leaf and node functions, automatically 995 + balancing nodes and annotating them with values from a provided 996 + measure module. It is useful for applications requiring fast access, 997 + dynamic sequence management, and efficient merging of tree-based 998 + data structures. 999 + 1000 + #2 - camomile: CamomileLib.AvlTree 1001 + Similarity: 0.5008 1002 + Original Similarity: 0.7155 1003 + Popularity Score: 0.0495 1004 + Description: This module implements balanced binary trees (AVL 1005 + trees) with operations for constructing, deconstructing, and 1006 + traversing trees. It supports key operations like inserting nodes, 1007 + extracting leftmost/rightmost elements, concatenating trees, and 1008 + folding or iterating over elements. It is useful for maintaining 1009 + ordered data with efficient lookup, insertion, and deletion, such as 1010 + in symbol tables or priority queues. 1011 + 1012 + #3 - batteries: BatAvlTree 1013 + Similarity: 0.5003 1014 + Original Similarity: 0.7147 1015 + Popularity Score: 0.1485 1016 + Description: This module implements balanced binary trees (AVL 1017 + trees) with operations for creating, modifying, and traversing 1018 + trees. It supports tree construction with optional rebalancing, 1019 + splitting, and concatenation, and provides root, left, and right 1020 + accessors with failure handling. Concrete use cases include 1021 + efficient ordered key-value storage, set-like structures, and 1022 + maintaining sorted data with logarithmic-time insertions and 1023 + lookups. 1024 + 1025 + #4 - grenier: Bt2 1026 + Similarity: 0.4927 1027 + Original Similarity: 0.7039 1028 + Popularity Score: 0.2634 1029 + Description: This module implements a balanced binary tree structure 1030 + with efficient concatenation and rank-based access. It supports 1031 + creating empty trees, constructing balanced nodes, and joining two 1032 + trees with logarithmic cost relative to the smaller tree's size. Use 1033 + cases include maintaining ordered collections with frequent splits 1034 + and joins, and efficiently accessing elements by position. 1035 + 1036 + #5 - grenier: Mbt.Make 1037 + Similarity: 0.4913 1038 + Original Similarity: 0.7019 1039 + Popularity Score: 0.0495 1040 + Description: This module implements a balanced tree structure with 1041 + efficient concatenation and size-based operations. It supports 1042 + construction of trees using leaf and node functions, where nodes are 1043 + automatically balanced and annotated with measurable values from 1044 + module M. The module enables efficient rank queries and joining of 1045 + trees, with applications in managing dynamic sequences where fast 1046 + access and concatenation are critical.</code></pre></div> 1047 + <h2 id="limitations-and-future-work"><a href="#limitations-and-future-work" class="anchor"></a>Limitations and future work</h2> 1048 + <p>We're aware that there are currently a number of limitations with what's been done so far, and there's a lot of exciting things that could quite easily be added!</p> 1049 + <p>We haven't done much prompt optimisation either for the tools themselves, nor their descriptions in the MCP server. We also haven't done much optimisation of the information retrieval - and it's clear from some of the results shown above that there are improvements to be made in the ranking algorithms. Some obvious next steps would be to do some <a href="https://arxiv.org/html/2406.12433v2">re-ranking</a> or some form of hybrid search.</p> 1050 + <p>A particular challenge is that since this is based entirely off of the <code>ocaml-docs-ci</code> build, it won't necessarily reflect the actual API your local build, as for OCaml, this <a href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html">can't be done</a>. Thibaut Mattio is working on a <a href="https://github.com/tmattio/ocaml-mcp">local MCP server</a> that would be perfectly positioned to do some of what we're doing, although we'd need to have a good local docs build implemented in dune for this to work well.</p> 1051 + <p>Also, there's plenty more data that we've collected during the docs builds! We can show the implementations of functions, we can expose code samples, select different versions of packages and much more. While we've concentrated on the search aspects, there's still a lot of low-hanging fruit that can be worked on.</p> 1052 + <p>If you're interested in helping us out on this, the project lives <a href="https://github.com/sadiqj/odoc-llm">on github</a> - come along and join us!</p> 1053 + <h2 id="using-the-server"><a href="#using-the-server" class="anchor"></a>Using the server</h2> 1054 + <p>If you'd like to try it, we've got a demo server running right now. It's hosted on dill.caelum.ci.dev here at the Computer Laboratory in the University of Cambridge. To enable it with Claude, try this:</p> 1055 + <div><pre class="language-bash"><code>claude mcp add -t sse ocaml http://dill.caelum.ci.dev:8000/sse</code></pre></div> 1056 + <p>Obviously this is pre-alpha quality software, and we might take it down with no notice, and it might not work as expected, and all of the other usual caveats. Let us know if it works, or doesn't, or if you've got some suggestions for improvements!</p>]]></content> 1057 + </entry> 1058 + <entry> 1059 + <id>https://jon.recoil.org/blog/2025/08/week33.html</id> 1060 + <title>Week 33</title> 1061 + <published>2025-08-19T00:00:00Z</published> 1062 + <updated>2025-08-19T00:00:00Z</updated> 1063 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/08/week33.html"/> 1064 + <summary>More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</summary> 1065 + <content type="html"><![CDATA[<h1 id="week-33"><a href="#week-33" class="anchor"></a>Week 33</h1> 1066 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-08-19</p></li></ul> 1067 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>cohttp,yojson,jsonm</p></li></ul> 1068 + <p>More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</p> 1069 + <ul><li>Write a README</li><li>Write and run a small script to fix a problem with module-type names</li><li>Write up and publish a blog post</li></ul> 1070 + <p>Not much, right? As always though, writing things up lead to a whole load more work.</p> 1071 + <p>The first problem occurred when writing up how it parsed the input docs. It turned out that when converting the repo so that it took markdown formatted files (using a <a href="https://github.com/jonludlam/odoc/tree/odoc-llm-markdown">slightly tweaked</a> version of <a href="https://github.com/ocaml/odoc/pull/1341">davesnx's PR</a>), Claude had decided that the way to do this was to first convert the markdown into HTML, and then use the HTML parser it had already built. Whilst tidying this up, Claude was remarkably keen to just use regexps to parse the markdown rather than using a pre-existing markdown library, so it took a little persuasion to get it into a state I was happy with.</p> 1072 + <p>The second issue was that the script that form the bulk of the repo had been written at different times, and therefore Claude didn't really take into account any of the decisions it had made in one script when building the next. So most of the command-line arguments were slightly different, which made writing up a mini 'howto' in the README quite a jarring experience.</p> 1073 + <p>Thirdly, and most importantly, we had decided that we needed a few example searches to show how the system worked. We'd already had a <a href="../07/week28.html" title="week28">useful experience</a> with this when Anil had tried to search for a 'time and date parsing and formatting' library, so it shouldn't really have been a surprise that trying a few more examples showed some more interesting behaviour. Specifically, the searches I wanted to do were for an &quot;HTTP client&quot;, &quot;JSON parser&quot;, &quot;Cryptographic Hash&quot; and Anil's time-and-date query, and in actually trying these searches and critically examining the results, I had to go back and figure out why they weren't giving me the results I had expected.</p> 1074 + <p>The first of these searches I had anticipated would be quite interesting, as this is a query that should show the OCaml ecosystem <a href="https://discuss.ocaml.org/t/simple-modern-http-client-library/11239">missing an obvious HTTP client</a>. However, even with this in mind one of the top results was one of Cohttp's module types, <code>Cohttp.Generic.Client.S</code>. This, of course, isn't much use if you're looking for an HTTP client, as module-types aren't going to give you an implementation to actually use. So I decided that we'd exclude module-types from the results. This turned out to be slightly more tricky than I anticipated as we'd lost the distinction between modules and module types further back in the pipeline, so Claude had to do some plumbing to ensure we had this information at the point we were doing the search.</p> 1075 + <p>The cryptographic hash search gave some plausible looking results, so I moved on to the JSON search. I was expecting to see <code>Yojson</code> somewhere near the top of the list as that's a very popular library. I was also expecting to see <code>Jsonm</code> somewhere near the top - or at least I'd like to be able to find it by searching for a 'streaming parser' as that's one of its key strengths. However, searching for &quot;JSON parser&quot; yielded some less than brilliant answers. The top 5 results were for modules in the packages <code>yojson-five</code>, <code>decoders-yojson</code>, <code>decoders-jsonaf</code>, <code>ocplib-json-typed-browser</code> and <code>ppx_protocol_conv_jsonm</code>. While all of these are clearly in the same realm as I was after, having <code>jsonm</code> show up literally 99th in the list, and <code>yojson</code> itself not in the top 100 wasn't a great result.</p> 1076 + <p>Some investigation showed that yojson had a particularly bad showing because the description of the module <code>Yojson.Basic</code> was the empty string! This turned out to be because of some bad error-handling logic in the summariser script, which ended up turning some errors into a blank description. Since running the summariser costs actual money, I didn't want to just rerun the whole thing, so I asked Claude for a script to find these problems and rerun them. The problem is not totally trivial as the summaries of child modules are used when generating the summary for parents, so when one is regenerated we should regenerate the summaries of all ancestors too. Given my recent experiences with Claude I'd like to look this over quite carefully before letting it loose on my data, so I've run it on yojson, which seemed to do the right thing, but not yet on the rest of the packages.</p> 1077 + <p>Having fixed this, I still found that <code>jsonm</code> was making a very poor showing. This turned out to be because the description it gives itself is a &quot;Non-blocking streaming JSON codec for OCaml&quot; which had a fairly low similarity with &quot;JSON parser&quot;. I was using a fairly small embedding model for the queries - Qwen/Qwen3-Embedding-0.6B, so I thought I might address this by using a larger one, and opted for Qwen/Qwen3-Embedding-8B. The machine I had been using for the MCP server has no GPU and had taken a while to do the embeddings using the 0.6B model, so I switched to generating them on my M4 macbook. This went <i>much</i> faster, though since I have about 70Mb of module summaries it still took quite a while. This improved the situation somewhat, but it was still not high in the list.</p> 1078 + <p>So I took a step back and had a think about the problem some more. Searching for a JSON parser is really quite a high-level search, and when evaluating the results I realised I was really thinking in terms of packages rather than modules. So I thought we could split the search in two - a package search and a module search. The package search would be used for the broad queries where you're interested in pulling in whole chunks of functionality, and the module search is for more low-level queries. In fact, the 'time and dating formatting' query is somewhere in between, so I might need to have some more example queries for the module search functions. In addition, the module search could be restricted to the set of packages you're using, which might make it even more useful.</p> 1079 + <p>Part of the split meant that I needed a different source of 'popularity' for the packages than the occurrences data that came out of docs ci, as that was per-module and I needed something per-package. The obvious thing is to look at reverse dependencies in opam. I have this kind-of working, but it's currently not particularly smart, so this will need a little more attention. For example, it currently thinks that <a href="https://melange.re/v5.0.0/">melange</a> has over 3000 reverse dependencies.</p> 1080 + <p>With these changes in place, a package search for 'JSON parser' now returns <code>yojson</code> as number one, followed by <code>ppx_deriving_yojson</code>, <code>ezjsonm</code>, <code>ocplib-json-typed</code> and <code>jsonaf</code>. Unfortunately <code>jsonm</code> is still languishing in 27th place, so there's still some tweaking to do.</p>]]></content> 1081 + </entry> 1082 + <entry> 1083 + <id>https://jon.recoil.org/blog/2025/07/retrospective.html</id> 1084 + <title>4 months in, a retrospective</title> 1085 + <published>2025-07-18T00:00:00Z</published> 1086 + <updated>2025-07-18T00:00:00Z</updated> 1087 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/retrospective.html"/> 1088 + <summary>Astonishingly, it's already been since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into t...</summary> 1089 + <content type="html"><![CDATA[<h1 id="4-months-in,-a-retrospective"><a href="#4-months-in,-a-retrospective" class="anchor"></a>4 months in, a retrospective</h1> 1090 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-18</p></li></ul> 1091 + <p>Astonishingly, it's already been <i>four whole months</i> since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into the Computer Laboratory as an SRA for the first time since 2021, but here we are, at the end of term already. Time to do a bit of a retrospective and forward-looking plan for the next 3-4 months!</p> 1092 + <h2 id="what's-happened?"><a href="#what's-happened?" class="anchor"></a>What's happened?</h2> 1093 + <p>On wednesday this week, I had a chance to sit down with Anil, supposedly to talk about the upcoming lecturing of 1A Foundations of Computer Science, but we ended up talking about what I've been doing for the past few months, and where it fits into the broader picture of the group as a whole. It was a really useful conversation, and I thought it would be good to outline it here while it's fresh in my mind.</p> 1094 + <p>So then, to start, what have I been doing? What have I achieved? What have I learnt? It's been a bit of a daunting experience, landing in a team that are already working one hundred miles an hour on things well out of my comfort zone. I've been going to group meetings and having lots of interesting conversations, but I've found it difficult to make the next steps happen. One area where I've had some success is in working with Sadiq on LLMs - in particular, getting local LLMs to solve programming exercises that we both <a href="https://toao.com/blog/ocaml-local-code-models">wrote</a> <a href="../05/ticks-solved-by-ai.html" title="ticks-solved-by-ai">up</a>. I've also been working with him on taking the output from the docs CI and <a href="https://github.com/sadiqj/odoc-llm">summarising it with LLMs</a> in order to create an MCP server that would help tools like <a href="https://anthropic.com/">Claude Code</a> to choose OCaml packages to solve users' problems.</p> 1095 + <p>It's been somewhat easier, partly due to inertia, to carry on with projects that had been in flight at the time I started. Things like getting the Odoc 3 generated docs onto ocaml.org, which is finally complete only <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">as of this week!</a>. This has taken a whole lot of time, but I'm really pleased with the end results. There's still an awful lot of improvements that I'd like to see made, which, after drawing breath for a couple of weeks, I'll be writing down.</p> 1096 + <p>An itch I'd been wanting to scratch for a long time has been to look at client-side ocaml notebooks. I decided to make this an integral <a href="../04/this-site.html" title="this-site">feature of this blog</a>, and I've learnt an awful lot doing it. An important feature of this that I've been keeping in mind is the idea that we could use the ocaml-docs-ci tool to build the libraries, which would allow us to host a toplevel for every single package in opam-repository - allowing at best <a href="https://discuss.ocaml.org/t/an-example-for-every-ocaml-package/16953/10">interactive examples</a>, and at bare minimum merlin for live type-checking and autocompletion. The important principles to keep in mind for this are that:</p> 1097 + <ul><li>We have one 'toplevel' javascript file, and libraries and cmis are dynamically loaded</li><li>The interface between the frontend and the worker must not rely on a matched pair, e.g. an OCaml-5.3-compiled frontend might be talking to an OCaml-4.08-compiled worker thread - or even an oxcaml one!</li></ul> 1098 + <p>I have this all working on my blog, where I have both an oxcaml worker and a standard ocaml worker and they both dynamically load in libraries and cmis as specified on the page.</p> 1099 + <p>I've also supervised a 1A course for the first time - <a href="https://www.cl.cam.ac.uk/teaching/2425/IntroProb/">Introduction to Probability</a>, and I've done some marking for the 1A Foundations of Computer Science.</p> 1100 + <p>Something that I'd been expecting to do a lot on was work with oxcaml, but as the release happened later than anticipated and it coinciding with the marking and supervising, I've not done quite as much of this as I had thought I would. In addition, I had anticipated working on Odoc to start implementing the new features of oxcaml, but to avoid duplicating effort I've been waiting for the patches that have already been written at Jane Street to at least get odoc to compile, which have taken longer than I had hoped to get to me.</p> 1101 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 1102 + <p>With that in mind, Anil and I then talked about the bigger picture, as those of you who know Anil will be entirely unsurprised to hear! In particular, how will we be weaving the various threads of these activites - the teaching of OCaml, the large-scale (for OCaml) CI work, the LLMs and Oxcaml work together to form a coherent whole? How do I find a balance between them and ensure that we find <a href="https://arxiv.org/abs/1106.0848">synergies</a> as opposed to pulling in different directions? How do make sure what we're doing helps us navigate the upending of the nature of development that agentic coding is bringing?</p> 1103 + <h3 id="efficient-and-reusable-ci"><a href="#efficient-and-reusable-ci" class="anchor"></a>Efficient and reusable CI</h3> 1104 + <p>A clear and obvious area where we'll be able to see real progress is to extract from docs CI the logic that I've been using to do efficient builds of packages. As I previously <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">wrote about</a>, the new CI system is far more efficient than some of the other ocurrent-based pipelines, and it would save a huge amount of compute time if we were to take this tech and apply it elsewhere.</p> 1105 + <p>So, how might we take what we've got and produce something useful to everyone? We need to take a hammer to the fracture points of the docs CI service and split it into individually useful parts. Here are some next steps as I see them now. Let's take the solver out of docs CI, and have a service whose sole job is to create a repository of up-to-date solutions for all versions of all packages in opam-repository. These are the data that allow us to build the tree of package builds.</p> 1106 + <p>Next, turn these solutions into one giant build. Perhaps a script? Maybe a giant buildkit dockerfile? This is very similar to Mark Elvers' <a href="https://github.com/mtelvers/ohc">day10</a> project. We can get this running on a big machine and just see how fast we can build everything. The key thing here is that it should be <em>trivial</em> to run this on a linux box. A raspberry pi or a 768-core behemoth with 3TiB of ram. Just how fast <em>can</em> we get it going? It's already building in a couple of days using <a href="odoc-3-live-on-ocaml-org.html" title="odoc-3-live-on-ocaml-org">sage</a>, but that's using ocurrent/obuilder, which isn't quite the right tool for the job, and on a relatively puny machine. Can we do it in an hour? 10 minutes? Certainly the incrememntal builds ought to be done in seconds. What's the limit?</p> 1107 + <p>These tools can then be used as the foundation for other CI systems. For opam-repo-ci, where we should be able to do the builds for a new package incredibly quickly. For opam-health-check, where we currently build foundational packages like dune and findlib <i>thousands of times</i> per run.</p> 1108 + <p>Once we've got the packages built, docs CI is simply a pass over the top of the built artifacts. ocaml-docs-ci already demonstrates this - it only takes a few hours to rebuild all the docs when a new version of odoc is released, but in a way that only benefits docs! All the CI systems should be able to use this.</p> 1109 + <p>We should also then be able to run js_of_ocaml on the libraries to build to infrastructure needed for the per-package toplevels for ocaml.org that I mentioned above. Each of these steps should be separate stages in a pipeline - one where each step produces artifacts for the next to consume.</p> 1110 + <p>When we mix in some of the projects that other people in the team are working on, like David's work on <a href="https://www.dra27.uk/blog/">relocatable OCaml</a>, we've got something that might be able to form a basis for a binary cache for Dune Package Management, particularly when we involve Ryan's <a href="https://ryan.freumh.org/papers.html#2025-arxiv-hyperres">Hyperres</a> paper so we might check that dependencies from outside of the OCaml universe are correct. Can we use <a href="https://github.com/quantifyearth/shark">Patrick and Michael's shark</a> to do the build steps? Can we use these images to serve up toplevels for ocaml.org that are <em>real toplevels</em> rather than javascript toplevels? Can we use these build environments to do help with reinforcement learning to train LLMs on OCaml code? There are a lot of interesting directions to take this work.</p> 1111 + <h3 id="other-projects"><a href="#other-projects" class="anchor"></a>Other projects</h3> 1112 + <p>There are, of course, other responsibilities that I have. Some of these I'll be able to fit in with the theme above, and some - well - maybe I'll have to figure out how to delegate them, a skill that I am not particularly good at, but one that I feel I should learn!</p> 1113 + <h4 id="teaching"><a href="#teaching" class="anchor"></a>Teaching</h4> 1114 + <p>A looming, terrifying, but tremendously exciting opportunity is teaching of 1A Foundations of Computer Science. This is amongst the first courses we teach our incoming undergraduates, currently lectured by <a href="https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/">Anil</a>. As he's on sabbatical this year, he has asked me to step up and lecture it. This is definitely not one for delegation!</p> 1115 + <p>The immediate question, partly raised by my work with Sadiq, is: what do we do about LLMs? How should we adjust our teaching to take into account the existence of these tools? We had a very interesting chat earlier in the term with Professor <a href="https://eecs.iisc.ac.in/people/prof-viraj-kumar/">Viraj Kumar</a> from <a href="https://eecs.iisc.ac.in/">IISc</a> who was visiting Cambridge earlier this year. He's been <a href="https://dl.acm.org/doi/10.1145/3724363.3729100">working on this question</a> for a while now, and I hope to have some more conversations with him over the summer.</p> 1116 + <h4 id="odoc-paper"><a href="#odoc-paper" class="anchor"></a>Odoc paper</h4> 1117 + <p>An area where I've really made a shockingly small amount of progress is to write up all the work that's gone into Odoc over the past 6 (!!!) years.</p> 1118 + <h4 id="odoc-notebooks"><a href="#odoc-notebooks" class="anchor"></a>Odoc notebooks</h4> 1119 + <p>This needs to be tidied up and a v0.1 released. In particular, the work on js_top_worker might well be shared with Arthur's <a href="https://github.com/art-w/x-ocaml">x-ocaml</a> for a unified toplevel experience.</p> 1120 + <h4 id="ai-work"><a href="#ai-work" class="anchor"></a>AI work</h4> 1121 + <p>I'd like to carry on the work I've started with Sadiq on the interaction of LLMs with OCaml. Getting the package search to work sensibly for an MCP server is first on the list, but also doing some reinforcement learning to improve specifically the perfomance on OCaml is very interesting, but not something I've managed to carve out the time for yet. Something along the lines of <a href="https://arxiv.org/abs/2504.21798">swesmith</a> but adapted for OCaml.</p> 1122 + <h4 id="oxcaml-odoc"><a href="#oxcaml-odoc" class="anchor"></a>Oxcaml Odoc</h4> 1123 + <p>Odoc needs to have some work done on it to support the new work that's gone into oxcaml, for example documenting of the modes. This is something I do expect to be working on soon.</p> 1124 + <h4 id="dune-and-odoc"><a href="#dune-and-odoc" class="anchor"></a>Dune and odoc</h4> 1125 + <p>Work needs to be done on the dune rules for odoc, which currently only support the feature-set in odoc 2.x. Paul-Elliot has <a href="https://github.com/ocaml/dune/pull/11716">done some work on this</a>, but much more needs to be done.</p> 1126 + <h4 id="further-general-odoc-work"><a href="#further-general-odoc-work" class="anchor"></a>Further general odoc work</h4> 1127 + <ul><li>Better source rendering</li><li>Syntax for linking to source</li><li>Custom tags (used in odoc_notebook)</li><li>Web-native rendering, for embedding odoc in a website</li><li>Unifying paths and cpaths (https://github.com/jonludlam/odoc/tree/parameterised-paths)</li></ul> 1128 + <h2 id="what-to-actually-do?"><a href="#what-to-actually-do?" class="anchor"></a>What to <i>actually</i> do?</h2> 1129 + <p>There are a lot of things in the above list. I'm not sure yet how I manage to figure out what I actually end up doing, and how that helps me to help Tarides, to fit in as a useful member of the EEG group, and to make sure I'm doing what's right for my own future. I feel the core project of the CI work will help everyone, but slotting the other work into the bigger picture will require some careful thought.</p>]]></content> 1130 + </entry> 1131 + <entry> 1132 + <id>https://jon.recoil.org/blog/2025/07/week28.html</id> 1133 + <title>Week 28</title> 1134 + <published>2025-07-14T00:00:00Z</published> 1135 + <updated>2025-07-14T00:00:00Z</updated> 1136 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/week28.html"/> 1137 + <summary>Week 28</summary> 1138 + <content type="html"><![CDATA[<h1 id="week-28"><a href="#week-28" class="anchor"></a>Week 28</h1> 1139 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-14</p></li></ul> 1140 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>caqti.platform,mariadb</p></li></ul> 1141 + <h2 id="ocaml-mcp-server"><a href="#ocaml-mcp-server" class="anchor"></a>OCaml MCP server</h2> 1142 + <p>Last week I got the summarisation to the point where it felt useful to run it across all the modules in opam. With this completed we then got to try out the MCP server to see how useful it would be in practice.</p> 1143 + <p>One of the first queries <a href="https://anil.recoil.org/">Anil</a> tried was to ask it which libraries would be most useful for &quot;date time parsing and formatting&quot;. We were surprised to see that the first two libraries it returned were <code>caqti</code> and <code>mariadb</code>, specifically mentioning the module <code>Caqti_platform.Conv</code> and <code>Mariadb.S.Time</code>. While these do indeed provide the required functionality, they're probably not the right libraries to provide this. It's going to be tricky to decide this in the MCP server, so we should probably be leaving it up to the LLM to decide amongst them on the client. However, for very general queries we might end up with a large number of matching libraries, so we'll need to have a limit on the number of packages returned, which implies some form of ranking.</p> 1144 + <p>One way we can do this is by using the occurrences code in odoc. The idea is that we examine module implementation files (ie, ml rather than mli files), and counts the number of times the code uses values, types and other identifiers from other libraries. We can then aggregate these counts over all packages in opam repository and use it as an effective marker of popularity, which allows us to rank the results by popularity and only return the top N results.</p> 1145 + <p>We're not currently using the occurrences for anything, so I wasn't especially surprised to find that it's not working as intended. There were a number of issues:</p> 1146 + <ul><li>The occurrences output file was being written at a path not within the package dir, so it wasn't being persisted.</li><li>The CLI interface for generating occurrences works by providing a directory containing the odocl files, but we were only providing the top-level directory and it wasn't recursively searching.</li><li>Once the occurrences were captured, the aggregation step used the full identifier of the value being aggregated, meaning that, for example, <code>List.length</code> in OCaml 5.3 was counted separately from <code>List.length</code> in OCaml 4.14.</li></ul> 1147 + <p>All of these issues are with code in the odoc repository, which, as it happens, also needs a release soon to ensure that it works with the imminent launch of OCaml 5.4. During the week, before I discovered the problems above, I had attempted to make a release of Odoc 3.1, but there was a license kerfuffle that, when combined with the issues in the occurrences code, gave me enough cause to pull the release.</p> 1148 + <p>Before I try to make the release again, this time I'll be running the release candidate with docs-ci, and checking that the occurrences make sense. I set this running on Friday afternoon, and it had completed by Friday evening, so it's actually pretty quick to rerun odoc on the 15,000 or so packages required for ocaml.org.</p> 1149 + <h2 id="trouble-with-this-blog"><a href="#trouble-with-this-blog" class="anchor"></a>Trouble with this blog</h2> 1150 + <p>In other news, in trying to post my blog at the beginning of the week, I was stymied a little by the changes in oxcaml. I had been using a custom opam-repository forked from the official oxcaml one, because I needed a patched js_of_ocaml in order to fix the toplevel code. I had hoped this would mean that I could update it on my schedule, rather than being at the mercy of upstream changes. Unfortunately though, the download URL for ocaml-flambda wasn't pointing at an immutable commit, so when I tried it I got a checksum error. So I ended up trying to rebase the changes onto the latest oxcaml opam-repository, which didn't go well at all. The version numbers had all changed, which in opam means that files are in different directories, so git got thoroughly confused. On top of that, because the js_of_ocaml repository has multiple packages in it, whereas opam repository has a directory per-package, we end up having multiple copies of the patches. So in the end I've just committed all the patches to a git repo on github, and pinned it in the Dockerfile that builds this site.</p> 1151 + <p>What would be handy is a way to apply the patches in a package in opam repository to and from a git repository, similar to quilt/guilt. We don't quite have all of the pieces to do this, as although we have a download URL and often a dev-repo, I don't believe we currently have a way to get the base commit of that repository.</p> 1152 + <h2 id="oxcaml-continues"><a href="#oxcaml-continues" class="anchor"></a>Oxcaml continues</h2> 1153 + <p>We had a meeting on Thursday with Jane Street on the next steps for oxcaml. There are a number of areas in which JS are keen for us to help out with.</p> 1154 + <ul><li>Playgrounds - both javascript and docker-based. The playground on the oxcaml website right now uses github codespaces, which works nicely but currently takes an absolute age to start up. We can almost certainly improve this by building docker images and pushing them to the docker hub, rather than building oxcaml from scratch when starting the codespace. There's also interest in the javascript playgrounds, which can serve a slightly different purpose than the docker-based one, more limited in how it can be used, but without requiring someone to spin up a full docker container.</li><li>Documentation - Odoc has had some patches to run on oxcaml, but there's no support for documenting many of the new features yet, including modes. We've got to do some experiments here to see what the best way is to show the new type-system features in the generated docs. There were some suggestions of using javascript to show/hide the modes, for example.</li><li>Improvements in Merlin - again this is an area ripe for investigation. In particular, how do we best expose the new features of the type system for users? What's needed here is user feedback from people who are actually using oxcaml to build real projects.</li><li>Better error messages - OCaml has been getting improved error messages with each release, but there's still room for improvement, and the new features of the type system in particular have many different failure modes. Again, we need user feedback to understand the pain points and improve the error messages accordingly.</li></ul> 1155 + <h2 id="next-week"><a href="#next-week" class="anchor"></a>Next week</h2> 1156 + <p>Next week, the plan is to:</p> 1157 + <ul><li>Check the occurrences from docs-ci, and integrate into the MCP server</li><li>Talk to <a href="https://tunbury.org/">Mark</a> about building the docker image for the oxcaml playground</li><li>Tidy up the <code>Js_top_worker</code> code so it can be used in the javascript oxcaml playground</li><li>Release Odoc 3.1</li></ul>]]></content> 1158 + </entry> 1159 + <entry> 1160 + <id>https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html</id> 1161 + <title>Odoc 3 is live on OCaml.org!</title> 1162 + <published>2025-07-14T00:00:00Z</published> 1163 + <updated>2025-07-14T00:00:00Z</updated> 1164 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html"/> 1165 + <summary>As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</summary> 1166 + <content type="html"><![CDATA[<h1 id="odoc-3-is-live-on-ocaml.org!"><a href="#odoc-3-is-live-on-ocaml.org!" class="anchor"></a>Odoc 3 is live on OCaml.org!</h1> 1167 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-14</p></li></ul> 1168 + <p>As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</p> 1169 + <p>Some of the highlights include:</p> 1170 + <ul><li>Source code rendering</li><li>Hierarchical manual pages</li><li>Image, video and audio support</li><li>Separation of API docs by library</li></ul> 1171 + <p>A huge amount of work went into the <a href="https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339">Odoc 3.0 release</a>, and I'd like to thank my colleagues at Tarides, in particular <a href="https://github.com/panglesd">Paul-Elliot</a> and <a href="https://github.com/julow/">Jules</a> for the work they put into this.</p> 1172 + <p>But the odoc release happened months ago, so why is it only going live now? So, the doc tool itself is only one small part of getting the docs onto ocaml.org. Odoc works on the <a href="https://discuss.ocaml.org/t/cmt-cmti-question/5308">cmt and cmti</a> files that are produced during the build process, and so part of the process of building docs is to build the packages, so we have to, at minimum, attempt to build all 17,000 or so distinct versions of the packages in opam-repository. The <a href="https://github.com/ocurrent">ocurrent</a> tool <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a>, which I've previously <a href="../05/docs-progress.html" title="docs-progress">written</a> <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">about</a>, is responsible for these builds and in this new release has demonstrated a new approach to this task, where we attempt to do the build in as efficient a way as possible by effectively building binary packages once for each required package in a specific 'universe' of dependencies. For example, many packages require e.g. <a href="https://erratique.ch/software/cmdliner">cmdliner.1.3.0</a> to build, and some require a specific version of OCaml too. So we'll build cmdliner.1.3.0 once against each version of OCaml required -- but <i>only once</i>, which is in contrast to how some of the other tools in the ocurrent suite work, e.g. <a href="https://github.com/ocurrent/opam-repo-ci">opam-repo-ci</a>. Once the packages are built, we then run the new tool <a href="https://ocaml.github.io/odoc/odoc-driver/index.html">odoc_driver</a> to actually build the HTML docs. In addition to this, a new feature of Odoc 3 is to be able to link to packages that are your direct dependencies - so for example, the docs of odoc contain links to the docs of odoc_driver, even though odoc_driver depends upon odoc. This, whilst sounding easy enough, required some radical changes in the docs ci, which I promise I will write about later!</p> 1173 + <p>The builds and the generation of the docs is all done on a single blade server, called <a href="https://sage.caelum.ci.dev">sage</a> with 40 threads, 2 8TiB spinning drives and a 1.8TiB SSD cache, and it produces about 1 TiB of data over the course of a couple of days. The changes required to this part of the process since odoc 2.x were primarily myself and <a href="https://tunbury.org">Mark Elvers</a></p> 1174 + <p>Once the docs are built, how do they get onto ocaml.org? Odoc itself knows nothing about the layout and styling of ocaml.org, so the HTML it produces isn't suitable to be just rendered when a user requests particular docs. What happens is that odoc produces, as well as a self-contained HTML file, a json file with the body of the page, the sidebars, the breadcrumbs and so on as structured data, one per HTML page, which are then served from sage over HTTP. When a user requests a particular docs page, the ocaml.org server will request that json file from sage, then render this with the ocaml.org styling, then send it back to the user.</p> 1175 + <p>As odoc 3 moved a fair bit of logic from ocaml.org into odoc itself, there were quite a few changes that needed to be made to the ocaml.org server to integrate this into the site. This work was mostly done by <a href="https://github.com/panglesd">Paul-Elliot</a> and myself, with a lot of help from the <a href="https://github.com/ocaml/ocaml.org?tab=readme-ov-file#maintainers">ocaml.org team</a>, in particular <a href="">Sabine Schmaltz</a> and <a href="https://github.com/cuihtlauac">Cuihtlauac Alvarado</a>.</p> 1176 + <p>So, quite a lot of integration and infrastructure work was required to get the new docs site up and running, and I'm very happy to see this particular task concluded!</p>]]></content> 1177 + </entry> 1178 + <entry> 1179 + <id>https://jon.recoil.org/blog/2025/07/week27.html</id> 1180 + <title>Weeks 24-27</title> 1181 + <published>2025-07-07T00:00:00Z</published> 1182 + <updated>2025-07-07T00:00:00Z</updated> 1183 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/07/week27.html"/> 1184 + <summary>It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml ...</summary> 1185 + <content type="html"><![CDATA[<h1 id="weeks-24-27"><a href="#weeks-24-27" class="anchor"></a>Weeks 24-27</h1> 1186 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-07-07</p></li></ul> 1187 + <p>It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml documentation. This post is about anaspect of that last one that I found particularly interesting.</p> 1188 + <h2 id="odoc-llm"><a href="#odoc-llm" class="anchor"></a>odoc-llm</h2> 1189 + <p>Sadiq and I have been <a href="https://toao.com/blog/ocaml-local-code-models">looking at using LLMs</a> to summarise the documentation produced by Odoc. The idea is to get a concise summary of the purpose of each module, so that we can quickly identify which modules are relevant to a particular task.</p> 1190 + <p>For testing this, we need to see how it works on different types of libraries. The first axis I wanted to test on goes between 'well documented' and 'poorly documented', and so I need at least two libraries on opposite ends of the spectrum.</p> 1191 + <p>For the 'well documented' case, I chose <code>cmdliner</code>. It's a library that I almost always have to look at the docs for when I'm using it, as, despite using it many many times, the interface doesn't seem to stick in my head.</p> 1192 + <p>For the 'poorly documented' case, I chose <code>odoc</code> itself, somewhat ironically. In defence of the odoc authors (me included!), the libraries it provides are simply there for code organisation and aren't meant to be consumed outside of the tool itself!</p> 1193 + <h3 id="the-approach"><a href="#the-approach" class="anchor"></a>The approach</h3> 1194 + <p>The output from Odoc is a set of HTML files, one per module/module type/class/etc., containing the documentation for that item. Our first take on this was to parse the HTML files and extract the text content, which we then fed to an LLM to summarise. However, this was pretty awkward, so we decided to try a PR that <a href="https://github.com/ocaml/odoc/pull/1341">davesnx recently made to Odoc</a> to output markdown instead of HTML.</p> 1195 + <p>We look for leaf modules that don't contain any submodules, and start by summarising those, then move onto the parent modules, splicing in the summaries of the children, and so on, up to the top-level modules. We then move on to summarising the whole library, which usually is just a single namespace module, but occasionally is a group of top-level modules.</p> 1196 + <p>One of the early prompts for the module <code>Cmdliner.Term.Syntax</code> looked roughly as follows:</p> 1197 + <pre>You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 1198 + - The specific operations and functions this module provides 1199 + - What data types or structures it works with 1200 + - Concrete use cases (avoid generic terms like &quot;utility functions&quot; or &quot;common operations&quot;) 1201 + 1202 + Do NOT: 1203 + - Repeat the module name in the description 1204 + - Mention &quot;functional programming patterns&quot; or &quot;code clarity&quot; 1205 + - Use filler phrases like &quot;provides functionality for&quot; or &quot;collection of functions&quot; 1206 + - Describe how it works with other modules 1207 + 1208 + Module: Syntax 1209 + Module Documentation: let operators. 1210 + ( let+ ) is map. 1211 + ( and* ) is product. 1212 + - val (let+) : 'a t -&gt; ('a -&gt; 'b) -&gt; 'b t (* ( let+ ) is map. *) 1213 + - val (and+) : 'a t -&gt; 1214 + 'b t -&gt; 1215 + ('a * 'b) t (* ( and* ) is product. *)</pre> 1216 + <p>and the output using a small model (qwen3-30b-a3b) was:</p> 1217 + <pre>&quot;The module provides (let+) for applying functions to values within a context and (and+) for combining two contexts into a product. It operates on applicative 1218 + structures like option, list, or custom types that support these operations. For example, it enables sequential transformation of values in a context or 1219 + pairing elements from two separate contexts.&quot;</pre> 1220 + <p>There are quite a few issues with the input here. Firstly, we've only given the module name, not the full path. Secondly, there's nothing to let the model know what <code>t</code> might be, so it has decided it's a completely generic type. It also has no idea about the context in which this module was found, so it has no idea that it's part of a command-line processing library. By fixing these issues, we end up with the prompt:</p> 1221 + <pre>You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 1222 + - The specific operations and functions this module provides 1223 + - What data types or structures it works with 1224 + - Concrete use cases (avoid generic terms like &quot;utility functions&quot; or &quot;common operations&quot;) 1225 + 1226 + Do NOT: 1227 + - Repeat the module name in the description 1228 + - Mention &quot;functional programming patterns&quot; or &quot;code clarity&quot; 1229 + - Use filler phrases like &quot;provides functionality for&quot; or &quot;collection of functions&quot; 1230 + - Describe how it works with other modules 1231 + 1232 + Module: Cmdliner.Term.Syntax 1233 + 1234 + Ancestor Module Context: 1235 + - Cmdliner: Declarative definition of command line interfaces. 1236 + Consult the tutorial, details about the supported command line syntax and examples of use. 1237 + Open the module to use it, it defines only three modules in your scope. 1238 + - Cmdliner.Term: Terms. 1239 + A term is evaluated by a program to produce a result, which can be turned into an exit status. A term made of terms referring to command line arguments implicitly defines a command line syntax. 1240 + 1241 + Module Documentation: let operators. 1242 + - val (let+) : 'a Cmdliner.Term.t -&gt; ('a -&gt; 'b) -&gt; 'b Cmdliner.Term.t (* ( let+ ) is map. *) 1243 + - val (and+) : 'a Cmdliner.Term.t -&gt; 1244 + 'b Cmdliner.Term.t -&gt; 1245 + ('a * 'b) Cmdliner.Term.t (* ( and* ) is product. *)</pre> 1246 + <p>The output of this improved prompt is much better:</p> 1247 + <pre>The module provides operators to map and combine terms, which represent command line argument parsers and their results. `let+` transforms a parsed argument 1248 + into a new value, while `and+` merges two independent arguments into a tuple. These enable building structured command line interfaces, such as parsing a 1249 + filename and a flag simultaneously, then combining them into a configuration record.</pre> 1250 + <p>It still occasionally incorrectly decides that this module provides monadic combinators rather than applicative, but this is where we get better results from using a larger model.</p> 1251 + <p>There are quite a few other issues that we've fixed - for example, treating module types differently than modules, and a bug where infix operators were being omitted from the documentation. In one case, it uncovered a bug in the markdown generator where includes weren't getting expanded, which I got <a href="https://github.com/jonludlam/odoc/commit/926cca100c307818e57281c3d40e98f1975f0f95">Claude to fix</a>. My <i>modus operandi</i> has essentially been to look at the output for the test packages, find a summary that looks bonkers, and then look back at the prompt to find that, indeed, the input was missing some crucial information.</p> 1252 + <p>One thing I'd quite like to try is to re-open a <a href="https://github.com/ocaml/odoc/pull/655">PR that Drup made</a> as an April Fool's joke back in 2001, which ended up outputting OCaml formatted code. This is actually pretty close to what we might want to give to the LLM - though we'd probably format the comments as markdown, and we'd be replacing types with fully-qualified types as above. Funny how things turn out!</p>]]></content> 1253 + </entry> 1254 + <entry> 1255 + <id>https://jon.recoil.org/blog/2025/06/week23.html</id> 1256 + <title>Week 23</title> 1257 + <published>2025-06-09T00:00:00Z</published> 1258 + <updated>2025-06-09T00:00:00Z</updated> 1259 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/06/week23.html"/> 1260 + <summary>Some brief notes on last week.</summary> 1261 + <content type="html"><![CDATA[<h1 id="week-23"><a href="#week-23" class="anchor"></a>Week 23</h1> 1262 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>opam-format,fpath,rresult,bos</p></li></ul> 1263 + <ul class="at-tags"><li class="merlinonly"><span class="at-tag">merlinonly</span> </li></ul> 1264 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-06-09</p></li></ul> 1265 + <p>Some brief notes on last week.</p> 1266 + <h2 id="docs-ci-and-sherlodoc"><a href="#docs-ci-and-sherlodoc" class="anchor"></a>Docs CI and Sherlodoc</h2> 1267 + <p>Anil has been working on an <a href="https://tangled.sh/@anil.recoil.org/odoc-mcp">MCP server</a> that searches through the output of the docs CI to find relevant packages and API information for opam packages. For expediency, this works by scraping the HTML output, but a potentially better solution would be to integrate properly with <a href="https://doc.sherlocode.com">Sherlodoc</a>, <a href="https://github.com/art-w/">Arthur's</a> code search engine.</p> 1268 + <h3 id="building-the-index"><a href="#building-the-index" class="anchor"></a>Building the index</h3> 1269 + <p>To make this work with the new docs CI, first we need to build a sherlodoc search database. This involves grabbing all of the <code>.odocl</code> files that odoc produces for the latest version of each library, copying them locally and running <code>sherlodoc index</code> on the output. Getting <em>all</em> of the odocl files is simple, but filtering so we only have the latest version is slightly more complex, as we need to use <code>opam</code>'s library to make sure we're looking at the latest versions.</p> 1270 + <p>The simple way to get the odocl files ends up unpacking them into the filesystem in a directory hierarchy that matches the URL on ocaml.org, so we see files like:</p> 1271 + <pre>p/odoc/3.0.0/doc/odoc.document/odoc_document.odocl</pre> 1272 + <p>So finding the version number is a matter of listing the directories, for which I took <a href="https://github.com/ocurrent/ocaml-docs-ci/blob/4dfe7e6265610da4e0ce2a386cfbf0b8eac3d9bd/src/lib/track.ml#L58-L76">some code</a> from docs CI:</p> 1273 + <div><pre class="language-ocaml"><code>type p = { 1274 + opam : OpamPackage.t; 1275 + path : Fpath.t; 1276 + } 1277 + 1278 + let rec take n l = 1279 + match n, l with 1280 + | n, x::xs when n &gt; 0 -&gt; 1281 + x :: take (n - 1) xs 1282 + | _, _ -&gt; [] 1283 + 1284 + let get_versions ~limit path = 1285 + let open Rresult in 1286 + let package = Fpath.basename path in 1287 + let mk_pkg v = 1288 + Printf.sprintf &quot;%s.%s&quot; package v 1289 + in 1290 + Bos.OS.Dir.contents path 1291 + &gt;&gt;| (fun versions -&gt; 1292 + versions 1293 + |&gt; List.map (fun path -&gt; 1294 + { opam = Fpath.basename path |&gt; mk_pkg |&gt; OpamPackage.of_string; 1295 + path = path }) 1296 + ) 1297 + |&gt; Result.get_ok 1298 + |&gt; (fun v -&gt; 1299 + v 1300 + |&gt; List.sort (fun a b -&gt; 1301 + -OpamPackage.compare a.opam b.opam) 1302 + |&gt; take limit)</code></pre></div> 1303 + <p>This gives us a sorted list of the versions for the package, and we can pick the first one to get the latest version. With the output from this we can then run <code>sherlodoc index</code> and we get a nice big (1.7 gig!) index file.</p> 1304 + <h3 id="serving-the-index"><a href="#serving-the-index" class="anchor"></a>Serving the index</h3> 1305 + <p>The next step is to serve this index file so that the MCP server can access it. The file format is a marshalled OCaml value, so we need to have an OCaml program to read it in and perform the search, and it'll have to be a server since the whole index needs to be unmarshalled into memory before any search can be performed, and it would be dumb to do this for every query.</p> 1306 + <p>Sherlodoc got partially integrated into odoc's code base before the 3.0 release with the exception of the server, which was left out to avoid pulling a load of new dependencies to odoc. Unfortunately, we didn't expose the sherlodoc libraries publicly, so we'll need to do that in order to make anything useful with sherlodoc. In addition, odoc embeds the version of odoc used into the odocl files, and since right now the docs CI is building with a version of odoc that <em>doesn't</em> expose the libraries, we might have to hack around that in order to use those odocl files. Obviously the longer term solution is just to make a new release of odoc with this change and update the docs CI to use that.</p> 1307 + <h2 id="package-to-library-map"><a href="#package-to-library-map" class="anchor"></a>Package to Library map</h2> 1308 + <p>A related quest was to assemble a map of opam package to ocamlfind library names. It's a quirk of history that the library names that an opam package provides are not necessarily related to the name of the package. That means that tools like <code>dune</code> have a hard time linting projects to check that the libraries they're using are mentioned in the opam files. Dune, of course, has resolved this be ensuring that it's an error to build a package using dune where the library names don't start with the package name, but as dune is just one of many OCaml build systems, the problem remains.</p> 1309 + <p>Since docs CI has built every version of every package, and because the Odoc 3 package layout includes the library names in the paths to the documentation, we should be able to produce a fairly definitive list of the libraries that each package provides, which tools like dune can then use for this sort of lint check. We can just tweak the code above slightly to get the library names and output a big JSON file with the mapping - or perhaps with the exceptions to dune's rule.</p> 1310 + <p>I thought this would be a neat first project to try Claude Code on - a 'starter for ten' - as it were, so I signed up to use Claude code and gave it a shot.</p> 1311 + <p>It handily produced a working program that did exactly what I wanted, including creating a test directory that it used to verify the code worked. One fascinating thing I noted as it scrolled past was that it tried to use <code>yojson</code> to write the output, but failed to get it to work and reverted back to <code>printf</code> output. I suspect this will be due to it finding it troublesome to figure out the various steps that need to be taken to use a new library in a dune project, so this is something to have a play with later.</p> 1312 + <p>After a couple of iterations with different heuristics to disambiguate between library names and other directories, I got a working program producing a JSON file with only the exceptions to dune's rule. I took a look through and almost immediately found <code>camlidl</code> suggesting it produces a library called <code>com</code>. This didn't look right at all so I installed it and found that the library is actually named <code>camlidl</code>. The <code>cma</code> file, though, is named <code>com.cma</code>, so it looks like <code>odoc_driver</code> has a bug. Interestingly, running <code>odoc_driver</code> locally gets the library name correct, so it's only an issue when running it in the docs CI. <a href="https://github.com/ocaml/odoc/issues/1351">Issue filed</a>.</p> 1313 + <h2 id="further-claude-code-experiments"><a href="#further-claude-code-experiments" class="anchor"></a>Further claude code experiments</h2> 1314 + <p>To see how well Claude Code could handle more complex tasks, I thought I'd give it a whirl on something more like its home territory, and somewhere where I was less familiar. I decided to ask it to write some code to make a nicer editor experience for the notebooks project. Since the <a href="https://github.com/jonludlam/jsoo-code-mirror">bindings to codemirror</a> I'm using are very minimal, any new features I want to use end up with needing to write a bunch of bindings first, and only then seeing if the feature works as I'd like. So instead I thought I'd get claude to write the editor code for me in javascript, and then I could make sure it works as I want and only then convert it to OCaml. This worked pretty nicely, and I've now got a neat <a href="https://jon.ludl.am/experiments/notebook-editor/notebook-editable.html">demonstration editor</a> that I can use to guide the OCaml implementation.</p> 1315 + <h2 id="more-notebook-work"><a href="#more-notebook-work" class="anchor"></a>More notebook work</h2> 1316 + <p>The oxcaml project will be launching this week hopefully. I've been looking at Luke's <a href="https://github.com/ocaml-flambda/flambda-backend/pull/3886">Parallelism tutorial</a> and have been thinking about how this will work with the online notebooks. The parallel library works via effects, and the oxcaml branch of js_of_ocaml doesn't support effects yet, and it might be a while before it does. However, the blog post is mainly talking about the intricacies of the type system work that's been done to ensure the parallel library is safe, and as such perhaps we can get a lot out of doing this online with just Merlin.</p> 1317 + <p>Some early experimentation showed that the parallel library breaks the worker on load, so we need to do something a bit more sophisticated than just 'not call exec', so I did some work to have a mode of worker that doesn't load the cmas, just the cmis for Merlin. This is almost there.</p> 1318 + <h2 id="odoc-work"><a href="#odoc-work" class="anchor"></a>Odoc work</h2> 1319 + <p>Ocaml 5.4 is just around the corner, and there's some odoc work to be done before the release. One of the main new features that will impact odoc is the new <a href="https://tyconmismatch.com/papers/ml2024_labeled_tuples.pdf">labelled tuples</a> feature. Fortunately <a href="https://github.com/lukemaurer">Luke Maurer</a> has already done a lot of work to plumb this into odoc, so this will save us a lot of work - thanks, Luke! There's likely to be a few other bits and pieces to do, but hopefully not too much.</p>]]></content> 1320 + </entry> 1321 + <entry> 1322 + <id>https://jon.recoil.org/blog/2025/05/docs-progress.html</id> 1323 + <title>Progress in OCaml docs</title> 1324 + <published>2025-05-29T00:00:00Z</published> 1325 + <updated>2025-05-29T00:00:00Z</updated> 1326 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/docs-progress.html"/> 1327 + <summary>The docs build is progress well, and we've hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any in...</summary> 1328 + <content type="html"><![CDATA[<h1 id="progress-in-ocaml-docs"><a href="#progress-in-ocaml-docs" class="anchor"></a>Progress in OCaml docs</h1> 1329 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-29</p></li></ul> 1330 + <p>The docs build is progress well, and we've <i>just about</i> hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any insights to be gained.</p> 1331 + <p>Odoc requires a built package in order to generate the docs, there are two steps that have to be done before we can begin building the docs. Step one is to figure out the exact set of packages to build - ie, doing an opam solve, and step two is to actually build the packages. These two steps are, to some extent, out of docs-ci's control, and rely on the state of opam repository. While there are efforts to keep this in as good a state as possible, it's still the case that these steps fail much more often than the actual docs build itself. Let's take a look at some of the failures we see in each of these steps.</p> 1332 + <h2 id="step-1:-opam-solve"><a href="#step-1:-opam-solve" class="anchor"></a>Step 1: opam solve</h2> 1333 + <p>There are 2,074 solver failures. A good chunk of these are due to the way docs ci works itself, that it starts with a specific version of OCaml. In order to do this, the solution must have a specific version of OCaml in it, and this is not always the case, for example, all of the <code>conf-*</code> packages fail in this way. This particular class of &quot;failures&quot; is not at all important, as mostly they don't contain useful documentation, but even if they do, if they're actually being used then they will be built as part of another solution. For example, while <code>conf-faad</code> fails with this error, the solution of the <code>faad</code> package pulls it in anyway, so we can build any docs that it includes. Roughly two thirds (685) of the reported failures are due to this, and by checking the resulting HTML docs we can see that we do get docs for 278 of these, so they must be pulled in by other solutions.</p> 1334 + <p>The remaining failures are &quot;real&quot; in the sense that we never currently get docs for these packages. In turn, these can be subcategorised. One class of failures happen with platform-specific packages, for example <code>camlkit</code> which provides bindings to Cocoa frameworks, and is only available on macOS, or <code>eio_windows</code> which clearly requires Windows. The current docs-ci setup only builds on Linux, and extending this to other platforms will require a little more work, and is not currently scheduled. These are &quot;fixable&quot; failures.</p> 1335 + <p>The third class of failures are those that will &quot;just never work&quot;. For example, some early versions of <code>domainslib</code> were released before the OCaml 5.0 APIs were finalised, and so they can only work with alpha versions of OCaml 5.0. We won't be documenting these.</p> 1336 + <p>Finally there are some more 'unexplained' failures, such as <code>docteur.0.0.2</code>. This one's particularly interesting as the solve actually succeeds when using the stand-alone tool opam-0install, whereas it's failing in docs-ci, which uses opam-0install as a library! I'm currently suspicious about the 'deprecated' flag, as the failure log contains:</p> 1337 + <pre>- git-cohttp-unix -&gt; (problem) 1338 + No usable implementations: 1339 + git-cohttp-unix.3.6.0: Availability condition not satisfied 1340 + git-cohttp-unix.3.5.0: Availability condition not satisfied 1341 + git-cohttp-unix.3.4.0: Availability condition not satisfied 1342 + git-cohttp-unix.3.3.3: Availability condition not satisfied 1343 + git-cohttp-unix.3.3.2: Availability condition not satisfied 1344 + ...</pre> 1345 + <p>and that flag is the only thing I can immediately see that stands out in <code>git-cohttp-unix</code>. In contrast, the solution given by opam-0install contains <code>git-cohttp-unix.3.6.0</code> as a dependency. I suspect fixing this will cause quite a few more packages to succeed.</p> 1346 + <h2 id="step-2:-building-packages"><a href="#step-2:-building-packages" class="anchor"></a>Step 2: building packages</h2> 1347 + <p>The next step, once we've got the solutions, is to build the packages. This is using the new method I <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">previously wrote about</a>. There are about 1,000 packages that fail to build, and once again we can take a look and categorise some of these failures. There are a wider variety of failures here, and it's quite useful to cross-check with <em>opam health check</em> to see if it's known to be broken. Unfortunately OHC only builds the latest versions of everything, so we can't check in some cases. The interesting issues are where we're failing to build something that seems to work in OHC.</p> 1348 + <h3 id="llvm.18"><a href="#llvm.18" class="anchor"></a>llvm.18</h3> 1349 + <p>This is an interesting type of error, where the build fails because of a missing external dependency. The <code>llvm</code> package depends upon <code>conf-llvm-static.18</code>, which should be able to install the depext. Looking at the package, it does indeed have a depext for Debian:</p> 1350 + <pre>depexts: [ 1351 + [&quot;llvm@18&quot; &quot;zstd&quot;] {os-distribution = &quot;homebrew&quot; &amp; os = &quot;macos&quot;} 1352 + [&quot;llvm-18&quot;] {os-distribution = &quot;macports&quot; &amp; os = &quot;macos&quot;} 1353 + [&quot;llvm-18-dev&quot; &quot;zlib1g-dev&quot; &quot;libzstd-dev&quot;] {os-family = &quot;debian&quot;} 1354 + [&quot;llvm18-dev&quot;] {os-distribution = &quot;alpine&quot;} 1355 + [&quot;llvm18&quot;] {os-family = &quot;arch&quot;} 1356 + [&quot;llvm18-devel&quot;] {os-family = &quot;suse&quot; | os-family = &quot;opensuse&quot;} 1357 + [&quot;llvm18-devel&quot;] {os-distribution = &quot;fedora&quot; &amp; os-version &gt;= &quot;41&quot;} 1358 + [&quot;llvm-devel&quot;] {os-distribution = &quot;fedora&quot; &amp; os-version = &quot;40&quot;} 1359 + [&quot;llvm18-devel&quot; &quot;epel-release&quot;] {os-distribution = &quot;centos&quot;} 1360 + [&quot;devel/llvm18&quot;] {os = &quot;freebsd&quot;} 1361 + ]</pre> 1362 + <p>However, in Debian 12, they've already updated to <code>llvm-19</code>, so the depext is not available.</p> 1363 + <h3 id="camlimages.5.0.5"><a href="#camlimages.5.0.5" class="anchor"></a>camlimages.5.0.5</h3> 1364 + <p>This one fails due to a linking error. Oddly enough it does seem to work in OHC.</p> 1365 + <pre>(cd _build/default &amp;&amp; /home/opam/.opam/4.14/bin/ocamlmklib.opt -g -o freetype/camlimages_freetype_stubs freetype/ftintf.o -ldopt -lfreetype) 1366 + # /usr/bin/ld: freetype/ftintf.o: warning: relocation against `Caml_state' in read-only section `.text' 1367 + # /usr/bin/ld: freetype/ftintf.o: relocation R_X86_64_PC32 against undefined symbol `Caml_state' can not be used when making a shared object; recompile with -fPIC 1368 + # /usr/bin/ld: final link failed: bad value</pre> 1369 + <h3 id="ahrocksdb.0.2.2"><a href="#ahrocksdb.0.2.2" class="anchor"></a>ahrocksdb.0.2.2</h3> 1370 + <p>This one fails in OHC too, but it looks like it's a build failure with more recent gccs, fixed upstream: https://github.com/ahrefs/ocaml-ahrocksdb/commit/e52316b3d30fededac023141bf8b47da79cabfed</p> 1371 + <pre># run: gcc -O2 -fno-strict-aliasing -fwrapv -fPIC -pthread -I/usr/include/rocksdb -I /home/opam/.opam/5.3/lib/ocaml -o /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.exe /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c -lm -lpthread -lrocksdb 1372 + # -&gt; process exited with code 1 1373 + # -&gt; stdout: 1374 + # -&gt; stderr: 1375 + # | In file included from /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c:4: 1376 + # | /usr/include/rocksdb/version.h:7:10: fatal error: string: No such file or directory 1377 + # | 7 | #include &lt;string&gt; 1378 + # | | ^~~~~~~~ 1379 + # | compilation terminated. 1380 + # Error: discover error</pre> 1381 + <h3 id="alt-ergo.2.2.0"><a href="#alt-ergo.2.2.0" class="anchor"></a>alt-ergo.2.2.0</h3> 1382 + <p>Looks like it's trying to write outside the sandbox. The failure only occurs on alt-ergo 1.3.0 - 2.2.0.</p> 1383 + <pre># mkdir -p /home/opam/.opam/4.14/man/man1 1384 + # cp -f doc/alt-ergo.1 /home/opam/.opam/4.14/man/man1 1385 + # mkdir -p /usr/local/lib/alt-ergo/preludes 1386 + # mkdir: cannot create directory '/usr/local/lib/alt-ergo': Permission denied 1387 + # make: *** [Makefile.users:243: install-preludes] Error 1</pre> 1388 + <h3 id="ctypes-foreign.0.18.0"><a href="#ctypes-foreign.0.18.0" class="anchor"></a>ctypes-foreign.0.18.0</h3> 1389 + <p>This one is a much more interesting failure. The logs show:</p> 1390 + <pre>[ERROR] No solution for ctypes-foreign.0.18.0: * Missing dependency: 1391 + - ctypes-foreign -&gt; ctypes 1392 + unknown package</pre> 1393 + <p>which is happening because of the optimisation I <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">mentioned before</a> where we build a new <code>opam-repository</code> with only the packages we're going to need. In this case, we've somehow missed out the <code>ctypes</code> package. Looking at the opam file for <code>ctypes-foreign</code>, it has a <code>post</code> dependency on <code>ctypes</code>. The <code>post</code> keyword indicates that <code>ctypes</code> should be installed with <code>ctypes-foreign</code>, but that having it as a &quot;normal&quot; dependency would introduce a dependency cycle. Since we require a DAG of dependencies, we explicitly remove any <code>post</code> dependencies from the set of packages to build, but it seems that <code>opam</code> would like to know about it anyway!</p> 1394 + <h3 id="others"><a href="#others" class="anchor"></a>others</h3> 1395 + <p>There are many more. An automatic cross-check with OHC would be really useful, mainly to distinguish between the packages that are broken due to <code>ocaml-docs-ci</code> issues (like <code>ctypes-foreign</code>) and those that are broken for other reasons (like <code>ahrocksdb</code>).</p> 1396 + <h2 id="step-3:-building-docs"><a href="#step-3:-building-docs" class="anchor"></a>Step 3: building docs</h2> 1397 + <p>Finally, we have the actual docs build. This is where we run <code>odoc</code> and <code>odoc_driver</code> to produce the HTML docs. All the errors here are ones that we should be able to fix!</p> 1398 + <p>Firstly, there are the internal errors:</p> 1399 + <pre>Uncaught exception: Failure(&quot;\&quot;rm\&quot; \&quot;-rf\&quot; \&quot;/var/cache/obuilder/merged/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&quot; \&quot;/var/cache/obuilder/work/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&quot; failed with exit status 1&quot;) 1400 + 2025-05-22 09:30.18: Job failed: Failed: Internal error</pre> 1401 + <p>These are some <code>obuilder</code> error that needs fixing. Currently we're just rerunning the job to fix these.</p> 1402 + <h3 id="odoc.2.0.0"><a href="#odoc.2.0.0" class="anchor"></a>odoc.2.0.0</h3> 1403 + <p>Oops, we can't build our own docs! At least it's an old version :-)</p> 1404 + <pre>odoc: internal error, uncaught exception: 1405 + File &quot;src/html/link.ml&quot;, line 101, characters 16-22: Assertion failed 1406 + Raised at Odoc_html__Link.href in file &quot;src/html/link.ml&quot;, line 101, characters 16-57 1407 + Called from Odoc_html__Generator.internallink in file &quot;src/html/generator.ml&quot;, line 108, characters 19-49 1408 + ...</pre> 1409 + <p>The failure points <a href="https://github.com/ocaml/odoc/blob/42190737339d9be4510eeeb0e3c47e84badf4d73/src/html/link.ml#L101">here</a>, an assertion about the common ancestor of two paths. <a href="https://github.com/ocaml/odoc/issues/1345">Issue filed</a>.</p> 1410 + <h3 id="ocaml-base-compiler.4.07.0"><a href="#ocaml-base-compiler.4.07.0" class="anchor"></a>ocaml-base-compiler.4.07.0</h3> 1411 + <p>This one happens because of our &quot;optimisation&quot; to use a base image with OCaml pre-installed. What we <i>actually</i> do is find the major/minor version of OCaml and use the corresponding docker image - so in this case we'll use ocaml/opam:debian-12-ocaml-4.07. Now this image actually contains OCaml 4.07.1, and the format of <code>cmt</code> and <code>cmti</code> files changed between these releases, so we get a failure.</p> 1412 + <p>We'll fix this by getting rid of the optimisation and building from an empty switch.</p> 1413 + <h3 id="lascar.0.7.0"><a href="#lascar.0.7.0" class="anchor"></a>lascar.0.7.0</h3> 1414 + <p>This one is quite interesting. It's another assertion failure in odoc:</p> 1415 + <pre>odoc: internal error, uncaught exception: 1416 + File &quot;src/xref2/cpath.ml&quot;, line 364, characters 37-43: Assertion failed 1417 + Raised at Odoc_xref2__Cpath.unresolve_resolved_parent_path in file &quot;src/xref2/cpath.ml&quot;, line 364, characters 37-49 1418 + Called from Odoc_xref2__Cpath.unresolve_module_path in file &quot;src/xref2/cpath.ml&quot;, line 349, characters 28-60 1419 + Called from Odoc_xref2__Tools.fragmap.map_module_decl in file &quot;src/xref2/tools.ml&quot;, line 1792, characters 48-80</pre> 1420 + <p>It's happening when we 'unresolve' a previously resolved path. We end up having to do this when something about the path has changed, in this case while we're handling a <code>S with module Foo = Bar</code> or similar. Issue <a href="https://github.com/ocaml/odoc/issues/1346">filed</a>.</p> 1421 + <h3 id="camlp5"><a href="#camlp5" class="anchor"></a>camlp5</h3> 1422 + <p>This one actually occurs in <code>odoc_driver</code> rather than in <code>odoc</code> itself.</p> 1423 + <pre>odoc_driver_voodoo: [DEBUG] Found cmi_only_lib in dir: /home/opam/.opam/4.08/lib/camlp5 1424 + odoc_driver_voodoo: internal error, uncaught exception: 1425 + Invalid_argument(&quot;\&quot;/home/opam/.opam/4.08/lib/camlp5\&quot;: invalid segment&quot;) 1426 + </pre> 1427 + <p>Here we're trying to add a segment to a path, but rather than a single path segment we've got an entire fully qualified path. Issue <a href="https://github.com/ocaml/odoc/issues/1347">filed</a>.</p> 1428 + <h2 id="conclusion"><a href="#conclusion" class="anchor"></a>Conclusion</h2> 1429 + <p>It's pretty good that we've only got 4 types of error happening at the doc-generation phase. However, as a whole, any error that occurs earlier in the pipeline ends up with a missing documentation tab on the website, and we need to do a bit more so that the actual problem can be tracked down and fixed. This is obviously a more general problem than just the docs, and one that <a href="https://check.ci.ocaml.org">opam health check</a> seeks to highlight. However, the current incarnation of OHC is significantly less efficient than docs-ci, so generalising the approach we've taken with <a href="https://github.com/jonludlam/opamh">opamh</a> should really help with making this more responsive.</p> 1430 + <p>In addition, a number of the issues seen here could be addressed with a tool my colleague <a href="https://ryan.freumh.org/">Ryan</a> is working on: <a href="https://ryan.freumh.org/enki.html">Enki</a>. This tool would allow us to run a solve that actually determines not only the set of packages we wish to install, but the platform to install onto - e.g. for <code>eio_windows</code> the solution would be to install on Windows, and for <code>llvm.18-static</code> the solution might be Fedora 40.</p>]]></content> 1431 + </entry> 1432 + <entry> 1433 + <id>https://jon.recoil.org/blog/2025/05/lots-of-things.html</id> 1434 + <title>Lots of things have been happening</title> 1435 + <published>2025-05-20T00:00:00Z</published> 1436 + <updated>2025-05-20T00:00:00Z</updated> 1437 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/lots-of-things.html"/> 1438 + <summary>I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, s...</summary> 1439 + <content type="html"><![CDATA[<h1 id="lots-of-things-have-been-happening"><a href="#lots-of-things-have-been-happening" class="anchor"></a>Lots of things have been happening</h1> 1440 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-20</p></li></ul> 1441 + <p>I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, so I haven't written up anything for a while.</p> 1442 + <p>Time for a little summary of things then!</p> 1443 + <h2 id="ocaml-docs-ci"><a href="#ocaml-docs-ci" class="anchor"></a>Ocaml-docs-ci</h2> 1444 + <p>I've been working with <a href="https://tunbury.org/">Mark Elvers</a> on getting the docs CI running using Odoc 3.0. There are quite a few changes involved, both in how we're <a href="../04/ocaml-docs-ci-and-odoc-3.html" title="ocaml-docs-ci-and-odoc-3">building the packages</a> but also how we're running odoc - it's building using <code>odoc_driver</code> rather than <code>voodoo</code> now, and while it's looking promising now we had hit a few hurdles along the way.</p> 1445 + <p>We set the CI going last weekend but discovered that it was having some issues building packages using OCaml 5.3.0. The way the builds work is that we first do a &quot;solve&quot; step for each version of every package so we've got exact versions of all of the packages required to build them. We then look through that solution to figure out the version of OCaml required, and the (to avoid a little bit of work) we start from one of the <a href="https://hub.docker.com/r/ocaml/opam">opam docker images</a> for that version of OCaml.</p> 1446 + <p>When installing a package using opam it does a few operations that scale with the size of the opam repository, which ends up adding around ten of seconds to the build time. When we're building 50,000 packages, this adds up to quite a lot of time, so we short-cut this process with the simple expedient of creating an opam-repository that only contains the packages we need for the build. However, since we've already got a few packages installed in the image, we need to make sure our repository contains these packages too, otherwise opam gets thoroughly confused. My mistake was that we were missing out the `ocaml-compiler` package, which is new in OCaml 5.3.0, which led to the builds failing. Adding this in and kicking off the build again it's now got a lot further - at time of writing it has built 14,000 packages, there are 6,000 still building, and 1000 that have failed. If it continues in a similar fashion, this will compare quite favourably with the docs CI that's currently powering ocaml.org, where it has successfully built 17,000 packages, and 4,500 have failed.</p> 1447 + <p>Mark has been working on a different approach to the build process, which is to come up with a new binary that doesn't do any of the <code>O(n)</code> operations and just builds the package! This is definitely a promising direction, and I'm hoping to take a look at <a href="https://github.com/mtelvers/ohc">his prototype</a> soon.</p> 1448 + <p>Meanwhile, <a href="https://choum.net">panglesd</a> is working on integrating this into the ocaml.org site, and is making good progress. He spotted last week that we were overwriting the `status.json` file that comes out of `odoc_driver` which we will use to power the redirections on ocaml.org. One of the changes of odoc 3.0 is that we carefully put modules into a directory structure that represents the library in which they are found. It's long been a pain that OCaml libraries (what Ocamlfind unhelpfully calls 'packages') are not always the same name as the opam package in which they're found. For example, the package <code>ocamlfind</code> contains the library <code>findlib</code>. So to help the user figure out where to find the module, we're putting it into the URL of the docs, and therefore into the breadcrumbs. The downside is that the modules are now in a different place on the website to where they were before, so the <code>status.json</code> file is there to help with the redirections.</p> 1449 + <h2 id="notebooks"><a href="#notebooks" class="anchor"></a>Notebooks</h2> 1450 + <p>I've been working on Merlin integration with the notebooks, which has been a fun little project. The bits that needed improving most were that merlin didn't work with toplevel-style code, and that each cell was a separate typing context, so while you could define a function in one cell and execute it in another, Merlin would tell you the function was undefined.</p> 1451 + <p>For the toplevel-style code, what I've ended up doing is to essentially strip out all of the toplevel bits and pieces, and replace them with whitespace. So where I have a cell that looks like:</p> 1452 + <pre># let x = 1 + 2;; 1453 + - val x : int = 3 1454 + # let y = x + 1;; 1455 + - val y : int = 4</pre> 1456 + <p>I tell Merlin that the contents are:</p> 1457 + <pre> let x = 1 + 2;; 1458 + 1459 + let y = x + 1;; 1460 + </pre> 1461 + <p>where I'm careful to maintain the position of the original code. This bit is working quite nicely, but only when the code is syntactically correct, as I'm using the standard toplevel parser to figure out where the expression ends. I think I'm going to end up needing to write a custom parser for this, so something that will end on a <code>;;</code> but ignore them in string constants, comments and so on.</p> 1462 + <p>The approach I've taken for the second problem is to treat each cell as a separate module. I then write out a <code>cmi</code> file into the virtual filesystem as <code>cell__id_0.cmi</code> and <code>open</code> all the previous modules in 'line 0' of every cell. I then remap all of the reported locations by removing 'line 0'.</p> 1463 + <p>There are a number of issues with the current approaches: 1. The stripping of the toplevel bits is a little fragile, and currently only works when the toplevel is syntactically correct. This is fairly fixable. 2. When the contents of the cells change we need to flush any caches merlin and the compiler have. 3. An <code>open</code> statement in once cell does _not_ cause the module to be available in the next cell. 4. A lot of cells leads to a lot of opens!</p> 1464 + <p>I suspect that this the 'cells as modules' approach might end up being a bit of a dead-end, so I'll have a chat with <a href="https://github.com/voodoos">Ulysse</a> to figure out the next experiment.</p> 1465 + <h2 id="oxcaml"><a href="#oxcaml" class="anchor"></a>Oxcaml</h2> 1466 + <p>I've been working on trying out oxcaml too, which has been a bit challenging. Firstly, although Jane Street provide a version of <code>js_of_ocaml</code>, the toplevel didn't work. Fortunately, my amazing colleagues <a href="https://patrick.sirref.org/">Patrick O'Ferris</a> and <a href="https://github.com/art-w">Arthur Wendling</a> spent a good chunk of time fixing this and provided an <a href="https://github.com/patricoferris/opam-repository-js#with-extensions">opam repository</a> with the relevant changes, without which I would have not been able to make any progress. Thanks, guys! So my goal of making my notebooks work with it looked doable, but I almost immediately hit more dependency issues that make it problematic to port the whole site over - including odoc and various PPXes that I use.</p> 1467 + <p>I've therefore decided that I would bring forward a feature that I'd had in mind for a while - that we could have different &quot;backends&quot; for the notebooks. So I'd still build the frontend using &quot;normal&quot; OCaml, but the web-worker serving as the toplevel would be an oxcaml one.</p> 1468 + <p>Of course, it didn't work first time! After a bit of head-scratching, it turned out that the interface between the worker and the main thread, although I'd <i>almost</i> got it ocaml-agnostic, wasn't quite right. The way it works is that it uses the jsonrpc protocol to communicate, and while it had marshalled the requests into a string, it hadn't turned that final OCaml string into a Javascript string, so it was sending the js_of_ocaml representation of a string as an object, rather than a simple string. When the frontend and workers were built with different compilers, this ended up just failing with an obscure error, which took a good deal of time to track down. Once that was fixed, it was just a case of making sure I could have 2 independent 'switches' on my site - one for oxcaml and one for standard OCaml.</p> 1469 + <p>The upshot of all this is that I now have a semi-working version of the notebooks using oxcaml. As an initial demonstration I ported one of my colleague <a href="https://github.com/cuihtlauac">Cuihtlauac</a>'s oxcaml tutorial docs to the notebook format, and it <a href="../../../notebooks/oxcaml/local.html" title="local">works quite nicely</a>.</p>]]></content> 1470 + </entry> 1471 + <entry> 1472 + <id>https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html</id> 1473 + <title>Solving First-year OCaml exercises with AI</title> 1474 + <published>2025-05-07T00:00:00Z</published> 1475 + <updated>2025-05-07T00:00:00Z</updated> 1476 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html"/> 1477 + <summary>My colleague and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has don...</summary> 1478 + <content type="html"><![CDATA[<h1 id="solving-first-year-ocaml-exercises-with-ai"><a href="#solving-first-year-ocaml-exercises-with-ai" class="anchor"></a>Solving First-year OCaml exercises with AI</h1> 1479 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-07</p></li></ul> 1480 + <p>My colleague <a href="https://toao.com">Sadiq Jaffer</a> and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has done an excellent <a href="https://toao.com/blog/ocaml-local-code-models">write up</a> of our initial results, which you should all go and read! The tl;dr though, as Sadiq writes, is that even some of the smaller models would score top marks on these exercises!</p> 1481 + <p>One interesting aspect we discovered quite quickly is that we had to make the testing feedback a little more generous than just &quot;exception raised&quot;! The problems are presented as a Jupyter notebook using <a href="https://github.com/akabe">akabe's</a> excellent OCaml kernel, with <a href="https://nbgrader.readthedocs.io/en/stable/">nbgrader</a> to do the assessment. Our students can see the tests that are run, and if they fail they're able to copy the test cell out and play with their code to figure out exactly what went wrong. The AI models, however, have a far less interactive experience, and get just 3 chances to write code that passes the tests. We found that the performance of the models increased hugely when we adjusted the test cells such that they clearly indicated which test failed, the results that were expected, and the results the code actually produced.</p> 1482 + <p>Of course, we <a href="https://anil.recoil.org/notes/claude-copilot-sandbox">already knew</a> that AI models can code OCaml very well, and we (along with the rest of the teaching world) are still ruminating on the implications of this from a pedagogical perspective. Our plan, though, is to try and make the 'problem' worse by training these models on more OCaml code, and see just how well we can get them to perform! It's pretty amazing, and a little startling to know that a model that'll run pretty comfortably on my laptop can solve these problems so well even without extra training, though given how hot it gets, I'd rather not have the laptop on my actual lap while it's doing so!</p>]]></content> 1483 + </entry> 1484 + <entry> 1485 + <id>https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html</id> 1486 + <title>OxCaml is getting closer...</title> 1487 + <published>2025-05-02T00:00:00Z</published> 1488 + <updated>2025-05-02T00:00:00Z</updated> 1489 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html"/> 1490 + <summary>I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</summary> 1491 + <content type="html"><![CDATA[<h1 id="oxcaml-is-getting-closer..."><a href="#oxcaml-is-getting-closer..." class="anchor"></a>OxCaml is getting closer...</h1> 1492 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-02</p></li></ul> 1493 + <p>I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</p> 1494 + <p>It seems that mainly what needs to be done before the release can be made is to ensure there is some reasonable documentation for the new features, and that a reasonable number of packages are working, so people are furiously writing and bugfixing to try and get this ready.</p> 1495 + <p>As well as this though, there are some challenges of a more organisational level that will need to be addressed to ensure the success of the project. Jane Street have long had a public branch of their compiler, but while they've had patches internally to ensure the tooling and other libraries work, these patches haven't previously been made public in a usable way. In order for OxCaml to be useful, it will clearly need these patches not only to be available, but also to be maintained and to easily allow contributions from the community -- in short, they need to be properly Open Source!</p> 1496 + <p>Personally, I'm looking forward to seeing their branch of <a href="https://ocaml.github.io/odoc/">odoc</a> and having a look to see how the modes will fit into the documentation. I'm also keen to see whether the <a href="../04/this-site.html" title="this-site">notebook features</a> I've been working on can be ported over to run on OxCaml!</p>]]></content> 1497 + </entry> 1498 + <entry> 1499 + <id>https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html</id> 1500 + <title>AI for Climate &amp; Nature Community Day</title> 1501 + <published>2025-05-01T00:00:00Z</published> 1502 + <updated>2025-05-01T00:00:00Z</updated> 1503 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html"/> 1504 + <summary> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the . A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</summary> 1505 + <content type="html"><![CDATA[<h1 id="ai-for-climate-&amp;-nature-community-day"><a href="#ai-for-climate-&amp;-nature-community-day" class="anchor"></a>AI for Climate &amp; Nature Community Day</h1> 1506 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-05-01</p></li></ul> 1507 + <div><a href="melissa.jpg" class="img-link"><img src="melissa.jpg" alt="Melissa Leach"/></a></div> 1508 + <p><i>Melissa Leach introducing the day</i> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the <a href="https://map.cam.ac.uk/?maplon=0.12032&amp;maplat=52.20354&amp;mapzoom=18&amp;maplayers=Building+Labels%2CExternal+Sites%2CColleges%2CUniversity+Sites%2CBuildings%2CTransport&amp;mapfeature=mfid257%2CBuildings">David Attenborough Building</a>. A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</p> 1509 + <h2 id="keynote-and-main-talks"><a href="#keynote-and-main-talks" class="anchor"></a>Keynote and main talks</h2> 1510 + <p>Following the intro talks from Professors <a href="https://www.cambridgeconservation.org/about/people/prof-melissa-leach/">Melissa Leach</a> and <a href="https://www.zoo.cam.ac.uk/directory/bill-sutherland">Bill Sutherland</a>, the day started with the keynote talk from <a href="https://www.biology.ox.ac.uk/people/amy-hinsley">Amy Hinsley</a>, who, using the specific case of animial trafficking, talked about the need to make AI in conservation equitable, explainable and useful.</p> 1511 + <div><a href="amy.jpg" class="img-link"><img src="amy.jpg" alt="Amy Hinsley"/></a></div> 1512 + <p><i>Amy Hinsley delivering the keynote talk</i></p> 1513 + <p>We then moved into the first session with <a href="https://www.geog.cam.ac.uk/people/lines/">Emily Lines</a> from the Geography Department who talked about the challenges processing sensor data in the context of forests. Her group has a variety of data collected from forests across Europe, collected from using many different methods, from drones taking pictures of the canopies to ground-based laser scanners producing 3d point clouds. The challenge is then not only to identify individual trees, which is pretty tricky, but also to then distinguish between the leaves of the trees and the wood.</p> 1514 + <p>After Emily came <a href="https://ai.cam.ac.uk/people/robert-rouse.html">Robert Rouse</a> from the <a href="https://iccs.cam.ac.uk">ICCS</a>, who's using a small neural net and genetic algorithms to extend a study from the RSPB on figuring out an optimal way to do some land use adjustments to cut carbon and improve outcomes for birds, whilst not significantly impacting the ability to produce food.</p> 1515 + <p>We then had <a href="https://www.zoo.cam.ac.uk/directory/dr-sam-reynolds">Sam Reynolds</a> and <a href="https://toao.com">Sadiq Jaffer</a> who talked about their project; using AI to sift through millions of papers looking for those relevant to a specified conservation topic. They're able to directly compare their results with results obtained by manually doing this process, a project that's been going on over the last 20 or so years summing to something like 75 man years of effort. In the end they only missed a few papers that the manual process had found, but actually found many relevant papers that had been missed - and all in only a few days of compute.</p> 1516 + <div><a href="sadiq.jpg" class="img-link"><img src="sadiq.jpg" alt="Sam Reynolds and Sadiq Jaffer"/></a></div> 1517 + <p><i>Sam Reynolds and Sadiq Jaffer sorting millions of papers</i></p> 1518 + <h2 id="lightning-talks"><a href="#lightning-talks" class="anchor"></a>Lightning talks</h2> 1519 + <p>We then had a number of 'lightning talks', with each presenter having only three minutes to talk about their work.</p> 1520 + <ul><li><a href="https://www.maths.cam.ac.uk/person/ss3299">Sebastian Schemm</a> presented his work on creating a foundational model for the climate</li><li><a href="https://www.eng.cam.ac.uk/profiles/ac685">Alice Cicirello</a> talked about the prospects of applying machine learning to <a href="https://en.wikipedia.org/wiki/Marine_cloud_brightening">Marine Cloud Brightening</a></li><li><a href="https://www.maths.cam.ac.uk/person/sdat2">Simon Thomas</a> has been looking at analysing the heights of tropical cyclone storm surges</li><li><a href="https://github.com/niccolozanotti">Niccolò Zanotti</a> gave us an introduction to <a href="https://github.com/cambridge-ICCS/FTorch">FTorch</a>, a library to integrate the worlds of PyTorch and Fortran</li><li><a href="https://www.nceo.ac.uk/contact-us/people/dr-simon-driscoll/">Simon Driscoll</a> then talked about melt ponds on arctic sea ice, a poorly understood but important component of the climate in the Arctic.</li><li><a href="https://www.zoo.cam.ac.uk/directory/emilio-luz-ricca">Emilio Luz-Ricca</a> talked about his project to apply machine learning to predict hunting pressure</li><li><a href="https://orlando-code.github.io">Orlando Timmerman</a> gave us some insights into how he's been using machine learning to predict the future of coral reefs, and how we might use this to help with their conservation.</li><li><a href="https://www.zoo.cam.ac.uk/directory/ruari-marshall-hawkes">Ruari Marshall-Hawkes</a> showed us how to listen very carefully to figure out population numbers,</li><li><a href="https://www.linkedin.com/in/harriet-branson-a93a8313b/">Hattie Branson</a> from <a href="https://www.fauna-flora.org">Fauna &amp; Flora</a> talked about habitat detection in South Sudan,</li><li><a href="https://www.linkedin.com/in/martakoch/">Marta Koch</a> showed us an analysis of how well ChatGPT, Claude and the like would perform at setting the agendas for SDPs,</li><li><a href="https://www.linkedin.com/in/zhengpeng-feng-2410a132a/">Frank Feng</a> finished the session with a talk on the <a href="https://www.cst.cam.ac.uk/seminars/list/227335">Barlow Twins Earth Foundation Model</a>.</li></ul> 1521 + 1522 + <div style="display: grid; grid-template-columns: 1fr 1fr; gap: 20px;"> 1523 + <figure style="margin:0; width: 100%;"> 1524 + <img src="sebastian.jpg" alt="Sebastian Schemm" style="max-width: 100%; height: auto;"> 1525 + <figcaption>Sebastian Schemm</figcaption> 1526 + </figure> 1527 + <figure style="margin:0; width: 100%;"> 1528 + <img src="alice.jpg" alt="Alice Cicirello" style="max-width: 100%; height: auto;"> 1529 + <figcaption>Alice Cicirello</figcaption> 1530 + </figure> 1531 + <figure style="margin:0; width: 100%;"> 1532 + <img src="simon.jpg" alt="Simon Thomas" style="max-width: 100%; height: auto;"> 1533 + <figcaption>Simon Thomas</figcaption> 1534 + </figure> 1535 + <figure style="margin:0; width: 100%;"> 1536 + <img src="simond.jpg" alt="Simon Driscoll" style="max-width: 100%; height: auto;"> 1537 + <figcaption>Simon Driscoll</figcaption> 1538 + </figure> 1539 + <figure style="margin:0; width: 100%;"> 1540 + <img src="emilio.jpg" alt="Emilio Luz-Ricca" style="max-width: 100%; height: auto;"> 1541 + <figcaption>Emilio Luz-Ricca</figcaption> 1542 + </figure> 1543 + <figure style="margin:0; width: 100%;"> 1544 + <img src="orlando.jpg" alt="Orlando Timmerman" style="max-width: 100%; height: auto;"> 1545 + <figcaption>Orlando Timmerman</figcaption> 1546 + </figure> 1547 + <figure style="margin:0; width: 100%;"> 1548 + <img src="ruari.jpg" alt="Ruari Marshall-Hawkes" style="max-width: 100%; height: auto;"> 1549 + <figcaption>Ruari Marshall-Hawkes</figcaption> 1550 + </figure> 1551 + <figure style="margin:0; width: 100%;"> 1552 + <img src="hattie.jpg" alt="Hattie Branson" style="max-width: 100%; height: auto;"> 1553 + <figcaption>Hattie Branson</figcaption> 1554 + </figure> 1555 + <figure style="margin:0; width: 100%;"> 1556 + <img src="frank.jpg" alt="Frank Feng" style="max-width: 100%; height: auto;"> 1557 + <figcaption>Frank Feng</figcaption> 1558 + </figure> 1559 + 1560 + </div> 1561 + 1562 + <h2 id="discussions"><a href="#discussions" class="anchor"></a>Discussions</h2> 1563 + <p>We then split up into three discussion groups; one on the future of this work, one on how to continue building this community of researchers, and the last on applying AI to real-world problems. As a newcomer to the field I was interested in the direction it's heading in, so I joined in <a href="https://dorchard.github.io">Dominic Orchard</a>'s led session on the future of AI.</p> 1564 + <p>We had a fascinating discussion on both the immediate things we can do and longer term worries. We were imagining a world where AI becomes 'just a tool' that we don't need to be experts in to apply it, but right now we're in a much more tightly coupled collaborative world where we need experts in AI to complement the experts in the application field to make progress. This comes with challenges - applying for funding for multidisciplinary work is not the norm, so we spent some time discussing this too.</p> 1565 + <p>One of our group spoke about statistics now being 'just a tool', but it's been one that we've worked with for a long time now and we know where the sharp corners are. We have protocols for applying statistical tools and we have diagnostic plots to tell us whether the results are trustworthy, but not only do we not have these for AI models, but it's not yet clear whether such a thing will be even possible.</p> 1566 + <p>Overall it was a fascinating day, and I'm very much looking forward to following the work of these outstanding researchers, and maybe even contributing to their work in some way in the future.</p> 1567 + <p><i>Thanks to <a href="https://anil.recoil.org">Anil Madhavapeddy</a> for the photos of the day.</i></p>]]></content> 1568 + </entry> 1569 + <entry> 1570 + <id>https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html</id> 1571 + <title>OCaml-Docs-CI and Odoc 3</title> 1572 + <published>2025-04-29T00:00:00Z</published> 1573 + <updated>2025-04-29T00:00:00Z</updated> 1574 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html"/> 1575 + <summary>The release of Odoc 3 means that we need to update the project so that the documentation that appears on is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give t...</summary> 1576 + <content type="html"><![CDATA[<h1 id="ocaml-docs-ci-and-odoc-3"><a href="#ocaml-docs-ci-and-odoc-3" class="anchor"></a>OCaml-Docs-CI and Odoc 3</h1> 1577 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-29</p></li></ul> 1578 + <p>The release of Odoc 3 means that we need to update the <em>docs-ci</em> project so that the documentation that appears on <em>ocaml.org</em> is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give the CI pipeline a bit of an overhaul too, and fix some of the irritations that it causes.</p> 1579 + <h2 id="the-challenge-of-documenting-ocaml"><a href="#the-challenge-of-documenting-ocaml" class="anchor"></a>The challenge of documenting OCaml</h2> 1580 + <p>As I wrote about <a href="semantic-versioning-is-hard.html" title="semantic-versioning-is-hard">recently</a>, the APIs of OCaml libraries are dependent not only on the version of its package, but possibly also on the versions of any of its dependencies. Due to this fact, to produce the docs for ocaml.org means that sometimes we need to build the docs for a particular version of a particular package multiple times with different versions of its dependencies.</p> 1581 + <p>It's clearly impractical to try to build every possible combination, so what we do is to run the opam solver once for each version of each package. This gives us a set of packages at particular versions. We then take that, and for each package in the set, we pluck out <i>its</i> dependencies from the set, producing a &quot;universe&quot; of dependencies for every package in the set. Let's look at a very simple example; the package <code>cry</code> from the <a href="https://www.liquidsoap.info">LiquidSoap</a> project.</p> 1582 + <p>The oldest version of <code>cry</code> from before the <a href="https://discuss.ocaml.org/t/opam-repository-archival-phase-1-unavailable-packages/15797/6">Great Purge</a> was 0.2.2, which when solved produced the following dependencies:</p> 1583 + <pre>cry.0.2.2 1584 + ocaml.4.05.0 1585 + ocaml-base-compiler.4.05.0 1586 + ocaml-config.1 1587 + ocamlfind.1.9.6</pre> 1588 + <p>and the oldest version of <code>cry</code> after the purge is 0.6.0 which produces the following solution:</p> 1589 + <pre>cry.0.6.0 1590 + ocaml.5.2.1 1591 + ocaml-base-compiler.5.2.1 1592 + ocaml-config.3 1593 + ocamlfind.1.9.6</pre> 1594 + <p>so we we can see from these two solutions that we'll need to build <code>ocamlfind.1.9.6</code> twice, once with <code>ocaml.4.05.0</code> and once with <code>ocaml.5.2.1</code>.</p> 1595 + <p>Once we've got, for every version of every package, a set of dependency universes, we choose one of these to be the one presented to the user under the <code>ocaml.org/p/</code> hierarchy. For all of the other universes, we build the package againt them, and put the docs under the <code>ocaml.org/u/</code> hierarchy.</p> 1596 + <h2 id="performing-the-builds"><a href="#performing-the-builds" class="anchor"></a>Performing the builds</h2> 1597 + <p>Once we've got a complete set of solutions and builds to do, the current CI pipeline batches the builds up to try and build as many packages as possible in as few builds as possible. While this works well enough, it does mean that we build a lot packages more than once - dune, for example, is built thousands of times during this process, producing exactly the same binaries each time.</p> 1598 + <p>In the new pipeline, I wrote a <a href="https://github.com/jonludlam/opamh">little tool</a> that allows opam packages to be archived and restored, which happens to work nicely because we're always building the packages in the same container in the same location. This means there are no worries about relocatability, although that is something that is <a href="https://www.dra27.uk/blog/platform/2025/04/22/branching-out.html">nearly here!</a></p> 1599 + <p>The downside to this is that our storage requirements are quite a bit larger, as we're storing the entire package rather than just the bits that odoc needs. However, we were always going to use more storage than before simply because the new <code>odoc</code> and <code>odoc_driver</code> pair are more capable, and the new features like <a href="https://github.com/ocaml/odoc/pull/909">source code rendering</a> and <a href="https://github.com/ocaml/odoc/pull/1121/files#diff-10c8829023814c0bcc3316f95f643623404c000b13c68849ef3d61097a6e03a6R1-R415">classify</a> require more files from the original package.</p> 1600 + <p>The upshot is that I'll be working with <a href="https://www.tunbury.org/">Mark Elvers</a> to move the docs CI from its current machine to a shiny new <a href="https://www.tunbury.org/blade-reallocation/">blade server</a>.</p>]]></content> 1601 + </entry> 1602 + <entry> 1603 + <id>https://jon.recoil.org/blog/2025/04/odoc-3.html</id> 1604 + <title>Odoc 3: So what?</title> 1605 + <published>2025-04-25T00:00:00Z</published> 1606 + <updated>2025-04-25T00:00:00Z</updated> 1607 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/odoc-3.html"/> 1608 + <summary>Odoc 3 was and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</summary> 1609 + <content type="html"><![CDATA[<h1 id="odoc-3:-so-what?"><a href="#odoc-3:-so-what?" class="anchor"></a>Odoc 3: So what?</h1> 1610 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-25</p></li></ul> 1611 + <p>Odoc 3 was <a href="https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339">released last month</a> and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</p> 1612 + <p>It's <b>manuals</b>, the theme of Odoc 3 is <b>manuals</b>. It's got a load of features to make it much better for writing <code>mld</code> pages (files written using odoc's markup) to document your packages and their relationship to the surrounding ecosystem. Previous versions of Odoc were very library-centric, in that while we did have mld-file support, most of the effort went into making sure that we were generating correct per-module pages, which show the shape of your API even if you've not put in any doc comments at all. We've still got that, obviously, but we've added many features to make write <code>mld</code> pages far more useful, and we're really hoping that these will draw people in to make documenting packages a much more enjoyable experience.</p> 1613 + <h2 id="odoc's-special-skill:-links!"><a href="#odoc's-special-skill:-links!" class="anchor"></a>Odoc's special skill: links!</h2> 1614 + <p>But why you might want to use Odoc at all for your package's manuals, rather than, say, markdown, asciidoc, rst or any other similar language? The biggest thing that Odoc brings, and has always brought, is <b>reliable linking</b>. Just write <code>{!Module.func}</code> and Odoc will check that the target exists and ensure that the link goes to the correct place, no matter how complex the definition of <code>Module</code> is or what the layout of the docs. We can link to almost all elements of an OCaml library, from modules and types through to fields of records, exceptions and extensions, and we have facilities for disambiguating, so if you happen to have both a module <code>S</code> and a module type <code>S</code> you can easily link to whichever you please.</p> 1615 + <p>In Odoc 2 though, these links were pretty limited - the only ones possible were only those to docs and API elements (modules, types, values, etc) in your own package, or to API elements in any libraries that your package depends on. When writing API docs, which tend to be at the level of types and functions, this wasn't a huge problem, but when considering manuals this turned out to be a really limiting constraint. For example, in Odoc's own docs, we really want to have a link to <code>odoc-driver</code>, but since <code>odoc-driver</code> is a separate package and depends upon <code>odoc</code>, the only way to do that in Odoc 2.x would be to use an HTML link. With Odoc 3, this constraint is gone, so you can <b>link to any other package or library</b>. The link to <code>odoc-driver</code> would look like <code>{!/odoc-driver/page-index}</code>, as can be seen in <a href="https://github.com/ocaml/odoc/blob/master/doc/driver.mld#L10">odoc's source</a>. The only requirement is that you must be able to simultaneously install all of the packages you'd like to link to, so you can't easily link to, for example, different versions of the same package.</p> 1616 + <p>This will be particularly useful for any projects that's grouped into multiple packages. For example, the <a href="https://mirage.io">Mirage project</a>. The main package there -- <code>mirage</code> -- is actually right at the bottom of the dependency hierarchy, but it's the perfect place to have docs that link to all of the other Mirage packages. On a smaller scale, the <a href="https://github.com/ocaml-multicore/picos">Picos project</a> consists of multiple packages all from a single git repository, and this would allow the docs pages from the <code>picos</code> package to link to any of the other packages.</p> 1617 + <p>Of course there are also a lot of other new features in this release, which are called out in the <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">annoucement post on discuss</a>, some of which I may post about in the future.</p> 1618 + <h2 id="can-i-use-it-now?"><a href="#can-i-use-it-now?" class="anchor"></a>Can I use it now?</h2> 1619 + <p>Of course! These new features can be used right now, so long as you're happy to self-host the docs. All that's needed is to create a switch containing all the packages you're interested in together, and use <code>odoc_driver</code> to generate the HTML and push them to your web server. At time of writing though, ocaml.org is still using Odoc 2.4, so any packages that are published to opam that choose to use these new features will be missing the new features. Furthermore, it's actually quite a challenge to do this, since we'll have to extend the package-universe solutions to include all relevant packages, for which we need extra fields in the opam metadata.</p> 1620 + <h2 id="what's-next?"><a href="#what's-next?" class="anchor"></a>What's next?</h2> 1621 + <p>We're actively working on getting Odoc 3 into the pipeline generating the docs found in https://ocaml.org/p/. This will bring with it some of the developments that landed in Odoc 2, but didn't make it onto ocaml.org - for example, the rendering of source pages. Not only are there challenges related to the package-universe solutions as mentioned above, but the storage requirements are considerably larger, so I'll be working with <a href="https://tunbury.org/">Mark Elvers</a> to complete this project.</p> 1622 + <p>We've also got work to do to update the build rules in dune to take advantage of these features. While <code>odoc_driver</code> works very well as part of the process of deploying a docs site, it's quite impractical as a tool to help while you're actually writing the docs. For that, we'll need to make sure <code>dune</code> understands how to use these new features. Fortunately we've had some experience with those rules in the past, and part of the work that's gone into Odoc 3 was to ensure that incremental build rules should be far more straightforward to write than for Odoc 2. In addition, some of the logic that previously only existed in <a href="https://github.com/ocaml-doc/voodoo">Voodoo</a> - the old driver that builds docs for ocaml.org - has been integrated into <code>odoc</code> itself, meaning one again that getting dune to produce correct docs for non-dune packages (e.g. the standard library!) should again be simpler.</p> 1623 + <p>After we've done these, there are plans afoot to make more improvements to the manual writing experience. <a href="https://choum.net/">@panglesd</a> has been investigating how to add admonitions to the language, I've been thinking about custom tag support, we're looking at the <a href="https://discuss.ocaml.org/t/ann-oxidizing-ocaml-an-update/15237">modes</a> work coming from Jane Street to see how to support that. There's plenty more to do, so if you'd like to lend a hand, reach out and join in!</p>]]></content> 1624 + </entry> 1625 + <entry> 1626 + <id>https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html</id> 1627 + <title>Semantic Versioning in OCaml is Hard</title> 1628 + <published>2025-04-20T00:00:00Z</published> 1629 + <updated>2025-04-20T00:00:00Z</updated> 1630 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html"/> 1631 + <summary> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projec...</summary> 1632 + <content type="html"><![CDATA[<h1 id="semantic-versioning-in-ocaml-is-hard"><a href="#semantic-versioning-in-ocaml-is-hard" class="anchor"></a>Semantic Versioning in OCaml is Hard</h1> 1633 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-20</p></li></ul> 1634 + <p><a href="https://semver.org">Semantic versioning</a> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projects that are trying to do this, including a recent <a href="https://www.outreachy.org">Outreachy</a> project by <a href="https://github.com/azzsal/">Abdulaziz Alkurd</a> mentored by <a href="https://choum.net">panglesd</a> and <a href="https://github.com/nathanreb">Nathan Reb</a>. While this is a great start, there are some subtleties of the OCaml module system that make it a good deal more complex than in other languages.</p> 1635 + <h2 id="opam-format.2.3.0-≠-opam-format.2.3.0?"><a href="#opam-format.2.3.0-≠-opam-format.2.3.0?" class="anchor"></a>opam-format.2.3.0 ≠ opam-format.2.3.0?</h2> 1636 + <p>Let's take the case that hit me this morning. I've been working on <a href="https://github.com/ocurrent/ocaml-docs-ci">ocaml-docs-ci</a> in order to bring the exciting new <a href="https://ocaml.github.io/odoc">odoc 3</a> features to <a href="https://ocaml.org/">ocaml.org</a> for everyone to enjoy. I have it checked out and building locally, but to deploy it to the infrastructure managed by <a href="https://tunbury.org/">Mark Elvers</a> it needs to be packaged up into a Docker image. So I issued the usual <code>docker build .</code> and after it churned through the setup stages and got on to building the project, it hit an error:</p> 1637 + <pre>File &quot;src/solver/solver.ml&quot;, line 58, characters 75-98: 1638 + let deps = List.map (fun pkg -&gt; OpamPackage.Map.find pkg simple_deps) (OpamPackage.Set.to_list pkgs) in 1639 + Error: Unbound value OpamPackage.Set.to_list 1640 + Hint: Did you mean of_list?</pre> 1641 + <p>Now <code>OpamPackage</code> is a module in the <code>opam-format</code> library, which is easily discovered using the excellent <a href="https://doc.sherlocode.com/?q=OpamPackage">Sherlodoc</a> tool, so I checked what version I had locally, and what version I had in the Docker container, and it turned out I was using exactly the same version -- 2.3.0 -- both locally and in the container. So what's going on?</p> 1642 + <p>The problem is that the Dockerfile I was using was using OCaml version 4.14, whereas locally I was using OCaml 5.3. &quot;But how on earth can this cause the API of <code>opam-format</code> to change?&quot; I hear you wail! Well, this is actually one of the simpler outcomes of the way the OCaml module system works. Let's look at <a href="https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli">the code</a>.</p> 1643 + <p>The first thing we note is the absence of any definition of <code>Set</code> or <code>Map</code> here</p> 1644 + <ul><li>where do they come from? It turns out they come from <a href="https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli#L49">this line here</a>:</li></ul> 1645 + <div><pre class="language-ocaml"><code>include OpamStd.ABSTRACT with type t := t</code></pre></div> 1646 + <p>So let's take a look over in <code>opamStd.mli</code> to see what that signature looks like:</p> 1647 + <div><pre class="language-ocaml"><code>(** A signature for handling abstract keys and collections thereof *) 1648 + module type ABSTRACT = sig 1649 + 1650 + type t 1651 + 1652 + val compare: t -&gt; t -&gt; int 1653 + val equal: t -&gt; t -&gt; bool 1654 + val of_string: string -&gt; t 1655 + val to_string: t -&gt; string 1656 + val to_json: t OpamJson.encoder 1657 + val of_json: t OpamJson.decoder 1658 + 1659 + module Set: SET with type elt = t 1660 + module Map: MAP with type key = t 1661 + end</code></pre></div> 1662 + <p>OK, so we've found the definitions of <code>Set</code> and <code>Map</code> - they refer to signatures <code>SET</code> and <code>MAP</code> which are defined just above in <a href="https://github.com/ocaml/opam/blob/2.3.0/src/core/opamStd.mli#L17-L98">opamStd.mli</a>. Let's just look at <code>Set</code> since that's where the problem was:</p> 1663 + <div><pre class="language-ocaml"><code>module type SET = sig 1664 + 1665 + include Set.S 1666 + 1667 + val map: (elt -&gt; elt) -&gt; t -&gt; t 1668 + 1669 + val is_singleton: t -&gt; bool 1670 + 1671 + (** Returns one element, assuming the set is a singleton. Raises [Not_found] 1672 + on an empty set, [Failure] on a non-singleton. *) 1673 + val choose_one : t -&gt; elt 1674 + 1675 + val choose_opt: t -&gt; elt option 1676 + 1677 + val of_list: elt list -&gt; t 1678 + val to_list_map: (elt -&gt; 'b) -&gt; t -&gt; 'b list 1679 + val to_string: t -&gt; string 1680 + val to_json: t OpamJson.encoder 1681 + val of_json: t OpamJson.decoder 1682 + val find: (elt -&gt; bool) -&gt; t -&gt; elt 1683 + val find_opt: (elt -&gt; bool) -&gt; t -&gt; elt option 1684 + 1685 + (** Raises Failure in case the element is already present *) 1686 + val safe_add: elt -&gt; t -&gt; t 1687 + 1688 + (** Accumulates the resulting sets of a function of elements until a fixpoint 1689 + is reached *) 1690 + val fixpoint: (elt -&gt; t) -&gt; t -&gt; t 1691 + 1692 + (** [map_reduce f op t] applies [f] to every element of [t] and combines the 1693 + results using associative operator [op]. Raises [Invalid_argument] on an 1694 + empty set, or returns [default] if it is defined. *) 1695 + val map_reduce: ?default:'a -&gt; (elt -&gt; 'a) -&gt; ('a -&gt; 'a -&gt; 'a) -&gt; t -&gt; 'a 1696 + 1697 + module Op : sig 1698 + val (++): t -&gt; t -&gt; t (** Infix set union *) 1699 + 1700 + val (--): t -&gt; t -&gt; t (** Infix set difference *) 1701 + 1702 + val (%%): t -&gt; t -&gt; t (** Infix set intersection *) 1703 + end 1704 + 1705 + end</code></pre></div> 1706 + <p>Sure enough, there's no <code>to_list</code> function defined in there. Once again though, there's an <code>include Set.S</code> in there. It turns out that that refers to the <code>Set</code> module in the OCaml standard library. We can again <a href="https://github.com/ocaml/ocaml/blob/5.3.0/stdlib/set.mli">look at the source</a>:</p> 1707 + <div><pre class="language-ocaml"><code>val to_list : t -&gt; elt list 1708 + (** [to_list s] is {!elements}[ s]. 1709 + @since 5.1 *)</code></pre></div> 1710 + <p>And there it is. The <code>to_list</code> function has only been in the <code>Set</code> module since version 5.1.</p> 1711 + <h2 id="using-ocaml.org-docs"><a href="#using-ocaml.org-docs" class="anchor"></a>Using ocaml.org docs</h2> 1712 + <p>It was pretty difficult to figure that out from the source, but happily there's a better way. We can browse the docs on https://ocaml.org/ - We can look at the docs for the <a href="https://ocaml.org/p/opam-format/2.3.0/doc/OpamPackage/Set/index.html">OpamPackage.Set module</a> which, as of today, does not contain any <code>to_list</code> function. The <code>include Set.S</code> is there with the expansion showing all of the types and values coming from it, so we can click on the <code>Set.S</code> link on the include line which takes us to the documentation for the stdlib from OCaml 4.11.2. Changing the version from the dropdown at the top to something more recent takes us to a page containing the <code>to_list</code> function with the helpful <code>since 5.1</code> annotation.</p> 1713 + <p>This is, in fact, a relatively simple example of the sorts of issues that can occur that make semantic versioning a headache. In this example, it was a change due to a difference in the compiler version used, but there's nothing particularly special about that - a package may expose signatures derived from any of its dependencies! So is there anything we can do about this? Obviously, yes!</p> 1714 + <h2 id="towards-a-solution"><a href="#towards-a-solution" class="anchor"></a>Towards a solution</h2> 1715 + <p>Step 1 of any approach to solving this is to be able to identify which bits of a libraries API come from which packages, and therefore which versions of those packages. It turns out there may well be a nice way to piggy-back on a recent feature from Odoc, which was originally intended to help with suppressing suprious warnings.</p> 1716 + <p>The problem we were tackling was that if your library ends up exporting a module whose signature is defined in someone else's package, then any warnings that come from it are unfixable. To fix this we added a tag to each signature of a module that indicates which package it originally came from. Odoc is then very careful to keep track of this as it performs its signature manipulations, resulting in an accurate way to know which signature elements came from which package. This fixed the problem of the spurious warnings nicely.</p> 1717 + <p>Quite separately, we've got the docs CI that is attempting to build docs for every version of every package. Obviously given the above, in order to exhaustively show all the possible APIs of every library, we should build all possible combinations of every version of every package. Clearly we can't possibly do this, so the docs CI focuses on the goal of building at least one solution for every version of every package.</p> 1718 + <p>Now if you combine these two ideas, we can use the builds of the packages with the tracking of the package of the originating signatures to be able to precisely track the differences in API between different versions of a package. This would allow us to build a database of those changes, and with this in hand we can look at what APIs are used in any other package and be able to suggest upper and lower bounds on the versions of its dependencies.</p> 1719 + <p>Now wouldn't that be cool?</p>]]></content> 1720 + </entry> 1721 + <entry> 1722 + <id>https://jon.recoil.org/blog/2025/04/meeting-the-team.html</id> 1723 + <title>Meeting the Team</title> 1724 + <published>2025-04-08T00:00:00Z</published> 1725 + <updated>2025-04-08T00:00:00Z</updated> 1726 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/meeting-the-team.html"/> 1727 + <summary>It's tremendously exciting to be back in the , as the last time I worked here was just before the pandemic. I'm now a member of the whose goal is &quot;to have a measurable impact on tools and techniques ...</summary> 1728 + <content type="html"><![CDATA[<h1 id="meeting-the-team"><a href="#meeting-the-team" class="anchor"></a>Meeting the Team</h1> 1729 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-08</p></li></ul> 1730 + <p>It's tremendously exciting to be back in the <a href="https://www.cst.cam.ac.uk/">Computer Laboratory</a>, as the last time I worked here was just before the pandemic. I'm now a member of the <a href="https://www.cst.cam.ac.uk/research/eeg">Energy and Environment Group</a> whose goal is &quot;to have a measurable impact on tools and techniques for de-risking the future&quot;.</p> 1731 + <h2 id="what's-going-on?"><a href="#what's-going-on?" class="anchor"></a>What's going on?</h2> 1732 + <p>With such a broad goal, it's hard to know where to start and how I'll fit in, so my first few weeks have been spent getting to know the other members of the group and what they're up to. It's an incredibly inspiring group of individuals who are all doing amazing work, and it's really humbling and daunting to be a part of it.</p> 1733 + <p>There's some really interesting work going on in our group on LLMs, principally led by the fantastic <a href="https://toao.com/">Sadiq Jaffer</a>. We had a chat a few weeks ago and have started to explore some ideas around seeing how well LLMs can program in OCaml already before we start to do some RL training on them. Having not done any LLM stuff before, it's a steep learning curve for me, but we're already seeing some interesting results. We should have some more to say about this in the coming weeks.</p> 1734 + <p>Last week I met with <a href="https://digitalflapjack.com/">Michael Dales</a>, and he talked about the project <a href="https://github.com/quantifyearth/shark">shark</a> that he and <a href="patrick.sirref.org">Patrick Ferris</a> have been working on. It's kind of a mix between a shell and a jupyter-style notebook, with a strong focus on reproducibility. The traditional pain of notebooks is, of course, the execution model, whereby cells might be executed in any order you like. This means that the state you find the notebook in might not be even reachable again, let alone consistently reproducible. Shark is trying to address this by using file-system snapshots and clever analysis of the inputs and outputs of each cell to both ensure reproducibility, but also to allow a fast editing cycle, rerunning of only the bits that need to be rerun, even in the presence of slow data processing steps. It's a fascinating project, and I can't wait to see it in action when Michael gives us a demo!</p> 1735 + <p>I also met up with <a href="https://ryan.freumh.org">Ryan Gibb</a> with <a href="https://www.dra27.uk/blog/">David Allsopp</a> and we had a good chat about his project <a href="https://github.com/RyanGibb/babel">Babel</a>, which is using the <a href="https://nex3.medium.com/pubgrub-2fb6470504f">PubGrub</a> algorithm to do package resolution for multiple package domains at once. We've got a number of avenues to explore here, from building a PubGrub implementation in OxCaml, to using Babel to construct Docker images for opam packages entirely from scratch, without using a base image.</p> 1736 + <p>With my other hat on as a member of the CTO office at <a href="https://tarides.com/">Tarides</a>, I'm very much looking forward to using OCaml and OxCaml to solve some real-world problems that are in an entirely different domain than I've been used to over the last few years.</p>]]></content> 1737 + </entry> 1738 + <entry> 1739 + <id>https://jon.recoil.org/blog/2025/04/this-site.html</id> 1740 + <title>This site</title> 1741 + <published>2025-04-07T00:00:00Z</published> 1742 + <updated>2025-04-07T00:00:00Z</updated> 1743 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/04/this-site.html"/> 1744 + <summary>I've spent a of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about ho...</summary> 1745 + <content type="html"><![CDATA[<h1 id="this-site"><a href="#this-site" class="anchor"></a>This site</h1> 1746 + <ul class="at-tags"><li class="x-ocaml.requires"><span class="at-tag">x-ocaml.requires</span> <p>mime_printer</p></li></ul> 1747 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-04-07</p></li></ul> 1748 + <p>I've spent a <em>lot</em> of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about how I might use odoc as part of it. We've spent a lot of time recently trying to make odoc more able to generate structured documentation sites, so I've gone all in and am trialling using it as a tool to generate my entire site. This is a bit of an experiment, and I don't know how well it will work out, but let's see how it goes.</p> 1749 + <p>Additionally, I've recently been working on a project currently called <code>odoc_notebook</code>, which is a set of tools to allow odoc <code>mld</code> files to be used as a sort of Jupyter-style notebook. The idea is that you can write both text and code in the same file, and then run the code in the notebook interactively. Since I've only got a webserver, all the execution of code has to be done client side, so I'm making extensive use of the phenomenal <a href="https://github.com/ocsigen/js_of_ocaml">Js_of_ocaml</a> project to get an OCaml engine running in the browser.</p> 1750 + <p>My focus has initially been on getting 'toplevel-style' code execution working. As an example, let's write a little demo.</p> 1751 + <h2 id="demo"><a href="#demo" class="anchor"></a>Demo</h2> 1752 + <p>Let's start with a little demo:</p> 1753 + <div><pre class="language-ocaml"><code>let x = 1 + 2</code></pre></div> 1754 + <p>It's intended to look like an OCaml toplevel session, so each new expression starts with a <code>#</code> and is terminated with a double semicolon. The response from the toplevel is then below that indented with 2 spaces. Right now, there's not much in the way of error checking so you can make it all very confused by deleting the hash, removing the <code>;;</code> and so on. Avoiding this, however, you can edit the numbers here and hit 'run' (maybe twice!) to see the results being updated.</p> 1755 + <p>There is also a little integration to allow the code to produce output more interesting than just text. The following cell creates an SVG image and 'pushes' it to <code>Mime_printer</code>, which receives the mime value and renders it in the browser below the code block.</p> 1756 + <div><pre class="language-ocaml"><code>let svg = [ 1757 + {|&lt;svg height=&quot;210&quot; width=&quot;500&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;&gt;|}; 1758 + {|&lt;polygon points=&quot;100,10 40,198 190,78 10,78 160,198&quot; |}; 1759 + {|style=&quot;fill:lime;stroke:purple;stroke-width:5;&quot;/&gt;&lt;/svg&gt;|}];; 1760 + 1761 + Mime_printer.push &quot;image/svg&quot; (String.concat &quot;\n&quot; svg)</code></pre></div> 1762 + <h2 id="things-to-come"><a href="#things-to-come" class="anchor"></a>Things to come</h2> 1763 + <h3 id="merlin-support"><a href="#merlin-support" class="anchor"></a>Merlin support</h3> 1764 + <p>There are a bunch of things I want to add to this, for example, Merlin support. In fact, <a href="https://github.com/voodoos/merlin-js">merlin-js</a> already exists and works, thanks to the fantastic work of <a href="https://github.com/voodoos">Ulysse</a>, but the problem is that it's not really designed for toplevel work, and it doesn't work when the code is broken up into chunks like I do here. So either I need to concatenate all the cells together before I give it to Merlin, or I need to make each cell it's own little module and 'open' every previous cell's module.</p> 1765 + <p>Within a single cell, it does already work. You can see that Merlin is correctly underlining the error in the following cell. You should also be able to hover over the variables and see their types.</p> 1766 + <div><pre class="language-ocaml"><code>type t = { foo : int; bar : string };; 1767 + 1768 + let x = { foo = 1; bar = &quot;hello&quot; };; 1769 + 1770 + let this_line_has_an_error = { foo = 1; bar = None };;</code></pre></div> 1771 + <p>But across cells, I've broken Merlin, though the code is executes correctly. You can see the problem in the following cell, which re-pushes the SVG image using the variable <code>svg</code> defined in the cell above. Merlin highlights the use of the varible <code>svg</code> is, because it's not aware of the varible, but the code gets executed correctly and the image is rendered below the cell.</p> 1772 + <div><pre class="language-ocaml"><code>Mime_printer.push &quot;image/svg&quot; (String.concat &quot;\n&quot; svg)</code></pre></div> 1773 + <p>Edit 2025-05-20: I have now got merlin working across cells, though I'm not convinced the current solution is the right long-term solution. S</p> 1774 + <h3 id="dynamic-libraries"><a href="#dynamic-libraries" class="anchor"></a>Dynamic libraries</h3> 1775 + <p>Currently the use of libraries it quite limited - they are defined more-or-less statically. I've had dynamic libraries working in the past, but I need to re-implement them. The plan is to have the <code>cma</code> files converted to <code>js</code> files and then load them on-demand when the notebook specifies them. The tricky thing here is that we need to be able to use them both in the browser and in bytecode executables so that the 'test-promote' workflow still works. This will probably require specifying the libraries by name, and having to re-implement the work that <a href="https://projects.camlcity.org/projects/findlib.html">findlib</a> does to find the libraries and load them and their dependencies in the right order, though this time entirely over HTTP.</p> 1776 + <h3 id="other-things"><a href="#other-things" class="anchor"></a>Other things</h3> 1777 + <p>There are loads of other things I'm interested in doing, including:</p> 1778 + <ul><li>Investigating how to do 'exercises' to allow readers to try things out in a guided way</li><li>'Test cells' to see if implementations are correct</li><li>Persistence of the notebook state - both using local and remote storage</li><li>Integration of docs</li><li>Exploration of the execution model - how to run the code in the right order and ensure reproducibility</li><li>Use of remote execution engines rather than just in the browser</li><li>Other languages?</li></ul> 1779 + <p>Right now though, my focus is on the functionality required for this blog, with a secondary goal of looking at how we might use this sort of technology on the docs site on ocaml.org. Wouldn't it be cool to be able to drop into a live OCaml toplevel for any library in opam?</p> 1780 + <h2 id="example-notebooks"><a href="#example-notebooks" class="anchor"></a>Example notebooks</h2> 1781 + <p>As a more extended example of odoc notebooks, I have converted to this format the course that I help teach at the University of Cambridge; <a href="https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/">Foundations of Computer Science</a>. <a href="../../../notebooks/foundations/index.html" title="index">Try them out for yourself!</a>.</p>]]></content> 1782 + </entry> 1783 + <entry> 1784 + <id>https://jon.recoil.org/blog/2025/03/module-type-of.html</id> 1785 + <title>The Road to Odoc 3: Module Type Of</title> 1786 + <published>2025-03-08T00:00:00Z</published> 1787 + <updated>2025-03-08T00:00:00Z</updated> 1788 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/03/module-type-of.html"/> 1789 + <summary>There are that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an that landed in May 2024.</summary> 1790 + <content type="html"><![CDATA[<h1 id="the-road-to-odoc-3:-module-type-of"><a href="#the-road-to-odoc-3:-module-type-of" class="anchor"></a>The Road to Odoc 3: Module Type Of</h1> 1791 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-03-08</p></li></ul> 1792 + <p>There are <a href="https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043">many new and improved features</a> that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an <a href="https://github.com/ocaml/odoc/pull/1081">overhaul of &quot;module type of&quot;</a> that landed in May 2024.</p> 1793 + <h2 id="module-type-of"><a href="#module-type-of" class="anchor"></a>Module Type Of</h2> 1794 + <p>module type of is a language feature of OCaml allowing one to recover the signature of an existing module. For example, if I had a module <code>X</code>:</p> 1795 + <div><pre class="language-ocaml"><code>module X = struct 1796 + type t = Foo | Bar 1797 + end</code></pre></div> 1798 + <p>then I can get back the signature of <code>X</code> using <code>module type of</code>:</p> 1799 + <div><pre class="language-ocaml"><code>module type Xsig = module type of X</code></pre></div> 1800 + <p>which can be very useful if you’re trying to <a href="https://discuss.ocaml.org/t/extend-existing-module/1389">extend existing modules</a> amongst other things.</p> 1801 + <p>OCaml and Odoc treat module type of in somewhat different ways. OCaml internally expands the expression immediately it sees it, and effectively replaces it with the signature - ie, in the above example Xsig is now a signature, not a module type of expression.</p> 1802 + <p>In contrast, Odoc would like to keep track of the fact that this signature came from a <code>module type of</code> expression, as it’s very useful to know. If you’re extending a module, your signature might look like:</p> 1803 + <div><pre class="language-ocaml"><code>module type UnitExtended = sig 1804 + include module type of Unit 1805 + val extra_unit_function : unit -&gt; unit 1806 + end</code></pre></div> 1807 + <p>The documentation we produce will expand the contents of the <code>include</code> statement, but keep track of the fact that it came from a <code>module type of</code> expression so the reader can see where these signature items came from. In practice, you'd probably want to use <code>module type of struct include Unit end</code>, which is a bit different from simply <code>module type of Unit</code>, and I'll talk about this at some point in a future post.</p> 1808 + <h2 id="the-problem"><a href="#the-problem" class="anchor"></a>The problem</h2> 1809 + <p>We run into difficulties as soon as we introduce another language feature that operates on signatures: with. Let’s start with a module type <code>S</code>:</p> 1810 + <div><pre class="language-ocaml"><code>module type S = sig 1811 + module X : sig 1812 + type t = int 1813 + end 1814 + 1815 + module type Y = 1816 + module type of X 1817 + end</code></pre></div> 1818 + <p>We’ll now define a new module <code>X2</code> that we intend to use as a replacement for <code>X</code>:</p> 1819 + <div><pre class="language-ocaml"><code>module X2 = struct 1820 + type t = int 1821 + type u = float 1822 + end</code></pre></div> 1823 + <p>Now we’ll define a new module type <code>T</code> which is <code>S</code> but with <code>X</code> replaced:</p> 1824 + <div><pre class="language-ocaml"><code>module type T = S with module X := X2</code></pre></div> 1825 + <p>Here you can see that OCaml has expanded the <code>module type of</code> expressions and told us the computed signature. The interesting thing here is that in module type <code>T</code>, module type <code>Y</code> only has a type <code>t</code> in it, not a type <code>u</code>. As above, Odoc wants to keep the <code>module type of</code> expression so the reader can tell where module type <code>Y</code> came from. However, the substitution would do a different thing in this case - we would have the following:</p> 1826 + <div><pre class="language-ocaml"><code>module type T = sig 1827 + module type Y = module type of X2 1828 + end</code></pre></div> 1829 + <p>and the expansion of this would then clearly have both types <code>t</code> and <code>u</code> in it.</p> 1830 + <p>So now Odoc has two problems: We need to compute the correct signature, and we need to be able to describe how we computed it.</p> 1831 + <h2 id="the-solution"><a href="#the-solution" class="anchor"></a>The solution</h2> 1832 + <p>The previous solution to this was to have a ‘phase 0’ of odoc which would compute the expansions of all module type of expressions before doing any other work. This was necessary because of a ‘simplfying’ assumption in how we handled the typing environment. The new, simpler approach was to calculate the expansion during the normal flow of work, and never to attempt to recalculate it, but simply operate on the signature. This was a nice big simplification and optimisation that removed a few corner cases in the previous code (including an <a href="https://github.com/ocaml/odoc/blob/v2.4/src/xref2/type_of.ml#L167-L174">infinite loop</a> that we <em>hoped</em> always terminated…!)</p> 1833 + <p>The second issue was how to describe it. We still want it clear that this signature was derived from another, but it’s clear we can’t honestly say that in the above example that it’s <code>module type of X2</code>. The answer is that we have applied a transparent ascription to the signature. Essentially, the signature is <code>X2</code> but constrained to only have the fields of <code>X</code>.</p> 1834 + <p>This is not a current feature of OCaml, though Jane Street has <a href="https://blog.janestreet.com/plans-for-ocaml-408/">done some work</a> on this, including declaring the syntax: <code>X2 &lt;: X</code>. However, there’s another interesting wrinkle here. <code>X</code> is a module defined in the module type <code>S</code>, so it’s not possible to write a valid OCaml path that points to it – <code>S.X</code> has no meaning. In addition, the right-hand side of the <code>&lt;:</code> operator should be a module type, so we’d actually need to write <code>X2 &lt;: module type of S.X</code> . We’re still figuring out the right thing to do here, so for now Odoc 3 will still pretend that it’s simply <code>module type of X2</code>.</p>]]></content> 1835 + </entry> 1836 + <entry> 1837 + <id>https://jon.recoil.org/blog/2025/03/code-block-metadata.html</id> 1838 + <title>Code block metadata</title> 1839 + <published>2025-03-07T00:00:00Z</published> 1840 + <updated>2025-03-07T00:00:00Z</updated> 1841 + <link rel="alternate" href="https://jon.recoil.org/blog/2025/03/code-block-metadata.html"/> 1842 + <summary>Back in 2021 introduced some to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language ta...</summary> 1843 + <content type="html"><![CDATA[<h1 id="code-block-metadata"><a href="#code-block-metadata" class="anchor"></a>Code block metadata</h1> 1844 + <ul class="at-tags"><li class="published"><span class="at-tag">published</span> <p>2025-03-07</p></li></ul> 1845 + <p>Back in 2021 <a href="https://github.com/julow">julow</a> introduced some <a href="https://github.com/ocaml-doc/odoc-parser/pull/2">new syntax</a> to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language tag and the start of the code block. Now odoc needs to use it itself, we need to be a bit more precise about how it’s defined.</p> 1846 + <p>The original concept looked like this:</p> 1847 + <pre>{@ocaml metadata goes here in an unstructured way[ 1848 + ... code ... 1849 + ]}</pre> 1850 + <p>where everything in between the language (“ocaml” in this case) and the opening square bracket would be captured and put into the AST verbatim. Odoc itself has had no particular use for this, but it has been used in <a href="https://github.com/realworldocaml/mdx">mdx</a> to control how it handles the code blocks, for example to skip processing of the block, to synchronise the block with another file, to disable testing the block on particular OSs and so on.</p> 1851 + <p>As part of the Odoc 3 release we decided to address one of our <a href="https://github.com/ocaml/odoc/pull/303">oldest open issues</a>, that of extracting code blocks from mli/mld files for inclusion into other files. This is similar to the file-sync facility in mdx but it works in the other direction: the canonical source is in the mld/mli file. In order to do this, we now need to use the metadata so we can select which code blocks to extract, and so we needed a more concrete specification of how the metadata should be parsed.</p> 1852 + <p>We looked at what <a href="https://github.com/realworldocaml/mdx/blob/main/lib/label.ml#L195-L210">mdx does</a>, but the way it works is rather ad-hoc, using very simple String.splits to chop up the metadata. This is OK for mdx as it’s fully in charge of what things the user might want to put into the metadata, but for a general parsing library like odoc.parser we need to be a bit more careful. Daniel Bünzli <a href="https://github.com/ocaml/odoc/pull/1326#issuecomment-2702260053">suggested</a> a simple strategy of atoms and bindings inspired by s-expressions. The idea is that we can have something like this:</p> 1853 + <pre>{@ocaml atom1 &quot;atom two&quot; key1=value1 &quot;key 2&quot;=&quot;value with spaces&quot;[ 1854 + ... code content ... 1855 + ]}</pre> 1856 + <p>Daniel suggested a very minimal escaping rule, whereby a string could contain a literal &quot; by prefixing with a backslash - something like; &quot;value with a \&quot; and spaces&quot;, but we discussed it during the <a href="https://ocaml.org/governance/platform">odoc developer meeting</a> and felt that we might want something a little more familiar. So we took a look at the lexer in <a href="https://github.com/janestreet/sexplib/blob/master/src/lexer.mll">sexplib</a> and found that it follows the <a href="https://github.com/janestreet/sexplib/blob/d7c5e3adc16fcf0435220c3cd44bb695775020c1/README.org#lexical-conventions-of-s-expression">lexical conventions</a> of OCaml’s strings, and decided that would be a reasonable approach for us to follow too.</p> 1857 + <p>The resulting code, including the extraction logic, was implemented in <a href="https://github.com/ocaml/odoc/pull/1326/">PR 1326</a> mainly by <a href="https://github.com/panglesd">panglesd</a> with a little help from me on the lexer.</p>]]></content> 1858 + </entry> 1859 + </feed>
+4 -8
scripts/dune
··· 1 - (executable 2 - (name gen_atom) 3 - (libraries syndic uri unix ptime bos odoc.odoc odoc.html odoc.document 4 - odoc.model fmt tyxml yojson ISO8601)) 5 - 6 - (executable 7 - (name gen_blog_index) 8 - (libraries str)) 1 + (executables 2 + (names gen_atom gen_blog_index) 3 + (libraries unix bos fpath odoc.odoc odoc.html odoc.document odoc.model fmt 4 + tyxml str))
+212 -156
scripts/gen_atom.ml
··· 1 - (* Generate an atom feed from compiled odocl blog posts. *) 1 + (* Generate an Atom feed from compiled odocl blog posts. 2 2 3 - let id = Uri.of_string "https://jon.recoil.org/atom.xml" 4 - let title : Syndic.Atom.text_construct = Syndic.Atom.Text "Jon's blog" 3 + Uses odoc's HTML generator to render full post content into the feed, 4 + so feed readers get the same HTML as the website. No external Atom 5 + library — we emit the XML directly. *) 5 6 6 - let author = 7 - Syndic.Atom.author "Jon Ludlam" ~uri:(Uri.of_string "https://jon.recoil.org/") 7 + (** {1 Date helpers} *) 8 8 9 - let updated = Unix.gettimeofday () |> Ptime.of_float_s |> Option.get 9 + (** Parse an ISO 8601 date string like "2026-03-02" into (y, m, d). *) 10 + let parse_date s = 11 + try Scanf.sscanf s "%d-%d-%d" (fun y m d -> Some (y, m, d)) 12 + with _ -> None 10 13 11 - (** Extract the text content from a custom tag's payload. 12 - Custom tags like [@published 2026-03-02] are stored in Comment.docs 13 - as [`Tag (`Custom ("published", elements))]. The elements are 14 - nestable block elements — typically a single paragraph containing 15 - words and spaces. *) 14 + (** Format a date triple as an Atom datetime (midnight UTC). *) 15 + let atom_datetime (y, m, d) = 16 + Printf.sprintf "%04d-%02d-%02dT00:00:00Z" y m d 17 + 18 + (** {1 XML helpers} *) 19 + 20 + let xml_escape s = 21 + let buf = Buffer.create (String.length s) in 22 + String.iter 23 + (function 24 + | '&' -> Buffer.add_string buf "&amp;" 25 + | '<' -> Buffer.add_string buf "&lt;" 26 + | '>' -> Buffer.add_string buf "&gt;" 27 + | '"' -> Buffer.add_string buf "&quot;" 28 + | c -> Buffer.add_char buf c) 29 + s; 30 + Buffer.contents buf 31 + 32 + (** {1 Odoc content extraction} *) 33 + 34 + (** Extract text from a custom tag's payload. 35 + Custom tags like [@published 2026-03-02] are stored as 36 + [`Tag (`Custom ("published", elements))]. *) 16 37 let text_of_tag_payload elements = 17 38 let buf = Buffer.create 32 in 18 39 List.iter 19 - (fun (el : Odoc_model.Comment.nestable_block_element Odoc_model.Location_.with_location) -> 40 + (fun (el : 41 + Odoc_model.Comment.nestable_block_element 42 + Odoc_model.Location_.with_location) -> 20 43 match el.Odoc_model.Location_.value with 21 44 | `Paragraph inlines -> 22 45 List.iter 23 - (fun (il : Odoc_model.Comment.inline_element Odoc_model.Location_.with_location) -> 46 + (fun (il : 47 + Odoc_model.Comment.inline_element 48 + Odoc_model.Location_.with_location) -> 24 49 match il.value with 25 50 | `Word w -> Buffer.add_string buf w 26 51 | `Space -> Buffer.add_char buf ' ' ··· 33 58 (** Find a custom tag by name in the page's content elements. *) 34 59 let find_custom_tag name (docs : Odoc_model.Comment.docs) = 35 60 List.find_map 36 - (fun (el : Odoc_model.Comment.block_element Odoc_model.Location_.with_location) -> 61 + (fun (el : 62 + Odoc_model.Comment.block_element 63 + Odoc_model.Location_.with_location) -> 37 64 match el.value with 38 65 | `Tag (`Custom (n, payload)) when n = name -> 39 66 Some (text_of_tag_payload payload) 40 67 | _ -> None) 41 68 docs.elements 42 69 43 - let entry_of_mld odoc_file = 70 + (** {1 Entry type and extraction} *) 71 + 72 + type entry = { 73 + url : string; 74 + title : string; 75 + summary : string; 76 + content : string; 77 + published : int * int * int; 78 + } 79 + 80 + let entry_of_odocl odoc_file = 44 81 let report_error during msg = 45 - Format.eprintf "Error processing file '%s' while %s: %s\n%!" 46 - (Fpath.to_string odoc_file) 47 - during msg; 82 + Format.eprintf "Error processing '%s' while %s: %s\n%!" 83 + (Fpath.to_string odoc_file) during msg; 48 84 None 49 85 in 50 - let unit = 51 - match Odoc_odoc.Odoc_file.load odoc_file with 52 - | Ok f -> Some f 53 - | Error (`Msg m) -> 54 - ignore (report_error "loading file" m); 55 - None 56 - in 57 - match unit with 58 - | None -> None 59 - | Some unit -> ( 60 - let page = 61 - match unit.content with 62 - | Odoc_odoc.Odoc_file.Page_content page -> Some page 63 - | _ -> None 64 - in 65 - match page with 66 - | None -> None 67 - | Some page -> ( 68 - let document = 69 - Odoc_document.Renderer.document_of_page ~syntax:OCaml page 70 - in 71 - let published = find_custom_tag "published" page.content in 72 - match published with 73 - | None -> None (* Skip posts without published date *) 74 - | Some published -> ( 75 - match document with 76 - | Odoc_document.Types.Document.Source_page _ -> None 77 - | Odoc_document.Types.Document.Page p -> 78 - let first_heading = 79 - List.find_map 80 - (function 81 - | Odoc_document.Types.Item.Heading h -> Some h 82 - | _ -> None) 83 - p.preamble 84 - in 85 - match first_heading with 86 - | None -> 87 - ignore (report_error "parsing title" "No heading found"); 88 - None 89 - | Some first_heading -> 90 - let title = 91 - List.filter_map 92 - (function 93 - | Odoc_document.Types.Inline.{ desc = Text t; _ } -> Some t 94 - | _ -> None) 95 - first_heading.title 86 + match Odoc_odoc.Odoc_file.load odoc_file with 87 + | Error (`Msg m) -> report_error "loading file" m 88 + | Ok unit -> ( 89 + match unit.content with 90 + | Odoc_odoc.Odoc_file.Page_content page -> ( 91 + let published_str = find_custom_tag "published" page.content in 92 + match published_str with 93 + | None -> None 94 + | Some s when s = "never" || s = "draft" -> None 95 + | Some published_str -> ( 96 + match parse_date published_str with 97 + | None -> 98 + Format.eprintf "Bad date '%s' in %s\n%!" published_str 99 + (Fpath.to_string odoc_file); 100 + None 101 + | Some published -> ( 102 + let document = 103 + Odoc_document.Renderer.document_of_page ~syntax:OCaml page 96 104 in 97 - let title = String.concat "" title in 98 - if title = "" then None 99 - else 100 - let resolve = Odoc_html.Link.Current p.url in 101 - let config = 102 - Odoc_html.Config.v ~semantic_uris:false ~indent:false 103 - ~flat:false ~open_details:false ~as_json:false ~remap:[] () 104 - in 105 - let url = Odoc_html.Generator.filepath p.url ~config in 106 - let url = 107 - Format.asprintf "https://jon.recoil.org/%s" 108 - (Fpath.to_string url) 109 - in 110 - (* Generate full content: preamble + items *) 111 - let all_items = p.preamble @ p.items in 112 - let html = Odoc_html.Generator.items ~config ~resolve all_items in 113 - let content_fmt = Fmt.list (Tyxml.Html.pp_elt ()) in 114 - let content = Format.asprintf "%a" content_fmt html in 115 - (* Extract first paragraph for summary *) 116 - let summary = 117 - let first_text = 105 + match document with 106 + | Odoc_document.Types.Document.Source_page _ -> None 107 + | Odoc_document.Types.Document.Page p -> ( 108 + let first_heading = 118 109 List.find_map 119 110 (function 120 - | Odoc_document.Types.Item.Text blocks -> 121 - List.find_map 122 - (function 123 - | { Odoc_document.Types.Block.desc = 124 - Odoc_document.Types.Block.Paragraph inline; 125 - _ 126 - } -> 127 - let text = 128 - List.filter_map 129 - (function 130 - | Odoc_document.Types.Inline. 131 - { desc = Text t; _ } -> 132 - Some t 133 - | _ -> None) 134 - inline 135 - in 136 - if text = [] then None 137 - else Some (String.concat "" text) 138 - | _ -> None) 139 - blocks 111 + | Odoc_document.Types.Item.Heading h -> Some h 140 112 | _ -> None) 141 113 p.preamble 142 114 in 143 - match first_text with 144 - | Some t -> 145 - if String.length t > 200 then 146 - String.sub t 0 200 ^ "..." 147 - else t 148 - | None -> title 149 - in 150 - let published = 151 - try 152 - ISO8601.Permissive.date published |> Ptime.of_float_s 153 - with _ -> 154 - Format.eprintf "Error parsing date '%s' for %s\n%!" 155 - published (Fpath.to_string odoc_file); 156 - None 157 - in 158 - match published with 159 - | None -> None 160 - | Some published -> 161 - Some 162 - (Syndic.Atom.entry ~id:(Uri.of_string url) 163 - ~title:(Syndic.Atom.Text title) 164 - ~published ~updated:published 165 - ~summary:(Syndic.Atom.Text summary) 166 - ~content:(Syndic.Atom.Html (None, content)) 167 - ~links: 168 - [ 169 - Syndic.Atom.link ~rel:Syndic.Atom.Alternate 170 - (Uri.of_string url); 171 - ] 172 - ~authors:(author, []) ())))) 115 + match first_heading with 116 + | None -> report_error "parsing" "no heading found" 117 + | Some h -> 118 + let title = 119 + List.filter_map 120 + (function 121 + | Odoc_document.Types.Inline.{ desc = Text t; _ } 122 + -> 123 + Some t 124 + | _ -> None) 125 + h.title 126 + |> String.concat "" 127 + in 128 + if title = "" then None 129 + else 130 + let config = 131 + Odoc_html.Config.v ~semantic_uris:false 132 + ~indent:false ~flat:false ~open_details:false 133 + ~as_json:false ~remap:[] () 134 + in 135 + let resolve = Odoc_html.Link.Current p.url in 136 + let url = 137 + let fp = 138 + Odoc_html.Generator.filepath p.url ~config 139 + in 140 + Format.asprintf "https://jon.recoil.org/%s" 141 + (Fpath.to_string fp) 142 + in 143 + let all_items = p.preamble @ p.items in 144 + let html = 145 + Odoc_html.Generator.items ~config ~resolve 146 + all_items 147 + in 148 + let content = 149 + Format.asprintf "%a" 150 + (Fmt.list (Tyxml.Html.pp_elt ())) 151 + html 152 + in 153 + let summary = 154 + let first_text = 155 + List.find_map 156 + (function 157 + | Odoc_document.Types.Item.Text blocks -> 158 + List.find_map 159 + (function 160 + | { Odoc_document.Types.Block.desc = 161 + Odoc_document.Types.Block 162 + .Paragraph inline; 163 + _ 164 + } -> 165 + let text = 166 + List.filter_map 167 + (function 168 + | Odoc_document.Types 169 + .Inline 170 + .{ desc = Text t; _ } 171 + -> 172 + Some t 173 + | _ -> None) 174 + inline 175 + in 176 + if text = [] then None 177 + else 178 + Some (String.concat "" text) 179 + | _ -> None) 180 + blocks 181 + | _ -> None) 182 + p.preamble 183 + in 184 + match first_text with 185 + | Some t when String.length t > 200 -> 186 + String.sub t 0 200 ^ "..." 187 + | Some t -> t 188 + | None -> title 189 + in 190 + Some { url; title; summary; content; published })))) 191 + | _ -> None) 192 + 193 + (** {1 Discovery and sorting} *) 173 194 174 195 let is_blog_post path = 175 196 let basename = Fpath.basename path in ··· 178 199 && String.sub basename 0 5 = "page-" 179 200 && basename <> "page-index.odocl" 180 201 181 - let entries = 202 + let compare_entries a b = 203 + (* Newest first *) 204 + let (y1, m1, d1) = b.published in 205 + let (y2, m2, d2) = a.published in 206 + compare (y1, m1, d1) (y2, m2, d2) 207 + 208 + (** {1 Atom XML generation} *) 209 + 210 + let write_atom entries out_path = 211 + let oc = open_out out_path in 212 + let p = Printf.fprintf in 213 + let now = 214 + let t = Unix.gettimeofday () in 215 + let tm = Unix.gmtime t in 216 + Printf.sprintf "%04d-%02d-%02dT%02d:%02d:%02dZ" 217 + (tm.tm_year + 1900) (tm.tm_mon + 1) tm.tm_mday 218 + tm.tm_hour tm.tm_min tm.tm_sec 219 + in 220 + p oc "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"; 221 + p oc "<feed xmlns=\"http://www.w3.org/2005/Atom\">\n"; 222 + p oc " <id>https://jon.recoil.org/atom.xml</id>\n"; 223 + p oc " <title>Jon's blog</title>\n"; 224 + p oc " <updated>%s</updated>\n" now; 225 + p oc " <author>\n"; 226 + p oc " <name>Jon Ludlam</name>\n"; 227 + p oc " <uri>https://jon.recoil.org/</uri>\n"; 228 + p oc " </author>\n"; 229 + p oc " <link rel=\"self\" href=\"https://jon.recoil.org/atom.xml\"/>\n"; 230 + p oc " <link rel=\"alternate\" href=\"https://jon.recoil.org/blog/\"/>\n"; 231 + List.iter 232 + (fun e -> 233 + let date = atom_datetime e.published in 234 + p oc " <entry>\n"; 235 + p oc " <id>%s</id>\n" (xml_escape e.url); 236 + p oc " <title>%s</title>\n" (xml_escape e.title); 237 + p oc " <published>%s</published>\n" date; 238 + p oc " <updated>%s</updated>\n" date; 239 + p oc " <link rel=\"alternate\" href=\"%s\"/>\n" (xml_escape e.url); 240 + p oc " <summary>%s</summary>\n" (xml_escape e.summary); 241 + p oc " <content type=\"html\"><![CDATA[%s]]></content>\n" e.content; 242 + p oc " </entry>\n") 243 + entries; 244 + p oc "</feed>\n"; 245 + close_out oc 246 + 247 + (** {1 Main} *) 248 + 249 + let () = 250 + let odocl_dir = Fpath.v "_build/default/site/_odoc/blog" in 182 251 let mlds = 183 252 Bos.OS.Dir.fold_contents 184 253 (fun path acc -> if is_blog_post path then path :: acc else acc) 185 - [] 186 - (Fpath.v "_build/default/site/_odoc/blog") 254 + [] odocl_dir 187 255 in 188 256 match mlds with 189 - | Ok mlds -> 190 - let entries = List.filter_map entry_of_mld mlds in 191 - (* Sort by published date, newest first *) 192 - List.sort Syndic.Atom.descending entries 193 257 | Error (`Msg m) -> 194 258 Format.eprintf "Error finding blog posts: %s\n%!" m; 195 - [] 196 - 197 - let self_link = 198 - Syndic.Atom.link ~rel:Self (Uri.of_string "https://jon.recoil.org/atom.xml") 199 - 200 - let alt_link = 201 - Syndic.Atom.link ~rel:Alternate (Uri.of_string "https://jon.recoil.org/blog/") 202 - 203 - let feed = 204 - Syndic.Atom.feed ~id ~title ~updated ~links:[ self_link; alt_link ] entries 205 - 206 - let _ = 207 - Syndic.Atom.write feed "atom.xml"; 208 - Format.printf "Generated atom.xml with %d entries\n%!" (List.length entries) 259 + exit 1 260 + | Ok mlds -> 261 + let entries = List.filter_map entry_of_odocl mlds in 262 + let entries = List.sort compare_entries entries in 263 + write_atom entries "atom.xml"; 264 + Format.printf "Generated atom.xml with %d entries\n%!" (List.length entries)
+4
site/_blog_gen/blog/2025/03/index.mld
··· 1 + {0 March} 2 + 3 + @children_order module-type-of code-block-metadata 4 +
+4
site/_blog_gen/blog/2025/04/index.mld
··· 1 + {0 April} 2 + 3 + @children_order ocaml-docs-ci-and-odoc-3 odoc-3 semantic-versioning-is-hard meeting-the-team this-site 4 +
+4
site/_blog_gen/blog/2025/05/index.mld
··· 1 + {0 May} 2 + 3 + @children_order docs-progress lots-of-things ticks-solved-by-ai oxcaml-gets-closer ai-for-climate-and-nature-day 4 +
+4
site/_blog_gen/blog/2025/06/index.mld
··· 1 + {0 June} 2 + 3 + @children_order week23 4 +
+4
site/_blog_gen/blog/2025/07/index.mld
··· 1 + {0 July} 2 + 3 + @children_order retrospective odoc-3-live-on-ocaml-org week28 week27 4 +
+4
site/_blog_gen/blog/2025/08/index.mld
··· 1 + {0 August} 2 + 3 + @children_order ocaml-lsp-mcp ocaml-mcp-server week33 4 +
+4
site/_blog_gen/blog/2025/09/index.mld
··· 1 + {0 September} 2 + 3 + @children_order caching-opam-solutions2 odoc-bugs caching-opam-solutions build-ids-for-day10 giving-hub-cl-an-upgrade 4 +
+4
site/_blog_gen/blog/2025/11/index.mld
··· 1 + {0 November} 2 + 3 + @children_order foundations-of-computer-science 4 +
+4
site/_blog_gen/blog/2025/12/index.mld
··· 1 + {0 December} 2 + 3 + @children_order claude-and-dune an-svg-is-all-you-need 4 +
+32
site/_blog_gen/blog/2025/index.mld
··· 1 + {0 2025} 2 + 3 + @children_order 12/ 11/ 09/ 08/ 07/ 06/ 05/ 04/ 03/ 4 + 5 + - {{!//blog/2025/12/page-"claude-and-dune"}Claude and Dune} 6 + - {{!//blog/2025/12/page-"an-svg-is-all-you-need"}An SVG is all you need} 7 + - {{!//blog/2025/11/page-"foundations-of-computer-science"}Foundations of Computer Science} 8 + - {{!//blog/2025/09/page-"caching-opam-solutions2"}Caching opam solutions - part 2} 9 + - {{!//blog/2025/09/page-"odoc-bugs"}Odoc bugs} 10 + - {{!//blog/2025/09/page-"caching-opam-solutions"}Caching opam solutions} 11 + - {{!//blog/2025/09/page-"build-ids-for-day10"}Build IDs for Day10} 12 + - {{!//blog/2025/09/page-"giving-hub-cl-an-upgrade"}Giving hub.cl an upgrade} 13 + - {{!//blog/2025/08/page-"ocaml-lsp-mcp"}Using ocaml-lsp-server via an MCP server} 14 + - {{!//blog/2025/08/page-"ocaml-mcp-server"}An OCaml MCP server} 15 + - {{!//blog/2025/08/page-week33}Week 33} 16 + - {{!//blog/2025/07/page-retrospective}4 months in, a retrospective} 17 + - {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}Odoc 3 is live on OCaml.org!} 18 + - {{!//blog/2025/07/page-week28}Week 28} 19 + - {{!//blog/2025/07/page-week27}Weeks 24-27} 20 + - {{!//blog/2025/06/page-week23}Week 23} 21 + - {{!//blog/2025/05/page-"docs-progress"}Progress in OCaml docs} 22 + - {{!//blog/2025/05/page-"lots-of-things"}Lots of things have been happening} 23 + - {{!//blog/2025/05/page-"ticks-solved-by-ai"}Solving First-year OCaml exercises with AI} 24 + - {{!//blog/2025/05/page-"oxcaml-gets-closer"}OxCaml is getting closer...} 25 + - {{!//blog/2025/05/page-"ai-for-climate-and-nature-day"}AI for Climate & Nature Community Day} 26 + - {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}OCaml-Docs-CI and Odoc 3} 27 + - {{!//blog/2025/04/page-"odoc-3"}Odoc 3: So what?} 28 + - {{!//blog/2025/04/page-"semantic-versioning-is-hard"}Semantic Versioning in OCaml is Hard} 29 + - {{!//blog/2025/04/page-"meeting-the-team"}Meeting the Team} 30 + - {{!//blog/2025/04/page-"this-site"}This site} 31 + - {{!//blog/2025/03/page-"module-type-of"}The Road to Odoc 3: Module Type Of} 32 + - {{!//blog/2025/03/page-"code-block-metadata"}Code block metadata}
+4
site/_blog_gen/blog/2026/01/index.mld
··· 1 + {0 January} 2 + 3 + @children_order weeknotes-2026-04-05 weeknotes-2026-03 4 +
+4
site/_blog_gen/blog/2026/02/index.mld
··· 1 + {0 February} 2 + 3 + @children_order weeknotes-2026-08 weeknotes-2026-06 4 +
+4
site/_blog_gen/blog/2026/03/index.mld
··· 1 + {0 March} 2 + 3 + @children_order weeknotes-2026-10 weeknotes-2026-09 4 +
+10
site/_blog_gen/blog/2026/index.mld
··· 1 + {0 2026} 2 + 3 + @children_order 03/ 02/ 01/ 4 + 5 + - {{!//blog/2026/03/page-"weeknotes-2026-10"}Weeknotes 2026 week 10} 6 + - {{!//blog/2026/03/page-"weeknotes-2026-09"}Weeknotes 2026 week 9} 7 + - {{!//blog/2026/02/page-"weeknotes-2026-08"}Weeknotes weeks 7-8} 8 + - {{!//blog/2026/02/page-"weeknotes-2026-06"}Weeknotes for week 6} 9 + - {{!//blog/2026/01/page-"weeknotes-2026-04-05"}Weeknotes for weeks 4-5} 10 + - {{!//blog/2026/01/page-"weeknotes-2026-03"}Weeknotes for week 3}
+1 -1
site/blog/2025/04/ocaml-docs-ci-and-odoc-3.mld
··· 10 10 11 11 {1 The challenge of documenting OCaml} 12 12 13 - As I wrote about {{!/jon-site/blog/2025/04/page-"semantic-versioning-is-hard"}recently}, the 13 + As I wrote about {{!//blog/2025/04/page-"semantic-versioning-is-hard"}recently}, the 14 14 APIs of OCaml libraries are dependent not only on the version of its package, but possibly 15 15 also on the versions of any of its dependencies. Due to this fact, to produce the docs for 16 16 ocaml.org means that sometimes we need to build the docs for a particular version of a
+5 -5
site/blog/2025/04/this-site.mld
··· 23 23 {1 Demo} 24 24 Let's start with a little demo: 25 25 26 - {@ocaml[ 26 + {@ocaml x[ 27 27 let x = 1 + 2 28 28 ]} 29 29 ··· 37 37 text. The following cell creates an SVG image and 'pushes' it to [Mime_printer], which receives the 38 38 mime value and renders it in the browser below the code block. 39 39 40 - {@ocaml[ 40 + {@ocaml x[ 41 41 let svg = [ 42 42 {|<svg height="210" width="500" xmlns="http://www.w3.org/2000/svg">|}; 43 43 {|<polygon points="100,10 40,198 190,78 10,78 160,198" |}; ··· 62 62 underlining the error in the following cell. You should also be able to hover over 63 63 the variables and see their types. 64 64 65 - {@ocaml[ 65 + {@ocaml x[ 66 66 type t = { foo : int; bar : string };; 67 67 68 68 let x = { foo = 1; bar = "hello" };; ··· 75 75 in the cell above. Merlin highlights the use of the varible [svg] is, because it's not aware 76 76 of the varible, but the code gets executed correctly and the image is rendered below the cell. 77 77 78 - {@ocaml[ 78 + {@ocaml x[ 79 79 Mime_printer.push "image/svg" (String.concat "\n" svg) 80 80 ]} 81 81 ··· 110 110 111 111 As a more extended example of odoc notebooks, I have converted to this format the course that I help teach 112 112 at the University of Cambridge; {{:https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/}Foundations of Computer Science}. 113 - {{!/jon-site/notebooks/foundations/page-index}Try them out for yourself!}. 113 + {{!//notebooks/foundations/page-index}Try them out for yourself!}. 114 114 115 115 116 116
+2 -2
site/blog/2025/05/docs-progress.mld
··· 58 58 {1 Step 2: building packages} 59 59 60 60 The next step, once we've got the solutions, is to build the packages. This is using the new method 61 - I {{!/jon-site/blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}previously wrote about}. There are about 61 + I {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}previously wrote about}. There are about 62 62 1,000 packages that fail to build, and once again we can take a look and categorise some of these 63 63 failures. There are a wider variety of failures here, and it's quite useful to cross-check with 64 64 {{!https://check.ci.ocaml.org/}opam health check} to see if it's known to be broken. Unfortunately ··· 139 139 unknown package 140 140 v} 141 141 142 - which is happening because of the optimisation I {{!/jon-site/blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}mentioned before} where we 142 + which is happening because of the optimisation I {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}mentioned before} where we 143 143 build a new [opam-repository] with only the packages we're going to need. In this case, we've somehow 144 144 missed out the [ctypes] package. Looking at the opam file for [ctypes-foreign], it has a [post] dependency 145 145 on [ctypes]. The [post] keyword indicates that [ctypes] should be installed with [ctypes-foreign], but
+2 -2
site/blog/2025/05/lots-of-things.mld
··· 13 13 14 14 I've been working with {{:https://tunbury.org/}Mark Elvers} on getting the docs 15 15 CI running using Odoc 3.0. There are quite a few changes involved, both in how 16 - we're {{!/jon-site/blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}building the packages} 16 + we're {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}building the packages} 17 17 but also how we're running odoc - it's building using [odoc_driver] rather than [voodoo] now, 18 18 and while it's looking promising now we had hit a few hurdles along the way. 19 19 ··· 126 126 127 127 The upshot of all this is that I now have a semi-working version of the notebooks using oxcaml. As 128 128 an initial demonstration I ported one of my colleague {{:https://github.com/cuihtlauac}Cuihtlauac}'s 129 - oxcaml tutorial docs to the notebook format, and it {{!/jon-site/notebooks/oxcaml/page-"local"}works quite nicely}. 129 + oxcaml tutorial docs to the notebook format, and it {{!//notebooks/oxcaml/page-"local"}works quite nicely}. 130 130 131 131 132 132
+1 -1
site/blog/2025/05/oxcaml-gets-closer.mld
··· 20 20 21 21 Personally, I'm looking forward to seeing their branch of {{:https://ocaml.github.io/odoc/}odoc} and 22 22 having a look to see how the modes will fit into the documentation. I'm also keen to see whether the 23 - {{!/jon-site/blog/2025/04/page-"this-site"}notebook features} I've been working on can be ported over to run on OxCaml! 23 + {{!//blog/2025/04/page-"this-site"}notebook features} I've been working on can be ported over to run on OxCaml! 24 24 25 25
+1 -1
site/blog/2025/06/week23.mld
··· 33 33 took {{:https://github.com/ocurrent/ocaml-docs-ci/blob/4dfe7e6265610da4e0ce2a386cfbf0b8eac3d9bd/src/lib/track.ml#L58-L76}some code} 34 34 from docs CI: 35 35 36 - {@ocaml[ 36 + {@ocaml x[ 37 37 type p = { 38 38 opam : OpamPackage.t; 39 39 path : Fpath.t;
+1 -1
site/blog/2025/07/odoc-3-live-on-ocaml-org.mld
··· 20 20 during the build process, and so part of the process of building docs is to build the packages, 21 21 so we have to, at minimum, attempt to build all 17,000 or so distinct versions of the packages 22 22 in opam-repository. The {{:https://github.com/ocurrent}ocurrent} tool {{:https://github.com/ocurrent/ocaml-docs-ci}ocaml-docs-ci}, 23 - which I've previously {{!/jon-site/blog/2025/05/page-"docs-progress"}written} {{!/jon-site/blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}about}, 23 + which I've previously {{!//blog/2025/05/page-"docs-progress"}written} {{!//blog/2025/04/page-"ocaml-docs-ci-and-odoc-3"}about}, 24 24 is responsible for these builds and in this new release has demonstrated a new approach to this task, 25 25 where we attempt to do the build in as efficient a way as possible by effectively building 26 26 binary packages once for each required package in a specific 'universe' of dependencies. For
+5 -5
site/blog/2025/07/retrospective.mld
··· 22 22 difficult to make the next steps happen. One area where I've had some success is in 23 23 working with Sadiq on LLMs - in particular, getting local LLMs to solve programming 24 24 exercises that we both {{:https://toao.com/blog/ocaml-local-code-models}wrote} 25 - {{!/jon-site/blog/2025/05/page-"ticks-solved-by-ai"}up}. I've also been working with him 25 + {{!//blog/2025/05/page-"ticks-solved-by-ai"}up}. I've also been working with him 26 26 on taking the output from the docs CI and {{:https://github.com/sadiqj/odoc-llm}summarising it with LLMs} in order to 27 27 create an MCP server that would help tools like {{:https://anthropic.com/}Claude Code} 28 28 to choose OCaml packages to solve users' problems. 29 29 30 30 It's been somewhat easier, partly due to inertia, to carry on with projects that had been in flight at the time 31 31 I started. Things like getting the Odoc 3 generated docs onto ocaml.org, which is 32 - finally complete only {{!/jon-site/blog/2025/07/page-"odoc-3-live-on-ocaml-org"}as of this week!}. 32 + finally complete only {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}as of this week!}. 33 33 This has taken a whole lot of time, but I'm really pleased with the end results. There's 34 34 still an awful lot of improvements that I'd like to see made, which, after drawing breath for 35 35 a couple of weeks, I'll be writing down. 36 36 37 37 An itch I'd been wanting to scratch for a long time has been to look at client-side ocaml 38 - notebooks. I decided to make this an integral {{!/jon-site/blog/2025/04/page-"this-site"}feature of this blog}, 38 + notebooks. I decided to make this an integral {{!//blog/2025/04/page-"this-site"}feature of this blog}, 39 39 and I've learnt an awful 40 40 lot doing it. An important feature of this that I've been keeping in mind is the idea that 41 41 we could use the ocaml-docs-ci tool to build the libraries, which would allow us to host ··· 73 73 74 74 {2 Efficient and reusable CI} 75 75 A clear and obvious area where we'll be able to see real progress is to extract from docs CI the logic that I've been 76 - using to do efficient builds of packages. As I previously {{!/jon-site/blog/2025/07/page-"odoc-3-live-on-ocaml-org"}wrote about}, 76 + using to do efficient builds of packages. As I previously {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}wrote about}, 77 77 the new CI system is far more efficient than some of the other ocurrent-based pipelines, 78 78 and it would save a huge amount of compute time if we were to take this tech and apply it 79 79 elsewhere. ··· 90 90 get this running on a big machine and just see how fast we can build everything. The key thing 91 91 here is that it should be {e trivial} to run this on a linux box. A raspberry pi or a 768-core 92 92 behemoth with 3TiB of ram. Just how fast {e can} we get it going? It's already building in a 93 - couple of days using {{!/jon-site/blog/2025/07/page-"odoc-3-live-on-ocaml-org"}sage}, but that's 93 + couple of days using {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}sage}, but that's 94 94 using ocurrent/obuilder, which isn't quite the right tool for the job, and on a relatively 95 95 puny machine. Can we do it in an hour? 10 minutes? Certainly the incrememntal builds ought 96 96 to be done in seconds. What's the limit?
+1 -1
site/blog/2025/08/week33.mld
··· 28 28 experience. 29 29 30 30 Thirdly, and most importantly, we had decided that we needed a few example searches to show 31 - how the system worked. We'd already had a {{!/jon-site/blog/2025/07/page-week28}useful experience} 31 + how the system worked. We'd already had a {{!//blog/2025/07/page-week28}useful experience} 32 32 with this when Anil had tried to search for a 'time and date parsing and formatting' library, 33 33 so it shouldn't really have been a surprise that trying a few more examples showed some more 34 34 interesting behaviour. Specifically, the searches I wanted to do were for an "HTTP client", "JSON parser",
+1 -1
site/blog/2025/09/caching-opam-solutions.mld
··· 5 5 The {{:https://github.com/ocurrent/ocaml-docs-ci}ocaml-docs-ci} system works by watching 6 6 opam-repository for changes, and then when it notices a new package it performs an opam 7 7 solve and builds the package, a prerequisite for building the documentation. In order 8 - to give the docs some stability, as the docs may well {{!/jon-site/blog/2025/04/page-"semantic-versioning-is-hard"}depend upon your dependencies}, we currently cache the solve results so that a package 8 + to give the docs some stability, as the docs may well {{!//blog/2025/04/page-"semantic-versioning-is-hard"}depend upon your dependencies}, we currently cache the solve results so that a package 9 9 will always be built with the same set of dependencies, even if a new version of one 10 10 of those dependencies has been released. 11 11
+1 -1
site/blog/2025/09/caching-opam-solutions2.mld
··· 2 2 3 3 @published 2025-09-23 4 4 5 - Some results from the {{!/jon-site/blog/2025/09/page-"caching-opam-solutions"}previous post}. 5 + Some results from the {{!//blog/2025/09/page-"caching-opam-solutions"}previous post}. 6 6 This time I've run day10 on 144 or so commits from opam-repository to see how well the 7 7 cache performs. The results are quite interesting. 8 8
+2 -2
site/blog/2025/12/an-svg-is-all-you-need.mld
··· 54 54 So this is yet another tool in our ongoing effort to be able to effortlessly share and 55 55 remix our work - added to the pile of Jupyter notebooks, {{:https://digitalflapjack.com/blog/marimo/}Marimo botebooks}, 56 56 the {{:https://slipshow.readthedocs.io/en/stable/}slipshow}/{{:https://github.com/art-w/x-ocaml/}x-ocaml} 57 - {{!/jon-site/blog/2025/11/page-"foundations-of-computer-science"}combination}, 57 + {{!//blog/2025/11/page-"foundations-of-computer-science"}combination}, 58 58 {{:https://patrick.sirref.org/weekly-2025-w45/index.xml}Patrick's take} on Jon Sterling's 59 - {{:https://sr.ht/~jonsterling/forester/}Forester}, my own {{!/jon-site/notebooks/page-index}notebooks}, 59 + {{:https://sr.ht/~jonsterling/forester/}Forester}, my own {{!//notebooks/page-index}notebooks}, 60 60 and many others - and this is a subset of what we're using just in our own group! 61 61 62 62
+1 -1
site/blog/2025/12/claude-and-dune.mld
··· 6 6 major new version of the OCaml documentation generator. It had a whole load of {{:https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043}new features}, 7 7 many of which came with new demands on the build system driving it. We decided when 8 8 working on it to build a new driver for odoc so that we could adjust it as we were 9 - building the new features, and this driver is now used to {{!/jon-site/blog/2025/07/page-"odoc-3-live-on-ocaml-org"}build the documentation} that appears on 9 + building the new features, and this driver is now used to {{!//blog/2025/07/page-"odoc-3-live-on-ocaml-org"}build the documentation} that appears on 10 10 {{:https://ocaml.org/p/base/latest/doc/index.html}ocaml.org}. However, it was 11 11 always the plan to integrate the new features into {{:https://dune.build}Dune} so 12 12 that everyone could just run [dune build @doc] and be able to use all of the new
+1 -1
site/blog/2026/01/weeknotes-2026-03.mld
··· 76 76 test this out. 77 77 78 78 {2 Day10 and docs} 79 - I've {{!/jon-site/blog/2025/09/page-"build-ids-for-day10"}written about} {{:https://tunbury.org/}Mark's} day10 project 79 + I've {{!//blog/2025/09/page-"build-ids-for-day10"}written about} {{:https://tunbury.org/}Mark's} day10 project 80 80 before. It's a tool to very rapidly build odoc packages mainly in order to test that they build correctly. An obvious 81 81 extension would be to use this to then build the docs for those packages, as the way we do this requires the packages 82 82 to be built first. This would be a replacement for the Docs CI that I talked about above, though there's considerable
+2 -2
site/blog/2026/01/weeknotes-2026-04-05.mld
··· 37 37 the expansions and referencing correct, and while we've made progress on the actual content markup, introducing 38 38 {{:https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#media}media tags} for example, there's still a good distance to go. 39 39 40 - Using the plugins mechanism I {{!/jon-site/blog/2026/01/page-"weeknotes-2026-03"}wrote about last week}, 40 + Using the plugins mechanism I {{!//blog/2026/01/page-"weeknotes-2026-03"}wrote about last week}, 41 41 I've made a plugin interface for odoc and implemented a few plugins. Initially I was just going to support 'custom tags' 42 42 but it occurred to me that rendering code blocks could also be done in this way. So I've made a few. Two custom tag plugins: 43 43 ··· 60 60 61 61 {{:https://github.com/lukemaurer}Luke Maurer} at Jane Street pointed out that they're still suffering from 62 62 yet another repro of {{:https://github.com/ocaml/odoc/issues/930}issue 930} at Jane Street. I'd worked on this 63 - {{!/jon-site/blog/2025/09/page-"odoc-bugs"}back in September} but turns out I hadn't actually made a PR, so I tidied 63 + {{!//blog/2025/09/page-"odoc-bugs"}back in September} but turns out I hadn't actually made a PR, so I tidied 64 64 up the branch and {{:https://github.com/ocaml/odoc/pull/1400}made a PR}. 65 65 66 66 {2 Docs CI}
+1 -1
site/blog/2026/02/weeknotes-2026-06.mld
··· 14 14 One of the issues was to do with identifiers needing to be unique. For example, consider the following 15 15 code: 16 16 17 - {@ocaml[ 17 + {@ocaml x[ 18 18 module type S = sig 19 19 type t 20 20
+1 -1
site/blog/2026/03/index.mld
··· 1 1 {0 March} 2 2 3 - @children_order weeknotes-2026-09 3 + @children_order weeknotes-2026-10 weeknotes-2026-09 4 4
+5
site/blog/2026/03/monopam-madness.mld
··· 1 + {0 Monopam Madness} 2 + 3 + One of Dune's key strengths is in its native ability to do vendoring. It's been a feature right 4 + from the {{:https://www.dra27.uk/blog/platform/2018/08/15/dune-vendoring.html}early days}, and 5 + I've recently been using it via Anil's {{:}monopam tool}
site/blog/2026/03/new.png

This is a binary file and will not be displayed.

site/blog/2026/03/old.png

This is a binary file and will not be displayed.

+10
site/blog/2026/03/open-source-and-ai.mld
··· 1 + {0 Open Source and AI} 2 + 3 + @notanotebook 4 + @published never 5 + 6 + I've been doing an awful lot of working with Claude recently, and it's time to reflect a bit on what 7 + impact these AI agents will be having on software. I'll concentrate on open source software, as that's 8 + the world I live in. 9 + 10 + First of all, Claude is {e amazing}. Mind blowingly amazing. I would never have thought
site/blog/2026/03/tessera.png

This is a binary file and will not be displayed.

+14 -5
site/blog/2026/03/weeknotes-2026-10.mld
··· 1 1 {0 Weeknotes 2026 week 10} 2 2 3 - @publishes 2026-03-09 4 - @notanotebook 3 + @published 2026-03-09 4 + 5 + Here are my weeknotes for the last week, while I'm still writing up 6 + some more focused posts on some specific topics - like the experience 7 + of putting everything in a monorepo to create this site, and more notes 8 + on Claude and Agentic coding in general, and its impact on the world 9 + of software. But for now, here's what I've been up to. 5 10 6 11 {1 What did I do?} 7 12 8 13 - New site design. The old site was a bit of a mess and was simply reusing odoc's default 9 14 default styling. I've also rearranged the content a bit to make it more navigable and 10 15 cohesive. 11 - {image!old.png} 12 - {image!new.png} 16 + {image!./old.png} 17 + {image!./new.png} 13 18 - TESSERA in the browser is a {{:https://tee.cl.cam.ac.uk/}hot} {{:https://anil.recoil.org/notes/2026w10}topic} right now, so I've applied the work I've been doing with x-ocaml, js_top_worker 14 - and odoc plugins to make a {{!}TESSERA notebook} that's based on the {{:https://github.com/ucam-eo/tessera-interactive-map}example notebook}. 19 + and odoc plugins to make a {{:/notebooks/interactive_map.html}TESSERA notebook} that's based on the {{:https://github.com/ucam-eo/tessera-interactive-map}example notebook}. 20 + {image!./tessera.png} 15 21 - I was interested in whether we'll be able to do inference in reasonable time using these 16 22 notebooks. {{:https://onnx.ai/}ONNX} has a web version of its runtime, so I got Claude to 17 23 make some bindings, and checked it was working by doing a sentiment analysis notebook. This ··· 30 36 - Our group seminar this week was {{:https://tombearpark.com/}Tom Bearpark} who talked about 31 37 his proposed 'Carbon at Risk' measure in order to compare diverse ways of removing carbon 32 38 from the atomsphere to help with the carbon removal market. 39 + 40 + {1 What's next?} 41 + - More writing before more coding, I think. 33 42 34 43 35 44
+1
site/blog/2026/index.mld
··· 2 2 3 3 @children_order 03/ 02/ 01/ 4 4 5 + - {{!//blog/2026/03/page-"weeknotes-2026-10"}Weeknotes 2026 week 10} 5 6 - {{!//blog/2026/03/page-"weeknotes-2026-09"}Weeknotes 2026 week 9} 6 7 - {{!//blog/2026/02/page-"weeknotes-2026-08"}Weeknotes weeks 7-8} 7 8 - {{!//blog/2026/02/page-"weeknotes-2026-06"}Weeknotes for week 6}
+1
site/blog/index.mld
··· 4 4 5 5 @recent-posts 6 6 {ul 7 + {- {{!//blog/2026/03/page-"weeknotes-2026-10"}Weeknotes 2026 week 10} 2026-03-09} 7 8 {- {{!//blog/2026/03/page-"weeknotes-2026-09"}Weeknotes 2026 week 9} 2026-03-02} 8 9 {- {{!//blog/2026/02/page-"weeknotes-2026-08"}Weeknotes weeks 7-8} 2026-02-23} 9 10 {- {{!//blog/2026/02/page-"weeknotes-2026-06"}Weeknotes for week 6} 2026-02-09}
+165 -4
site/dune.inc
··· 48 48 blog/2026/02/weeknotes-2026-06.mld 49 49 blog/2026/02/weeknotes-2026-08.mld 50 50 blog/2026/03/index.mld 51 + blog/2026/03/monopam-madness.mld 52 + blog/2026/03/open-source-and-ai.mld 51 53 blog/2026/03/weeknotes-2026-09.mld 52 54 blog/2026/03/weeknotes-2026-10.mld 53 55 blog/2026/index.mld ··· 88 90 blog/2025/09/examination_map_histogram.svg 89 91 blog/2025/12/fungus.svg 90 92 blog/2026/03/mapdemo.mov 91 - blog/2026/03/search.png) 93 + blog/2026/03/new.png 94 + blog/2026/03/old.png 95 + blog/2026/03/search.png 96 + blog/2026/03/tessera.png) 92 97 (action 93 98 (progn 94 99 (run ··· 462 467 (run 463 468 odoc 464 469 compile 470 + blog/2026/03/monopam-madness.mld 471 + --output-dir 472 + _odoc 473 + --parent-id 474 + blog/2026/03) 475 + (run 476 + odoc 477 + compile 478 + blog/2026/03/open-source-and-ai.mld 479 + --output-dir 480 + _odoc 481 + --parent-id 482 + blog/2026/03) 483 + (run 484 + odoc 485 + compile 465 486 blog/2026/03/weeknotes-2026-09.mld 466 487 --output-dir 467 488 _odoc ··· 779 800 odoc 780 801 compile-asset 781 802 --name 803 + new.png 804 + --output-dir 805 + _odoc 806 + --parent-id 807 + blog/2026/03) 808 + (run 809 + odoc 810 + compile-asset 811 + --name 812 + old.png 813 + --output-dir 814 + _odoc 815 + --parent-id 816 + blog/2026/03) 817 + (run 818 + odoc 819 + compile-asset 820 + --name 782 821 search.png 783 822 --output-dir 784 823 _odoc ··· 786 825 blog/2026/03) 787 826 (run 788 827 odoc 828 + compile-asset 829 + --name 830 + tessera.png 831 + --output-dir 832 + _odoc 833 + --parent-id 834 + blog/2026/03) 835 + (run 836 + odoc 789 837 link 790 838 _odoc/blog/2025/03/page-code-block-metadata.odoc 791 839 -P ··· 1155 1203 (run 1156 1204 odoc 1157 1205 link 1206 + _odoc/blog/2026/03/page-monopam-madness.odoc 1207 + -P 1208 + site:_odoc 1209 + -o 1210 + _odoc/blog/2026/03/page-monopam-madness.odocl) 1211 + (run 1212 + odoc 1213 + link 1214 + _odoc/blog/2026/03/page-open-source-and-ai.odoc 1215 + -P 1216 + site:_odoc 1217 + -o 1218 + _odoc/blog/2026/03/page-open-source-and-ai.odocl) 1219 + (run 1220 + odoc 1221 + link 1158 1222 _odoc/blog/2026/03/page-weeknotes-2026-09.odoc 1159 1223 -P 1160 1224 site:_odoc ··· 1475 1539 (run 1476 1540 odoc 1477 1541 link 1542 + _odoc/blog/2026/03/asset-new.png.odoc 1543 + -P 1544 + site:_odoc 1545 + -o 1546 + _odoc/blog/2026/03/asset-new.png.odocl) 1547 + (run 1548 + odoc 1549 + link 1550 + _odoc/blog/2026/03/asset-old.png.odoc 1551 + -P 1552 + site:_odoc 1553 + -o 1554 + _odoc/blog/2026/03/asset-old.png.odocl) 1555 + (run 1556 + odoc 1557 + link 1478 1558 _odoc/blog/2026/03/asset-search.png.odoc 1479 1559 -P 1480 1560 site:_odoc ··· 1482 1562 _odoc/blog/2026/03/asset-search.png.odocl) 1483 1563 (run 1484 1564 odoc 1565 + link 1566 + _odoc/blog/2026/03/asset-tessera.png.odoc 1567 + -P 1568 + site:_odoc 1569 + -o 1570 + _odoc/blog/2026/03/asset-tessera.png.odocl) 1571 + (run 1572 + odoc 1485 1573 compile-index 1486 1574 --root 1487 1575 site:_odoc ··· 1533 1621 _odoc/blog/2026/02/page-weeknotes-2026-06.odocl 1534 1622 _odoc/blog/2026/02/page-weeknotes-2026-08.odocl 1535 1623 _odoc/blog/2026/03/page-index.odocl 1624 + _odoc/blog/2026/03/page-monopam-madness.odocl 1625 + _odoc/blog/2026/03/page-open-source-and-ai.odocl 1536 1626 _odoc/blog/2026/03/page-weeknotes-2026-09.odocl 1537 1627 _odoc/blog/2026/03/page-weeknotes-2026-10.odocl 1538 1628 _odoc/blog/2026/page-index.odocl ··· 1613 1703 _odoc/blog/2026/02/page-weeknotes-2026-06.odocl 1614 1704 _odoc/blog/2026/02/page-weeknotes-2026-08.odocl 1615 1705 _odoc/blog/2026/03/page-index.odocl 1706 + _odoc/blog/2026/03/page-monopam-madness.odocl 1707 + _odoc/blog/2026/03/page-open-source-and-ai.odocl 1616 1708 _odoc/blog/2026/03/page-weeknotes-2026-09.odocl 1617 1709 _odoc/blog/2026/03/page-weeknotes-2026-10.odocl 1618 1710 _odoc/blog/2026/page-index.odocl ··· 1653 1745 _odoc/blog/2025/09/asset-examination_map_histogram.svg.odocl 1654 1746 _odoc/blog/2025/12/asset-fungus.svg.odocl 1655 1747 _odoc/blog/2026/03/asset-mapdemo.mov.odocl 1748 + _odoc/blog/2026/03/asset-new.png.odocl 1749 + _odoc/blog/2026/03/asset-old.png.odocl 1656 1750 _odoc/blog/2026/03/asset-search.png.odocl 1751 + _odoc/blog/2026/03/asset-tessera.png.odocl 1657 1752 blog/2025/05/alice.jpg 1658 1753 blog/2025/05/amy.jpg 1659 1754 blog/2025/05/emilio.jpg ··· 1671 1766 blog/2025/09/examination_map_histogram.svg 1672 1767 blog/2025/12/fungus.svg 1673 1768 blog/2026/03/mapdemo.mov 1769 + blog/2026/03/new.png 1770 + blog/2026/03/old.png 1674 1771 blog/2026/03/search.png 1772 + blog/2026/03/tessera.png 1675 1773 static/assets/jon.jpg 1676 1774 static/assets/notebook-foundations.png 1677 - static/assets/notebook-oxcaml.png) 1775 + static/assets/notebook-interactive-map.png 1776 + static/assets/notebook-oxcaml.png 1777 + static/assets/notebook-sentiment.png 1778 + static/assets/notebook-widgets.png) 1678 1779 (action 1679 1780 (progn 1680 1781 (run ··· 2146 2247 x-ocaml.universe=/_opam 2147 2248 -o 2148 2249 _html 2250 + _odoc/blog/2026/03/page-monopam-madness.odocl) 2251 + (run 2252 + odoc 2253 + html-generate 2254 + --shell 2255 + jon-shell 2256 + --config 2257 + x-ocaml.universe=/_opam 2258 + -o 2259 + _html 2260 + _odoc/blog/2026/03/page-open-source-and-ai.odocl) 2261 + (run 2262 + odoc 2263 + html-generate 2264 + --shell 2265 + jon-shell 2266 + --config 2267 + x-ocaml.universe=/_opam 2268 + -o 2269 + _html 2149 2270 _odoc/blog/2026/03/page-weeknotes-2026-09.odocl) 2150 2271 (run 2151 2272 odoc ··· 2507 2628 odoc 2508 2629 html-generate-asset 2509 2630 --asset-unit 2631 + _odoc/blog/2026/03/asset-new.png.odocl 2632 + -o 2633 + _html 2634 + blog/2026/03/new.png) 2635 + (run 2636 + odoc 2637 + html-generate-asset 2638 + --asset-unit 2639 + _odoc/blog/2026/03/asset-old.png.odocl 2640 + -o 2641 + _html 2642 + blog/2026/03/old.png) 2643 + (run 2644 + odoc 2645 + html-generate-asset 2646 + --asset-unit 2510 2647 _odoc/blog/2026/03/asset-search.png.odocl 2511 2648 -o 2512 2649 _html 2513 2650 blog/2026/03/search.png) 2651 + (run 2652 + odoc 2653 + html-generate-asset 2654 + --asset-unit 2655 + _odoc/blog/2026/03/asset-tessera.png.odocl 2656 + -o 2657 + _html 2658 + blog/2026/03/tessera.png) 2514 2659 (run odoc support-files -o _html) 2515 2660 (system 2516 2661 "mkdir -p $(dirname _html/static/assets/jon.jpg) && cp static/assets/jon.jpg _html/static/assets/jon.jpg") 2517 2662 (system 2518 2663 "mkdir -p $(dirname _html/static/assets/notebook-foundations.png) && cp static/assets/notebook-foundations.png _html/static/assets/notebook-foundations.png") 2519 2664 (system 2520 - "mkdir -p $(dirname _html/static/assets/notebook-oxcaml.png) && cp static/assets/notebook-oxcaml.png _html/static/assets/notebook-oxcaml.png")))) 2665 + "mkdir -p $(dirname _html/static/assets/notebook-interactive-map.png) && cp static/assets/notebook-interactive-map.png _html/static/assets/notebook-interactive-map.png") 2666 + (system 2667 + "mkdir -p $(dirname _html/static/assets/notebook-oxcaml.png) && cp static/assets/notebook-oxcaml.png _html/static/assets/notebook-oxcaml.png") 2668 + (system 2669 + "mkdir -p $(dirname _html/static/assets/notebook-sentiment.png) && cp static/assets/notebook-sentiment.png _html/static/assets/notebook-sentiment.png") 2670 + (system 2671 + "mkdir -p $(dirname _html/static/assets/notebook-widgets.png) && cp static/assets/notebook-widgets.png _html/static/assets/notebook-widgets.png")))) 2521 2672 2522 2673 (alias 2523 2674 (name site) ··· 2568 2719 _html/blog/2026/02/weeknotes-2026-06.html 2569 2720 _html/blog/2026/02/weeknotes-2026-08.html 2570 2721 _html/blog/2026/03/index.html 2722 + _html/blog/2026/03/monopam-madness.html 2723 + _html/blog/2026/03/open-source-and-ai.html 2571 2724 _html/blog/2026/03/weeknotes-2026-09.html 2572 2725 _html/blog/2026/03/weeknotes-2026-10.html 2573 2726 _html/blog/2026/index.html ··· 2593 2746 _html/reference/index.html 2594 2747 _html/static/assets/jon.jpg 2595 2748 _html/static/assets/notebook-foundations.png 2749 + _html/static/assets/notebook-interactive-map.png 2596 2750 _html/static/assets/notebook-oxcaml.png 2751 + _html/static/assets/notebook-sentiment.png 2752 + _html/static/assets/notebook-widgets.png 2597 2753 _html/blog/2025/05/alice.jpg 2598 2754 _html/blog/2025/05/amy.jpg 2599 2755 _html/blog/2025/05/emilio.jpg ··· 2611 2767 _html/blog/2025/09/examination_map_histogram.svg 2612 2768 _html/blog/2025/12/fungus.svg 2613 2769 _html/blog/2026/03/mapdemo.mov 2614 - _html/blog/2026/03/search.png)) 2770 + _html/blog/2026/03/new.png 2771 + _html/blog/2026/03/old.png 2772 + _html/blog/2026/03/search.png 2773 + _html/blog/2026/03/tessera.png)) 2615 2774 2616 2775 (rule 2617 2776 (target ··· 2650 2809 blog/2026/02/odoc-js-notebooks-fun.mld 2651 2810 blog/2026/02/weeknotes-2026-06.mld 2652 2811 blog/2026/02/weeknotes-2026-08.mld 2812 + blog/2026/03/monopam-madness.mld 2813 + blog/2026/03/open-source-and-ai.mld 2653 2814 blog/2026/03/weeknotes-2026-09.mld 2654 2815 blog/2026/03/weeknotes-2026-10.mld 2655 2816 blog/2025/03/index.mld
+1
site/index.mld
··· 13 13 14 14 @recent-posts 15 15 {ul 16 + {- {{!blog/2026/03/page-"weeknotes-2026-10"}Weeknotes 2026 week 10} 2026-03-09} 16 17 {- {{!blog/2026/03/page-"weeknotes-2026-09"}Weeknotes 2026 week 9} 2026-03-02} 17 18 {- {{!blog/2026/02/page-"weeknotes-2026-08"}Weeknotes weeks 7-8} 2026-02-23} 18 19 {- {{!blog/2026/02/page-"weeknotes-2026-06"}Weeknotes for week 6} 2026-02-09}
+4 -4
site/notebooks/foundations/foundations10.mld
··· 43 43 44 44 {1 Breadth-First Tree Traversal --- Using Append} 45 45 46 - {@ocaml[ 46 + {@ocaml x[ 47 47 let rec nbreadth = function 48 48 | [] -> [] 49 49 | Lf :: ts -> nbreadth ts ··· 126 126 127 127 {1 Efficient Functional Queues: Code} 128 128 129 - {@ocaml[ 129 + {@ocaml x[ 130 130 type 'a queue = 131 131 | Q of 'a list * 'a list;; 132 132 let norm = function ··· 171 171 172 172 {1 Breadth-First Tree Traversal --- Using Queues} 173 173 174 - {@ocaml[ 174 + {@ocaml x[ 175 175 let rec breadth q = 176 176 if qnull q then [] 177 177 else ··· 301 301 302 302 Consider the following OCaml function. 303 303 304 - {@ocaml[ 304 + {@ocaml x[ 305 305 let next n = [2 * n; 2 * n + 1] 306 306 ]} 307 307
+9 -9
site/notebooks/foundations/foundations11.mld
··· 78 78 79 79 {1 Trying Out References} 80 80 81 - {@ocaml[ 81 + {@ocaml x[ 82 82 let p = ref 5 (* create a reference *);; 83 83 p := !p + 1 (* p now holds value 6 *);; 84 84 let ps = [ ref 77; p ];; ··· 140 140 141 141 {1 Iteration: the [while] command} 142 142 143 - {@ocaml[ 143 + {@ocaml x[ 144 144 let tlopt = function 145 145 | [] -> None 146 146 | _::xs -> Some xs;; ··· 189 189 190 190 {1 Private, Persistent References} 191 191 192 - {@ocaml[ 192 + {@ocaml x[ 193 193 exception TooMuch of int;; 194 194 let makeAccount initBalance = 195 195 let balance = ref initBalance in ··· 228 228 229 229 {1 Two Bank Accounts} 230 230 231 - {@ocaml[ 231 + {@ocaml x[ 232 232 let student = makeAccount 500;; 233 233 let director = makeAccount 4000000;; 234 234 student 5 (* coach fare *);; ··· 254 254 255 255 {1 OCaml Primitives for Arrays} 256 256 257 - {@ocaml[ 257 + {@ocaml x[ 258 258 [|"a"; "b"; "c"|] (* allocate a fresh string array *);; 259 259 Array.make 3 'a' (* array[3] with cell containing 'a' *);; 260 260 let aa = Array.init 5 (fun i -> i * 10) (* array[5] initialised to (fun i) *);; ··· 265 265 There are many other array operations in the [Array] module in the OCaml standard 266 266 library. 267 267 268 - {@ocaml[ 268 + {@ocaml x[ 269 269 Array.make;; 270 270 Array.init;; 271 271 Array.get;; ··· 302 302 second call to [Array.get] supplies a subscript that is out of range, so OCaml 303 303 rejects it. 304 304 305 - {@ocaml[ 305 + {@ocaml x[ 306 306 let ar = Array.init 20 (fun i -> i * i);; 307 307 Array.get ar 2;; 308 308 Array.get ar 20;; ··· 320 320 [Array.exists] takes a boolean-valued function and returns [true] if an 321 321 array element satisfies it. 322 322 323 - {@ocaml[ 323 + {@ocaml x[ 324 324 Array.exists (fun i -> i > 200) ar;; 325 325 Array.exists (fun i -> i < 0) ar 326 326 ]} ··· 356 356 You can use the constructs we have learnt to easily create linked (mutable) lists as 357 357 an alternative to arrays. 358 358 359 - {@ocaml[ 359 + {@ocaml x[ 360 360 type 'a mlist = 361 361 | Nil 362 362 | Cons of 'a * 'a mlist ref
+3 -3
site/notebooks/foundations/foundations3.mld
··· 113 113 These three primitive functions are {e polymorphic} and allow flexibility in the 114 114 types of their arguments and results. Note their types! 115 115 116 - {@ocaml[ 116 + {@ocaml x[ 117 117 null;; 118 118 hd;; 119 119 tl ··· 127 127 128 128 {1 Computing the Length of a List} 129 129 130 - {@ocaml[ 130 + {@ocaml x[ 131 131 let rec nlength = function 132 132 | [] -> 0 133 133 | x :: xs -> 1 + nlength xs;; ··· 380 380 381 381 Consider the polymorphic types in these two function declarations: 382 382 383 - {@ocaml[ 383 + {@ocaml x[ 384 384 let id x = x;; 385 385 let rec loop x = loop x 386 386 ]}
+10 -10
site/notebooks/foundations/foundations4.mld
··· 15 15 16 16 They can be implemented in OCaml as follows: 17 17 18 - {@ocaml[ 18 + {@ocaml x[ 19 19 let rec take i = function 20 20 | [] -> [] 21 21 | x::xs -> ··· 68 68 69 69 {1 Equality Tests} 70 70 71 - {@ocaml[ 71 + {@ocaml x[ 72 72 let rec member x = function 73 73 | [] -> false 74 74 | y::l -> ··· 102 102 103 103 {1 Building a List of Pairs} 104 104 105 - {@ocaml[ 105 + {@ocaml x[ 106 106 let rec zip xs ys = 107 107 match xs, ys with 108 108 | (x::xs, y::ys) -> (x, y) :: zip xs ys ··· 129 129 In other cases, the {m (x_i,y_i)} pairs might have been generated by applying a 130 130 function to the elements of another list {m [z_1,\ldots,z_n] }. 131 131 132 - {@ocaml[ 132 + {@ocaml x[ 133 133 let rec unzip = function 134 134 | [] -> ([], []) 135 135 | (x, y)::pairs -> ··· 154 154 pairs: [zip] pairs up corresponding list elements and [unzip] 155 155 inverts this operation. Their types reflect what they do: 156 156 157 - {@ocaml[ 157 + {@ocaml x[ 158 158 zip;; 159 159 unzip 160 160 ]} ··· 173 173 computation as the previous version of [unzip] and is possibly clearer, 174 174 but not every local binding can be eliminated as easily. 175 175 176 - {@ocaml[ 176 + {@ocaml x[ 177 177 let conspair ((x, y), (xs, ys)) = (x::xs, y::ys);; 178 178 let rec unzip = function 179 179 | [] -> ([], []) ··· 187 187 will probably exceed those of [unzip] despite the advantages of 188 188 iteration. 189 189 190 - {@ocaml[ 190 + {@ocaml x[ 191 191 let rec revUnzip = function 192 192 | ([], xs, ys) -> (xs, ys) 193 193 | ((x, y)::pairs, xs, ys) -> ··· 206 206 }{- Exclude from consideration any coins that are too large. 207 207 }} 208 208 209 - {@ocaml[ 209 + {@ocaml x[ 210 210 let rec change till amt = 211 211 match till, amt with 212 212 | _, 0 -> [] ··· 238 238 Now we generalise the problem to return the list of {e all possible ways} of making change, 239 239 and write a new [change] function. 240 240 241 - {@ocaml[ 241 + {@ocaml x[ 242 242 let rec change till amt = 243 243 match till, amt with 244 244 | _ , 0 -> [ [] ] ··· 278 278 279 279 {1 All Ways of Making Change --- Faster!} 280 280 281 - {@ocaml[ 281 + {@ocaml x[ 282 282 let rec change till amt chg chgs = 283 283 match till, amt with 284 284 | _ , 0 -> chg::chgs
+7 -7
site/notebooks/foundations/foundations6.mld
··· 296 296 constructors and their associated arguments, around which we build the 297 297 logic: 298 298 299 - {@ocaml[ 299 + {@ocaml x[ 300 300 let wheels = function 301 301 | Bike -> 2 302 302 | Motorbike _ -> 2 ··· 350 350 351 351 {1 Exceptions in OCaml} 352 352 353 - {@ocaml[ 353 + {@ocaml x[ 354 354 exception Failure;; 355 355 exception NoChange of int;; 356 356 raise Failure ··· 367 367 exception [Failure] is just an error indication, while [NoChange n] carries 368 368 further information: the integer {m n}. 369 369 370 - {@ocaml[ 370 + {@ocaml x[ 371 371 try 372 372 print_endline "pre exception"; 373 373 raise (NoChange 1); ··· 397 397 in a function declaration indicates which exceptions it might raise. One 398 398 alternative to exceptions is to instead return a value of datatype [option]. 399 399 400 - {@ocaml[ 400 + {@ocaml x[ 401 401 let x = Some 1 402 402 ]} 403 403 ··· 409 409 410 410 {1 Making Change with Exceptions} 411 411 412 - {@ocaml[ 412 + {@ocaml x[ 413 413 exception Change;; 414 414 let rec change till amt = 415 415 match till, amt with ··· 587 587 588 588 Using the definition of ['a tree] from before: 589 589 590 - {@ocaml[ 590 + {@ocaml x[ 591 591 type 'a tree = Lf | Br of 'a * 'a tree * 'a tree 592 592 ]} 593 593 594 594 Examine the following function declaration. What does [ftree (1, n)] accomplish? 595 595 596 - {@ocaml[ 596 + {@ocaml x[ 597 597 let rec ftree k n = 598 598 if n = 0 then Lf 599 599 else Br (k, ftree (2 * k) (n - 1), ftree (2 * k + 1) (n - 1))
+11 -11
site/notebooks/foundations/foundations9.mld
··· 75 75 }{- Delayed version of {m E} is [fun () -> E] 76 76 }} 77 77 78 - {@ocaml[ 78 + {@ocaml x[ 79 79 type 'a seq = 80 80 | Nil 81 81 | Cons of 'a * (unit -> 'a seq);; ··· 107 107 108 108 {1 The Infinite Sequence: {m k}, {m k+1}, {m k+2}, …} 109 109 110 - {@ocaml[ 110 + {@ocaml x[ 111 111 let rec from k = Cons (k, fun () -> from (k+1));; 112 112 let it = from 1;; 113 113 let it = tail it;; ··· 126 126 127 127 {1 Consuming a Sequence} 128 128 129 - {@ocaml[ 129 + {@ocaml x[ 130 130 let rec get n s = 131 131 match n, s with 132 132 | 0, _ -> [] ··· 174 174 175 175 {1 Joining Two Sequences} 176 176 177 - {@ocaml[ 177 + {@ocaml x[ 178 178 let rec appendq xq yq = 179 179 match xq with 180 180 | Nil -> yq ··· 183 183 184 184 A more fair alternative: 185 185 186 - {@ocaml[ 186 + {@ocaml x[ 187 187 let rec interleave xq yq = 188 188 match xq with 189 189 | Nil -> yq ··· 214 214 215 215 Filtering lazy lists: 216 216 217 - {@ocaml[ 217 + {@ocaml x[ 218 218 let rec filterq p = function 219 219 | Nil -> Nil 220 220 | Cons (x, xf) -> ··· 226 226 227 227 The infinite sequence {m x}, {m f(x)}, {m f(f(x))}, … 228 228 229 - {@ocaml[ 229 + {@ocaml x[ 230 230 let rec iterates f x = 231 231 Cons (x, fun () -> iterates f (f x)) 232 232 ]} ··· 242 242 243 243 {1 Numerical Computations on Infinite Sequences} 244 244 245 - {@ocaml[ 245 + {@ocaml x[ 246 246 let next a x = (a /. x +. x) /. 2.0 247 247 ]} 248 248 249 249 Close enough? 250 250 251 - {@ocaml[ 251 + {@ocaml x[ 252 252 let rec within eps = function 253 253 | Cons (x, xf) -> 254 254 match xf () with ··· 259 259 260 260 Square Roots: 261 261 262 - {@ocaml[ 262 + {@ocaml x[ 263 263 let root a = within 1e6 (iterates (next a) 1.0) 264 264 ]} 265 265 ··· 295 295 Consider the list function [concat], which concatenates a list of lists to form a single list. Can 296 296 it be generalised to concatenate a sequence of sequences? What can go wrong? 297 297 298 - {@ocaml[ 298 + {@ocaml x[ 299 299 let rec concat = function 300 300 | [] -> [] 301 301 | l::ls -> l @ concat ls
+14 -1
site/notebooks/foundations/index.mld
··· 17 17 by Jeremy Yallop. We thank Richard Sharp, Srinivasan Keshav, Ambroise Lafont, Vojtěch Tvrdík 18 18 and Jeremy Yallop for further feedback and corrections since 2020. 19 19 20 - {!/jon-site/notebooks/foundations/page-foundations1} 20 + {1 Lectures} 21 21 22 + {ol 23 + {- {{!//notebooks/foundations/page-foundations1}Introduction to Programming}} 24 + {- {{!//notebooks/foundations/page-foundations2}Recursion and Efficiency}} 25 + {- {{!//notebooks/foundations/page-foundations3}Lists}} 26 + {- {{!//notebooks/foundations/page-foundations4}More on Lists}} 27 + {- {{!//notebooks/foundations/page-foundations5}Sorting}} 28 + {- {{!//notebooks/foundations/page-foundations6}Datatypes and Trees}} 29 + {- {{!//notebooks/foundations/page-foundations7}Dictionaries and Functional Arrays}} 30 + {- {{!//notebooks/foundations/page-foundations8}Functions as Values}} 31 + {- {{!//notebooks/foundations/page-foundations9}Sequences, or Lazy Lists}} 32 + {- {{!//notebooks/foundations/page-foundations10}Queues and Search Strategies}} 33 + {- {{!//notebooks/foundations/page-foundations11}Elements of Procedural Programming}} 34 + }
+23 -5
site/notebooks/index.mld
··· 84 84 </div> 85 85 </a> 86 86 87 - <a class="notebook-card" href="/notebooks/oxcaml/local.html"> 88 - <img class="notebook-card-thumb" src="/static/assets/notebook-oxcaml.png" alt="OxCaml Locality Mode notebook preview" loading="lazy"> 87 + <a class="notebook-card" href="/reference/onnxrt/sentiment_example.html"> 88 + <img class="notebook-card-thumb" src="/static/assets/notebook-sentiment.png" alt="Sentiment Analysis notebook preview" loading="lazy"> 89 89 <div class="notebook-card-body"> 90 - <div class="notebook-card-label">&rlhar; OxCaml</div> 91 - <div class="notebook-card-title">Locality Mode OxCaml</div> 92 - <p class="notebook-card-desc">Explore stack allocation, modes, and regions in the OxCaml compiler extensions. Interactive exercises included.</p> 90 + <div class="notebook-card-label">&oplus; ONNX Runtime</div> 91 + <div class="notebook-card-title">Sentiment Analysis</div> 92 + <p class="notebook-card-desc">Run a DistilBERT model entirely in the browser. Includes a WordPiece tokenizer and inference pipeline — all editable OCaml.</p> 93 + </div> 94 + </a> 95 + 96 + <a class="notebook-card" href="/notebooks/interactive_map.html"> 97 + <img class="notebook-card-thumb" src="/static/assets/notebook-interactive-map.png" alt="Interactive Map notebook preview" loading="lazy"> 98 + <div class="notebook-card-body"> 99 + <div class="notebook-card-label">&target; GeoTessera</div> 100 + <div class="notebook-card-title">Interactive Map</div> 101 + <p class="notebook-card-desc">Geospatial land-cover classification with GeoTessera embeddings. Draw regions, place training points, and run KNN — all in the browser.</p> 102 + </div> 103 + </a> 104 + 105 + <a class="notebook-card" href="/reference/odoc-interactive-extension/demo_widgets.html"> 106 + <img class="notebook-card-thumb" src="/static/assets/notebook-widgets.png" alt="FRP Widget Demo notebook preview" loading="lazy"> 107 + <div class="notebook-card-body"> 108 + <div class="notebook-card-label">&circlearrowright; Widgets</div> 109 + <div class="notebook-card-title">FRP Widget Demo</div> 110 + <p class="notebook-card-desc">Counters, sliders, and text inputs powered by Note FRP signals. See how reactive widgets work from OCaml in the browser.</p> 93 111 </div> 94 112 </a> 95 113
+7 -4
site/notebooks/interactive_map.mld
··· 1 1 {0 TESSERA Interactive Map} 2 2 3 + @admonition.warning This code is known to be slightly wrong. The 4 + purpose is illustrative, not for making decisions! 5 + 3 6 Explore geospatial embeddings from the 4 7 {{:https://geotessera.org}GeoTessera} foundation model 5 8 directly in the browser. This notebook walks through a complete ··· 25 28 The map is centred on Cambridge, UK — navigate to any area of interest 26 29 before drawing. 27 30 28 - {@ocaml[ 31 + {@ocaml x[ 29 32 (* Shared state *) 30 33 let bbox : Geotessera.bbox option ref = ref None 31 34 let mosaic : (Linalg.mat * int * int) option ref = ref None ··· 113 116 After drawing your bounding box, run this cell to fetch GeoTessera 114 117 embeddings and display a PCA false-colour visualisation. 115 118 116 - {@ocaml[ 119 + {@ocaml x[ 117 120 let () = 118 121 match !bbox with 119 122 | None -> Widget.update ~id:"status" (status_view "Error: draw a bounding box first!") ··· 150 153 Click on the map to add training points. Use the buttons below to 151 154 switch between classes before clicking. 152 155 153 - {@ocaml[ 156 + {@ocaml x[ 154 157 let make_class_buttons () = 155 158 let open Widget.View in 156 159 let buttons = Array.to_list (Array.mapi (fun i name -> ··· 226 229 Run this cell after placing training points to classify the entire 227 230 region using k-nearest neighbours. 228 231 229 - {@ocaml[ 232 + {@ocaml x[ 230 233 let () = 231 234 match !mosaic, !projected, !bbox with 232 235 | Some (mat, h, w), Some _proj, Some b ->
+14 -14
site/notebooks/oxcaml/local.mld
··· 2 2 3 3 @x-ocaml.requires base 4 4 5 - {@ocaml[ 5 + {@ocaml x[ 6 6 let f () = 7 7 let u @ local = [6; 2; 8] in (* mode *) 8 8 let len = Base.List.length u in 9 9 len 10 10 ]} 11 11 12 - {@ocaml[ 12 + {@ocaml x[ 13 13 let f () = 14 14 let local_ u = [6; 2; 8] in 15 15 let len = Base.List.length u in 16 16 len 17 17 ]} 18 18 19 - {@ocaml[ 19 + {@ocaml x[ 20 20 let f () = 21 21 let u : int list @@ local = stack_ [6; 2; 8] in (* modality *) 22 22 let len = Base.List.length u in 23 23 len 24 24 ]} 25 25 26 - {@ocaml[ 26 + {@ocaml x[ 27 27 let f () = 28 28 let u = local_ [6; 2; 8] in 29 29 let len = Base.List.length u in ··· 48 48 - Inference decides how to allocate, defaults to the stack 49 49 - Regions can nest and are wider than scopes 50 50 51 - {@ocaml[ 51 + {@ocaml x[ 52 52 let f () = 53 53 let foo = 54 54 let local_ bar = ("region", "scope") in ··· 69 69 + Parameter: callee respects caller's locality 70 70 + Result: callee stores in caller's region 71 71 + This really defines 4 arrows 72 - {@ocaml[ 72 + {@ocaml x[ 73 73 val global_global : s -> t * t (* Legacy *) 74 74 val local_global : local_ s -> t * t 75 75 val global_local : s -> local_ t * t ··· 85 85 86 86 {1 Hands-on} 87 87 88 - {@ocaml[ 88 + {@ocaml x[ 89 89 let monday () = let str = "mon" ^ "day" in str 90 90 ]} 91 91 92 - {@ocaml[ 92 + {@ocaml x[ 93 93 let bye () = let ciao = "sorry" in failwith ciao 94 94 ]} 95 95 96 - {@ocaml[ 96 + {@ocaml x[ 97 97 let make_counter () = 98 98 let counter = ref (-1) in 99 99 fun () -> incr counter; !counter 100 100 ]} 101 101 102 - {@ocaml[ 102 + {@ocaml x[ 103 103 let state = ref "";; 104 104 let set () = state := "disco" 105 105 ]} 106 106 107 - {@ocaml[ 107 + {@ocaml x[ 108 108 let rec map f = function [] -> [] | x :: u -> f x :: map f u 109 109 ]} 110 110 111 - {@ocaml[ 111 + {@ocaml x[ 112 112 let f1 (local_ u : int list) = [1; 2; 3] 113 113 ]} 114 114 115 - {@ocaml[ 115 + {@ocaml x[ 116 116 let f2 (local_ u : int list) = u 117 117 ]} 118 118 119 - {@ocaml[ 119 + {@ocaml x[ 120 120 let f3 (local_ u : int list) = 42 :: u 121 121 ]} 122 122
site/static/assets/notebook-interactive-map.png

This is a binary file and will not be displayed.

site/static/assets/notebook-sentiment.png

This is a binary file and will not be displayed.

site/static/assets/notebook-widgets.png

This is a binary file and will not be displayed.

+2
test/e2e/foundations-notebooks.spec.js
··· 96 96 } 97 97 } 98 98 99 + test.describe.configure({ mode: 'parallel' }); 100 + 99 101 for (const nb of NOTEBOOKS) { 100 102 test.describe(nb.name, () => { 101 103 test('all cells execute without errors', async ({ page }) => {
+134 -89
test/e2e/onnx-inference.spec.js
··· 8 8 const BASE = '/reference/onnxrt'; 9 9 10 10 /** 11 - * Run cells one by one, waiting for each to produce output. 11 + * Run cells one by one, waiting for each to complete. 12 + * Detects completion by watching the Run button: it changes to aria-label="Stop" 13 + * while running, then back to "Run" when done. This works even for cells like 14 + * #require that produce no visible output elements. 12 15 */ 13 - async function runCellsOneByOne(page, { cellTimeout = 30_000 } = {}) { 16 + async function runCellsOneByOne(page, { cellTimeout = 30_000, skipExercise = true } = {}) { 14 17 const cellCount = await page.evaluate( 15 18 () => document.querySelectorAll('x-ocaml').length 16 19 ); ··· 20 23 const cell = document.querySelectorAll('x-ocaml')[idx]; 21 24 const mode = cell.getAttribute('mode'); 22 25 const shadow = cell.shadowRoot; 23 - const hasOutput = 24 - shadow && 25 - shadow.querySelector('.caml_stdout, .caml_stderr, .caml_meta') !== null; 26 - return { mode, hasOutput }; 26 + const btn = shadow && shadow.querySelector('button[aria-label="Run"]'); 27 + return { mode, hasRunBtn: !!btn }; 27 28 }, i); 28 29 29 - if (info.mode === 'hidden' || info.hasOutput) continue; 30 + if (info.mode === 'hidden' || !info.hasRunBtn) continue; 31 + if (skipExercise && info.mode === 'exercise') continue; 30 32 31 - // Click Run 33 + // Click Run and wait for it to enter running state 32 34 await page.evaluate((idx) => { 33 35 const cell = document.querySelectorAll('x-ocaml')[idx]; 34 36 const shadow = cell.shadowRoot; 35 37 if (!shadow) return; 36 - const btn = 37 - shadow.querySelector('button[aria-label="Run"]') || shadow.querySelector('button[title="Run"]') || 38 - shadow.querySelector('button'); 38 + const btn = shadow.querySelector('button[aria-label="Run"]'); 39 39 if (btn) btn.click(); 40 40 }, i); 41 41 42 - // Wait for this cell to produce output 42 + // Wait for button to change to "Stop" (running), then back to "Run" (done) 43 43 try { 44 + // First wait for "Stop" to confirm the cell started 44 45 await page.waitForFunction( 45 46 (idx) => { 46 47 const cell = document.querySelectorAll('x-ocaml')[idx]; 47 48 const shadow = cell.shadowRoot; 48 49 if (!shadow) return false; 49 - return ( 50 - shadow.querySelector('.caml_stdout, .caml_stderr, .caml_meta') !== 51 - null 52 - ); 50 + const btn = shadow.querySelector('button[aria-label]'); 51 + return btn && btn.getAttribute('aria-label') === 'Stop'; 52 + }, 53 + i, 54 + { timeout: 5_000 } 55 + ); 56 + } catch { 57 + // If it never went to "Stop", it may have completed instantly 58 + } 59 + 60 + // Now wait for "Run" to confirm the cell finished 61 + try { 62 + await page.waitForFunction( 63 + (idx) => { 64 + const cell = document.querySelectorAll('x-ocaml')[idx]; 65 + const shadow = cell.shadowRoot; 66 + if (!shadow) return false; 67 + const btn = shadow.querySelector('button[aria-label]'); 68 + return btn && btn.getAttribute('aria-label') === 'Run'; 53 69 }, 54 70 i, 55 71 { timeout: cellTimeout } ··· 64 80 test('sentiment analysis classifies positive text correctly', async ({ 65 81 page, 66 82 }) => { 67 - test.skip(!!process.env.CI, 'Skipped in CI — DistilBERT model download too slow'); 68 - test.setTimeout(300_000); // model download can be very slow 83 + test.setTimeout(60_000); 69 84 70 85 await page.goto(`${BASE}/sentiment_example.html`); 71 - await waitForCellsInitialized(page, { timeout: 60_000 }); 86 + await waitForCellsInitialized(page, { timeout: 15_000 }); 72 87 73 - // Run cells with generous per-cell timeout for model loading 74 - await runCellsOneByOne(page, { cellTimeout: 120_000 }); 88 + // Run all interactive cells sequentially. The model-loading cell (cell 5) 89 + // completes quickly but loads the model asynchronously via Lwt. 90 + await runCellsOneByOne(page, { cellTimeout: 15_000 }); 75 91 76 - const outputs = await getCellOutputs(page); 92 + // Wait for async model load to complete. The status widget is a sibling 93 + // .widget-container element, not inside the shadow DOM. 94 + await page.waitForFunction( 95 + () => { 96 + const containers = document.querySelectorAll('.widget-container'); 97 + for (const c of containers) { 98 + if ((c.textContent || '').includes('Model ready')) return true; 99 + } 100 + return false; 101 + }, 102 + { timeout: 30_000 } 103 + ); 104 + 105 + // Widget containers are siblings of x-ocaml elements in the regular DOM. 106 + // Click the Analyze button in the widget container. 107 + await page.evaluate(() => { 108 + const btn = Array.from(document.querySelectorAll('.widget-container button')) 109 + .find((b) => b.textContent.trim() === 'Analyze'); 110 + if (btn) btn.click(); 111 + }); 112 + 113 + // Wait for sentiment result in widget container 114 + await page.waitForFunction( 115 + () => { 116 + const containers = document.querySelectorAll('.widget-container'); 117 + for (const c of containers) { 118 + const text = (c.textContent || '').toUpperCase(); 119 + if (text.includes('POSITIVE') || text.includes('NEGATIVE')) return true; 120 + } 121 + return false; 122 + }, 123 + { timeout: 10_000 } 124 + ); 125 + 126 + const resultText = await page.evaluate(() => { 127 + const containers = document.querySelectorAll('.widget-container'); 128 + return Array.from(containers).map((c) => c.textContent).join('\n'); 129 + }); 130 + expect(resultText.toUpperCase()).toContain('POSITIVE'); 131 + 132 + // Now test negative sentiment. Set textarea value using the native setter 133 + // (bypasses React/framework value traps) and dispatch input event. 134 + // Then wait and click Analyze. 135 + await page.evaluate(() => { 136 + return new Promise((resolve) => { 137 + const ta = document.querySelector('.widget-container textarea'); 138 + if (!ta) { resolve(); return; } 139 + const nativeSet = Object.getOwnPropertyDescriptor( 140 + window.HTMLTextAreaElement.prototype, 'value' 141 + ).set; 142 + nativeSet.call(ta, 'This movie was terrible, I hated every minute of it.'); 143 + ta.dispatchEvent(new Event('input', { bubbles: true })); 144 + // Wait for the widget event to propagate via postMessage to the worker 145 + setTimeout(() => { 146 + const btn = Array.from(document.querySelectorAll('.widget-container button')) 147 + .find((b) => b.textContent.trim() === 'Analyze'); 148 + if (btn) btn.click(); 149 + resolve(); 150 + }, 1000); 151 + }); 152 + }); 77 153 78 - // Result may appear in a widget or stdout. Wait for it. 79 - try { 80 - await page.waitForFunction( 81 - () => { 82 - const cells = document.querySelectorAll('x-ocaml'); 83 - for (const cell of cells) { 84 - const shadow = cell.shadowRoot; 85 - if (!shadow) continue; 86 - const text = (shadow.textContent || '').toUpperCase(); 87 - if (text.includes('POSITIVE') || text.includes('NEGATIVE')) return true; 88 - } 89 - return false; 90 - }, 91 - { timeout: 120_000 } 92 - ); 93 - } catch { 94 - // Timeout 95 - } 154 + // Wait for negative result 155 + await page.waitForFunction( 156 + () => { 157 + const containers = document.querySelectorAll('.widget-container'); 158 + for (const c of containers) { 159 + const text = (c.textContent || '').toUpperCase(); 160 + if (text.includes('NEGATIVE')) return true; 161 + } 162 + return false; 163 + }, 164 + { timeout: 15_000 } 165 + ); 96 166 97 - const allText = await page.evaluate(() => { 98 - const cells = document.querySelectorAll('x-ocaml'); 99 - return Array.from(cells).map((cell) => { 100 - const shadow = cell.shadowRoot; 101 - return shadow ? shadow.textContent : ''; 102 - }).join('\n'); 167 + const negativeResult = await page.evaluate(() => { 168 + const containers = document.querySelectorAll('.widget-container'); 169 + return Array.from(containers).map((c) => c.textContent).join('\n'); 103 170 }); 104 - expect(allText.toUpperCase()).toContain('POSITIVE'); 171 + expect(negativeResult.toUpperCase()).toContain('NEGATIVE'); 105 172 }); 106 173 107 174 test('tensor addition produces correct result', async ({ page }) => { 108 - test.setTimeout(180_000); 175 + test.setTimeout(60_000); 109 176 110 177 await page.goto(`${BASE}/add_example.html`); 111 - await waitForCellsInitialized(page, { timeout: 60_000 }); 112 - await runCellsOneByOne(page, { cellTimeout: 60_000 }); 178 + await waitForCellsInitialized(page, { timeout: 15_000 }); 179 + // Exercise cells contain the tensor creation and inference code 180 + await runCellsOneByOne(page, { cellTimeout: 15_000, skipExercise: false }); 113 181 114 - const outputs = await getCellOutputs(page); 182 + // The result appears in a widget container (sibling of x-ocaml elements). 183 + // Model load + inference happens asynchronously via Lwt. 184 + await page.waitForFunction( 185 + () => { 186 + const containers = document.querySelectorAll('.widget-container'); 187 + for (const c of containers) { 188 + if ((c.textContent || '').includes('C =')) return true; 189 + } 190 + return false; 191 + }, 192 + { timeout: 30_000 } 193 + ); 115 194 116 - // The result appears in a reactive widget, not stdout. Wait for it. 117 - let widgetText = ''; 118 - try { 119 - await page.waitForFunction( 120 - () => { 121 - const cells = document.querySelectorAll('x-ocaml'); 122 - for (const cell of cells) { 123 - const shadow = cell.shadowRoot; 124 - if (!shadow) continue; 125 - const widget = shadow.querySelector('[data-widget-id="result"]') || 126 - shadow.querySelector('.widget-container'); 127 - if (widget && widget.textContent.includes('C =')) return true; 128 - // Also check stdout/meta for the result 129 - const text = shadow.textContent || ''; 130 - if (text.includes('C =')) return true; 131 - } 132 - return false; 133 - }, 134 - { timeout: 60_000 } 135 - ); 136 - } catch { 137 - // Timeout — check what we have anyway 138 - } 139 - 140 - // Collect all text from cells including widgets 141 - widgetText = await page.evaluate(() => { 142 - const cells = document.querySelectorAll('x-ocaml'); 143 - return Array.from(cells).map((cell) => { 144 - const shadow = cell.shadowRoot; 145 - return shadow ? shadow.textContent : ''; 146 - }).join('\n'); 195 + const widgetText = await page.evaluate(() => { 196 + const containers = document.querySelectorAll('.widget-container'); 197 + return Array.from(containers).map((c) => c.textContent).join('\n'); 147 198 }); 148 199 149 - const hasExpectedSum = 150 - widgetText.includes('5') && 151 - widgetText.includes('7') && 152 - widgetText.includes('9'); 153 - expect(hasExpectedSum, `Expected tensor sum values 5, 7, 9 in output. Got: ${widgetText.slice(0, 500)}`).toBe( 154 - true 155 - ); 200 + expect(widgetText).toContain('C ='); 156 201 }); 157 202 });
+78
test/e2e/package-lock.json
··· 1 + { 2 + "name": "site-e2e-tests", 3 + "version": "1.0.0", 4 + "lockfileVersion": 3, 5 + "requires": true, 6 + "packages": { 7 + "": { 8 + "name": "site-e2e-tests", 9 + "version": "1.0.0", 10 + "devDependencies": { 11 + "@playwright/test": "^1.40.0" 12 + } 13 + }, 14 + "node_modules/@playwright/test": { 15 + "version": "1.58.2", 16 + "resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.58.2.tgz", 17 + "integrity": "sha512-akea+6bHYBBfA9uQqSYmlJXn61cTa+jbO87xVLCWbTqbWadRVmhxlXATaOjOgcBaWU4ePo0wB41KMFv3o35IXA==", 18 + "dev": true, 19 + "license": "Apache-2.0", 20 + "dependencies": { 21 + "playwright": "1.58.2" 22 + }, 23 + "bin": { 24 + "playwright": "cli.js" 25 + }, 26 + "engines": { 27 + "node": ">=18" 28 + } 29 + }, 30 + "node_modules/fsevents": { 31 + "version": "2.3.2", 32 + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz", 33 + "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==", 34 + "dev": true, 35 + "hasInstallScript": true, 36 + "license": "MIT", 37 + "optional": true, 38 + "os": [ 39 + "darwin" 40 + ], 41 + "engines": { 42 + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" 43 + } 44 + }, 45 + "node_modules/playwright": { 46 + "version": "1.58.2", 47 + "resolved": "https://registry.npmjs.org/playwright/-/playwright-1.58.2.tgz", 48 + "integrity": "sha512-vA30H8Nvkq/cPBnNw4Q8TWz1EJyqgpuinBcHET0YVJVFldr8JDNiU9LaWAE1KqSkRYazuaBhTpB5ZzShOezQ6A==", 49 + "dev": true, 50 + "license": "Apache-2.0", 51 + "dependencies": { 52 + "playwright-core": "1.58.2" 53 + }, 54 + "bin": { 55 + "playwright": "cli.js" 56 + }, 57 + "engines": { 58 + "node": ">=18" 59 + }, 60 + "optionalDependencies": { 61 + "fsevents": "2.3.2" 62 + } 63 + }, 64 + "node_modules/playwright-core": { 65 + "version": "1.58.2", 66 + "resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.58.2.tgz", 67 + "integrity": "sha512-yZkEtftgwS8CsfYo7nm0KE8jsvm6i/PTgVtB8DL726wNf6H2IMsDuxCpJj59KDaxCtSnrWan2AeDqM7JBaultg==", 68 + "dev": true, 69 + "license": "Apache-2.0", 70 + "bin": { 71 + "playwright-core": "cli.js" 72 + }, 73 + "engines": { 74 + "node": ">=18" 75 + } 76 + } 77 + } 78 + }
+1 -1
test/e2e/playwright.config.js
··· 7 7 testMatch: '*.spec.js', 8 8 timeout: 120_000, 9 9 retries: 0, 10 - workers: 1, // serial — cells share state within pages 10 + workers: undefined, // auto — Playwright chooses based on CPU cores 11 11 use: { 12 12 baseURL: 'http://localhost:8770', 13 13 navigationTimeout: 60_000,
+108
test/e2e/widget-interaction.spec.js
··· 1 + // @ts-check 2 + const { test, expect } = require('@playwright/test'); 3 + const { waitForCellsInitialized } = require('./helpers'); 4 + 5 + const BASE = '/reference/odoc-interactive-extension'; 6 + 7 + /** 8 + * Wait for specific widget containers to appear in the DOM. 9 + */ 10 + async function waitForWidgets(page, widgetIds, { timeout = 60_000 } = {}) { 11 + await page.waitForFunction( 12 + (ids) => ids.every((id) => document.querySelector(`[data-widget-id="${id}"]`)), 13 + widgetIds, 14 + { timeout } 15 + ); 16 + } 17 + 18 + test.describe('Widget Interaction', () => { 19 + test('counter increments and decrements via button clicks', async ({ page }) => { 20 + test.setTimeout(60_000); 21 + await page.goto(`${BASE}/demo_widgets.html`); 22 + await waitForCellsInitialized(page, { timeout: 15_000 }); 23 + await waitForWidgets(page, ['counter'], { timeout: 30_000 }); 24 + 25 + // Verify counter starts at 0 26 + const initial = await page.evaluate(() => { 27 + const c = document.querySelector('[data-widget-id="counter"]'); 28 + const span = c && c.querySelector('span[style]'); 29 + return span ? span.textContent : null; 30 + }); 31 + expect(initial).toBe('0'); 32 + 33 + // Click + three times 34 + for (let i = 0; i < 3; i++) { 35 + await page.evaluate(() => { 36 + const c = document.querySelector('[data-widget-id="counter"]'); 37 + const btns = c.querySelectorAll('button'); 38 + const plus = Array.from(btns).find((b) => b.textContent.trim() === '+'); 39 + plus.click(); 40 + }); 41 + await page.waitForTimeout(500); 42 + } 43 + 44 + // Wait for counter to show 3 45 + await page.waitForFunction(() => { 46 + const c = document.querySelector('[data-widget-id="counter"]'); 47 + const span = c && c.querySelector('span[style]'); 48 + return span && span.textContent === '3'; 49 + }, { timeout: 5_000 }); 50 + 51 + // Click - once 52 + await page.evaluate(() => { 53 + const c = document.querySelector('[data-widget-id="counter"]'); 54 + const btns = c.querySelectorAll('button'); 55 + const minus = Array.from(btns).find((b) => b.textContent.trim() === '-'); 56 + minus.click(); 57 + }); 58 + 59 + await page.waitForFunction(() => { 60 + const c = document.querySelector('[data-widget-id="counter"]'); 61 + const span = c && c.querySelector('span[style]'); 62 + return span && span.textContent === '2'; 63 + }, { timeout: 5_000 }); 64 + }); 65 + 66 + test('text entry shouts uppercase after text change', async ({ page }) => { 67 + test.setTimeout(60_000); 68 + await page.goto(`${BASE}/demo_widgets.html`); 69 + await waitForCellsInitialized(page, { timeout: 15_000 }); 70 + await waitForWidgets(page, ['text-entry', 'text-result'], { timeout: 30_000 }); 71 + 72 + // Click Shout with default text 73 + await page.evaluate(() => { 74 + const c = document.querySelector('[data-widget-id="text-entry"]'); 75 + c.querySelector('button').click(); 76 + }); 77 + 78 + await page.waitForFunction(() => { 79 + const r = document.querySelector('[data-widget-id="text-result"]'); 80 + return r && r.textContent === 'HELLO WORLD'; 81 + }, { timeout: 5_000 }); 82 + 83 + // Change textarea text programmatically, then click Shout 84 + await page.evaluate(() => { 85 + const c = document.querySelector('[data-widget-id="text-entry"]'); 86 + const ta = c.querySelector('textarea'); 87 + const nativeSet = Object.getOwnPropertyDescriptor( 88 + window.HTMLTextAreaElement.prototype, 'value' 89 + ).set; 90 + nativeSet.call(ta, 'goodbye world'); 91 + ta.dispatchEvent(new Event('input', { bubbles: true })); 92 + }); 93 + 94 + // Wait for the text_changed event to be processed by the worker 95 + await page.waitForTimeout(1000); 96 + 97 + // Click Shout 98 + await page.evaluate(() => { 99 + const c = document.querySelector('[data-widget-id="text-entry"]'); 100 + c.querySelector('button').click(); 101 + }); 102 + 103 + await page.waitForFunction(() => { 104 + const r = document.querySelector('[data-widget-id="text-result"]'); 105 + return r && r.textContent === 'GOODBYE WORLD'; 106 + }, { timeout: 5_000 }); 107 + }); 108 + });
+1
x-ocaml/src/dune
··· 25 25 x_protocol) 26 26 (modes js) 27 27 (js_of_ocaml 28 + (javascript_files oxcaml_stubs.js) 28 29 (sourcemap no)) 29 30 (preprocess 30 31 (pps ppx_blob))
+1 -1
x-ocaml/src/widget_render.ml
··· 122 122 let ev_type = Ev.Type.create (Jstr.v ev_name) in 123 123 let _listener = Ev.listen ev_type (fun _ev -> 124 124 let is_input = 125 - let tn = Jstr.to_string (El.tag_name el) in 125 + let tn = String.lowercase_ascii (Jstr.to_string (El.tag_name el)) in 126 126 tn = "input" || tn = "select" || tn = "textarea" 127 127 in 128 128 let value =