A fork of mtelver's day10 project

Rename odoc-jon-shell to odoc-jons-plugins and add atom.xml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+1564 -27
+1516
atom.xml
··· 1 + <?xml version="1.0" encoding="UTF-8"?> 2 + <feed xmlns="http://www.w3.org/2005/Atom"><link href="https://jon.recoil.org/blog/" rel="alternate"/><link href="https://jon.recoil.org/atom.xml" rel="self"/><id>https://jon.recoil.org/atom.xml</id><title type="text">Jon's blog</title><updated>2026-03-06T15:40:56-00:00</updated><entry><summary type="text">Highlights:</summary><published>2026-02-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-week-6&quot;&gt;&lt;a href=&quot;#weeknotes-for-week-6&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for week 6&lt;/h1&gt; 3 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-02-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 4 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.xref2,odoc.loader,odoc.model&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 5 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;packages&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;packages&lt;/span&gt; &lt;p&gt;odoc&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 6 + &lt;p&gt;Highlights:&lt;/p&gt; 7 + &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/day10-jtw/standalone/index.html&quot;&gt;day10 / javascript toplevels integration&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/scrollycoder/&quot;&gt;Scrollycode experiments&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt; 8 + &lt;h2 id=&quot;oxmono&quot;&gt;&lt;a href=&quot;#oxmono&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxmono&lt;/h2&gt; 9 + &lt;p&gt;I spent some time on Anil's oxmono repo getting odoc to work correctly. It turned out that the bug I was working on last week was critically important for this - and that the bugfix was incomplete. One of the issues was to do with identifiers needing to be unique. For example, consider the following code:&lt;/p&gt; 10 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S = sig 11 + type t 12 + 13 + include sig 14 + type t 15 + 16 + val f : t -&amp;gt; t 17 + end with type t := t 18 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 19 + &lt;p&gt;The problem here is that both definitions of `type t` have the same identifier, which causes problems when we move to and from the 'Component' types. The solution was to introduce a 'dummy' parent for the type defined within the include. This works because we never actually render the body of the include into HTML - we render the &lt;i&gt;expansion&lt;/i&gt;, which &lt;i&gt;doesn't&lt;/i&gt; have &lt;code&gt;type t&lt;/code&gt; in it, as it has been substituted out.&lt;/p&gt; 20 + &lt;p&gt;The fix I made last week fixed the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_loader&quot;&gt;loader&lt;/span&gt;, which reads in the &lt;code&gt;cmt&lt;/code&gt;/&lt;code&gt;cmti&lt;/code&gt; files produced by the compiler. There's one more place where we create these in the code - when we translate from the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_xref2.Component&quot;&gt;Component&lt;/span&gt; types back into &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang&quot;&gt;Lang&lt;/span&gt; types. I was a little curious about whether it was possible to make this happen, so I thought I'd ask Claude to see if it could come up with a scenario where we'd end up in this situation. This was a complete failure, which was a real disappointment to me, as doing this sort of thing is a quite tedious and annoying part of working on odoc.&lt;/p&gt; 21 + &lt;p&gt;Meanwhile, I was running odoc on Anil's &lt;a href=&quot;https://github.com/avsm/oxmono&quot;&gt;oxmono&lt;/a&gt; repo, which was using &lt;a href=&quot;https://github.com/art-w&quot;&gt;art-w&lt;/a&gt;'s &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1399&quot;&gt;PR to upstream oxcaml support&lt;/a&gt;. It was failing with an exception that was very familiar, so I pulled in the fix I'd been working on, and that enabled it to get much further. However, it did subsequently fail with another slightly different exception. I had my suspicions at this point that it might be due to the other place, but I thought this again was a good opportunity to test Claude's debugging skills. However, this again was a complete failure. I spend quite a long time prodding it - at least 4 separate sessions - and it really didn't get anywhere close to a solution, despite knowing precisely that the commit we'd made that had fixed the first problem. Two of the four times it ended up telling me that the oxcaml compiler was broken and suggesting that we create an issue!&lt;/p&gt; 22 + &lt;p&gt;I'm only very mildly disappointed in this - it's all quite subtle, and something I still end up scratching my head over sometimes, but it would have been wonderful to be able to offload this sort of work!&lt;/p&gt; 23 + &lt;p&gt;In any case, the docs now all build on &lt;a href=&quot;https://github.com/jonludlam/oxmono/commit/2a53f6857d5b8849a73f5bb3e5244b9ac0f36708&quot;&gt;my fork of oxmono&lt;/a&gt;.&lt;/p&gt; 24 + &lt;h2 id=&quot;docs-ci&quot;&gt;&lt;a href=&quot;#docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI&lt;/h2&gt; 25 + &lt;p&gt;The fix I deployed last week for ocaml-docs-ci was taking forever to complete, so I ended up spending some time investigating this. The problem was happening during the 'prep' phase, which is the first part of the pipeline where we simply build the package to be documented. This is supposed to work by building a graph of all inter-package dependencies across all of the solved packages, so we maximise sharing of built artefacts. Each 'prep' job builds precisely one package by coping in the dependencies from previous prep jobs, then running &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;opamh&lt;/a&gt; to fix up the metadata so that opam believes it has installed everything itself, then running opam to build the one package required. It was this last step that was going wrong, where it would decide that there had been upstream changes to the compiler itself, and rebuild &lt;i&gt;everything&lt;/i&gt;, so rather than a prep job taking a few seconds, it would take a few minutes.&lt;/p&gt; 26 + &lt;p&gt;I was totally unable to repro this locally - everything build very quickly and just how it should have done. After much head-scratching I finally realised that the problem was somewhere in the caching. I think what's going on is that we dynamically build an opam repository to make the `opam install` command faster, and that repo contains only the packages that are required to build whatever it is we're building. Those opam files are cached by the docs CI server and passed to the build script as a base64-encoded gzipped tarball inline in the obuilder file (!). This should all be totally consistent as we're also caching all the builds - except for the compiler itself, which comes from the base docker image. This, of course, is the problem. The ocaml compiler opam files had been updated, and then when we reconstructed the opam repo with our cached opam files, opam noticed they had changed (gone &lt;i&gt;backwards&lt;/i&gt; in time!) and decided it needed to rebuild the compiler, and therefore &lt;i&gt;everything&lt;/i&gt; else. Clearing out the opam-files cache and restarting the builds fixed this entirely, and the full rebuild job completed after about 2 days. I flipped the switch on Saturday night and the docs are now fully up to date again. Phew!&lt;/p&gt; 27 + &lt;h2 id=&quot;day10-work&quot;&gt;&lt;a href=&quot;#day10-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;day10 work&lt;/h2&gt; 28 + &lt;p&gt;This was a fun week of large-scale building! I integrated day10 and odoc_driver and js_top_worker and x-ocaml and have now successfully got a docs-ci-like system that's able to build docs and toplevels that can coexist in the one HTML tree. I've not got a full integrated demo yet, but you can see the test cases for this &lt;a href=&quot;https://jon.ludl.am/experiments/day10-jtw/standalone/index.html&quot;&gt;here&lt;/a&gt;. Be sure to take a look at the 'network' tab in the browser dev tools to see what it's doing!&lt;/p&gt; 29 + &lt;h2 id=&quot;scrollycode-experiments&quot;&gt;&lt;a href=&quot;#scrollycode-experiments&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Scrollycode experiments&lt;/h2&gt; 30 + &lt;p&gt;I've long been a fan of &lt;a href=&quot;https://pomb.us/&quot;&gt;Rodrigo Pombo's&lt;/a&gt; work on &amp;quot;building tools for better code reading comprehension&amp;quot;, ever since first seeing his post &amp;quot;&lt;a href=&quot;https://pomb.us/build-your-own-react/&quot;&gt;Build your own React&lt;/a&gt;&amp;quot;. Claude is &lt;i&gt;fantastically good&lt;/i&gt; at doing this sort of thing, so I asked it to go and build me some simple OCaml-focused versions. We came up with 5 variations in the end - and they're all pretty neat! &lt;a href=&quot;https://jon.ludl.am/experiments/scrollycoder/&quot;&gt;take a look!&lt;/a&gt;. The best part of this was that it took me less than half-an-hour to get Claude to do all this.&lt;/p&gt; 31 + &lt;h2 id=&quot;dune-pr&quot;&gt;&lt;a href=&quot;#dune-pr&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune PR&lt;/h2&gt; 32 + &lt;p&gt;I attended the bi-weekly dune dev meeting to talk about the first part of the dune PR - the bit that Paul Elliot did almost a year ago.&lt;/p&gt; 33 + &lt;h2 id=&quot;coming-week&quot;&gt;&lt;a href=&quot;#coming-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Coming week&lt;/h2&gt; 34 + &lt;p&gt;So the clock is ticking on writing the exam questions for FoCS, so I'll need to be spending time this week on that.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html</id><title type="text">Weeknotes for week 6</title><updated>2026-02-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!</summary><published>2026-01-30T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-weeks-4-5&quot;&gt;&lt;a href=&quot;#weeknotes-for-weeks-4-5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for weeks 4-5&lt;/h1&gt; 35 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-01-30&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 36 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.extension_api&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 37 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;packages&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;packages&lt;/span&gt; &lt;p&gt;odoc-admonition-extension odoc-rfc-extension odoc-msc-extension odoc-mermaid-extension odoc-dot-extension&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 38 + &lt;p&gt;I've been battling the seasonal illnesses this week, so I've combined two weeknotes into one. Fortunately the 'flu doesn't hold Claude back!&lt;/p&gt; 39 + &lt;p&gt;Probably the most interesting part of this is the &lt;a href=&quot;#retrospective&quot; title=&quot;retrospective&quot;&gt;Retrospective&lt;/a&gt;, so make sure to read that bit.&lt;/p&gt; 40 + &lt;h2 id=&quot;the-last-two-weeks&quot;&gt;&lt;a href=&quot;#the-last-two-weeks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The Last Two Weeks&lt;/h2&gt; 41 + &lt;p&gt;As is becoming more and more apparent, the &lt;em&gt;breadth&lt;/em&gt; of what I'm working on is ever expanding, powered by agentic AI. It's become so much more (cognitively) cheaper to have an idea and set an agent off investigating it that I've been finding that I'm working in parallel on far more things in a single week than I would have even six months ago. Here are some of the bigger headings though.&lt;/p&gt; 42 + &lt;h3 id=&quot;monorepo-excitement&quot;&gt;&lt;a href=&quot;#monorepo-excitement&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Monorepo excitement&lt;/h3&gt; 43 + &lt;p&gt;We're currently experimenting with a new tool - &lt;a href=&quot;https://tangled.org/anil.recoil.org/monopam&quot;&gt;monopam&lt;/a&gt; to help develop across multiple OCaml libraries by using git subtrees to create a monorepo with all of the packages in. We then extract patches to the individual repos to push upstream. I've been moving my development workflow from in-vscode-claude with careful permissions checking to running claude with `--dangerously-skip-permissions` in a container with the monorepo checked out. This has been a bit of a bumpy ride, with the tool evolving daily, but I'm very much seeing the benefits of letting Claude just get on with things, given a strict enough early design and testing strategy, and using Anil's method of creating the interfaces first.&lt;/p&gt; 44 + &lt;h4 id=&quot;odoc&quot;&gt;&lt;a href=&quot;#odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc&lt;/h4&gt; 45 + &lt;p&gt;I also did quite a bit related to odoc these 2 weeks, split over improving functionality and bugfixing.&lt;/p&gt; 46 + &lt;h5 id=&quot;plugins&quot;&gt;&lt;a href=&quot;#plugins&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Plugins&lt;/h5&gt; 47 + &lt;p&gt;Getting Claude to run with all of the monorepo libraries implicitly requires that they're well documented, as looking at the source to figure out how to use them exhausts the context window pretty rapidly. Odoc's main focus has been on getting the expansions and referencing correct, and while we've made progress on the actual content markup, introducing &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#media&quot;&gt;media tags&lt;/a&gt; for example, there's still a good distance to go.&lt;/p&gt; 48 + &lt;p&gt;Using the plugins mechanism I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2026/01/weeknotes-2026-03&quot;&gt;wrote about last week&lt;/span&gt;, I've made a plugin interface for odoc and implemented a few plugins. Initially I was just going to support 'custom tags' but it occurred to me that rendering code blocks could also be done in this way. So I've made a few. Two custom tag plugins:&lt;/p&gt; 49 + &lt;ul&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-admonition-extension/index&quot;&gt;odoc-admonition-extension&lt;/span&gt; - styled callout blocks for notes, warnings, tips. Note that we are intending to make this more first-class - there's a &lt;a href=&quot;https://hackmd.io/ETSOAmetTI-E3vrDk3Bfrw&quot;&gt;design out there&lt;/a&gt;. This was just a convenient way to test the feature!&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-rfc-extension/index&quot;&gt;odoc-rfc-extension&lt;/span&gt; - links to IETF RFC documents&lt;/li&gt;&lt;/ul&gt; 50 + &lt;p&gt;and 3 code block plugins:&lt;/p&gt; 51 + &lt;ul&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-msc-extension/index&quot;&gt;odoc-msc-extension&lt;/span&gt; - Message Sequence Charts&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-mermaid-extension/index&quot;&gt;odoc-mermaid-extension&lt;/span&gt; - Mermaid diagrams (flowcharts, sequence diagrams, etc.)&lt;/li&gt;&lt;li&gt;&lt;span class=&quot;xref-unresolved&quot; title=&quot;/odoc-dot-extension/index&quot;&gt;odoc-dot-extension&lt;/span&gt; - Graphviz/DOT diagrams&lt;/li&gt;&lt;/ul&gt; 52 + &lt;p&gt;The module signatures relevant to the plugins are documented in &lt;code&gt;/odoc.extension_api/Odoc_extension_api&lt;/code&gt; and the plugins each have to implement an interface described in &lt;code&gt;Odoc_extension_api.Code_Block_Extension&lt;/code&gt; or &lt;code&gt;Odoc_extension_api.Extension&lt;/code&gt; for custom tags.&lt;/p&gt; 53 + &lt;h5 id=&quot;bugfixing&quot;&gt;&lt;a href=&quot;#bugfixing&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bugfixing&lt;/h5&gt; 54 + &lt;p&gt;&lt;a href=&quot;https://github.com/lukemaurer&quot;&gt;Luke Maurer&lt;/a&gt; at Jane Street pointed out that they're still suffering from yet another repro of &lt;a href=&quot;https://github.com/ocaml/odoc/issues/930&quot;&gt;issue 930&lt;/a&gt; at Jane Street. I'd worked on this &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/odoc-bugs&quot;&gt;back in September&lt;/span&gt; but turns out I hadn't actually made a PR, so I tidied up the branch and &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1400&quot;&gt;made a PR&lt;/a&gt;.&lt;/p&gt; 55 + &lt;h3 id=&quot;docs-ci&quot;&gt;&lt;a href=&quot;#docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI&lt;/h3&gt; 56 + &lt;p&gt;Docs CI has been fixed and is even now rebuilding all of the docs for ocaml.org. I've added in the &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/commit/c6231fa383820b4c700aaa1e72107536b1872112&quot;&gt;handling of `post &amp;amp; with-doc`&lt;/a&gt; in place of x-extra-doc-deps, so we should be able to use either mechanism now. The idea is to deprecate x-extra-doc-deps soon though. Somehow despite an explicit button to press to update the epoch symlinks, it got updated anyway and broke most of the docs on ocaml.org. Fortunately &lt;a href=&quot;https://discuss.ocaml.org/t/is-caqti-doc-missing/17741/5&quot;&gt;someone noticed&lt;/a&gt; and posted on discuss and so I switched it back.&lt;/p&gt; 57 + &lt;p&gt;Unfortunately, it seemed to be taking a long time to build the docs - at time of writing it's now Friday, and the CI jobs have been running since Tuesday. In that time, it's only managed to build about 6500 packages, a long way short of the 16,000 or so that I expect a full build will produce. Looking through the logs, it seems that some change to opam is causing it to sometime rebuild the entire opam universe when it should only be building 1 package. For example, in a job that should be building just `tezos-protocol-004-Pt24m4xi`, it installs all of the prebuilt dependencies, then runs `opamh` to try to convince opam that everything is all set up to just run the build step for the package we want. Unfortunately the logs show the following:&lt;/p&gt; 58 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;The following actions will be performed: 59 + === recompile 178 packages 60 + - recompile aches 1.1.0 [uses ocaml] 61 + - recompile aches-lwt 1.1.0 [uses ocaml] 62 + ... 63 + - recompile mtime 2.1.0 [uses ocaml] 64 + - recompile ocaml 4.14.2 [upstream or system changes] 65 + - recompile ocaml-compiler-libs v0.12.4 [uses ocaml] 66 + ...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 67 + &lt;p&gt;where it seems opam has decided that something has changed enough for it to want to recompile the `ocaml` package, and therefore &lt;i&gt;everything&lt;/i&gt; in the entire opam switch! So this job took 12 minutes instead of 21 seconds, which was the time required to finally build the `tezos-protocol` package.&lt;/p&gt; 68 + &lt;h3 id=&quot;day10-and-docs&quot;&gt;&lt;a href=&quot;#day10-and-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Day10 and docs&lt;/h3&gt; 69 + &lt;p&gt;In closely related news, &lt;a href=&quot;https://tunbury.org/&quot;&gt;mtelver's&lt;/a&gt; day10 project looked precisely the right shape for building docs - in fact it shares its architecture and some components with the docs CI. So I asked Claude to take a look and see what it would take, and discovered that it doesn't take very much! We have a Really Big Machine here at the CL that was temporarily underused; and by Really Big I mean 768 cores and 3TB of RAM. So, how long could building all of the docs for all of the packages possibly take? Well, it takes 5 hours 40 mins. And I was only using roughly a third of the machine. Nice!&lt;/p&gt; 70 + &lt;p&gt;So should I push on with fixing ocaml-docs-ci and figure out why it's rebuilding everything all the time? Or should I forge ahead with day10 and turn it into a proper CI system as opposed to a slightly flakey bespoke thing I have to handhold through a build? This is next week's problem.&lt;/p&gt; 71 + &lt;h3 id=&quot;js-toplevels&quot;&gt;&lt;a href=&quot;#js-toplevels&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;JS toplevels&lt;/h3&gt; 72 + &lt;p&gt;Something I keep coming back to is javascript toplevels. I'd really like to be able to be able to host JS toplevels on ocaml.org for each different version of each different package. This is something I've worked on on-and-off for a long time now, and several fixes to help have been merged to various projects along the way. The tricky thing is to not put a massive load onto ocaml.org with this, so we need to be efficient. That means firstly having a single toplevel js file with all of the logic in but none of the libraries, and then dynamically loading libraries as we need them. Also we can save some bandwidth by not immediately sending all of the cmi files, as these can be faulted in as necessary too. So once again I've got Claude on the task, and things are honestly looking pretty hopeful now. I've got 2 demos:&lt;/p&gt; 73 + &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/findlibish/&quot;&gt;Dynamic library loading&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://jon.ludl.am/experiments/multi-universe-demo/&quot;&gt;Multi-version support&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt; 74 + &lt;p&gt;In both cases, make sure you take a look at the network tab to see it dynamically loading only what it needs.&lt;/p&gt; 75 + &lt;h2 id=&quot;retrospective&quot;&gt;&lt;a href=&quot;#retrospective&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Retrospective&lt;/h2&gt; 76 + &lt;h3 id=&quot;autonomous-claude&quot;&gt;&lt;a href=&quot;#autonomous-claude&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Autonomous Claude&lt;/h3&gt; 77 + &lt;p&gt;The power of sending Claude off to do some work can be immense. However, it does mean investing time up front telling it precisely what problem you're trying to solve, what approach to take, finer details on how you want it done, and how you can tell if it's working when it finishes. A 'failure mode' I've been experiencing is when I end up in a long, drawn out real time interaction, especially if that's happening with 2 projects simultaneously - and by 'failure' I really mean just 'slow'. Ideally what would be going on is for all of my agents to be getting on with whatever task they've been allocated without bothering me for more details. For Claude to have to ask me a question has much more latency involved than it just getting on with things, especially if I don't notice it immediately.&lt;/p&gt; 78 + &lt;h3 id=&quot;when-to-stop&quot;&gt;&lt;a href=&quot;#when-to-stop&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;When to Stop&lt;/h3&gt; 79 + &lt;p&gt;The 'finishing criteria' are important - many times this week I've had Claude tell me it's finished something, having verified that it's passing all the tests, only for me to take a look to find that it's very obviously broken. As quite a few things recently have involved the web, I've put Playwright into all of my devcontainers, and told Claude to use it to verify things are working. This has been working pretty well, so I'll be adding it to my prompts. It's not too dissimilar to what we used to call 'pre-flight checks' back in the Citrix days.&lt;/p&gt; 80 + &lt;h3 id=&quot;containers-vs-accounts&quot;&gt;&lt;a href=&quot;#containers-vs-accounts&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Containers vs accounts&lt;/h3&gt; 81 + &lt;p&gt;I've been running everything with `--dangerously-ignore-permissions` in containers, and while the outcome is amazing, the containers bit has been a bit of a headache. Next week I'll be trialling the idea of just giving the agents their own account (non-admin!) on my servers, their own github account, tangled account and so on, and just treating them more like I would if I had a real colleague. It's always slightly alarming to see my own name on the output of the bots, assigning me (or sometimes someone else (!!)) copyright over code I've never seen. This is, of course, a whole other pandora's box that I really don't want to open right now - but I think the point is that I'll feel a lot more comfortable if the commits are all by `Jon's Agent &amp;lt;jon+claude@recoil.org&amp;gt;` rather than by me!&lt;/p&gt; 82 + &lt;h3 id=&quot;deciding-next-steps&quot;&gt;&lt;a href=&quot;#deciding-next-steps&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Deciding next steps&lt;/h3&gt; 83 + &lt;p&gt;The question of whether I should fix up ocaml-docs-ci or improve the day10 solution requires a bit of thought. In fact, it requires a bit of a gap analysis between the two. This isn't something I've asked Claude to do before, so I'll try that and see how it turns out. I'll be asking it to be &amp;quot;scientific&amp;quot; in its approach, coming up with hypotheses and verifying them - for which I think I'll need to give it a platform on which it can perform experiments. This is a bit trickier with ocaml-docs-ci than day10 as day10 runs entirely on any given linux computer, whereas ocaml-docs-ci needs ocurrent workers and a routable ssh server. I'll report on the outcome of this next week!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html</id><title type="text">Weeknotes for weeks 4-5</title><updated>2026-01-30T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">First week back of 2026! Let's write some terse weeknotes.</summary><published>2026-01-19T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeknotes-for-week-3&quot;&gt;&lt;a href=&quot;#weeknotes-for-week-3&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeknotes for week 3&lt;/h1&gt; 84 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2026-01-19&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 85 + &lt;p&gt;First week back of 2026! Let's write some terse weeknotes.&lt;/p&gt; 86 + &lt;h2 id=&quot;projects&quot;&gt;&lt;a href=&quot;#projects&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Projects&lt;/h2&gt; 87 + &lt;h3 id=&quot;dune-odoc-rules&quot;&gt;&lt;a href=&quot;#dune-odoc-rules&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune odoc rules&lt;/h3&gt; 88 + &lt;p&gt;Last thing I did last year was to push the new rules for odoc 3. This week, Anil handed me an excellent opportunity to test the rules on the monorepo containing his &lt;a href=&quot;https://anil.recoil.org/notes/aoah-2025&quot;&gt;AOAH&lt;/a&gt; projects. Claude tends to actually write ocamldoc-formatted comments, so this is really useful to test the rules. I've &lt;a href=&quot;https://github.com/jonludlam/dune/tree/odoc-v3-rules-3.21&quot;&gt;rebased the commits&lt;/a&gt; on the just-released Dune 3.21 and we've been trying them out. There were a few things to fix:&lt;/p&gt; 89 + &lt;ul&gt;&lt;li&gt;More careful &lt;a href=&quot;https://github.com/jonludlam/dune/commit/25158eabf0c3cac2826e16ce590b4bd4d7c09818&quot;&gt;dependency tracking&lt;/a&gt; during the compile phase - this particularly affected the &lt;code&gt;@doc&lt;/code&gt; target, which was pulling in unnecessary dependencies. Most of these dependencies were compiling just fine, but one - Anstrom - is slightly odd in that the opam install of Angstrom installs a META file that references libraries that aren't in the dependencies of its opam package. This is a backward-compatibility hack that was implemented when the Anstrom package was split into several in order to manage the dependencies better.&lt;/li&gt;&lt;li&gt;A similar issue happens with eio, where the documentation of the package depends upon &lt;code&gt;bigstring&lt;/code&gt;, which isn't in eio's dependencies. This is entirely intentional - the extra doc dependencies is stated in the opam file with a &lt;code&gt;x-extra-doc-deps&lt;/code&gt; field. However, &lt;code&gt;opam install&lt;/code&gt; totally ignores this field (quite reasonably), and so a simple install gives you an opam repo whose docs can't be built. Once again, this broke &lt;code&gt;dune build @doc&lt;/code&gt; unnecessarily, but the fix was &lt;a href=&quot;https://github.com/jonludlam/dune/commit/2afe046cf4290d7a83b5f2c5646e3391ca94b630&quot;&gt;relatively simple&lt;/a&gt;. The &lt;i&gt;real&lt;/i&gt; fix here is to not use &lt;code&gt;x-extra-doc-deps&lt;/code&gt;, but switch to using a &lt;i&gt;real&lt;/i&gt; dependency, but marked with &lt;code&gt;with-doc&lt;/code&gt; and &lt;code&gt;post&lt;/code&gt; if it would otherwise introduce a circular dependency. That way, an &lt;code&gt;opam install --with-doc&lt;/code&gt; &lt;i&gt;would&lt;/i&gt; install the extra dependency.&lt;/li&gt;&lt;li&gt;Over the Christmas break, &lt;a href=&quot;https://discuss.ocaml.org/u/tbrk&quot;&gt;tbrk&lt;/a&gt; posted &lt;a href=&quot;https://discuss.ocaml.org/t/odoc-index-for-multiple-packages-inter-package-links-and-local-global-sidebar/17652&quot;&gt;on discuss&lt;/a&gt; a question about building docs, for which my dune branch was a partial answer. One feature he was requesting though was the ability to use a custom top-level index. It's a useful feature that's implemented in &lt;code&gt;odoc_driver&lt;/code&gt; so I've &lt;a href=&quot;https://github.com/jonludlam/dune/commit/efecdee1b36b7e47906e7c64b7496a1fc7954a2d&quot;&gt;added it&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;More sensible &lt;a href=&quot;https://github.com/jonludlam/dune/commit/039eb3d2a3e9d28f8b195905f43839daf5ce8c21&quot;&gt;default link scope&lt;/a&gt;. By default, documentation references in the &lt;code&gt;mli&lt;/code&gt; files of a library can link to any other library in the package. However, by default it wasn't possible to link to the dependencies of another library, unless it happened to be a dependency of your own library. Similarly, the package-wide mld files could only reference the modules in the package's libraries, not to the dependencies. This seems overly cautious, as we can be sure that if we've managed to build the libraries then their dependencies are installed, and if there are any module name conflicts, we can resolve them via the &lt;code&gt;/&amp;lt;lib&amp;gt;/Module&lt;/code&gt; syntax.&lt;/li&gt;&lt;li&gt;Lastly, implementations of virtual libraries &lt;a href=&quot;https://github.com/jonludlam/dune/commit/12f9ecbd4888444c2d359049a914ffb4827912f9&quot;&gt;need to be skipped&lt;/a&gt; as they've all got the same docs (as they share mli files), and the rules as they were causing Dune to crash with a &amp;quot;Conflicting implementations&amp;quot; error.&lt;/li&gt;&lt;/ul&gt; 90 + &lt;p&gt;I've also rebased the PR onto latest &lt;code&gt;main&lt;/code&gt;, but I've not yet put these patches there, which I'll need to do for the PR to be mergable. For now, the 3.21 branch is successfully building the docs for the monorepo.&lt;/p&gt; 91 + &lt;h3 id=&quot;ocaml-docs-ci&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml Docs CI&lt;/h3&gt; 92 + &lt;p&gt;&lt;a href=&quot;https://github.com/jmid&quot;&gt;Jan Midtgaard&lt;/a&gt; noticed over xmas that the Docs CI &lt;a href=&quot;https://github.com/ocaml/ocaml.org/issues/3437&quot;&gt;was broken&lt;/a&gt; and submitted &lt;a href=&quot;https://github.com/jonludlam/opamh/pull/1&quot;&gt;a fix&lt;/a&gt;. I've therefore been poking &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; to get the fix incorporated and into production. I almost immediately hit the issue that &lt;code&gt;odoc_driver&lt;/code&gt; now breaks for the exact same reason. I couldn't quite understand how &lt;code&gt;opam-format&lt;/code&gt; &lt;a href=&quot;https://github.com/ocaml/opam-repository/pull/28978&quot;&gt;had been merged&lt;/a&gt; to &lt;code&gt;opam-repository&lt;/code&gt; without someone noticing that it had broken &lt;code&gt;odoc_driver&lt;/code&gt;, but it turned out that it &lt;i&gt;had&lt;/i&gt; been noticed, but on a &lt;a href=&quot;https://github.com/ocaml/opam-repository/pull/28877&quot;&gt;beta release&lt;/a&gt;. The fix to docs ci was to install &lt;code&gt;odoc_driver&lt;/code&gt; from opam rather than &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/blob/81ca17c7b7a2f47ca571b1d6bc866720cebef136/src/lib/config.ml#L226&quot;&gt;pinning directly&lt;/a&gt; to a github hash, especially if that hash happens to be the hash of the released version!&lt;/p&gt; 93 + &lt;p&gt;While I'm working on docs CI, I thought it's probably also a good idea to move over to the &lt;code&gt;with-doc &amp;amp; post&lt;/code&gt; suggestion from above, so we're ready for when packages start to use that. This is now being tested, and hopefully we'll have the CI back up and running early next week.&lt;/p&gt; 94 + &lt;h3 id=&quot;better-styling-for-odoc&quot;&gt;&lt;a href=&quot;#better-styling-for-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Better styling for odoc&lt;/h3&gt; 95 + &lt;p&gt;I've done very little to the styling of odoc since I took maintainership way back in 2019 or so. It's a bit dated, and there are some annoying usability issues, so I thought it's a good opportunity to vibe-code a nice new frontend for it. Rather than hack directly on the HTML generator of odoc, this seemed to be a good opportunity to test the JSON output from the new Dune rules, so I asked Claude to make me a static site generator that read in the JSON files and spat out some nicely styled HTML. This worked like a charm, and the results are &lt;a href=&quot;https://jon.ludl.am/experiments/vibe-coded-odoc-frontend/&quot;&gt;here&lt;/a&gt;. Next steps are to see what it would take to get the native odoc output looking more like that.&lt;/p&gt; 96 + &lt;h3 id=&quot;custom-tags-in-odoc&quot;&gt;&lt;a href=&quot;#custom-tags-in-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Custom tags in odoc&lt;/h3&gt; 97 + &lt;p&gt;One of the themes of Anil's &lt;a href=&quot;&quot;&gt;AOAH&lt;/a&gt; coding spree was that many libraries were implementations of RFCs. In many places in the docs, there are links to relevant sections of the RFCs. It'd be nice in future to be able to validate that we've covered all of the parts of the RFCs, so making the links a little more parsable seemed like a good idea. In fact, it seemed that this might be a perfect use for custom tags - a feature that was present in ocamldoc that odoc has yet to implement.&lt;/p&gt; 98 + &lt;p&gt;&lt;a href=&quot;https://github.com/art-w&quot;&gt;Arthur Wendling&lt;/a&gt; then pointed me at dune's &lt;a href=&quot;https://dune.readthedocs.io/en/stable/reference/dune/plugin.html&quot;&gt;plugin system&lt;/a&gt;, which seemed just the ticket as a way to implement this. It's really nice, taking all of the hard work out of creating OCaml plugins, so I've now got &lt;a href=&quot;https://github.com/jonludlam/odoc/tree/extension-plugins&quot;&gt;an extension-plugins branch&lt;/a&gt; that implements this. It allows you to add support to odoc for tags like &lt;code&gt;@rfc&lt;/code&gt; which generate custom HTML, markdown or any other backend, can include links in their bodies, and can add custom headers to the web page, and custom files to be output by &lt;code&gt;odoc support-files&lt;/code&gt;. It looks like this should &amp;quot;just work&amp;quot; and no further changes to the dune rules are needed - though I need to actually test this out.&lt;/p&gt; 99 + &lt;h3 id=&quot;day10-and-docs&quot;&gt;&lt;a href=&quot;#day10-and-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Day10 and docs&lt;/h3&gt; 100 + &lt;p&gt;I've &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/build-ids-for-day10&quot;&gt;written about&lt;/span&gt; &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark's&lt;/a&gt; day10 project before. It's a tool to very rapidly build odoc packages mainly in order to test that they build correctly. An obvious extension would be to use this to then build the docs for those packages, as the way we do this requires the packages to be built first. This would be a replacement for the Docs CI that I talked about above, though there's considerable work to do before it's fully-featured enough to be a viable alternative. It seemed like a good time to experiment with this though, so I set up one of Anil's &lt;a href=&quot;https://anil.recoil.org/notes/ocaml-claude-dev&quot;&gt;devcontainers&lt;/a&gt;, gave Claude some instructions on what to do, took the safety belt off, and let him hack away! Previously most of my interactions with Claude had been via the vscode plugin, so using the terminal interface was a bit of a different experience. I'm fairly certain though that I'm going to switch everything over to working this way, as letting Claude just get on with things without having to OK every step is a far more efficient way to work - especially when you're not that concerned with the actual code being produced. This has been mostly a good experience, though Claude does sometimes go off in rather odd directions. At one point there was a network error with a dependency while trying to build odoc_driver, so it decided that it should have a fallback mechanism that executed odoc directly. I told it &lt;i&gt;NEVER&lt;/i&gt; to replace functionality in odoc_driver, so it rolled this back, but a few hours later in then did exactly the same thing again.&lt;/p&gt; 101 + &lt;h3 id=&quot;misc-other-stuff&quot;&gt;&lt;a href=&quot;#misc-other-stuff&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Misc other stuff&lt;/h3&gt; 102 + &lt;p&gt;A few other things too - &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/59037341cd53d8734a5874f7af2b728b5be70035&quot;&gt;improving the &lt;code&gt;--warn-error&lt;/code&gt; logic in odoc&lt;/a&gt;, and one of its &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/9d18feff5eda543652c6749062750de6e5bb4d6e&quot;&gt;error messages&lt;/a&gt;, improving the build of this website so I can iterate on it more quickly, fixing up some of my self-hosted services like my tangled knot, and other bits and bobs.&lt;/p&gt; 103 + &lt;h2 id=&quot;reflections&quot;&gt;&lt;a href=&quot;#reflections&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Reflections&lt;/h2&gt; 104 + &lt;p&gt;I think the most important thing this week has been the slightly eye-opening benefits of using Claude outside of the context of VSCode. I suspect I'll be doing much more of my work this way in future. There's also a good chance I'll have to upgrade my subscription from the $100-per-month to the $200 one...&lt;/p&gt; 105 + &lt;h2 id=&quot;next-week&quot;&gt;&lt;a href=&quot;#next-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Next week&lt;/h2&gt; 106 + &lt;ul&gt;&lt;li&gt;Start of term tutorial meetings&lt;/li&gt;&lt;li&gt;Sherldoc in monopam-myspace&lt;/li&gt;&lt;li&gt;Get ocaml-docs-ci deployed and working&lt;/li&gt;&lt;li&gt;Update the Dune PR&lt;/li&gt;&lt;li&gt;Integrate the custom-tags and website generator into monopam-myspace&lt;/li&gt;&lt;li&gt;Unleash Claude on my js-top-worker repo&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2026/01/weeknotes-2026-03.html</id><title type="text">Weeknotes for week 3</title><updated>2026-01-19T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Back in March of this year we released , a major new version of the OCaml documentation generator. It had a whole load of , many of which came with new demands on the build system driving it. We decid...</summary><published>2025-12-18T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/12/claude-and-dune.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;claude-and-dune&quot;&gt;&lt;a href=&quot;#claude-and-dune&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Claude and Dune&lt;/h1&gt; 107 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-12-18&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 108 + &lt;p&gt;Back in March of this year we released &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/index.html&quot;&gt;odoc 3.0.0&lt;/a&gt;, a major new version of the OCaml documentation generator. It had a whole load of &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;new features&lt;/a&gt;, many of which came with new demands on the build system driving it. We decided when working on it to build a new driver for odoc so that we could adjust it as we were building the new features, and this driver is now used to &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;build the documentation&lt;/span&gt; that appears on &lt;a href=&quot;https://ocaml.org/p/base/latest/doc/index.html&quot;&gt;ocaml.org&lt;/a&gt;. However, it was always the plan to integrate the new features into &lt;a href=&quot;https://dune.build&quot;&gt;Dune&lt;/a&gt; so that everyone could just run &lt;code&gt;dune build @doc&lt;/code&gt; and be able to use all of the new odoc 3 features.&lt;/p&gt; 109 + &lt;p&gt;So over the last few weeks I have been wrestling with getting Claude to update the odoc rules in Dune to support &lt;i&gt;some&lt;/i&gt; of the new features of odoc v3. What began as a background experiment during a lecture series has turned into a multi-week effort to turn mostly-working code into a clean, reviewable patch. AI-developed software is clearly going to be a big part of our future, and Anil is showing us all the way with his &lt;a href=&quot;https://anil.recoil.org/notes/aoah-2025-1&quot;&gt;Advent of Agentic Humps&lt;/a&gt; by building &lt;i&gt;new&lt;/i&gt; software, but upstreaming AI-generated changes to an existing, well established code base &lt;a href=&quot;https://github.com/ocaml/ocaml/pull/14369&quot;&gt;hasn't got off to a good start&lt;/a&gt; in the OCaml community, so I wanted to be extra careful to get this right.&lt;/p&gt; 110 + &lt;h3 id=&quot;claude-as-a-protyping-tool&quot;&gt;&lt;a href=&quot;#claude-as-a-protyping-tool&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Claude as a protyping tool&lt;/h3&gt; 111 + &lt;p&gt;The initial progress was pretty amazing, despite my initial worries that the dune code-base would be &lt;a href=&quot;https://github.com/ocaml/dune/pull/12529&quot;&gt;too large and subtle&lt;/a&gt; for an LLM to be able to make workable changes. In order to get going, first I had it look at several bits of example code:&lt;/p&gt; 112 + &lt;p&gt;1. &lt;a href=&quot;https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc.ml&quot;&gt;dune_rules/odoc.ml&lt;/a&gt; - this is the current home of the odoc rules in dune. It's local-only, meaning it only builds the docs for the current package in isolation, so no resolution of links to stdlib, other packages or libraries.&lt;/p&gt; 113 + &lt;p&gt;2. &lt;a href=&quot;https://github.com/ocaml/dune/blob/3.20.2/src/dune_rules/odoc_new.ml&quot;&gt;dune_rules/odoc_new.ml&lt;/a&gt; - these are the rules for odoc v2, which allow you to build the docs for your package plus all of the dependencies. I wrote this mostly myself some time ago. It does a pretty poor job of caching, error reporting, and has none of the odoc v3 features like assets, source rendering, hierarchical docs, better errors and so on.&lt;/p&gt; 114 + &lt;p&gt;3. &lt;a href=&quot;https://github.com/ocaml/odoc/tree/d8460cdaa2b91a03434a9a045d673703b7fabfb2/src/driver&quot;&gt;odoc_driver&lt;/a&gt; - this is the driver we wrote when building odoc v3. It's fully featured, but not at all incremental, and actually external to the dune codebase. It's the reference implementation that's used to build the docs that appear on &lt;a href=&quot;https://ocaml.org/p/base/latest/doc/index.html&quot;&gt;ocaml.org&lt;/a&gt;.&lt;/p&gt; 115 + &lt;p&gt;Armed with these three code-bases, I asked Claude to synthesise a new incremental version of the odoc rules for dune that has some of the features of &lt;code&gt;odoc_driver&lt;/code&gt;.&lt;/p&gt; 116 + &lt;h3 id=&quot;the-working-prototype&quot;&gt;&lt;a href=&quot;#the-working-prototype&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The working prototype&lt;/h3&gt; 117 + &lt;p&gt;Claude quickly produced a prototype that actually compiled and generated documentation. At that stage I was not interested in the quality of the generated source; I only needed to know whether Claude could navigate Dune's codebase and produce something that &lt;b&gt;works&lt;/b&gt;. I let the prototype evolve incrementally, adding in new features one at a time, for example, fixing the error reporting so that you only get warned about documentation errors that you can actually fix.&lt;/p&gt; 118 + &lt;p&gt;When the lectures finished, it turned out I had something that was pretty useful to me, and had a good chance to be useful to others too. So I opened up my editor and had a look through what had been produced, at this point hoping that a little bit of polishing should be enough - after all, it &lt;i&gt;was&lt;/i&gt; working!&lt;/p&gt; 119 + &lt;p&gt;It was dreadful.&lt;/p&gt; 120 + &lt;p&gt;There were long, rambling functions, code duplication, bad comments, it was unstructured, with repeated-but-slightly-different chunks all over the place. It wasn't just bad on one length scale - it was bad from the large-scale organisation of the code down to small scale baffling weirdnesses on one line. The more I looked, the more bonkers it appeared. But it did &lt;i&gt;work&lt;/i&gt;! So I thought I'd get Claude to clean up its own messes.&lt;/p&gt; 121 + &lt;h3 id=&quot;the-clean-up&quot;&gt;&lt;a href=&quot;#the-clean-up&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The clean-up&lt;/h3&gt; 122 + &lt;p&gt;I resolved that I would continue to let Claude do &lt;i&gt;all&lt;/i&gt; of the editing, and not do &lt;i&gt;any&lt;/i&gt; myself, and so thus began the more frustrating part of this adventure! I ended up giving a mix of very specific instructions: &amp;quot;move this code here&amp;quot;, &amp;quot;factorize out this functionality&amp;quot;, &amp;quot;rename this function&amp;quot;, and sometimes more general ones: &amp;quot;Remove any comments that don't add anything of value&amp;quot;, or &amp;quot;Think of a better way to do this&amp;quot;. The constant was that I needed to be looking over each change that it did, because while most of them were pretty good, there were still a few, even with the very explicit instructions, where it messed up. From the very broad, where at one point it told me &amp;quot;I'll remove this code to create odoc files for external dependencies, as they're installed by opam&amp;quot;, which isn't true, down to the very small - for example, it produced the following:&lt;/p&gt; 123 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let lib_names = deps.Odoc_config.libraries in 124 + if List.is_empty lib_names 125 + then Memo.return [] 126 + else Memo.List.filter_map lib_names ~f:(fun lib_name -&amp;gt; Lib.DB.find lib_db lib_name)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 127 + &lt;p&gt;where it has come up with a totally redundant check for the empty list.&lt;/p&gt; 128 + &lt;p&gt;It was at this point where it became frustrating, because although it's almost magical that Claude can do what it does in the time it does, this fact of having to keep a constant eye on it meant the the tens-of-seconds to minutes delay in between it doing something meant I ended up either twiddling my thumbs for long periods of time, or getting started on some other task and forgetting to come back to Claude, sometimes for hours!&lt;/p&gt; 129 + &lt;h3 id=&quot;ocaml-is-not-the-problem&quot;&gt;&lt;a href=&quot;#ocaml-is-not-the-problem&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml is &lt;b&gt;not&lt;/b&gt; the problem&lt;/h3&gt; 130 + &lt;p&gt;One part that particularly impressed, and also quite surprised me, was with its knowledge of OCaml. In particular, I had at one point two different types representing the 'target' - either a library or a package - and a 'kind' - either a module or a page. Now pages can only be associated with package targets, and modules can only be associated with libraries, but these two values were distinct, so there was a fair bit of code pattern matching invalid combinations and either throwing exceptions or picking some random value, depending on the whims of Claude's context. I bravely suggested it think of a better way to represent this, maybe using GADTs, and it did indeed come up with a pretty nice refactoring of the types:&lt;/p&gt; 131 + &lt;p&gt;Before:&lt;/p&gt; 132 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type target = 133 + | Lib of Package.Name.t * Lib.t 134 + | Pkg of Package.Name.t 135 + 136 + type artifact_kind = 137 + | Module of 138 + { visible : bool 139 + ; module_name : Module_name.t 140 + ; archive : string (* Which archive the module belongs to *) 141 + } 142 + | Page of 143 + { name : string 144 + ; pkg_libs : Lib.t list 145 + }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 146 + &lt;p&gt;After:&lt;/p&gt; 147 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;(* Artifact data types *) 148 + type page = { name : string; pkg_libs : Lib.t list } 149 + 150 + type mod_ = 151 + { visible : bool 152 + ; module_name : Module_name.t 153 + ; archive : string (* Which archive the module belongs to *) 154 + } 155 + 156 + type _ target = 157 + | Lib : Package.Name.t * Lib.t -&amp;gt; mod_ target 158 + | Pkg : Package.Name.t -&amp;gt; page target 159 + 160 + type artifact_kind = 161 + | Module : mod_ * mod_ target -&amp;gt; artifact_kind 162 + | Page : page * page target -&amp;gt; artifact_kind&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 163 + &lt;p&gt;This refactoring immediately removed a whole swathe of invalid combinations, making the code both safer and clearer. It's quite clear that Claude had no trouble understanding how GADTs work in OCaml, quite happily also using some existentials to pack them into lists and so on.&lt;/p&gt; 164 + &lt;h3 id=&quot;odd-behaviours&quot;&gt;&lt;a href=&quot;#odd-behaviours&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odd behaviours&lt;/h3&gt; 165 + &lt;p&gt;Sometimes Claude just went a little bit bananas. One annoyance that &lt;i&gt;repeatedly&lt;/i&gt; occurred was that it would forget how to build and test the dune executable, despite clear instructions in &lt;code&gt;Claude.md&lt;/code&gt;. Most of the time when it went wrong it would build dune, execute &lt;code&gt;dune clean&lt;/code&gt;, then try to run the dune binary that it had just removed with the &lt;code&gt;clean&lt;/code&gt;. Sometimes it would decide to use the bootstrap binary instead, which isn't rebuilt on every change, sometimes it would run the switch-installed dune binary, and on one occasion it tried to run &lt;code&gt;./configure &amp;amp;&amp;amp; make&lt;/code&gt;!&lt;/p&gt; 166 + &lt;p&gt;It would usually figure out eventually what the right thing to do was, but when you're waiting for it to complete so you can check what it's done these sorts of delays got a bit frustrating.&lt;/p&gt; 167 + &lt;h3 id=&quot;reflections&quot;&gt;&lt;a href=&quot;#reflections&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Reflections&lt;/h3&gt; 168 + &lt;p&gt;At one point, I ran out of Claude credits (despite paying $100 a month or so), at about 6:20pm one evening, and it told me that I needed to wait until 7pm to carry on. I'd just got to the point when I needed to write a short bit of code rather than refactoring what was already there, and I realised that while it would take me maybe 10 mins, it would take Claude maybe 10 seconds. Now, it could just be that it was the end of a long day and I was running out of steam, but I was content to switch focus elsewhere for a bit to wait for my credits to reset before carrying on! The point being that for the small implementation that I was after, it would be possible for me to get Claude to do it, and to eyeball the result to make sure it was OK in less time than I would have been able to do it myself. But I absolutely wouldn't have trusted Claude to do it in an upstreamable way &lt;b&gt;without&lt;/b&gt; looking at the result.&lt;/p&gt; 169 + &lt;p&gt;Overall, It's clear that Claude will be an incredibly useful tool for working with software. It's unbelievably good at jumping into a new code-base and figuring things out quickly, but less good at producing high-quality code that can be directly submitted upstream (yet?) - at least, not that &lt;b&gt;I&lt;/b&gt; would be comfortable submitting anyway. However, I think it's still a bit of an open question as to what the quality bar &lt;em&gt;should&lt;/em&gt; be. If it builds correctly, passes the tests, looks &lt;i&gt;broadly&lt;/i&gt; sensible and isn't on the critical path for performance, how much should we care about the line-to-line quality? &lt;b&gt;I&lt;/b&gt; certainly care, but am I being old fashioned?&lt;/p&gt; 170 + &lt;p&gt;I've submitted a &lt;a href=&quot;https://github.com/ocaml/dune/pull/12995&quot;&gt;PR with these changes&lt;/a&gt; for review, and we'll see what happens there. I ended up squashing all of the commits into one, as the intermediate steps are very likely not useful. However, for historical interest, the branch on which I did most of the work is &lt;a href=&quot;https://github.com/ocaml/dune/compare/main...jonludlam:dune:odoc3-global-sidebar&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/12/claude-and-dune.html</id><title type="text">Claude and Dune</title><updated>2025-12-18T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them int...</summary><published>2025-12-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;an-svg-is-all-you-need&quot;&gt;&lt;a href=&quot;#an-svg-is-all-you-need&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;An SVG is all you need&lt;/h1&gt; 171 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-12-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 172 + &lt;p&gt;SVGs are pretty cool - vector graphics in a simple XML format. They are supported on just about every device and platform, are crisp on every display, and can have embedded scripts in to make them interactive. They're &lt;a href=&quot;https://www.youtube.com/watch?v=4laPOtTRteI&quot;&gt;way more capable&lt;/a&gt; than many people realise, and I think we can capitalise on some of that unrealised potential.&lt;/p&gt; 173 + &lt;p&gt;Anil's recent post &lt;a href=&quot;https://anil.recoil.org/notes/principles-for-collective-knowledge&quot;&gt;Four Ps for Building Massive Collective Knowledge Systems&lt;/a&gt; got me thinking about the permanence of the experimentation that underlies our scientific papers. In my idealistic vision of how scientific publishing should work, each paper would be accompanied by a fully interactive environment where the reader could explore the data, rerun the experiments, tweak the parameters, and see how the results changed. Obviously we can't do this in the general case - some experiments are just too expensive or time-consuming to rerun on demand. But for many papers, especially in computer science, this is entirely feasible.&lt;/p&gt; 174 + &lt;p&gt;That line of thought reminded me of a project I tackled about 20 years ago as a post-doc in the Department of Plant Sciences here in Cambridge. I was writing a paper on &lt;a href=&quot;https://royalsocietypublishing.org/rsif/article/9/70/949/173/Applications-of-percolation-theory-to-fungal&quot;&gt;synergy in fungal networks&lt;/a&gt; and built a tiny SVG visualisation tool that let readers wander through the raw data captured from a real fungal network growing in a petri dish. I dug it up recently and was surprised (and delighted) to see that it still works perfectly in modern browsers - even though the original “cover page” suggested Firefox 1.5 or the Adobe SVG plug-in (!). Give it a spin; click the 'forward', 'back' and other buttons below the petri dish!&lt;/p&gt; 175 + &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;./fungus.svg&lt;/span&gt;&lt;/div&gt; 176 + &lt;p&gt;And that, dear reader, is literally all you need. A completely self-contained SVG file can either fetch data from a versioned repository or embed the data directly, as the example does. It can process that data, generate visualisations, and render knobs and sliders for interactive exploration. No server-side magic required - everything runs client-side in the browser, served by a plain static web server, and very easily to share.&lt;/p&gt; 177 + &lt;p&gt;How does it fit in with Anil's four Ps?&lt;/p&gt; 178 + &lt;ul&gt;&lt;li&gt;Permanence: SVGs can be assigned DOIs just like papers, blog posts, or datasets. The fact that the above SVG still works after two decades is a testament to the durability of the format.&lt;/li&gt;&lt;/ul&gt; 179 + &lt;ul&gt;&lt;li&gt;Provenance: Because SVG is plain text, it plays nicely with version control systems such as Git. When an SVG pulls in external data, the same provenance-tracking strategies Anil describes for datasets apply here as well.&lt;/li&gt;&lt;/ul&gt; 180 + &lt;ul&gt;&lt;li&gt;Permission: Once again, with the separation between the processing in the SVG and that data that it works on, the same permissioning models apply as for data in general.&lt;/li&gt;&lt;/ul&gt; 181 + &lt;ul&gt;&lt;li&gt;Placement: SVGs are &lt;i&gt;inherently&lt;/i&gt; spatial; it's very easy, for example, to make beautiful &lt;a href=&quot;https://stephanwagner.me/coding/blog/create-world-map-charts-with-svgmap#svgMapDemoGDP&quot;&gt;world maps&lt;/a&gt; with SVG.&lt;/li&gt;&lt;/ul&gt; 182 + &lt;p&gt;The SVG above is only a visualisation tool for data; it doesn't really do any processing, but it certainly &lt;i&gt;could&lt;/i&gt;. The biggest change that's happened over the 20 years since I wrote this is the &lt;i&gt;massive&lt;/i&gt; increase in the computation power available in the browser. If would be entirely feasible to implement the entire data analysis pipeline for that paper in an SVG today, probably without even spinning up the fans on my laptop!&lt;/p&gt; 183 + &lt;p&gt;So this is yet another tool in our ongoing effort to be able to effortlessly share and remix our work - added to the pile of Jupyter notebooks, &lt;a href=&quot;https://digitalflapjack.com/blog/marimo/&quot;&gt;Marimo botebooks&lt;/a&gt;, the &lt;a href=&quot;https://slipshow.readthedocs.io/en/stable/&quot;&gt;slipshow&lt;/a&gt;/&lt;a href=&quot;https://github.com/art-w/x-ocaml/&quot;&gt;x-ocaml&lt;/a&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/11/foundations-of-computer-science&quot;&gt;combination&lt;/span&gt;, &lt;a href=&quot;https://patrick.sirref.org/weekly-2025-w45/index.xml&quot;&gt;Patrick's take&lt;/a&gt; on Jon Sterling's &lt;a href=&quot;https://sr.ht/~jonsterling/forester/&quot;&gt;Forester&lt;/a&gt;, my own &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/index&quot;&gt;notebooks&lt;/span&gt;, and many others - and this is a subset of what we're using just in our own group!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html</id><title type="text">An SVG is all you need</title><updated>2025-12-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I recently completed lecturing the course to our newly arrived first-year computer scientists here at . This is the first time I've lectured this course, taking over from while he's on sabbatical. A...</summary><published>2025-11-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;foundations-of-computer-science&quot;&gt;&lt;a href=&quot;#foundations-of-computer-science&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Foundations of Computer Science&lt;/h1&gt; 184 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-11-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 185 + &lt;p&gt;I recently completed lecturing the course &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/&quot;&gt;&amp;quot;Foundations of Computer Science&amp;quot;&lt;/a&gt; to our newly arrived first-year computer scientists here at &lt;a href=&quot;https://www.cam.ac.uk&quot;&gt;Cambridge&lt;/a&gt;. This is the first time I've lectured this course, taking over from &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt; while he's on sabbatical. Although I was very nervous indeed about it, I ended up really enjoying the experience - and I hope the students did too! This post is a little brain dump of my thoughts on how it went and how we might improve it for next year.&lt;/p&gt; 186 + &lt;h2 id=&quot;course-overview&quot;&gt;&lt;a href=&quot;#course-overview&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Course Overview&lt;/h2&gt; 187 + &lt;p&gt;The course is 12 lectures long and has been lectured in a similar way since I myself was an undergraduate here, way back in 1996. There have been a few changes, not least of which is that back then it was in Standard ML rather than OCaml, but the core material has remained largely the same: lists, recursive functions, trees, higher-order functions, search and finally mutability. There are no prerequisites for the course, although all students have at least a maths A-level (or equivalent), and almost all of them have done some programming before, though the experience varies widely. Very few have done any functional programming, and even fewer have written any OCaml before.&lt;/p&gt; 188 + &lt;p&gt;The notes for the course are distributed both in hard copy and also as an &lt;a href=&quot;https://github.com/ocamllabs/focs-notebooks/blob/main/1A%20Foundations%20of%20Computer%20Science.ipynb&quot;&gt;interactive Jupyter Notebook&lt;/a&gt;, which we host on our &lt;a href=&quot;https://hub.cl.cam.ac.uk&quot;&gt;JupyterHub server&lt;/a&gt; that I maintain. The idea is that the students can read through the notes and then play around with the code examples directly in the notebook. I don't encourage them or give them time to do much &lt;i&gt;during&lt;/i&gt; the lectures - not that I think this is a terrible idea, but it's a struggle to fit all the material in otherwise! The notes are pretty closely coupled to the lectures, organised into 11 chapters that correspond to the first 11 lectures, with exercises at the end of each chapter that are intended to be covered in the supervisions. We also have some assessed exercises - &amp;quot;Ticks&amp;quot; - that the students complete in their own time using the JupyterHub server using &lt;a href=&quot;https://github.com/jupyter/nbgrader&quot;&gt;nbgrader&lt;/a&gt;. They are automatically assessed in a very transparent way; each &amp;quot;tick&amp;quot; is a Jupyter notebook with editable answer cells and read-only test cells. Overall we're aiming for the students not to &lt;i&gt;have&lt;/i&gt; to install OCaml locally at all, though I hope many of them will choose to do so anyway.&lt;/p&gt; 189 + &lt;p&gt;While I didn't want them playing around with the notebook during the lectures, I do, however, try to get them to interact by getting them to answer questions. It's pretty intimidating to stick your head above the parapet like this, so as an incentive I rewarded those that answered (rightly or wrongly) with some of the excellent stickers that Tarides has printed over the years. Everybody loves stickers!&lt;/p&gt; 190 + &lt;p&gt;The questions I asked varied quite a lot in their difficulty, and many were in the first few minutes of each lecture, where I had a short 'warm-up' where we recapped the contents of the previous lecture. These warm-ups were strongly suggested by Anil, and as well as reminding everyone of where we left off, they also gave me a bit of feedback on the things that the students found challenging.&lt;/p&gt; 191 + &lt;p&gt;One entertaining aspect is that during the first lecture I do actually encourage them to at least log on to the JupyterHub server, mostly to get them used to the idea of trying it. The entertaining part is that our server isn't particularly big and beefy, and so with 130 students all trying to log on at once, it invariably caves in under the load. At this point in the lecture I ssh to the server and run btop/htop and we watch it die in real time!&lt;/p&gt; 192 + &lt;h2 id=&quot;what-changed-this-year&quot;&gt;&lt;a href=&quot;#what-changed-this-year&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What changed this year&lt;/h2&gt; 193 + &lt;p&gt;During the lectures themselves, rather than use Keynote or PowerPoint for the slides, I decided to try using &lt;a href=&quot;https://slipshow.readthedocs.io/en/stable/&quot;&gt;Slipshow&lt;/a&gt;, augmented with &lt;a href=&quot;https://github.com/art-w/x-ocaml&quot;&gt;x-ocaml&lt;/a&gt; to embed executable OCaml code snippets. I'm very happy with how this worked out. I was able to prepare both working and broken snippets, modify them live during the lecture, and things like type-on-hover was very useful. In a few lectures where we were discussing big-O notation, I was able to run code on different input sizes and really demonstrate the big difference in run-time of certain algorithms. After the lectures, I posted the slides onto the course website so that students can refer back to them, and they can also try out the live code snippets directly in the slides.&lt;/p&gt; 194 + &lt;p&gt;Both Slipshow and x-ocaml are still quite young projects, so it was inevitable that there were a few rough edges, and in fact the interaction of the two revealed the biggest problem: that when you use the 'speaker-view' mode of Slipshow, where you have a separate window with notes and the current slide, the x-ocaml widgets are effectively independent in the two windows, so updating in one doesn't update in the other. &lt;a href=&quot;https://choum.net/panglesd/&quot;&gt;Paul-Elliot&lt;/a&gt;, the author of Slipshow, had already got a potential fix for this in the works when I spoke to him about it, so hopefully next time I use this I'll be able to have speaker notes on screen, instead of hand-written index cards! The x-ocaml project is a lot smaller than Slipshow, so I was able to use Claude to help me add functionality I needed, such as being able to programmatically highlight sections of the code.&lt;/p&gt; 195 + &lt;p&gt;Another new thing I tried this year was to go over 'tracing' of execution to help the students understand how programs run. We've always taught reduction steps in the course, which works well as it's only the last lecture where we introduce mutability, but it can quickly become unwieldy, and it can be challenging to do this all by hand. Tracing a function tells the runtime to log when function calls and returns happen, so you just need to call the function on your desired input, and you get a fully automatic trace of the execution. As it's only function calls and returns, it doesn't tell the full story, but alongside the handwritten reduction, it can help reassure students that they're on the right track. I ended up writing up a trace of a particularly complicated lazy-list evaluation using Slipshow and x-ocaml, which I posted &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/interleave_explanation.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt; 196 + &lt;h2 id=&quot;thoughts-for-next-year&quot;&gt;&lt;a href=&quot;#thoughts-for-next-year&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Thoughts for next year&lt;/h2&gt; 197 + &lt;p&gt;Overall I'm very happy with how the course went this year, though in some ways it did feel a little bit like the course finished just when it had started to get to the good stuff! There's a Tripos review process going on at the moment, so maybe we'll get to expand this course a bit in future years.&lt;/p&gt; 198 + &lt;p&gt;While the Slipshow+x-ocaml combination worked well, the fact that we ended up with two separate systems for executing OCaml wasn't ideal. I think it'd be a really nice project to investigate just how far we can push x-ocaml / Slipshow / some other web technology to have a true &amp;quot;serverless&amp;quot; experience so we can ditch the JupyterHub server entirely. By caching the x-ocaml 'execution' web worker in the browser, we could have a system that works fully offline, removing an annoyingly failure-prone single point of failure. Of course, we'd still need some way to do the assessed exercises, but that's a small point in a much larger problem: we really can't continue to ignore how LLMs are impacting the way that students are approaching these exercises in both positive and negative ways. To answer this properly, we need to think hard about what the purpose of these exercises is and look around to see what our &lt;a href=&quot;https://eecs.iisc.ac.in/people/prof-viraj-kumar/&quot;&gt;colleagues&lt;/a&gt; are doing &lt;a href=&quot;https://dl.acm.org/doi/10.1145/3724363.3729100&quot;&gt;in this space&lt;/a&gt;.&lt;/p&gt; 199 + &lt;p&gt;The slide decks themselves are fully open and available on the &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/&quot;&gt;course website&lt;/a&gt;:&lt;/p&gt; 200 + &lt;ol&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture1/lecture1.html&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture2/lecture2.html&quot;&gt;Recursion and Complexity&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture3/lecture3.html&quot;&gt;Lists and Polymorphism&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture4/lecture4.html&quot;&gt;More Lists and Making Change&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture5/lecture5.html&quot;&gt;Sorting&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture6/lecture6.html&quot;&gt;Datatypes and Trees&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture7/lecture7.html&quot;&gt;Dictionaries and Functional Arrays&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture8/lecture8.html&quot;&gt;Currying&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture9/lecture9.html&quot;&gt;Sequences, or Lazy Lists&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture10/lecture10.html&quot;&gt;Search&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture11/lecture11.html&quot;&gt;Procedural Programming&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/lecture12/lecture12.html&quot;&gt;Recap and Real World Use!&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><id>https://jon.recoil.org/blog/2025/11/foundations-of-computer-science.html</id><title type="text">Foundations of Computer Science</title><updated>2025-11-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Some results from the . This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.</summary><published>2025-09-23T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;caching-opam-solutions---part-2&quot;&gt;&lt;a href=&quot;#caching-opam-solutions---part-2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Caching opam solutions - part 2&lt;/h1&gt; 201 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-23&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 202 + &lt;p&gt;Some results from the &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/09/caching-opam-solutions&quot;&gt;previous post&lt;/span&gt;. This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.&lt;/p&gt; 203 + &lt;p&gt;First let's talk about the &amp;quot;examination map&amp;quot;. This is a map from package name to a list of other packages whose solutions should be recalculated if the package in question is altered. It's built by first looking at the packages that the solver asks about during the solution for a package, and then taking &lt;em&gt;all&lt;/em&gt; of the solutions, and 'inverting' the map, so for example, if both packages 'a' and 'b' ask about package 'c' during their solutions, then altering 'c' means that the solutions for both 'a' and 'b' need to be recalculated. The examination map entry for 'c' would then be &lt;code&gt;'a'; 'b'&lt;/code&gt;. We can plot the histogram of the sizes of each entry in the examination map:&lt;/p&gt; 204 + &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Package Examiner Distribution Histogram&lt;/span&gt;&lt;/div&gt; 205 + &lt;p&gt;Some interesting features from these data:&lt;/p&gt; 206 + &lt;ul&gt;&lt;li&gt;The most common number of observers is 1, meaning that the package is not involved in the solution of any other package. There are approximately 2000 such packages.&lt;/li&gt;&lt;li&gt;Most (~80%) of packages have fewer than 100 observers. This means that if we alter one of these packages, we only need to recalculate the solutions for fewer than 100 other packages.&lt;/li&gt;&lt;li&gt;A &lt;em&gt;very&lt;/em&gt; small number of packages are observed in all 4,400 solutions. This is actually a bit artificial, as the solver adds the ocaml-compiler package as an input to all solves to ensure we get the correct compiler version. There's another way to do this which would avoid this particular problem.&lt;/li&gt;&lt;li&gt;A small number of packages have a very large number of observers, around 3800. This mostly corresponds with &lt;code&gt;dune&lt;/code&gt; and its dependencies and associated packages. There are around 350 such packages, and any change to these means we need to recalcuate most of the solutions.&lt;/li&gt;&lt;/ul&gt; 207 + &lt;p&gt;This last point doesn't mean that we actually &lt;em&gt;recompile&lt;/em&gt; 3,800 packages, just that we need to recalcualte the solution, which might then lead to a cache hit of the layer and no actual compilation. However, recalculating the solutions of all of the packages takes (on my computer) around 10,000 seconds, or roughly 5 minutes of wall-clock time as I've got 32 threads.&lt;/p&gt; 208 + &lt;p&gt;However, if the package that's changes &lt;i&gt;isn't&lt;/i&gt; one of those 350 packages, then the number of solutions that need to be recalculated is dramatically reduced. I ran the logic over the last few weeks of commits to opam-repository, from commit &lt;code&gt;109398e2fd61803126becd398df0f1eabc9f3ca2&lt;/code&gt; of the 10th September up until commit &lt;code&gt;3f21ebe342ce440d9c9142ffe1185d8e5a326085&lt;/code&gt; from the 22nd. In this time there were 144 commits (counting only those from &lt;code&gt;git log --first-parent&lt;/code&gt;). Of these, only 4 resulted in a full resolve - the first commit, since obviously we have no cache at that point, the &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/40283204789e7116e1c99466de902cd565d121cf&quot;&gt;release of OCaml 5.4.0 beta2&lt;/a&gt; by &lt;a href=&quot;https://perso.quaesituri.org/florian.angeletti/&quot;&gt;Florian Angeletti&lt;/a&gt;, a fix of &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/6ef6813522b6ea29933f6451236a1639bdbaec61&quot;&gt;ocaml-base-compiler for MSVC&lt;/a&gt; by &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;David&lt;/a&gt; and a fix for &lt;a href=&quot;https://github.com/ocaml/opam-repository/commit/d141887ab0b4fc0836ad0787f1f806585a260bc8&quot;&gt;BER-OCaml&lt;/a&gt; by &lt;a href=&quot;https://www.cl.cam.ac.uk/~jdy22/&quot;&gt;Jeremy Yallop&lt;/a&gt;. Then 25 commits resulted in recalculating solutions for 3800 packages as they hit dune-adjacent packages, 5 commits resulted in recalculating between 100 and 300 packages and the remaining 110 commits resulted in recalculating fewer than 100 packages, the majority of which resulted in recalculating fewer than 5 packages.&lt;/p&gt; 209 + &lt;p&gt;Overall, at a rough estimate, this means that over this period, using this caching strategy gave us a 5x speedup in the solver!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions2.html</id><title type="text">Caching opam solutions - part 2</title><updated>2025-09-23T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they bo...</summary><published>2025-09-22T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/odoc-bugs.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-bugs&quot;&gt;&lt;a href=&quot;#odoc-bugs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc bugs&lt;/h1&gt; 210 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-22&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 211 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;odoc.model&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 212 + &lt;p&gt;This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they both had the same backtrace and similar immediate causes, they're actually quite different. They both involve &lt;em&gt;expansion&lt;/em&gt;, which is the process that odoc uses to work out the contents of a module from its expression - what allows you to see the contents of a module such as &lt;code&gt;module M = Map.Make(String)&lt;/code&gt;.&lt;/p&gt; 213 + &lt;h3 id=&quot;bug-930:-inline-destructive-substitutions&quot;&gt;&lt;a href=&quot;#bug-930:-inline-destructive-substitutions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bug 930: inline destructive substitutions&lt;/h3&gt; 214 + &lt;p&gt;Bug #930 in odoc is about a substitution problem:&lt;/p&gt; 215 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S1 = sig 216 + type t0 217 + type 'a t := unit 218 + 219 + val x : t0 t 220 + end 221 + 222 + module type S2 = sig 223 + type t (* must be the same name as [S1.t] *) 224 + 225 + include S1 with type t0 := t 226 + end 227 + 228 + module type S3 = sig 229 + type t1 230 + 231 + include S2 with type t := t1 232 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 233 + &lt;p&gt;which when processed by odoc 2.4 throws an exception:&lt;/p&gt; 234 + &lt;pre&gt;odoc: internal error, uncaught exception: 235 + Invalid_argument(&amp;quot;List.fold_left2&amp;quot;) 236 + Raised at Stdlib.invalid_arg in file &amp;quot;stdlib.ml&amp;quot;, line 33, characters 20-45 237 + Called from Odoc_xref2__Subst.type_expr in file &amp;quot;subst.ml&amp;quot;, line 598, characters 21-59 238 + Called from Odoc_xref2__Subst.value in file &amp;quot;subst.ml&amp;quot; (inlined), line 842, characters 19-38 239 + Called from Odoc_xref2__Subst.apply_sig_map.inner.(fun) in file &amp;quot;subst.ml&amp;quot;, line 1089, characters 19-52 240 + Called from Odoc_xref2__Component.Delayed.get in file &amp;quot;component.ml&amp;quot; (inlined), line 55, characters 16-22 241 + Called from Odoc_xref2__Lang_of.signature_items.inner in file &amp;quot;lang_of.ml&amp;quot;, line 438, characters 16-39 242 + Called from Odoc_xref2__Lang_of.signature in file &amp;quot;lang_of.ml&amp;quot; (inlined), line 466, characters 12-43 243 + Called from Odoc_xref2__Lang_of.include_ in file &amp;quot;lang_of.ml&amp;quot;, line 641, characters 18-69&lt;/pre&gt; 244 + &lt;p&gt;The key thing here is that definition of &lt;code&gt;'a t&lt;/code&gt; in &lt;code&gt;S1&lt;/code&gt; - a destructive substituion. If you type this code into an OCaml toplevel, you will see that the signature of &lt;code&gt;S1&lt;/code&gt; is:&lt;/p&gt; 245 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S1 = sig 246 + type t0 247 + type 'a t := unit 248 + 249 + val x : t0 t 250 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 251 + &lt;p&gt;where the substitution has clearly taken place. In contrast, odoc takes the position that the use of these inline destructive substitutions is to make the code easier to understand, and so it tries to keep them in the signature rather than simply apply them and present the resulting signature. So when rendering &lt;code&gt;S1&lt;/code&gt; we end up with:&lt;/p&gt; 252 + 253 + &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; border-radius: 5px&quot;&gt; 254 + &lt;a id=&quot;module-type-S1&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S1&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 255 + &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-t0&quot;&gt;&lt;a href=&quot;#type-t0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t0&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type subst anchored&quot; id=&quot;type-t&quot;&gt;&lt;a href=&quot;#type-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;span&gt;'a t&lt;/span&gt;&lt;/span&gt;&lt;span&gt; := unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : &lt;span&gt;&lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; &lt;a href=&quot;#type-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt; 256 + &lt;/div&gt; 257 + 258 + &lt;p&gt;The reported problem is a failure with a stack trace while processing &lt;code&gt;S3&lt;/code&gt;, but upon looking closely the real problem has happened when expanding &lt;code&gt;S2&lt;/code&gt;. What happens is that we have a type &lt;code&gt;t&lt;/code&gt; defined in &lt;code&gt;S2&lt;/code&gt; and a type &lt;code&gt;t&lt;/code&gt; that will later be substituted away that comes from the inclusion of &lt;code&gt;S1&lt;/code&gt;. The rendered signature of &lt;code&gt;S2&lt;/code&gt; is:&lt;/p&gt; 259 + 260 + &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px&quot;&gt; 261 + &lt;a id=&quot;module-type-S2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S2&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 262 + &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-s2-t&quot;&gt;&lt;a href=&quot;#type-s2-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-include&quot;&gt;&lt;details open=&quot;open&quot;&gt;&lt;summary class=&quot;spec include&quot;&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;include&lt;/span&gt; &lt;a href=&quot;#module-type-S1&quot;&gt;S1&lt;/a&gt; &lt;span class=&quot;keyword&quot;&gt;with&lt;/span&gt; &lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; := &lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/summary&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type subst anchored&quot; id=&quot;type-s2-t&quot;&gt;&lt;a href=&quot;#type-s2-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;span&gt;'a t&lt;/span&gt;&lt;/span&gt;&lt;span&gt; := unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : &lt;span&gt;&lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt; &lt;a href=&quot;#type-s2-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;/details&gt;&lt;/div&gt; 263 + &lt;/div&gt; 264 + 265 + &lt;p&gt;where the type of &lt;code&gt;x&lt;/code&gt; is now &lt;code&gt;t t&lt;/code&gt;, which is clearly incorrect. The problem is that odoc assumes that type names are unique within a signature (modulo shadowing, which isn't quite what's going on here), but in this signature there are two definitions of &lt;code&gt;type t&lt;/code&gt;, one of which is parameterised and one is not. At this point nothing fatal has happened, but when we try to process &lt;code&gt;S3&lt;/code&gt; the substitution code gets very confused by these different arities and &lt;code&gt;List.fold_left2&lt;/code&gt; throws the above exception.&lt;/p&gt; 266 + &lt;p&gt;The fix I'm trialling for this is that when we're including a signature that contains an inline destructive substitution, we will perform that substitution when the expansion of the include is done. This means that the rendered signature of &lt;code&gt;S1&lt;/code&gt; will be just the same as before, but the rendered signature of &lt;code&gt;S2&lt;/code&gt; will now be:&lt;/p&gt; 267 + 268 + &lt;div class=&quot;inset&quot; style=&quot;border: 1px solid var(--pre-border-color); padding: 10px; padding-right:30px; border-radius: 5px&quot;&gt; 269 + &lt;a id=&quot;module-type-newS2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;h2&gt;Module type &lt;code&gt;&lt;span&gt;S2&lt;/span&gt;&lt;/code&gt;&lt;/h2&gt; 270 + &lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec type anchored&quot; id=&quot;type-s2new-t&quot;&gt;&lt;a href=&quot;#type-s2new-t&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; t&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;odoc-include&quot;&gt;&lt;details open=&quot;open&quot;&gt;&lt;summary class=&quot;spec include&quot;&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;include&lt;/span&gt; &lt;a href=&quot;#module-type-S1&quot;&gt;S1&lt;/a&gt; &lt;span class=&quot;keyword&quot;&gt;with&lt;/span&gt; &lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;type&lt;/span&gt; &lt;a href=&quot;#type-t0&quot;&gt;t0&lt;/a&gt; := &lt;a href=&quot;#type-s2new-t&quot;&gt;t&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/summary&gt;&lt;div class=&quot;odoc-spec&quot;&gt;&lt;div class=&quot;spec value anchored&quot; id=&quot;val-x&quot;&gt;&lt;a href=&quot;#val-x&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;&lt;code&gt;&lt;span&gt;&lt;span class=&quot;keyword&quot;&gt;val&lt;/span&gt; x : unit&lt;/span&gt;&lt;/code&gt;&lt;/div&gt;&lt;/div&gt;&lt;/details&gt;&lt;/div&gt; 271 + &lt;/div&gt; 272 + 273 + &lt;p&gt;where the type of &lt;code&gt;x&lt;/code&gt; is now simply &lt;code&gt;unit&lt;/code&gt;, which is what OCaml itself thinks, happily! I think this strikes the balance between keeping the substitutions visible for clarity where they are originally defined, but when including them elsewhere we simply see the resulting signature.&lt;/p&gt; 274 + &lt;h3 id=&quot;bug-#1385:-exception-raised-during-compilation&quot;&gt;&lt;a href=&quot;#bug-#1385:-exception-raised-during-compilation&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Bug #1385: Exception raised during compilation&lt;/h3&gt; 275 + &lt;p&gt;The second bug has the identical backtrace, indicating a problem with arities. However, the repro case for this one does not involve any inline destructive substitution, though it does involve destructive substitution at the module expression level:&lt;/p&gt; 276 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Creators_base = sig 277 + type ('a, _, _) t 278 + type (_, _, _) concat 279 + 280 + include sig 281 + type ('a, 'b, 'c) t 282 + 283 + val concat : (('a, 'p1, 'p2) t, 'p1, 'p2) concat -&amp;gt; ('a, 'p1, 'p2) t 284 + end 285 + with type ('a, 'b, 'c) t := ('a, 'b, 'c) t 286 + end 287 + 288 + module type S0_with_creators_base = sig 289 + type t 290 + 291 + include Creators_base with type ('a, _, _) t := t and type ('a, _, _) concat := t 292 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 293 + &lt;p&gt;There's quite a lot of type parameters flying around here, so the first step was to try to simplify this as much as possible while still getting the exception. I got it down to:&lt;/p&gt; 294 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Creators_base = sig 295 + type 'a t 296 + type _ concat 297 + 298 + include sig 299 + type 'a t 300 + 301 + val concat : 'a concat -&amp;gt; 'a t 302 + end 303 + with type 'a t := 'a t 304 + end 305 + 306 + module type S0_with_creators_base = sig 307 + type t 308 + 309 + include Creators_base with type _ t := t with type _ concat := t 310 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 311 + &lt;p&gt;which still throws the same exception. So, what's going on here? Fundamentally, it's a similar issue to the first bug, just caused in a different way, in that once again we'll end up with a signature that has two definitions of &lt;code&gt;type t&lt;/code&gt; with different arities. In this case, the problem occurs during the expansion of &lt;code&gt;S0_with_creators_base&lt;/code&gt;.&lt;/p&gt; 312 + &lt;p&gt;This is the intermediate expansion of &lt;code&gt;Creators_base&lt;/code&gt; that odoc calculates:&lt;/p&gt; 313 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S0_with_creators_base = sig 314 + type t 315 + 316 + include Creators_base with type _ t := t with type _ concat := t (* 317 + 318 + The expansion as calculated by odoc is: 319 + 320 + include sig 321 + type 'a t 322 + val concat : t -&amp;gt; 'a t 323 + end 324 + *) 325 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 326 + &lt;p&gt;What's happened here is during the calculation of the body of the include, odoc has taken the signature of &lt;code&gt;Creators_base&lt;/code&gt; and has its two type definitions both replaced with &lt;code&gt;type t&lt;/code&gt; (with no parameters). However, since the &lt;code&gt;type t&lt;/code&gt; in the body of the include is defined in that signature, that one wasn't replaced. So we end up with the type of &lt;code&gt;concat&lt;/code&gt; being &lt;code&gt;t -&amp;gt; 'a t&lt;/code&gt;, which looks very odd! At this point though, odoc knows very well that they're different types. However, when odoc converts this signature back into the datatype that represents the expansions, it loses that information and we end up with the two types mixed up. We then go on to process this signature, the mixup of the arities causes the failure.&lt;/p&gt; 327 + &lt;p&gt;There are several independent fixes that we can make here. Firstly we can make sure that we don't mix up the types. This we can do because we can distinguish between items that are declared within the signature of the include's declaration and those that come from the outer context. We don't have to do this for the expansion of the include as OCaml's type system means that there can't be two types of the same name in the resulting signature. We never actually render any signature that occurs within the body of an include, so this doesn't actually make any difference to the output.&lt;/p&gt; 328 + &lt;p&gt;The second fix is to make sure that we only calculate the expansion of the include once. Currently the bug happens because we try to re-calculate the expansion of the &lt;code&gt;include sig ... end&lt;/code&gt; expression, even though we calculated it during the processing of &lt;code&gt;S0_with_creators_base&lt;/code&gt;. What we should do instead is apply the substitutions to the expansion of that calculated include, which would end up with the same result. This isn't a perfect solution though, as there are occasions when we have to recalculate the signature anyway.&lt;/p&gt; 329 + &lt;p&gt;The third fix is - and this takes a little care to parse - to ensure that we never actually try to process the items within a signature within a &amp;quot;with&amp;quot; expression within a module-type expression. Before diving into the 'why' of this, let's first explain how Odoc represents module-type expressions.&lt;/p&gt; 330 + &lt;p&gt;Internally, we have &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.expr&quot;&gt;a datatype&lt;/span&gt; that represents module expressions, which looks like this:&lt;/p&gt; 331 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type expr = 332 + | Path of path_t 333 + | Signature of Signature.t 334 + | Functor of FunctorParameter.t * expr 335 + | With of with_t 336 + | TypeOf of typeof_t&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 337 + &lt;p&gt;Now, each of the arguments to these constructors might contain an expansion of the expression that Odoc will calculate. For example, the definition of &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.path_t&quot;&gt;path_t&lt;/span&gt; is:&lt;/p&gt; 338 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type path_t = { 339 + p_expansion : simple_expansion option; 340 + p_path : Paths.Path.ModuleType.t; 341 + }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 342 + &lt;p&gt;and this expansion is initially &lt;code&gt;None&lt;/code&gt; and then filled in by Odoc in order to render the expansion in the HTML. In the case of a &lt;code&gt;With&lt;/code&gt; expression, the &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.with_t&quot;&gt;with_t&lt;/span&gt; type is:&lt;/p&gt; 343 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type with_t = { 344 + w_substitutions : substitution list; 345 + w_expansion : simple_expansion option; 346 + w_expr : U.expr; 347 + }&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 348 + &lt;p&gt;here you can see that the &lt;code&gt;With&lt;/code&gt; expression contains another module expression, as a &lt;code&gt;with&lt;/code&gt; expression operates on another module type. Early during Odoc's development, this simply was another `ModuleType.expr`, but we had a couple of bugs where we ended up calculating expansions for these inner expressions, which was all very wasteful as we only ever rendered the &amp;quot;outer&amp;quot; expansion. So we changed this to be a &lt;span class=&quot;xref-unresolved&quot; title=&quot;Odoc_model.Lang.ModuleType.U.expr&quot;&gt;U.expr&lt;/span&gt;, which is an &amp;quot;unexpanded&amp;quot; module type expression, and is very similar to the main expression above, but without the expansions and also with the functor case, as we can't have functors inside a &amp;quot;with&amp;quot; expression.&lt;/p&gt; 349 + &lt;p&gt;These &amp;quot;unexpanded&amp;quot; expressions still contain signatures though, so aren't &lt;em&gt;completely&lt;/em&gt; unexpanded, and it's &lt;em&gt;these&lt;/em&gt; signatures that we should avoid processing.&lt;/p&gt; 350 + &lt;p&gt;So, what I expected to be just one bug when I started looking at this turned out to be two related issues, and a total of four different fixes!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/odoc-bugs.html</id><title type="text">Odoc bugs</title><updated>2025-09-22T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In or...</summary><published>2025-09-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;caching-opam-solutions&quot;&gt;&lt;a href=&quot;#caching-opam-solutions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Caching opam solutions&lt;/h1&gt; 351 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 352 + &lt;p&gt;The &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In order to give the docs some stability, as the docs may well &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/semantic-versioning-is-hard&quot;&gt;depend upon your dependencies&lt;/span&gt;, we currently cache the solve results so that a package will always be built with the same set of dependencies, even if a new version of one of those dependencies has been released.&lt;/p&gt; 353 + &lt;p&gt;The downside to this is that as time goes on, the number of distinct universes that we build increases, and docs get more and more out of date. So it's not necessarily the best thing to do, though it does mean we minimise the amount of time spent solving.&lt;/p&gt; 354 + &lt;p&gt;The alternative approach is that on every commit to opam-repository we could resolve for all packages and use the latest, greatest solution to build the docs. Using this approach we would maximise the sharing of builds and keep the total amount of required storage steadier. Of course, this would mean solving for every package on every commit to opam-repository, even if we didn't end up rebuilding all of them due to the way that the cache works.&lt;/p&gt; 355 + &lt;p&gt;One possibility that might be worth investigating is to cache the solutions - but then Leon Bambrick &lt;a href=&quot;https://twitter.com/secretGeek/status/7269997868&quot;&gt;advises us&lt;/a&gt;:&lt;/p&gt; 356 + &lt;div&gt;&lt;pre class=&quot;language-quote&quot;&gt;&lt;code&gt;There are 2 hard problems in computer science: cache invalidation, 357 + naming things, and off-by-1 errors.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 358 + &lt;p&gt;and indeed it's not obvious what the best approach to cache invalidation is here. A sledgehammer approach would be to hook into the solver and note what questions it asks of opam-repository and record the responses. If any of these change, then it's safe to say that we need to recalculate. I had a quick look at this and checked what packages were involved in the solution of &lt;code&gt;ocaml&lt;/code&gt; as this would represent a minimum set of packages that would affect virtually all packages. The list was big, but not &lt;i&gt;too&lt;/i&gt; big:&lt;/p&gt; 359 + &lt;pre&gt;winpthreads, system-msvc, system-mingw, ocaml-variants, ocaml-system, 360 + ocaml-options-vanilla, ocaml-option-tsan, ocaml-option-static, 361 + ocaml-option-spacetime, ocaml-option-no-flat-float-array, 362 + ocaml-option-no-compression, ocaml-option-nnpchecker, 363 + ocaml-option-nnp, ocaml-option-musl, ocaml-option-mingw, 364 + ocaml-option-leak-sanitizer, ocaml-option-fp, ocaml-option-flambda, 365 + ocaml-option-default-unsafe-string, ocaml-option-bytecode-only, 366 + ocaml-option-afl, ocaml-option-address-sanitizer,ocaml-option-32bit, 367 + ocaml-config, ocaml-compiler, ocaml-beta, ocaml-base-compiler, ocaml, 368 + dkml-base-compiler, conf-unwind, conf-pkg-config, base-unix, 369 + base-threads, base-ocamlbuild, base-nnp, base-metaocaml-ocamlfind, 370 + base-implicits, base-effects, base-domains, base-bigarray&lt;/pre&gt; 371 + &lt;p&gt;I tried the same thing whilst using the oxcaml opam-repository, and this time, the list became much &lt;i&gt;much&lt;/i&gt; larger:&lt;/p&gt; 372 + &lt;pre&gt;zed, zarith-xen, zarith-freestanding, zarith, yojson, xenstore, xdg, 373 + x509, webbrowser, wasm_of_ocaml-compiler, variantslib, uutf, uuseg, 374 + uunf, uucp, uTop, uri-sexp, uri, uopt, univ_map, uchar, tyxml, 375 + typerex, typerep, trie, topkg, tls-lwt, tls, timezone, time_now, 376 + textutils_kernel, textutils, tcpip, system-msvc, system-mingw, 377 + swhid_core, stringext, string_dict, stdune, stdlib-shims, stdio, ssl, 378 + splittable_random, spdx_licenses, spawn, shell, 379 + shared-memory-ring-lwt, shared-memory-ring, sha, sexplib0, sexplib, 380 + sexp_pretty, seq, sedlex, rresult, result, regex_parser_intf, 381 + record_builder, react, re2, re, randomconv, publish, ptime, psq, 382 + protocol_version_header, ppxlib_jane, ppxlib_ast, ppxlib, ppxfind, 383 + ppx_yojson_conv_lib, ppx_yojson_conv, ppx_variants_conv, ppx_var_name, 384 + ppx_typerep_conv, ppx_typed_fields, ppx_tydi, ppx_tools_versioned, 385 + ppx_tools, ppx_template, ppx_string_conv, ppx_string, 386 + ppx_stable_witness, ppx_stable, ppx_shorthand, ppx_sexp_value, 387 + ppx_sexp_message, ppx_sexp_conv, ppx_pipebang, ppx_optional, 388 + ppx_optcomp, ppx_module_timer, ppx_log, ppx_let, ppx_js_style, 389 + ppx_jane, ppx_inline_test, ppx_ignore_instrumentation, ppx_here, 390 + ppx_helpers, ppx_hash, ppx_globalize, ppx_fixed_literal, 391 + ppx_fields_conv, ppx_fail, ppx_expect, ppx_enumerate, 392 + ppx_disable_unused_warnings, ppx_diff, ppx_deriving, ppx_derivers, 393 + ppx_custom_printf, ppx_cstruct, ppx_compare, ppx_cold, ppx_bin_prot, 394 + ppx_bench, ppx_base, ppx_assert, pp, portable, pipe_with_writer_error, 395 + pcre, pbkdf, patch, parsexp, ounit2, ordering, optint, opam-state, 396 + opam-repository, opam-publish, opam-lib, opam-format, 397 + opam-file-format, opam-core, ojs, ohex, odoc-parser, odoc, octavius, 398 + ocplib-endian, ocp-indent, ocp-build, ocb-stubblr, ocamlnet, 399 + ocamlgraph, ocamlformat-rpc-lib, ocamlformat-lib, ocamlformat, 400 + ocamlfind-secondary, ocamlfind, ocamlc-loc, ocamlbuild, 401 + ocaml_intrinsics_kernel, ocaml_intrinsics, ocaml-version, 402 + ocaml-variants, ocaml-system, ocaml-syntax-shims, 403 + ocaml-secondary-compiler, ocaml-options-vanilla, ocaml-option-tsan, 404 + ocaml-option-static, ocaml-option-spacetime, 405 + ocaml-option-no-flat-float-array, ocaml-option-no-compression, 406 + ocaml-option-nnpchecker, ocaml-option-nnp, ocaml-option-musl, 407 + ocaml-option-mingw, ocaml-option-leak-sanitizer, ocaml-option-fp, 408 + ocaml-option-flambda, ocaml-option-default-unsafe-string, 409 + ocaml-option-bytecode-only, ocaml-option-afl, 410 + ocaml-option-address-sanitizer, ocaml-option-32bit, 411 + ocaml-migrate-parsetree, ocaml-lsp-server, ocaml-index, 412 + ocaml-freestanding, ocaml-config, ocaml-compiler-libs, 413 + ocaml-base-compiler, ocaml, obuild, num, nocrypto, mtime, mmap, 414 + mirage-xen-posix, mirage-xen, mirage-types, mirage-time, mirage-stack, 415 + mirage-solo5, mirage-sleep, mirage-runtime, mirage-random, 416 + mirage-ptime, mirage-protocols, mirage-profile, mirage-no-xen, 417 + mirage-no-solo5, mirage-net-xen, mirage-net, mirage-mtime, 418 + mirage-kv-mem, mirage-kv-lwt, mirage-kv, mirage-flow, mirage-entropy, 419 + mirage-device, mirage-crypto-rng-mirage, mirage-crypto-rng-lwt, 420 + mirage-crypto-rng, mirage-crypto-pk, mirage-crypto-ec, mirage-crypto, 421 + mirage-clock-unix, mirage-clock-lwt, mirage-clock, mew_vi, mew, 422 + metrics-lwt, metrics, merlin-lib, merlin, menhirSdk, menhirLib, 423 + menhirCST, menhir, mdx, magic-mime, macaddr-cstruct, macaddr, lwt_ssl, 424 + lwt_react, lwt_ppx, lwt_log, lwt-dllist, lwt, lsp, lru, logs, 425 + lambda-term, kdf, jst-config, jsonrpc, jsonm, js_of_ocaml-toplevel, 426 + js_of_ocaml-ppx, js_of_ocaml-lwt, js_of_ocaml-compiler, js_of_ocaml, 427 + jbuilder, jane_rope, jane-street-headers, ipaddr-sexp, ipaddr-cstruct, 428 + ipaddr, io-page, int_repr, http, hkdf, hex, hacl_x25519, gmap, 429 + github-unix, github-data, github, gen_js_api, gen, gel, 430 + functoria-runtime, fpath, fmt, fix, fieldslib, fiber, fiat-p256, 431 + ezjsonm, extlib-compat, extlib, expectree, expect_test_helpers_core, 432 + ethernet, eqaf, either, easy-format, dyn, duration, dune-site, 433 + dune-rpc, dune-release, dune-private-libs, dune-configurator, 434 + dune-compiledb, dune-build-info, dune, dot-merlin-reader, domain-name, 435 + dkml-base-compiler, digestif, curly, cstruct-sexp, cstruct-lwt, 436 + cstruct, csexp, crunch, cpuid, cppo, core_unix, core_kernel, 437 + core_extended, core, configurator, conf-which, conf-unwind, 438 + conf-pkg-config, conf-ninja, conf-m4, conf-libssl, conf-libpcre, 439 + conf-gmp-powm-sec, conf-gmp, conf-g++, conf-cmake, conf-c++, 440 + conf-bash, conf-autoconf, conduit-lwt-unix, conduit-lwt, conduit, 441 + cohttp-lwt-unix, cohttp-lwt-jsoo, cohttp-lwt, cohttp, cmdliner, 442 + cmarkit, chrome-trace, charInfo_width, capitalization, camomile, 443 + camlp4, camlp-streams, ca-certs, bos, biniou, binaryen-bin, bin_prot, 444 + bigstringaf, bigarray-compat, bheap, basement, base_quickcheck, 445 + base_bigstring, base64, base-unix, base-threads, base-ocamlbuild, 446 + base-num, base-nnp, base-effects, base-domains, base-bytes, 447 + base-bigarray, base, backoff, atdgen-runtime, atdgen, atd, async_unix, 448 + async_rpc_kernel, async_log, async_kernel, async_extra, async, 449 + astring, asn1-combinators, arp, angstrom, alcotest&lt;/pre&gt; 450 + &lt;p&gt;This enormous list is because the opam file for oxcaml - &lt;code&gt;ocaml-variants.5.2.0+ox&lt;/code&gt; - lists a bunch of conflicts to ensure that various incompatible packages are never selected:&lt;/p&gt; 451 + &lt;pre&gt;conflicts: [ 452 + &amp;quot;base&amp;quot; {&amp;lt; &amp;quot;v0.18~&amp;quot;} 453 + &amp;quot;alcotest&amp;quot; {!= &amp;quot;1.9.0+ox&amp;quot;} 454 + &amp;quot;backoff&amp;quot; {!= &amp;quot;0.1.1+ox&amp;quot;} 455 + &amp;quot;dot-merlin-reader&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 456 + &amp;quot;gen_js_api&amp;quot; {!= &amp;quot;1.1.2+ox&amp;quot;} 457 + &amp;quot;js_of_ocaml&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 458 + &amp;quot;js_of_ocaml-compiler&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 459 + &amp;quot;js_of_ocaml-ppx&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 460 + &amp;quot;js_of_ocaml-toplevel&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 461 + &amp;quot;jsonrpc&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 462 + &amp;quot;lsp&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 463 + &amp;quot;lwt_ppx&amp;quot; {!= &amp;quot;5.9.1+ox&amp;quot;} 464 + &amp;quot;mdx&amp;quot; {!= &amp;quot;2.5.0+ox&amp;quot;} 465 + &amp;quot;merlin&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 466 + &amp;quot;merlin-lib&amp;quot; {!= &amp;quot;5.2.1-502+ox&amp;quot;} 467 + &amp;quot;ocaml-compiler-libs&amp;quot; {!= &amp;quot;v0.17.0+ox&amp;quot;} 468 + &amp;quot;ocaml-index&amp;quot; {!= &amp;quot;1.1+ox&amp;quot;} 469 + &amp;quot;ocaml-lsp-server&amp;quot; {!= &amp;quot;1.19.0+ox&amp;quot;} 470 + &amp;quot;ocamlbuild&amp;quot; {!= &amp;quot;0.15.0+ox&amp;quot;} 471 + &amp;quot;ocamlformat&amp;quot; {!= &amp;quot;0.26.2+ox&amp;quot;} 472 + &amp;quot;ocamlformat-lib&amp;quot; {!= &amp;quot;0.26.2+ox&amp;quot;} 473 + &amp;quot;ojs&amp;quot; {!= &amp;quot;1.1.2+ox&amp;quot;} 474 + &amp;quot;ppxlib&amp;quot; {!= &amp;quot;0.33.0+ox&amp;quot;} 475 + &amp;quot;ppxlib_ast&amp;quot; {!= &amp;quot;0.33.0+ox&amp;quot;} 476 + &amp;quot;sedlex&amp;quot; {!= &amp;quot;3.3+ox&amp;quot;} 477 + &amp;quot;topkg&amp;quot; {!= &amp;quot;1.0.8+ox&amp;quot;} 478 + &amp;quot;uTop&amp;quot; {!= &amp;quot;2.15.0+ox&amp;quot;} 479 + &amp;quot;uutf&amp;quot; {!= &amp;quot;1.0.3+ox&amp;quot;} 480 + &amp;quot;wasm_of_ocaml-compiler&amp;quot; {!= &amp;quot;6.0.1+ox&amp;quot;} 481 + &amp;quot;zarith&amp;quot; {!= &amp;quot;1.12+ox&amp;quot;} 482 + ]&lt;/pre&gt; 483 + &lt;p&gt;and it seems that the solver is looking not just at these packages, but also at all of their dependencies too. So this is a much larger set of packages that we need to track changes for, probably making the caching an awful lot less effective. It's not clear to me that this is the best way for the solver to handle conflicts, but I don't know enough about how it works yet to say for sure.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/caching-opam-solutions.html</id><title type="text">Caching opam solutions</title><updated>2025-09-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">, and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opa...</summary><published>2025-09-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;build-ids-for-day10&quot;&gt;&lt;a href=&quot;#build-ids-for-day10&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Build IDs for Day10&lt;/h1&gt; 484 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 485 + &lt;p&gt;&lt;a href=&quot;https://tunbury.org&quot;&gt;mtelvers&lt;/a&gt;, &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;dra27&lt;/a&gt; and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opam repository. It supports building even mutually-incompatible packages by dynamically creating the build environment for each package, and thus allows us to generate something akin to &lt;a href=&quot;&quot;&gt;opam health check&lt;/a&gt; but much faster.&lt;/p&gt; 486 + &lt;p&gt;Currently the cache of a package is a key-value store where the key is a hash of the package name and version and all of its dependencies and their name and version, alongside some information about the OS. This is great when this info can uniquely identify the output, but this isn't always the case. In particular, the oxcaml opam-repository has several packages where the version number is the upstream version number with `-ox` appended, as they have patches to make them compatible with oxcaml. If these patches change without bumping the suffix the currently caching mechanism would lead to trouble. When we discussed this David pointed out the idea of the &lt;a href=&quot;https://github.com/ocaml/opam/blob/c36dd1ce40a715ef27122184715bbf3e9aa7f0c9/src/state/opamPackageVar.ml#L178-L211&quot;&gt;build-id&lt;/a&gt; in opam, which would perfectly satisfy our needs. Unfortunately this code is quite deep within the opam codebase and at the point we need it we don't have an installed opam switch, so we need to pull the code out and insert it into our project.&lt;/p&gt; 487 + &lt;p&gt;One of the first challenges was that day10 currently includes the OS details in the hash so that we can test across different distros. This is at odds with the opam build-id which doesn't include that, so in order to try to get as close as possible to the opam hash I split the cache into 2 layers - a per-OS cache directory containing hashes based on pure opam metadata. The idea is that these should be identical to the build-ids of opam. With that fixed, the new cache layout looks like:&lt;/p&gt; 488 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;debian-12-x86_64/123...abc/{build.log,config,...}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 489 + &lt;p&gt;where the &lt;code&gt;123...abc&lt;/code&gt; should be the same as the build-id you would get with all the packages contained installed.&lt;/p&gt; 490 + &lt;p&gt;Now my actual use case for this is to track the state of the oxcaml world day by day, so for this I need to track both the opam-repository for OCaml and also the opam repository for OxCaml. The project currently uses a Makefile for coordinating the builds, but I thought it was time we moved on to a dedicated batch execution process. So I asked Claude to knock me up one of those, using odoc_driver for inspiration. It's very basic right now, simply iterating through the latest versions of every package, but I have got it to check on cache hits and misses, so I should be able to run it tomorrow to see how quickly we can test PRs to oxcaml/opam-repository&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/09/build-ids-for-day10.html</id><title type="text">Build IDs for Day10</title><updated>2025-09-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">For a few years now we've been running , a Jupyterhub instance, for the first year course &quot;Foundations of Computer Science&quot;. It serves as a hosting site for the lecture notes, which come in the form o...</summary><published>2025-09-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;giving-hub.cl-an-upgrade&quot;&gt;&lt;a href=&quot;#giving-hub.cl-an-upgrade&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Giving hub.cl an upgrade&lt;/h1&gt; 491 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-09-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 492 + &lt;p&gt;For a few years now we've been running &lt;code&gt;hub.cl.cam.ac.uk&lt;/code&gt;, a Jupyterhub instance, for the first year course &amp;quot;Foundations of Computer Science&amp;quot;. It serves as a hosting site for the lecture notes, which come in the form of Jupyter notebooks, and as a playground where students can try OCaml, and it also is used to run the assessed exercises that are a mandatory part of the course.&lt;/p&gt; 493 + &lt;p&gt;Since I spent some time setting it up back in 2018 or so, its aggregated some cruft over the years, and has also fallen somewhat behind the bleeding edge of the Jupyter software stack. So I thought this year, as I'm actually lecturing the course, I'd give it a bit of loving care and attention.&lt;/p&gt; 494 + &lt;p&gt;We were still on Jupyterhub 1.5.3 whereas the current release is 5.3.0 - so there was quite a bit of work to do. I brief play with putting things on the latest version seemed to break quite a lot of things, so I thought it might be better to go back to the drawing board and start the config again from scratch. So with some help from Claude, I've now managed to hugely simplify the whole config of Jupyterhub, and even given it a makeover to try to match the style of www.cst.cam.ac.uk as well. The improvements include:&lt;/p&gt; 495 + &lt;ul&gt;&lt;li&gt;Using caddy as a reverse proxy for TLS termination, meaning I don't have to manually renew the letsencrypt cert every 3 months&lt;/li&gt;&lt;li&gt;Unifying the configuration of the two container images used for students and instructors&lt;/li&gt;&lt;li&gt;Upgrading to much newer jupyterhub, notebook and nbgrader images&lt;/li&gt;&lt;li&gt;Simplifying the configuration required to make it work on a new server - persistent user directories are now docker volumes rather than bindmounts on the local filesystem&lt;/li&gt;&lt;li&gt;Updating the authentication method to use Raven via OAuth2 rather than the unmaintained &lt;a href=&quot;https://github.com/pyCav/jupyterhub-raven-auth&quot;&gt;jupyterhub-raven-auth&lt;/a&gt; which I'd had to maintain &lt;a href=&quot;https://github.com/jonludlam/jupyterhub-raven-auth/commit/36eaf16b410e7ac3cfc532269e0ae5f1de34f231&quot;&gt;a patch&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Rebasing &lt;a href=&quot;https://github.com/jonludlam/nbgrader/commit/c83a6cbb7b530ce87b0b157accddcdc832bcba38&quot;&gt;my patch&lt;/a&gt; to nbgrader to verify all of the output of the cells when grading answers&lt;/li&gt;&lt;/ul&gt; 496 + &lt;p&gt;As ever, this took longer than I'd anticipated, but I'm mostly there now. There are a few more steps to try:&lt;/p&gt; 497 + &lt;ul&gt;&lt;li&gt;trial the &lt;a href=&quot;https://github.com/akabe/ocaml-jupyter/pull/210&quot;&gt;new patch&lt;/a&gt; for using ocaml-jupyter with OCaml 5.x&lt;/li&gt;&lt;li&gt;see how to upgrade to notebook v7, as I've stuck with v6 in order to keep the extensions we're using going.&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html</id><title type="text">Giving hub.cl an upgrade</title><updated>2025-09-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.</summary><published>2025-08-27T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;using-ocaml-lsp-server-via-an-mcp-server&quot;&gt;&lt;a href=&quot;#using-ocaml-lsp-server-via-an-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using ocaml-lsp-server via an MCP server&lt;/h1&gt; 498 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-27&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 499 + &lt;p&gt;Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.&lt;/p&gt; 500 + &lt;p&gt;We're going to use &lt;a href=&quot;https://github.com/isaacphi&quot;&gt;issacphi&lt;/a&gt;'s adapter for LSP servers, which is written in go. So install go, and then:&lt;/p&gt; 501 + &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;go install github.com/isaacphi/mcp-language-server@latest&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 502 + &lt;p&gt;Once that's done, make sure you've got `ocaml-lsp-server` installed in your switch:&lt;/p&gt; 503 + &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;opam install ocaml-lsp-server&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 504 + &lt;p&gt;Then add the MCP config for claude where you want to run it:&lt;/p&gt; 505 + &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;claude mcp add ocamllsp -s local -t stdio -- /Users/jon/go/bin/mcp-language-server -workspace . -lsp ocamllsp&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 506 + &lt;p&gt;It'd be nice to get this working `globally` - that is, with `-s user` - but I haven't been able to get that to work yet.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html</id><title type="text">Using ocaml-lsp-server via an MCP server</title><updated>2025-08-27T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller langua...</summary><published>2025-08-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;an-ocaml-mcp-server&quot;&gt;&lt;a href=&quot;#an-ocaml-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;An OCaml MCP server&lt;/h1&gt; 507 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 508 + &lt;p&gt;LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller languages like OCaml there simply isn't enough training material out there, particularly when it comes to new language features like &lt;a href=&quot;https://ocaml.org/manual/5.3/effects.html&quot;&gt;effects&lt;/a&gt; or new packages that haven't had time to be widely used. With my colleagues &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt;, &lt;a href=&quot;https://ryan.freumh.org/&quot;&gt;Ryan&lt;/a&gt; and &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq&lt;/a&gt; we've been exploring ways to &lt;a href=&quot;https://anil.recoil.org/notes/cresting-the-ocaml-ai-hump&quot;&gt;improve this situation&lt;/a&gt;. One way we can mitigate these challenges is to provide a Model Context Protocol (&lt;a href=&quot;https://modelcontextprotocol.io&quot;&gt;MCP&lt;/a&gt;) server that's capable of providing up-to-date info on the current state of the OCaml world.&lt;/p&gt; 509 + &lt;p&gt;The &lt;a href=&quot;https://docs.anthropic.com/en/docs/mcp&quot;&gt;MCP specification&lt;/a&gt; was released by Anthropic at the end of last year. Since then it has become an astonishingly popular mechanism for extending the capabilities of LLMs, allowing them to become incredibly powerful agents capable of much more than simply chatting. There are now a huge variety of MCP servers, from one that provides &lt;a href=&quot;https://github.com/r-huijts/firstcycling-mcp&quot;&gt;professional cycling data&lt;/a&gt; to one that can &lt;a href=&quot;https://github.com/GongRzhe/Gmail-MCP-Server&quot;&gt;do your email&lt;/a&gt;. The &lt;a href=&quot;https://github.com/punkpeye/awesome-mcp-servers&quot;&gt;awesome mcp server list&lt;/a&gt; already lists hundreds, and these are just the &lt;em&gt;awesome&lt;/em&gt; ones!&lt;/p&gt; 510 + &lt;p&gt;I've been working with &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq&lt;/a&gt; to make an &lt;a href=&quot;https://github.com/sadiqj/odoc-llm/&quot;&gt;MCP server for OCaml&lt;/a&gt;, with an initial focus on building it such that it can be hosted for everyone rather than something that is run locally. Our plan is to start with a service that can help with choosing OCaml libraries, by taking advantage of the work done by &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/&quot;&gt;ocaml-docs-ci&lt;/a&gt; which is the tool used to generate the documentation for all packages in &lt;a href=&quot;https://github.com/ocaml/opam-repository&quot;&gt;opam-repository&lt;/a&gt; and is served by &lt;a href=&quot;https://ocaml.org/&quot;&gt;ocaml.org&lt;/a&gt;. As well as producing HTML docs, we can also extract a number of other formats from the pipeline, including a newly created &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;markdown backend&lt;/a&gt;. Using this, we can get markdown-formatted documentation for the every version of every package in the OCaml ecosystem.&lt;/p&gt; 511 + &lt;h2 id=&quot;semantic-searching&quot;&gt;&lt;a href=&quot;#semantic-searching&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Semantic searching&lt;/h2&gt; 512 + &lt;p&gt;The first thing we focused on was being able to do a &lt;em&gt;semantic search&lt;/em&gt; over the whole OCaml ecosystem. To do this, we're using &lt;a href=&quot;https://huggingface.co/spaces/hesamation/primer-llm-embedding&quot;&gt;LLM embeddings&lt;/a&gt;, for which we need some natural-language description to seach through.&lt;/p&gt; 513 + &lt;p&gt;The documentation produced by &lt;code&gt;ocaml-docs-ci&lt;/code&gt; is generated per library module using &lt;a href=&quot;https://github.com/ocaml/odoc&quot;&gt;odoc&lt;/a&gt;, relying on the package author to provide documentation comments for each element in the signature. However, even if the package authors &lt;em&gt;hasn't&lt;/em&gt; provided any documentation, we can still see the types, values, modules and so on that the library exposes, and this is often enough to get a good idea of what the module does. We then take these documentation pages, which are formatted in markdown, and summarise them via an LLM at the module level. This is done hierarchically, so we start with the 'deepest' modules, and then insert their summaries into the text of their parent module, then summarise those and so on. We found it useful to include the names and &lt;a href=&quot;https://ocaml.github.io/odoc/odoc/odoc_for_authors.html#preamble&quot;&gt;preambles&lt;/a&gt; of the ancestor modules when doing the summarisation to give additional context to the LLM. For example, here is the prompt generated for a submodule of the &lt;a href=&quot;https://erratique.ch/software/astring&quot;&gt;astring&lt;/a&gt; library:&lt;/p&gt; 514 + &lt;div&gt;&lt;pre class=&quot;language-markdown&quot;&gt;&lt;code&gt;Module: Astring.String.Ascii 515 + 516 + Ancestor Module Context: 517 + - Astring: Alternative `Char` and `String` modules. Open the module to 518 + use it. This defines one value in your scope, redefines the `(^)` 519 + operator, the `Char` module and the `String` module. Consult the 520 + differences with the OCaml `String` module, the porting guide and a 521 + few examples. 522 + - Astring.String: Strings, `substrings`, string sets and maps. A 523 + string `s` of length `l` is a zero-based indexed sequence of `l` 524 + bytes. An index `i` of `s` is an integer in the range [`0`;`l-1`], it 525 + represents the `i`th byte of `s` which can be accessed using the 526 + string indexing operator `s.[i]`. 527 + Important. OCaml's `string`s became immutable since 4.02. Whenever 528 + possible compile your code with the `-safe-string` option. This module 529 + does not expose any mutable operation on strings and assumes strings 530 + are immutable. See the porting guide. 531 + 532 + Module Documentation: US-ASCII string support. 533 + References. 534 + 535 + ## Predicates 536 + - val is_valid : string -&amp;gt; bool (* `is_valid s` is `true` iff only for 537 + all indices `i` of `s`, `s.[i]` is an US-ASCII character, i.e. a 538 + byte in the range [`0x00`;`0x7F`]. *) 539 + 540 + ## Casing transforms 541 + The following functions act only on US-ASCII code points that is on 542 + bytes in range [`0x00`;`0x7F`], leaving any other byte intact. The 543 + functions can be safely used on UTF-8 encoded strings; they will of 544 + course only deal with US-ASCII casings. 545 + 546 + - val uppercase : string -&amp;gt; string (* `uppercase s` is `s` with 547 + US-ASCII characters `'a'` to `'z'` mapped to `'A'` to `'Z'`. *) 548 + - val lowercase : string -&amp;gt; string (* `lowercase s` is `s` with 549 + US-ASCII characters `'A'` to `'Z'` mapped to `'a'` to `'z'`. *) 550 + - val capitalize : string -&amp;gt; string (* `capitalize s` is like 551 + `uppercase` but performs the map only on `s.[0]`. *) 552 + - val uncapitalize : string -&amp;gt; string (* `uncapitalize s` is like 553 + `lowercase` but performs the map only on `s.[0]`. *) 554 + 555 + ## Escaping to printable US-ASCII 556 + - val escape : string -&amp;gt; string (* `escape s` is `s` with: *) 557 + - val unescape : string -&amp;gt; string option (* `unescape s` unescapes 558 + what `escape` did. The letters of hex escapes can be upper, lower or 559 + mixed case, and any two letter hex escape is decoded to its 560 + corresponding byte. Any other escape not defined by `escape` or 561 + truncated escape makes the function return `None`. *) 562 + - val escape_string : string -&amp;gt; string (* `escape_string s` is like 563 + `escape` except it escapes `s` according to OCaml's lexical 564 + conventions for strings with: *) 565 + - val unescape_string : string -&amp;gt; string option (* `unescape_string` 566 + is to `escape_string` what `unescape` is to `escape` and also 567 + additionally unescapes the sequence `&amp;quot;\\'&amp;quot;` (`0x5C,0x27`) to `&amp;quot;'&amp;quot;` 568 + (`0x27`). *)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 569 + &lt;p&gt;where clearly the package author has provided excellent documentation comments. This is then passed to an LLM which generated the following description:&lt;/p&gt; 570 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;This module provides functions to check if a string contains only 571 + US-ASCII characters, convert case for ASCII letters, and escape or 572 + unescape strings using ASCII conventions. It operates on standard 573 + OCaml strings, treating them as sequences of bytes, and ensures 574 + compatibility with UTF-8 encoded strings when transforming case. Use 575 + cases include sanitizing input for ASCII-only protocols, preparing 576 + strings for environments requiring strict ASCII formatting, and 577 + handling escaped string representations in configuration or 578 + serialization contexts.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 579 + &lt;p&gt;Once we have these natural language descriptions, we can generate embeddings for them to allow for semantic search amongst all modules in opam.&lt;/p&gt; 580 + &lt;p&gt;In addition to the module descriptions, we also generate similar natural-language descriptions of the &lt;em&gt;package&lt;/em&gt; as a whole, by taking the README from the package and summarising it similarly. Where there is no README, we summarise the summaries of the modules of the libraries, so we're always able to generate some text description of the entire package.&lt;/p&gt; 581 + &lt;p&gt;To help with the ranking, we're also using a measure of popularity for both modules and packages. For packages, we're using the number of reverse dependencies in opam as a proxy for popularity, and for modules, we're using the &amp;quot;occurrences&amp;quot; generated as part of the docs build. These [occurrences] are a count of how often modules are used in other modules, and are calculated by looking at the compiled [cmt] files and resolving references to external modules using odoc's internal logic and counting them.&lt;/p&gt; 582 + &lt;p&gt;Once we have both the module and package summaries, we generate an embedding of the descriptions to allow for a semantic search to be performed efficiently. We're using this in two ways - to search for packages for broad queries of functionality, which just uses the package summaries, and for more specific queries to search for modules within packages.&lt;/p&gt; 583 + &lt;p&gt;For the module search, if the packages to search in haven't been specified, we search for both modules and packages and then combine the results. This is particularly helpful when the search is for generic functionality that might be found in more specific packages. For example, a module-only search for the term &amp;quot;time and date manipulation functions&amp;quot; returns the strongest match with a &lt;a href=&quot;https://ocaml.org/p/caqti/2.2.4/doc/caqti.platform/Caqti_platform/Conv/index.html&quot;&gt;module from caqti&lt;/a&gt;, which, as caqti is a library for talking to relational databases, might not be what the user is looking for.&lt;/p&gt; 584 + &lt;p&gt;We then put these search tools into an MCP server, along with a little more functionality. The server currently provides these five functions: &lt;/p&gt; 585 + &lt;ol&gt;&lt;/ol&gt; 586 + &lt;p&gt;The first 2 use the LLM-generated summaries as described above, and the last is using &lt;a href=&quot;https://github.com/art-w/&quot;&gt;Arthur's&lt;/a&gt; &lt;a href=&quot;https://github.com/art-w/sherlodoc&quot;&gt;sherlodoc tool&lt;/a&gt; which can do various searches, including type-based search, across the output of the &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt;.&lt;/p&gt; 587 + &lt;h2 id=&quot;example-searches&quot;&gt;&lt;a href=&quot;#example-searches&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Example searches&lt;/h2&gt; 588 + &lt;p&gt;The following are the results from some example package searches: &lt;/p&gt; 589 + &lt;ul&gt;&lt;li&gt;&amp;quot;HTTP client&amp;quot;&lt;/li&gt;&lt;/ul&gt; 590 + &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - http (v6.1.1) 591 + Similarity: 0.7593 592 + Reverse Dependencies: 407 593 + Combined Score: 0.6588 594 + Description: This package provides a comprehensive OCaml library for 595 + building HTTP clients and servers with support for multiple 596 + asynchronous programming model s. It enables developers to implement 597 + efficient, portable HTTP services using different backends such as 598 + Lwt, Async, Eio, and JavaScript, making it suitable for both Unix and 599 + browser environments. The library emphasizes performance, modularity, 600 + and interoperability, allowing custom backend implementations and 601 + seamless in tegration with other OCaml libraries. It is commonly used 602 + in web services, API clients, standalone microkernels, and 603 + OCaml-to-JavaScript compilations for web app lications. 604 + 605 + #2 - cohttp (v6.1.1) 606 + Similarity: 0.7377 607 + Reverse Dependencies: 403 608 + Combined Score: 0.6435 609 + Description: This package provides a comprehensive library for 610 + building HTTP clients and servers in OCaml. It supports multiple 611 + asynchronous programming models and backends, enabling flexible 612 + development across different runtime environments. The library offers 613 + efficient handling of HTTP/1.1 and HTTPS, with portable pa rsing and 614 + modular architecture. It is widely used for web services, API clients, 615 + and standalone network applications. 616 + 617 + #3 - cohttp-lwt-unix (v6.1.1) 618 + Similarity: 0.7089 619 + Reverse Dependencies: 338 620 + Combined Score: 0.6212 621 + Description: This package provides an implementation of the Cohttp 622 + library using the Lwt asynchronous programming framework with Unix 623 + bindings. It enables buil ding efficient HTTP clients and servers in 624 + OCaml, supporting both synchronous and asynchronous network 625 + operations. The package handles core HTTP functionality, i ncluding 626 + request and response parsing, connection management, and HTTPS support 627 + via OCaml-TLS. It is suitable for applications requiring 628 + high-performance web ser vices, microservices, or networked 629 + applications in the OCaml ecosystem. 630 + 631 + #4 - cohttp-lwt (v6.1.1) 632 + Similarity: 0.7067 633 + Reverse Dependencies: 367 634 + Combined Score: 0.6207 635 + Description: This package provides a comprehensive library for 636 + building HTTP clients and servers in OCaml, supporting multiple 637 + asynchronous programming models. It enables developers to implement 638 + efficient, portable HTTP services with support for both synchronous 639 + and asynchronous I/O, including secure HTTPS communicatio n. The 640 + package includes backends for Lwt, Async, Mirage, JavaScript, and 641 + Eio, making it versatile for use in different runtime environments, 642 + from Unix servers to web browsers. It is well-suited for applications 643 + requiring high-performance networking, such as web services, API 644 + clients, and embedded networked systems. 645 + 646 + #5 - quests (v0.1.3) 647 + Similarity: 0.7960 648 + Reverse Dependencies: 1 649 + Combined Score: 0.6180 650 + Description: This package provides a high-level HTTP client library 651 + for making web requests in OCaml. It simplifies interacting with HTTP 652 + servers by offering a n intuitive API for common methods like GET and 653 + POST, supporting features such as query parameters, form and JSON 654 + data submission, and automatic handling of gzip compression and 655 + redirects. It also includes authentication mechanisms like basic and 656 + bearer tokens, with partial support for sessions. Typical use cases 657 + include consuming REST APIs, scraping web content, or integrating 658 + with web services securely and efficiently. 659 + 660 + #6 - ezcurl (v0.2.4) 661 + Similarity: 0.7395 662 + Reverse Dependencies: 6 663 + Combined Score: 0.5979 664 + Description: This package provides a simplified interface for making 665 + HTTP requests in OCaml, built on top of the OCurl library. It 666 + addresses the need for an ea sy-to-use, reliable, and stable API for 667 + handling common web interaction tasks, such as fetching URLs and 668 + processing responses. The package supports both synchron ous and 669 + asynchronous operations, enabling efficient handling of parallel 670 + requests and non-blocking I/O. Practical use cases include web 671 + scraping, API client deve lopment, and integrating HTTP-based services 672 + into OCaml applications. 673 + &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 674 + &lt;ul&gt;&lt;li&gt;&amp;quot;Cryptographic hash&amp;quot;&lt;/li&gt;&lt;/ul&gt; 675 + &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - digestif (v1.3.0) 676 + Similarity: 0.8165 677 + Reverse Dependencies: 621 678 + Combined Score: 0.7041 679 + Description: This package provides a comprehensive implementation of 680 + cryptographic hash functions, supporting algorithms such as MD5, 681 + SHA1, SHA2, SHA3, WHIRLPOOL, BLAKE2, and RIPEMD160. It allows users 682 + to choose between C and OCaml backends at link time, offering 683 + flexibility in performance and deployment scenarios. The library is 684 + designed for applications requiring secure hashing, such as data 685 + integrity verification, digital signatures, and cryptographic 686 + protocols. It is well-suited for systems programming and 687 + security-related applications in the OCaml ecosystem. 688 + 689 + #2 - ppx_hash (vv0.17.0) 690 + Similarity: 0.7284 691 + Reverse Dependencies: 3337 692 + Combined Score: 0.6833 693 + Description: This package generates efficient hash functions for 694 + OCaml types based on their structure, enabling precise control over 695 + hashing behavior. It addresses the limitations of OCaml's built-in 696 + polymorphic hashing by allowing users to define custom hash 697 + functions during type derivation. Key features include selective 698 + field ignoring, support for folding-style hash accumulation, and 699 + compatibility with comparison and serialization systems. It is 700 + suitable for use with hash tables, persistent data structures, and 701 + any application requiring deterministic, type-driven hashing. 702 + 703 + #3 - ez_hash (v0.5.3) 704 + Similarity: 0.8366 705 + Reverse Dependencies: 3 706 + Combined Score: 0.6583 707 + Description: This package provides a straightforward interface to 708 + common cryptographic hash functions, simplifying their use in OCaml 709 + applications. It wraps secure, widely-used algorithms like SHA-256 710 + and Blake2b, offering consistent and safe APIs for hashing data. The 711 + library is designed for clarity and ease of integration, making it 712 + ideal for developers needing reliable cryptographic operations 713 + without deep expertise in security. Practical uses include data 714 + integrity verification, digital signatures, and secure data storage. 715 + 716 + #4 - murmur3 (v0.3) 717 + Similarity: 0.7805 718 + Reverse Dependencies: 1 719 + Combined Score: 0.6072 720 + Description: This package provides OCaml bindings for MurmurHash, a 721 + fast and widely used non-cryptographic hash function. It enables 722 + efficient hash value compu tation for arbitrary data, making it 723 + suitable for applications like hash tables, checksums, and data 724 + fingerprinting. The bindings offer consistent hashing across platforms 725 + and integrate seamlessly into OCaml projects requiring 726 + high-performance hashing. Use cases include caching, distributed 727 + systems, and data integrity ve rification where cryptographic security 728 + is not required. 729 + 730 + #5 - kdf (v1.0.0) 731 + Similarity: 0.6775 732 + Reverse Dependencies: 473 733 + Combined Score: 0.6033 734 + Description: This package implements standard key derivation 735 + functions (KDFs) for cryptographic applications in OCaml. It supports 736 + scrypt, PBKDF1, PBKDF2, and HKDF, enabling secure generation of 737 + cryptographic keys from passwords or shared secrets. These functions 738 + help mitigate brute-force attacks and ensure keys are de rived in a 739 + reproducible, secure manner. Use cases include password-based 740 + encryption, secure token generation, and key material expansion in 741 + cryptographic protocols.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 742 + &lt;p&gt;and a module-level search for &amp;quot;time and date manipulation functions&amp;quot;&lt;/p&gt; 743 + &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - timmy-jsoo: Timmy_jsoo 744 + Similarity: 0.5460 745 + Original Similarity: 0.7800 746 + Popularity Score: 0.0000 747 + Description: This module provides precise date and time arithmetic, 748 + conversion, and comparison operations across multiple representations, 749 + including OCaml-nati ve, JavaScript, and string formats. It works with 750 + structured types like `Date.t`, `Time.t`, and ISO weeks, supporting 751 + timezone-aware transformations and RFC3339 formatting. Concrete use 752 + cases include cross-runtime timestamp synchronization, calendar-aware 753 + scheduling, and robust temporal data validation in distributed 754 + systems. 755 + 756 + #2 - calendar: CalendarLib 757 + Similarity: 0.5331 758 + Original Similarity: 0.7616 759 + Popularity Score: 0.3448 760 + Description: This module provides precise date and time manipulation 761 + with support for calendar operations, time zones, periods, and 762 + formatted input/output. It works with types like `Calendar.t`, 763 + `Date.t`, `Time.t`, and `Period.t` to handle tasks such as event 764 + scheduling, timestamp conversion, and historical date calculations. 765 + Concrete use cases include scheduling systems, log timestamping, 766 + holiday calculations, and cross-timezone time normalization. 767 + 768 + #3 - calendar: CalendarLib.Fcalendar 769 + Similarity: 0.5191 770 + Original Similarity: 0.6820 771 + Popularity Score: 0.1390 772 + Description: This module provides float-based calendar operations 773 + for date creation, conversion, and manipulation, including time zone 774 + adjustments, component extraction (year/month/day/hour/second), and 775 + arithmetic with periods. It works with a `t` type representing time 776 + as float seconds, alongside `day`, `month`, `year`, and Unix time 777 + structures, prioritizing Unix time precision over sub-second 778 + accuracy. It suits applications tolerating minor imprecision in date 779 + comparisons or arithmetic, such as logging systems or coarse-grained 780 + scheduling, where exact floating-point equality isn't critical. 781 + 782 + #4 - calendar: CalendarLib.Calendar_builder.Make 783 + Similarity: 0.5112 784 + Original Similarity: 0.7302 785 + Popularity Score: 0.0785 786 + Description: This module combines date and time functionality to 787 + construct and manipulate calendar values with float-based precision, 788 + offering operations like timezone conversion, component extraction 789 + (day, month, year, etc.), and arithmetic using `Period.t`. It works 790 + with a calendar type `t` that integrates date and time components, 791 + alongside conversions to Unix timestamps, Julian day numbers, and 792 + structured representations like `Unix.tm`. Designed for scenarios 793 + requiring precise temporal calculations (e.g., calendar arithmetic, 794 + Gregorian date validation, or leap day checks), it balances 795 + flexibility with known precision limitations inherent to float-based 796 + time representations. 797 + 798 + #5 - timmy-unix: Clock 799 + Similarity: 0.5080 800 + Original Similarity: 0.7257 801 + Popularity Score: 0.0000 802 + Description: This module provides functions to retrieve the current 803 + POSIX time, the local timezone, and the current date in the local 804 + timezone. It works with time and date types from the Timmy library, 805 + specifically `Timmy.Time.t` and `Timmy.Date.t`. Use this module to 806 + obtain precise time and date information for logging, scheduling, or 807 + time-based computations.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 808 + &lt;p&gt;and for &amp;quot;Balanced Tree&amp;quot;:&lt;/p&gt; 809 + &lt;div&gt;&lt;pre class=&quot;language-nolang&quot;&gt;&lt;code&gt;#1 - grenier: Mbt 810 + Similarity: 0.5274 811 + Original Similarity: 0.7534 812 + Popularity Score: 0.0495 813 + Description: This module implements a balanced binary tree structure 814 + with efficient concatenation and size-based operations. It supports 815 + tree construction through leaf and node functions, automatically 816 + balancing nodes and annotating them with values from a provided 817 + measure module. It is useful for applications requiring fast access, 818 + dynamic sequence management, and efficient merging of tree-based 819 + data structures. 820 + 821 + #2 - camomile: CamomileLib.AvlTree 822 + Similarity: 0.5008 823 + Original Similarity: 0.7155 824 + Popularity Score: 0.0495 825 + Description: This module implements balanced binary trees (AVL 826 + trees) with operations for constructing, deconstructing, and 827 + traversing trees. It supports key operations like inserting nodes, 828 + extracting leftmost/rightmost elements, concatenating trees, and 829 + folding or iterating over elements. It is useful for maintaining 830 + ordered data with efficient lookup, insertion, and deletion, such as 831 + in symbol tables or priority queues. 832 + 833 + #3 - batteries: BatAvlTree 834 + Similarity: 0.5003 835 + Original Similarity: 0.7147 836 + Popularity Score: 0.1485 837 + Description: This module implements balanced binary trees (AVL 838 + trees) with operations for creating, modifying, and traversing 839 + trees. It supports tree construction with optional rebalancing, 840 + splitting, and concatenation, and provides root, left, and right 841 + accessors with failure handling. Concrete use cases include 842 + efficient ordered key-value storage, set-like structures, and 843 + maintaining sorted data with logarithmic-time insertions and 844 + lookups. 845 + 846 + #4 - grenier: Bt2 847 + Similarity: 0.4927 848 + Original Similarity: 0.7039 849 + Popularity Score: 0.2634 850 + Description: This module implements a balanced binary tree structure 851 + with efficient concatenation and rank-based access. It supports 852 + creating empty trees, constructing balanced nodes, and joining two 853 + trees with logarithmic cost relative to the smaller tree's size. Use 854 + cases include maintaining ordered collections with frequent splits 855 + and joins, and efficiently accessing elements by position. 856 + 857 + #5 - grenier: Mbt.Make 858 + Similarity: 0.4913 859 + Original Similarity: 0.7019 860 + Popularity Score: 0.0495 861 + Description: This module implements a balanced tree structure with 862 + efficient concatenation and size-based operations. It supports 863 + construction of trees using leaf and node functions, where nodes are 864 + automatically balanced and annotated with measurable values from 865 + module M. The module enables efficient rank queries and joining of 866 + trees, with applications in managing dynamic sequences where fast 867 + access and concatenation are critical.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 868 + &lt;h2 id=&quot;limitations-and-future-work&quot;&gt;&lt;a href=&quot;#limitations-and-future-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Limitations and future work&lt;/h2&gt; 869 + &lt;p&gt;We're aware that there are currently a number of limitations with what's been done so far, and there's a lot of exciting things that could quite easily be added!&lt;/p&gt; 870 + &lt;p&gt;We haven't done much prompt optimisation either for the tools themselves, nor their descriptions in the MCP server. We also haven't done much optimisation of the information retrieval - and it's clear from some of the results shown above that there are improvements to be made in the ranking algorithms. Some obvious next steps would be to do some &lt;a href=&quot;https://arxiv.org/html/2406.12433v2&quot;&gt;re-ranking&lt;/a&gt; or some form of hybrid search.&lt;/p&gt; 871 + &lt;p&gt;A particular challenge is that since this is based entirely off of the &lt;code&gt;ocaml-docs-ci&lt;/code&gt; build, it won't necessarily reflect the actual API your local build, as for OCaml, this &lt;a href=&quot;https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html&quot;&gt;can't be done&lt;/a&gt;. Thibaut Mattio is working on a &lt;a href=&quot;https://github.com/tmattio/ocaml-mcp&quot;&gt;local MCP server&lt;/a&gt; that would be perfectly positioned to do some of what we're doing, although we'd need to have a good local docs build implemented in dune for this to work well.&lt;/p&gt; 872 + &lt;p&gt;Also, there's plenty more data that we've collected during the docs builds! We can show the implementations of functions, we can expose code samples, select different versions of packages and much more. While we've concentrated on the search aspects, there's still a lot of low-hanging fruit that can be worked on.&lt;/p&gt; 873 + &lt;p&gt;If you're interested in helping us out on this, the project lives &lt;a href=&quot;https://github.com/sadiqj/odoc-llm&quot;&gt;on github&lt;/a&gt; - come along and join us!&lt;/p&gt; 874 + &lt;h2 id=&quot;using-the-server&quot;&gt;&lt;a href=&quot;#using-the-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using the server&lt;/h2&gt; 875 + &lt;p&gt;If you'd like to try it, we've got a demo server running right now. It's hosted on dill.caelum.ci.dev here at the Computer Laboratory in the University of Cambridge. To enable it with Claude, try this:&lt;/p&gt; 876 + &lt;div&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code&gt;claude mcp add -t sse ocaml http://dill.caelum.ci.dev:8000/sse&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 877 + &lt;p&gt;Obviously this is pre-alpha quality software, and we might take it down with no notice, and it might not work as expected, and all of the other usual caveats. Let us know if it works, or doesn't, or if you've got some suggestions for improvements!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/ocaml-mcp-server.html</id><title type="text">An OCaml MCP server</title><updated>2025-08-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:</summary><published>2025-08-19T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/08/week33.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-33&quot;&gt;&lt;a href=&quot;#week-33&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 33&lt;/h1&gt; 878 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-08-19&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 879 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;cohttp,yojson,jsonm&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 880 + &lt;p&gt;More work this week on the OCaml MCP server. Sadiq and I met before I went away on holiday and discussed the next steps to 'park' the work on the MCP server. The final steps are:&lt;/p&gt; 881 + &lt;ul&gt;&lt;li&gt;Write a README&lt;/li&gt;&lt;li&gt;Write and run a small script to fix a problem with module-type names&lt;/li&gt;&lt;li&gt;Write up and publish a blog post&lt;/li&gt;&lt;/ul&gt; 882 + &lt;p&gt;Not much, right? As always though, writing things up lead to a whole load more work.&lt;/p&gt; 883 + &lt;p&gt;The first problem occurred when writing up how it parsed the input docs. It turned out that when converting the repo so that it took markdown formatted files (using a &lt;a href=&quot;https://github.com/jonludlam/odoc/tree/odoc-llm-markdown&quot;&gt;slightly tweaked&lt;/a&gt; version of &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;davesnx's PR&lt;/a&gt;), Claude had decided that the way to do this was to first convert the markdown into HTML, and then use the HTML parser it had already built. Whilst tidying this up, Claude was remarkably keen to just use regexps to parse the markdown rather than using a pre-existing markdown library, so it took a little persuasion to get it into a state I was happy with.&lt;/p&gt; 884 + &lt;p&gt;The second issue was that the script that form the bulk of the repo had been written at different times, and therefore Claude didn't really take into account any of the decisions it had made in one script when building the next. So most of the command-line arguments were slightly different, which made writing up a mini 'howto' in the README quite a jarring experience.&lt;/p&gt; 885 + &lt;p&gt;Thirdly, and most importantly, we had decided that we needed a few example searches to show how the system worked. We'd already had a &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/week28&quot;&gt;useful experience&lt;/span&gt; with this when Anil had tried to search for a 'time and date parsing and formatting' library, so it shouldn't really have been a surprise that trying a few more examples showed some more interesting behaviour. Specifically, the searches I wanted to do were for an &amp;quot;HTTP client&amp;quot;, &amp;quot;JSON parser&amp;quot;, &amp;quot;Cryptographic Hash&amp;quot; and Anil's time-and-date query, and in actually trying these searches and critically examining the results, I had to go back and figure out why they weren't giving me the results I had expected.&lt;/p&gt; 886 + &lt;p&gt;The first of these searches I had anticipated would be quite interesting, as this is a query that should show the OCaml ecosystem &lt;a href=&quot;https://discuss.ocaml.org/t/simple-modern-http-client-library/11239&quot;&gt;missing an obvious HTTP client&lt;/a&gt;. However, even with this in mind one of the top results was one of Cohttp's module types, &lt;code&gt;Cohttp.Generic.Client.S&lt;/code&gt;. This, of course, isn't much use if you're looking for an HTTP client, as module-types aren't going to give you an implementation to actually use. So I decided that we'd exclude module-types from the results. This turned out to be slightly more tricky than I anticipated as we'd lost the distinction between modules and module types further back in the pipeline, so Claude had to do some plumbing to ensure we had this information at the point we were doing the search.&lt;/p&gt; 887 + &lt;p&gt;The cryptographic hash search gave some plausible looking results, so I moved on to the JSON search. I was expecting to see &lt;code&gt;Yojson&lt;/code&gt; somewhere near the top of the list as that's a very popular library. I was also expecting to see &lt;code&gt;Jsonm&lt;/code&gt; somewhere near the top - or at least I'd like to be able to find it by searching for a 'streaming parser' as that's one of its key strengths. However, searching for &amp;quot;JSON parser&amp;quot; yielded some less than brilliant answers. The top 5 results were for modules in the packages &lt;code&gt;yojson-five&lt;/code&gt;, &lt;code&gt;decoders-yojson&lt;/code&gt;, &lt;code&gt;decoders-jsonaf&lt;/code&gt;, &lt;code&gt;ocplib-json-typed-browser&lt;/code&gt; and &lt;code&gt;ppx_protocol_conv_jsonm&lt;/code&gt;. While all of these are clearly in the same realm as I was after, having &lt;code&gt;jsonm&lt;/code&gt; show up literally 99th in the list, and &lt;code&gt;yojson&lt;/code&gt; itself not in the top 100 wasn't a great result.&lt;/p&gt; 888 + &lt;p&gt;Some investigation showed that yojson had a particularly bad showing because the description of the module &lt;code&gt;Yojson.Basic&lt;/code&gt; was the empty string! This turned out to be because of some bad error-handling logic in the summariser script, which ended up turning some errors into a blank description. Since running the summariser costs actual money, I didn't want to just rerun the whole thing, so I asked Claude for a script to find these problems and rerun them. The problem is not totally trivial as the summaries of child modules are used when generating the summary for parents, so when one is regenerated we should regenerate the summaries of all ancestors too. Given my recent experiences with Claude I'd like to look this over quite carefully before letting it loose on my data, so I've run it on yojson, which seemed to do the right thing, but not yet on the rest of the packages.&lt;/p&gt; 889 + &lt;p&gt;Having fixed this, I still found that &lt;code&gt;jsonm&lt;/code&gt; was making a very poor showing. This turned out to be because the description it gives itself is a &amp;quot;Non-blocking streaming JSON codec for OCaml&amp;quot; which had a fairly low similarity with &amp;quot;JSON parser&amp;quot;. I was using a fairly small embedding model for the queries - Qwen/Qwen3-Embedding-0.6B, so I thought I might address this by using a larger one, and opted for Qwen/Qwen3-Embedding-8B. The machine I had been using for the MCP server has no GPU and had taken a while to do the embeddings using the 0.6B model, so I switched to generating them on my M4 macbook. This went &lt;i&gt;much&lt;/i&gt; faster, though since I have about 70Mb of module summaries it still took quite a while. This improved the situation somewhat, but it was still not high in the list.&lt;/p&gt; 890 + &lt;p&gt;So I took a step back and had a think about the problem some more. Searching for a JSON parser is really quite a high-level search, and when evaluating the results I realised I was really thinking in terms of packages rather than modules. So I thought we could split the search in two - a package search and a module search. The package search would be used for the broad queries where you're interested in pulling in whole chunks of functionality, and the module search is for more low-level queries. In fact, the 'time and dating formatting' query is somewhere in between, so I might need to have some more example queries for the module search functions. In addition, the module search could be restricted to the set of packages you're using, which might make it even more useful.&lt;/p&gt; 891 + &lt;p&gt;Part of the split meant that I needed a different source of 'popularity' for the packages than the occurrences data that came out of docs ci, as that was per-module and I needed something per-package. The obvious thing is to look at reverse dependencies in opam. I have this kind-of working, but it's currently not particularly smart, so this will need a little more attention. For example, it currently thinks that &lt;a href=&quot;https://melange.re/v5.0.0/&quot;&gt;melange&lt;/a&gt; has over 3000 reverse dependencies.&lt;/p&gt; 892 + &lt;p&gt;With these changes in place, a package search for 'JSON parser' now returns &lt;code&gt;yojson&lt;/code&gt; as number one, followed by &lt;code&gt;ppx_deriving_yojson&lt;/code&gt;, &lt;code&gt;ezjsonm&lt;/code&gt;, &lt;code&gt;ocplib-json-typed&lt;/code&gt; and &lt;code&gt;jsonaf&lt;/code&gt;. Unfortunately &lt;code&gt;jsonm&lt;/code&gt; is still languishing in 27th place, so there's still some tweaking to do.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/08/week33.html</id><title type="text">Week 33</title><updated>2025-08-19T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Astonishingly, it's already been since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into t...</summary><published>2025-07-18T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/retrospective.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;4-months-in,-a-retrospective&quot;&gt;&lt;a href=&quot;#4-months-in,-a-retrospective&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;4 months in, a retrospective&lt;/h1&gt; 893 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-18&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 894 + &lt;p&gt;Astonishingly, it's already been &lt;i&gt;four whole months&lt;/i&gt; since starting back at the university, which I find incredibly hard to believe. I'm utterly convinced that it was only a couple of weeks ago that I walked back into the Computer Laboratory as an SRA for the first time since 2021, but here we are, at the end of term already. Time to do a bit of a retrospective and forward-looking plan for the next 3-4 months!&lt;/p&gt; 895 + &lt;h2 id=&quot;what's-happened?&quot;&gt;&lt;a href=&quot;#what's-happened?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's happened?&lt;/h2&gt; 896 + &lt;p&gt;On wednesday this week, I had a chance to sit down with Anil, supposedly to talk about the upcoming lecturing of 1A Foundations of Computer Science, but we ended up talking about what I've been doing for the past few months, and where it fits into the broader picture of the group as a whole. It was a really useful conversation, and I thought it would be good to outline it here while it's fresh in my mind.&lt;/p&gt; 897 + &lt;p&gt;So then, to start, what have I been doing? What have I achieved? What have I learnt? It's been a bit of a daunting experience, landing in a team that are already working one hundred miles an hour on things well out of my comfort zone. I've been going to group meetings and having lots of interesting conversations, but I've found it difficult to make the next steps happen. One area where I've had some success is in working with Sadiq on LLMs - in particular, getting local LLMs to solve programming exercises that we both &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;wrote&lt;/a&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/05/ticks-solved-by-ai&quot;&gt;up&lt;/span&gt;. I've also been working with him on taking the output from the docs CI and &lt;a href=&quot;https://github.com/sadiqj/odoc-llm&quot;&gt;summarising it with LLMs&lt;/a&gt; in order to create an MCP server that would help tools like &lt;a href=&quot;https://anthropic.com/&quot;&gt;Claude Code&lt;/a&gt; to choose OCaml packages to solve users' problems.&lt;/p&gt; 898 + &lt;p&gt;It's been somewhat easier, partly due to inertia, to carry on with projects that had been in flight at the time I started. Things like getting the Odoc 3 generated docs onto ocaml.org, which is finally complete only &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;as of this week!&lt;/span&gt;. This has taken a whole lot of time, but I'm really pleased with the end results. There's still an awful lot of improvements that I'd like to see made, which, after drawing breath for a couple of weeks, I'll be writing down.&lt;/p&gt; 899 + &lt;p&gt;An itch I'd been wanting to scratch for a long time has been to look at client-side ocaml notebooks. I decided to make this an integral &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/this-site&quot;&gt;feature of this blog&lt;/span&gt;, and I've learnt an awful lot doing it. An important feature of this that I've been keeping in mind is the idea that we could use the ocaml-docs-ci tool to build the libraries, which would allow us to host a toplevel for every single package in opam-repository - allowing at best &lt;a href=&quot;https://discuss.ocaml.org/t/an-example-for-every-ocaml-package/16953/10&quot;&gt;interactive examples&lt;/a&gt;, and at bare minimum merlin for live type-checking and autocompletion. The important principles to keep in mind for this are that:&lt;/p&gt; 900 + &lt;ul&gt;&lt;li&gt;We have one 'toplevel' javascript file, and libraries and cmis are dynamically loaded&lt;/li&gt;&lt;li&gt;The interface between the frontend and the worker must not rely on a matched pair, e.g. an OCaml-5.3-compiled frontend might be talking to an OCaml-4.08-compiled worker thread - or even an oxcaml one!&lt;/li&gt;&lt;/ul&gt; 901 + &lt;p&gt;I have this all working on my blog, where I have both an oxcaml worker and a standard ocaml worker and they both dynamically load in libraries and cmis as specified on the page.&lt;/p&gt; 902 + &lt;p&gt;I've also supervised a 1A course for the first time - &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/IntroProb/&quot;&gt;Introduction to Probability&lt;/a&gt;, and I've done some marking for the 1A Foundations of Computer Science.&lt;/p&gt; 903 + &lt;p&gt;Something that I'd been expecting to do a lot on was work with oxcaml, but as the release happened later than anticipated and it coinciding with the marking and supervising, I've not done quite as much of this as I had thought I would. In addition, I had anticipated working on Odoc to start implementing the new features of oxcaml, but to avoid duplicating effort I've been waiting for the patches that have already been written at Jane Street to at least get odoc to compile, which have taken longer than I had hoped to get to me.&lt;/p&gt; 904 + &lt;h2 id=&quot;what's-next?&quot;&gt;&lt;a href=&quot;#what's-next?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's next?&lt;/h2&gt; 905 + &lt;p&gt;With that in mind, Anil and I then talked about the bigger picture, as those of you who know Anil will be entirely unsurprised to hear! In particular, how will we be weaving the various threads of these activites - the teaching of OCaml, the large-scale (for OCaml) CI work, the LLMs and Oxcaml work together to form a coherent whole? How do I find a balance between them and ensure that we find &lt;a href=&quot;https://arxiv.org/abs/1106.0848&quot;&gt;synergies&lt;/a&gt; as opposed to pulling in different directions? How do make sure what we're doing helps us navigate the upending of the nature of development that agentic coding is bringing?&lt;/p&gt; 906 + &lt;h3 id=&quot;efficient-and-reusable-ci&quot;&gt;&lt;a href=&quot;#efficient-and-reusable-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Efficient and reusable CI&lt;/h3&gt; 907 + &lt;p&gt;A clear and obvious area where we'll be able to see real progress is to extract from docs CI the logic that I've been using to do efficient builds of packages. As I previously &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;wrote about&lt;/span&gt;, the new CI system is far more efficient than some of the other ocurrent-based pipelines, and it would save a huge amount of compute time if we were to take this tech and apply it elsewhere.&lt;/p&gt; 908 + &lt;p&gt;So, how might we take what we've got and produce something useful to everyone? We need to take a hammer to the fracture points of the docs CI service and split it into individually useful parts. Here are some next steps as I see them now. Let's take the solver out of docs CI, and have a service whose sole job is to create a repository of up-to-date solutions for all versions of all packages in opam-repository. These are the data that allow us to build the tree of package builds.&lt;/p&gt; 909 + &lt;p&gt;Next, turn these solutions into one giant build. Perhaps a script? Maybe a giant buildkit dockerfile? This is very similar to Mark Elvers' &lt;a href=&quot;https://github.com/mtelvers/ohc&quot;&gt;day10&lt;/a&gt; project. We can get this running on a big machine and just see how fast we can build everything. The key thing here is that it should be &lt;em&gt;trivial&lt;/em&gt; to run this on a linux box. A raspberry pi or a 768-core behemoth with 3TiB of ram. Just how fast &lt;em&gt;can&lt;/em&gt; we get it going? It's already building in a couple of days using &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/07/odoc-3-live-on-ocaml-org&quot;&gt;sage&lt;/span&gt;, but that's using ocurrent/obuilder, which isn't quite the right tool for the job, and on a relatively puny machine. Can we do it in an hour? 10 minutes? Certainly the incrememntal builds ought to be done in seconds. What's the limit?&lt;/p&gt; 910 + &lt;p&gt;These tools can then be used as the foundation for other CI systems. For opam-repo-ci, where we should be able to do the builds for a new package incredibly quickly. For opam-health-check, where we currently build foundational packages like dune and findlib &lt;i&gt;thousands of times&lt;/i&gt; per run.&lt;/p&gt; 911 + &lt;p&gt;Once we've got the packages built, docs CI is simply a pass over the top of the built artifacts. ocaml-docs-ci already demonstrates this - it only takes a few hours to rebuild all the docs when a new version of odoc is released, but in a way that only benefits docs! All the CI systems should be able to use this.&lt;/p&gt; 912 + &lt;p&gt;We should also then be able to run js_of_ocaml on the libraries to build to infrastructure needed for the per-package toplevels for ocaml.org that I mentioned above. Each of these steps should be separate stages in a pipeline - one where each step produces artifacts for the next to consume.&lt;/p&gt; 913 + &lt;p&gt;When we mix in some of the projects that other people in the team are working on, like David's work on &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;relocatable OCaml&lt;/a&gt;, we've got something that might be able to form a basis for a binary cache for Dune Package Management, particularly when we involve Ryan's &lt;a href=&quot;https://ryan.freumh.org/papers.html#2025-arxiv-hyperres&quot;&gt;Hyperres&lt;/a&gt; paper so we might check that dependencies from outside of the OCaml universe are correct. Can we use &lt;a href=&quot;https://github.com/quantifyearth/shark&quot;&gt;Patrick and Michael's shark&lt;/a&gt; to do the build steps? Can we use these images to serve up toplevels for ocaml.org that are &lt;em&gt;real toplevels&lt;/em&gt; rather than javascript toplevels? Can we use these build environments to do help with reinforcement learning to train LLMs on OCaml code? There are a lot of interesting directions to take this work.&lt;/p&gt; 914 + &lt;h3 id=&quot;other-projects&quot;&gt;&lt;a href=&quot;#other-projects&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Other projects&lt;/h3&gt; 915 + &lt;p&gt;There are, of course, other responsibilities that I have. Some of these I'll be able to fit in with the theme above, and some - well - maybe I'll have to figure out how to delegate them, a skill that I am not particularly good at, but one that I feel I should learn!&lt;/p&gt; 916 + &lt;h4 id=&quot;teaching&quot;&gt;&lt;a href=&quot;#teaching&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Teaching&lt;/h4&gt; 917 + &lt;p&gt;A looming, terrifying, but tremendously exciting opportunity is teaching of 1A Foundations of Computer Science. This is amongst the first courses we teach our incoming undergraduates, currently lectured by &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/&quot;&gt;Anil&lt;/a&gt;. As he's on sabbatical this year, he has asked me to step up and lecture it. This is definitely not one for delegation!&lt;/p&gt; 918 + &lt;p&gt;The immediate question, partly raised by my work with Sadiq, is: what do we do about LLMs? How should we adjust our teaching to take into account the existence of these tools? We had a very interesting chat earlier in the term with Professor &lt;a href=&quot;https://eecs.iisc.ac.in/people/prof-viraj-kumar/&quot;&gt;Viraj Kumar&lt;/a&gt; from &lt;a href=&quot;https://eecs.iisc.ac.in/&quot;&gt;IISc&lt;/a&gt; who was visiting Cambridge earlier this year. He's been &lt;a href=&quot;https://dl.acm.org/doi/10.1145/3724363.3729100&quot;&gt;working on this question&lt;/a&gt; for a while now, and I hope to have some more conversations with him over the summer.&lt;/p&gt; 919 + &lt;h4 id=&quot;odoc-paper&quot;&gt;&lt;a href=&quot;#odoc-paper&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc paper&lt;/h4&gt; 920 + &lt;p&gt;An area where I've really made a shockingly small amount of progress is to write up all the work that's gone into Odoc over the past 6 (!!!) years.&lt;/p&gt; 921 + &lt;h4 id=&quot;odoc-notebooks&quot;&gt;&lt;a href=&quot;#odoc-notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc notebooks&lt;/h4&gt; 922 + &lt;p&gt;This needs to be tidied up and a v0.1 released. In particular, the work on js_top_worker might well be shared with Arthur's &lt;a href=&quot;https://github.com/art-w/x-ocaml&quot;&gt;x-ocaml&lt;/a&gt; for a unified toplevel experience.&lt;/p&gt; 923 + &lt;h4 id=&quot;ai-work&quot;&gt;&lt;a href=&quot;#ai-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;AI work&lt;/h4&gt; 924 + &lt;p&gt;I'd like to carry on the work I've started with Sadiq on the interaction of LLMs with OCaml. Getting the package search to work sensibly for an MCP server is first on the list, but also doing some reinforcement learning to improve specifically the perfomance on OCaml is very interesting, but not something I've managed to carve out the time for yet. Something along the lines of &lt;a href=&quot;https://arxiv.org/abs/2504.21798&quot;&gt;swesmith&lt;/a&gt; but adapted for OCaml.&lt;/p&gt; 925 + &lt;h4 id=&quot;oxcaml-odoc&quot;&gt;&lt;a href=&quot;#oxcaml-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml Odoc&lt;/h4&gt; 926 + &lt;p&gt;Odoc needs to have some work done on it to support the new work that's gone into oxcaml, for example documenting of the modes. This is something I do expect to be working on soon.&lt;/p&gt; 927 + &lt;h4 id=&quot;dune-and-odoc&quot;&gt;&lt;a href=&quot;#dune-and-odoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dune and odoc&lt;/h4&gt; 928 + &lt;p&gt;Work needs to be done on the dune rules for odoc, which currently only support the feature-set in odoc 2.x. Paul-Elliot has &lt;a href=&quot;https://github.com/ocaml/dune/pull/11716&quot;&gt;done some work on this&lt;/a&gt;, but much more needs to be done.&lt;/p&gt; 929 + &lt;h4 id=&quot;further-general-odoc-work&quot;&gt;&lt;a href=&quot;#further-general-odoc-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Further general odoc work&lt;/h4&gt; 930 + &lt;ul&gt;&lt;li&gt;Better source rendering&lt;/li&gt;&lt;li&gt;Syntax for linking to source&lt;/li&gt;&lt;li&gt;Custom tags (used in odoc_notebook)&lt;/li&gt;&lt;li&gt;Web-native rendering, for embedding odoc in a website&lt;/li&gt;&lt;li&gt;Unifying paths and cpaths (https://github.com/jonludlam/odoc/tree/parameterised-paths)&lt;/li&gt;&lt;/ul&gt; 931 + &lt;h2 id=&quot;what-to-actually-do?&quot;&gt;&lt;a href=&quot;#what-to-actually-do?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What to &lt;i&gt;actually&lt;/i&gt; do?&lt;/h2&gt; 932 + &lt;p&gt;There are a lot of things in the above list. I'm not sure yet how I manage to figure out what I actually end up doing, and how that helps me to help Tarides, to fit in as a useful member of the EEG group, and to make sure I'm doing what's right for my own future. I feel the core project of the CI work will help everyone, but slotting the other work into the bigger picture will require some careful thought.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/retrospective.html</id><title type="text">4 months in, a retrospective</title><updated>2025-07-18T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Week 28</summary><published>2025-07-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/week28.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-28&quot;&gt;&lt;a href=&quot;#week-28&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 28&lt;/h1&gt; 933 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 934 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;caqti.platform,mariadb&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 935 + &lt;h2 id=&quot;ocaml-mcp-server&quot;&gt;&lt;a href=&quot;#ocaml-mcp-server&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml MCP server&lt;/h2&gt; 936 + &lt;p&gt;Last week I got the summarisation to the point where it felt useful to run it across all the modules in opam. With this completed we then got to try out the MCP server to see how useful it would be in practice.&lt;/p&gt; 937 + &lt;p&gt;One of the first queries &lt;a href=&quot;https://anil.recoil.org/&quot;&gt;Anil&lt;/a&gt; tried was to ask it which libraries would be most useful for &amp;quot;date time parsing and formatting&amp;quot;. We were surprised to see that the first two libraries it returned were &lt;code&gt;caqti&lt;/code&gt; and &lt;code&gt;mariadb&lt;/code&gt;, specifically mentioning the module &lt;code&gt;Caqti_platform.Conv&lt;/code&gt; and &lt;code&gt;Mariadb.S.Time&lt;/code&gt;. While these do indeed provide the required functionality, they're probably not the right libraries to provide this. It's going to be tricky to decide this in the MCP server, so we should probably be leaving it up to the LLM to decide amongst them on the client. However, for very general queries we might end up with a large number of matching libraries, so we'll need to have a limit on the number of packages returned, which implies some form of ranking.&lt;/p&gt; 938 + &lt;p&gt;One way we can do this is by using the occurrences code in odoc. The idea is that we examine module implementation files (ie, ml rather than mli files), and counts the number of times the code uses values, types and other identifiers from other libraries. We can then aggregate these counts over all packages in opam repository and use it as an effective marker of popularity, which allows us to rank the results by popularity and only return the top N results.&lt;/p&gt; 939 + &lt;p&gt;We're not currently using the occurrences for anything, so I wasn't especially surprised to find that it's not working as intended. There were a number of issues:&lt;/p&gt; 940 + &lt;ul&gt;&lt;li&gt;The occurrences output file was being written at a path not within the package dir, so it wasn't being persisted.&lt;/li&gt;&lt;li&gt;The CLI interface for generating occurrences works by providing a directory containing the odocl files, but we were only providing the top-level directory and it wasn't recursively searching.&lt;/li&gt;&lt;li&gt;Once the occurrences were captured, the aggregation step used the full identifier of the value being aggregated, meaning that, for example, &lt;code&gt;List.length&lt;/code&gt; in OCaml 5.3 was counted separately from &lt;code&gt;List.length&lt;/code&gt; in OCaml 4.14.&lt;/li&gt;&lt;/ul&gt; 941 + &lt;p&gt;All of these issues are with code in the odoc repository, which, as it happens, also needs a release soon to ensure that it works with the imminent launch of OCaml 5.4. During the week, before I discovered the problems above, I had attempted to make a release of Odoc 3.1, but there was a license kerfuffle that, when combined with the issues in the occurrences code, gave me enough cause to pull the release.&lt;/p&gt; 942 + &lt;p&gt;Before I try to make the release again, this time I'll be running the release candidate with docs-ci, and checking that the occurrences make sense. I set this running on Friday afternoon, and it had completed by Friday evening, so it's actually pretty quick to rerun odoc on the 15,000 or so packages required for ocaml.org.&lt;/p&gt; 943 + &lt;h2 id=&quot;trouble-with-this-blog&quot;&gt;&lt;a href=&quot;#trouble-with-this-blog&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Trouble with this blog&lt;/h2&gt; 944 + &lt;p&gt;In other news, in trying to post my blog at the beginning of the week, I was stymied a little by the changes in oxcaml. I had been using a custom opam-repository forked from the official oxcaml one, because I needed a patched js_of_ocaml in order to fix the toplevel code. I had hoped this would mean that I could update it on my schedule, rather than being at the mercy of upstream changes. Unfortunately though, the download URL for ocaml-flambda wasn't pointing at an immutable commit, so when I tried it I got a checksum error. So I ended up trying to rebase the changes onto the latest oxcaml opam-repository, which didn't go well at all. The version numbers had all changed, which in opam means that files are in different directories, so git got thoroughly confused. On top of that, because the js_of_ocaml repository has multiple packages in it, whereas opam repository has a directory per-package, we end up having multiple copies of the patches. So in the end I've just committed all the patches to a git repo on github, and pinned it in the Dockerfile that builds this site.&lt;/p&gt; 945 + &lt;p&gt;What would be handy is a way to apply the patches in a package in opam repository to and from a git repository, similar to quilt/guilt. We don't quite have all of the pieces to do this, as although we have a download URL and often a dev-repo, I don't believe we currently have a way to get the base commit of that repository.&lt;/p&gt; 946 + &lt;h2 id=&quot;oxcaml-continues&quot;&gt;&lt;a href=&quot;#oxcaml-continues&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml continues&lt;/h2&gt; 947 + &lt;p&gt;We had a meeting on Thursday with Jane Street on the next steps for oxcaml. There are a number of areas in which JS are keen for us to help out with.&lt;/p&gt; 948 + &lt;ul&gt;&lt;li&gt;Playgrounds - both javascript and docker-based. The playground on the oxcaml website right now uses github codespaces, which works nicely but currently takes an absolute age to start up. We can almost certainly improve this by building docker images and pushing them to the docker hub, rather than building oxcaml from scratch when starting the codespace. There's also interest in the javascript playgrounds, which can serve a slightly different purpose than the docker-based one, more limited in how it can be used, but without requiring someone to spin up a full docker container.&lt;/li&gt;&lt;li&gt;Documentation - Odoc has had some patches to run on oxcaml, but there's no support for documenting many of the new features yet, including modes. We've got to do some experiments here to see what the best way is to show the new type-system features in the generated docs. There were some suggestions of using javascript to show/hide the modes, for example.&lt;/li&gt;&lt;li&gt;Improvements in Merlin - again this is an area ripe for investigation. In particular, how do we best expose the new features of the type system for users? What's needed here is user feedback from people who are actually using oxcaml to build real projects.&lt;/li&gt;&lt;li&gt;Better error messages - OCaml has been getting improved error messages with each release, but there's still room for improvement, and the new features of the type system in particular have many different failure modes. Again, we need user feedback to understand the pain points and improve the error messages accordingly.&lt;/li&gt;&lt;/ul&gt; 949 + &lt;h2 id=&quot;next-week&quot;&gt;&lt;a href=&quot;#next-week&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Next week&lt;/h2&gt; 950 + &lt;p&gt;Next week, the plan is to:&lt;/p&gt; 951 + &lt;ul&gt;&lt;li&gt;Check the occurrences from docs-ci, and integrate into the MCP server&lt;/li&gt;&lt;li&gt;Talk to &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark&lt;/a&gt; about building the docker image for the oxcaml playground&lt;/li&gt;&lt;li&gt;Tidy up the &lt;code&gt;Js_top_worker&lt;/code&gt; code so it can be used in the javascript oxcaml playground&lt;/li&gt;&lt;li&gt;Release Odoc 3.1&lt;/li&gt;&lt;/ul&gt;</content><id>https://jon.recoil.org/blog/2025/07/week28.html</id><title type="text">Week 28</title><updated>2025-07-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.</summary><published>2025-07-14T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-3-is-live-on-ocaml.org!&quot;&gt;&lt;a href=&quot;#odoc-3-is-live-on-ocaml.org!&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc 3 is live on OCaml.org!&lt;/h1&gt; 952 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-14&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 953 + &lt;p&gt;As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.&lt;/p&gt; 954 + &lt;p&gt;Some of the highlights include:&lt;/p&gt; 955 + &lt;ul&gt;&lt;li&gt;Source code rendering&lt;/li&gt;&lt;li&gt;Hierarchical manual pages&lt;/li&gt;&lt;li&gt;Image, video and audio support&lt;/li&gt;&lt;li&gt;Separation of API docs by library&lt;/li&gt;&lt;/ul&gt; 956 + &lt;p&gt;A huge amount of work went into the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339&quot;&gt;Odoc 3.0 release&lt;/a&gt;, and I'd like to thank my colleagues at Tarides, in particular &lt;a href=&quot;https://github.com/panglesd&quot;&gt;Paul-Elliot&lt;/a&gt; and &lt;a href=&quot;https://github.com/julow/&quot;&gt;Jules&lt;/a&gt; for the work they put into this.&lt;/p&gt; 957 + &lt;p&gt;But the odoc release happened months ago, so why is it only going live now? So, the doc tool itself is only one small part of getting the docs onto ocaml.org. Odoc works on the &lt;a href=&quot;https://discuss.ocaml.org/t/cmt-cmti-question/5308&quot;&gt;cmt and cmti&lt;/a&gt; files that are produced during the build process, and so part of the process of building docs is to build the packages, so we have to, at minimum, attempt to build all 17,000 or so distinct versions of the packages in opam-repository. The &lt;a href=&quot;https://github.com/ocurrent&quot;&gt;ocurrent&lt;/a&gt; tool &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt;, which I've previously &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/05/docs-progress&quot;&gt;written&lt;/span&gt; &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;about&lt;/span&gt;, is responsible for these builds and in this new release has demonstrated a new approach to this task, where we attempt to do the build in as efficient a way as possible by effectively building binary packages once for each required package in a specific 'universe' of dependencies. For example, many packages require e.g. &lt;a href=&quot;https://erratique.ch/software/cmdliner&quot;&gt;cmdliner.1.3.0&lt;/a&gt; to build, and some require a specific version of OCaml too. So we'll build cmdliner.1.3.0 once against each version of OCaml required -- but &lt;i&gt;only once&lt;/i&gt;, which is in contrast to how some of the other tools in the ocurrent suite work, e.g. &lt;a href=&quot;https://github.com/ocurrent/opam-repo-ci&quot;&gt;opam-repo-ci&lt;/a&gt;. Once the packages are built, we then run the new tool &lt;a href=&quot;https://ocaml.github.io/odoc/odoc-driver/index.html&quot;&gt;odoc_driver&lt;/a&gt; to actually build the HTML docs. In addition to this, a new feature of Odoc 3 is to be able to link to packages that are your direct dependencies - so for example, the docs of odoc contain links to the docs of odoc_driver, even though odoc_driver depends upon odoc. This, whilst sounding easy enough, required some radical changes in the docs ci, which I promise I will write about later!&lt;/p&gt; 958 + &lt;p&gt;The builds and the generation of the docs is all done on a single blade server, called &lt;a href=&quot;https://sage.caelum.ci.dev&quot;&gt;sage&lt;/a&gt; with 40 threads, 2 8TiB spinning drives and a 1.8TiB SSD cache, and it produces about 1 TiB of data over the course of a couple of days. The changes required to this part of the process since odoc 2.x were primarily myself and &lt;a href=&quot;https://tunbury.org&quot;&gt;Mark Elvers&lt;/a&gt;&lt;/p&gt; 959 + &lt;p&gt;Once the docs are built, how do they get onto ocaml.org? Odoc itself knows nothing about the layout and styling of ocaml.org, so the HTML it produces isn't suitable to be just rendered when a user requests particular docs. What happens is that odoc produces, as well as a self-contained HTML file, a json file with the body of the page, the sidebars, the breadcrumbs and so on as structured data, one per HTML page, which are then served from sage over HTTP. When a user requests a particular docs page, the ocaml.org server will request that json file from sage, then render this with the ocaml.org styling, then send it back to the user.&lt;/p&gt; 960 + &lt;p&gt;As odoc 3 moved a fair bit of logic from ocaml.org into odoc itself, there were quite a few changes that needed to be made to the ocaml.org server to integrate this into the site. This work was mostly done by &lt;a href=&quot;https://github.com/panglesd&quot;&gt;Paul-Elliot&lt;/a&gt; and myself, with a lot of help from the &lt;a href=&quot;https://github.com/ocaml/ocaml.org?tab=readme-ov-file#maintainers&quot;&gt;ocaml.org team&lt;/a&gt;, in particular &lt;a href=&quot;&quot;&gt;Sabine Schmaltz&lt;/a&gt; and &lt;a href=&quot;https://github.com/cuihtlauac&quot;&gt;Cuihtlauac Alvarado&lt;/a&gt;.&lt;/p&gt; 961 + &lt;p&gt;So, quite a lot of integration and infrastructure work was required to get the new docs site up and running, and I'm very happy to see this particular task concluded!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/odoc-3-live-on-ocaml-org.html</id><title type="text">Odoc 3 is live on OCaml.org!</title><updated>2025-07-14T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml ...</summary><published>2025-07-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/07/week27.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;weeks-24-27&quot;&gt;&lt;a href=&quot;#weeks-24-27&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Weeks 24-27&lt;/h1&gt; 962 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-07-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 963 + &lt;p&gt;It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml documentation. This post is about anaspect of that last one that I found particularly interesting.&lt;/p&gt; 964 + &lt;h2 id=&quot;odoc-llm&quot;&gt;&lt;a href=&quot;#odoc-llm&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;odoc-llm&lt;/h2&gt; 965 + &lt;p&gt;Sadiq and I have been &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;looking at using LLMs&lt;/a&gt; to summarise the documentation produced by Odoc. The idea is to get a concise summary of the purpose of each module, so that we can quickly identify which modules are relevant to a particular task.&lt;/p&gt; 966 + &lt;p&gt;For testing this, we need to see how it works on different types of libraries. The first axis I wanted to test on goes between 'well documented' and 'poorly documented', and so I need at least two libraries on opposite ends of the spectrum.&lt;/p&gt; 967 + &lt;p&gt;For the 'well documented' case, I chose &lt;code&gt;cmdliner&lt;/code&gt;. It's a library that I almost always have to look at the docs for when I'm using it, as, despite using it many many times, the interface doesn't seem to stick in my head.&lt;/p&gt; 968 + &lt;p&gt;For the 'poorly documented' case, I chose &lt;code&gt;odoc&lt;/code&gt; itself, somewhat ironically. In defence of the odoc authors (me included!), the libraries it provides are simply there for code organisation and aren't meant to be consumed outside of the tool itself!&lt;/p&gt; 969 + &lt;h3 id=&quot;the-approach&quot;&gt;&lt;a href=&quot;#the-approach&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The approach&lt;/h3&gt; 970 + &lt;p&gt;The output from Odoc is a set of HTML files, one per module/module type/class/etc., containing the documentation for that item. Our first take on this was to parse the HTML files and extract the text content, which we then fed to an LLM to summarise. However, this was pretty awkward, so we decided to try a PR that &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1341&quot;&gt;davesnx recently made to Odoc&lt;/a&gt; to output markdown instead of HTML.&lt;/p&gt; 971 + &lt;p&gt;We look for leaf modules that don't contain any submodules, and start by summarising those, then move onto the parent modules, splicing in the summaries of the children, and so on, up to the top-level modules. We then move on to summarising the whole library, which usually is just a single namespace module, but occasionally is a group of top-level modules.&lt;/p&gt; 972 + &lt;p&gt;One of the early prompts for the module &lt;code&gt;Cmdliner.Term.Syntax&lt;/code&gt; looked roughly as follows:&lt;/p&gt; 973 + &lt;pre&gt;You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 974 + - The specific operations and functions this module provides 975 + - What data types or structures it works with 976 + - Concrete use cases (avoid generic terms like &amp;quot;utility functions&amp;quot; or &amp;quot;common operations&amp;quot;) 977 + 978 + Do NOT: 979 + - Repeat the module name in the description 980 + - Mention &amp;quot;functional programming patterns&amp;quot; or &amp;quot;code clarity&amp;quot; 981 + - Use filler phrases like &amp;quot;provides functionality for&amp;quot; or &amp;quot;collection of functions&amp;quot; 982 + - Describe how it works with other modules 983 + 984 + Module: Syntax 985 + Module Documentation: let operators. 986 + ( let+ ) is map. 987 + ( and* ) is product. 988 + - val (let+) : 'a t -&amp;gt; ('a -&amp;gt; 'b) -&amp;gt; 'b t (* ( let+ ) is map. *) 989 + - val (and+) : 'a t -&amp;gt; 990 + 'b t -&amp;gt; 991 + ('a * 'b) t (* ( and* ) is product. *)&lt;/pre&gt; 992 + &lt;p&gt;and the output using a small model (qwen3-30b-a3b) was:&lt;/p&gt; 993 + &lt;pre&gt;&amp;quot;The module provides (let+) for applying functions to values within a context and (and+) for combining two contexts into a product. It operates on applicative 994 + structures like option, list, or custom types that support these operations. For example, it enables sequential transformation of values in a context or 995 + pairing elements from two separate contexts.&amp;quot;&lt;/pre&gt; 996 + &lt;p&gt;There are quite a few issues with the input here. Firstly, we've only given the module name, not the full path. Secondly, there's nothing to let the model know what &lt;code&gt;t&lt;/code&gt; might be, so it has decided it's a completely generic type. It also has no idea about the context in which this module was found, so it has no idea that it's part of a command-line processing library. By fixing these issues, we end up with the prompt:&lt;/p&gt; 997 + &lt;pre&gt;You are an expert OCaml developer. Write a 2-3 sentence description focusing on: 998 + - The specific operations and functions this module provides 999 + - What data types or structures it works with 1000 + - Concrete use cases (avoid generic terms like &amp;quot;utility functions&amp;quot; or &amp;quot;common operations&amp;quot;) 1001 + 1002 + Do NOT: 1003 + - Repeat the module name in the description 1004 + - Mention &amp;quot;functional programming patterns&amp;quot; or &amp;quot;code clarity&amp;quot; 1005 + - Use filler phrases like &amp;quot;provides functionality for&amp;quot; or &amp;quot;collection of functions&amp;quot; 1006 + - Describe how it works with other modules 1007 + 1008 + Module: Cmdliner.Term.Syntax 1009 + 1010 + Ancestor Module Context: 1011 + - Cmdliner: Declarative definition of command line interfaces. 1012 + Consult the tutorial, details about the supported command line syntax and examples of use. 1013 + Open the module to use it, it defines only three modules in your scope. 1014 + - Cmdliner.Term: Terms. 1015 + A term is evaluated by a program to produce a result, which can be turned into an exit status. A term made of terms referring to command line arguments implicitly defines a command line syntax. 1016 + 1017 + Module Documentation: let operators. 1018 + - val (let+) : 'a Cmdliner.Term.t -&amp;gt; ('a -&amp;gt; 'b) -&amp;gt; 'b Cmdliner.Term.t (* ( let+ ) is map. *) 1019 + - val (and+) : 'a Cmdliner.Term.t -&amp;gt; 1020 + 'b Cmdliner.Term.t -&amp;gt; 1021 + ('a * 'b) Cmdliner.Term.t (* ( and* ) is product. *)&lt;/pre&gt; 1022 + &lt;p&gt;The output of this improved prompt is much better:&lt;/p&gt; 1023 + &lt;pre&gt;The module provides operators to map and combine terms, which represent command line argument parsers and their results. `let+` transforms a parsed argument 1024 + into a new value, while `and+` merges two independent arguments into a tuple. These enable building structured command line interfaces, such as parsing a 1025 + filename and a flag simultaneously, then combining them into a configuration record.&lt;/pre&gt; 1026 + &lt;p&gt;It still occasionally incorrectly decides that this module provides monadic combinators rather than applicative, but this is where we get better results from using a larger model.&lt;/p&gt; 1027 + &lt;p&gt;There are quite a few other issues that we've fixed - for example, treating module types differently than modules, and a bug where infix operators were being omitted from the documentation. In one case, it uncovered a bug in the markdown generator where includes weren't getting expanded, which I got &lt;a href=&quot;https://github.com/jonludlam/odoc/commit/926cca100c307818e57281c3d40e98f1975f0f95&quot;&gt;Claude to fix&lt;/a&gt;. My &lt;i&gt;modus operandi&lt;/i&gt; has essentially been to look at the output for the test packages, find a summary that looks bonkers, and then look back at the prompt to find that, indeed, the input was missing some crucial information.&lt;/p&gt; 1028 + &lt;p&gt;One thing I'd quite like to try is to re-open a &lt;a href=&quot;https://github.com/ocaml/odoc/pull/655&quot;&gt;PR that Drup made&lt;/a&gt; as an April Fool's joke back in 2001, which ended up outputting OCaml formatted code. This is actually pretty close to what we might want to give to the LLM - though we'd probably format the comments as markdown, and we'd be replacing types with fully-qualified types as above. Funny how things turn out!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/07/week27.html</id><title type="text">Weeks 24-27</title><updated>2025-07-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Some brief notes on last week.</summary><published>2025-06-09T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/06/week23.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;week-23&quot;&gt;&lt;a href=&quot;#week-23&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Week 23&lt;/h1&gt; 1029 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;opam-format,fpath,rresult,bos&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1030 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;merlinonly&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;merlinonly&lt;/span&gt; &lt;/li&gt;&lt;/ul&gt; 1031 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-06-09&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1032 + &lt;p&gt;Some brief notes on last week.&lt;/p&gt; 1033 + &lt;h2 id=&quot;docs-ci-and-sherlodoc&quot;&gt;&lt;a href=&quot;#docs-ci-and-sherlodoc&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Docs CI and Sherlodoc&lt;/h2&gt; 1034 + &lt;p&gt;Anil has been working on an &lt;a href=&quot;https://tangled.sh/@anil.recoil.org/odoc-mcp&quot;&gt;MCP server&lt;/a&gt; that searches through the output of the docs CI to find relevant packages and API information for opam packages. For expediency, this works by scraping the HTML output, but a potentially better solution would be to integrate properly with &lt;a href=&quot;https://doc.sherlocode.com&quot;&gt;Sherlodoc&lt;/a&gt;, &lt;a href=&quot;https://github.com/art-w/&quot;&gt;Arthur's&lt;/a&gt; code search engine.&lt;/p&gt; 1035 + &lt;h3 id=&quot;building-the-index&quot;&gt;&lt;a href=&quot;#building-the-index&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Building the index&lt;/h3&gt; 1036 + &lt;p&gt;To make this work with the new docs CI, first we need to build a sherlodoc search database. This involves grabbing all of the &lt;code&gt;.odocl&lt;/code&gt; files that odoc produces for the latest version of each library, copying them locally and running &lt;code&gt;sherlodoc index&lt;/code&gt; on the output. Getting &lt;em&gt;all&lt;/em&gt; of the odocl files is simple, but filtering so we only have the latest version is slightly more complex, as we need to use &lt;code&gt;opam&lt;/code&gt;'s library to make sure we're looking at the latest versions.&lt;/p&gt; 1037 + &lt;p&gt;The simple way to get the odocl files ends up unpacking them into the filesystem in a directory hierarchy that matches the URL on ocaml.org, so we see files like:&lt;/p&gt; 1038 + &lt;pre&gt;p/odoc/3.0.0/doc/odoc.document/odoc_document.odocl&lt;/pre&gt; 1039 + &lt;p&gt;So finding the version number is a matter of listing the directories, for which I took &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci/blob/4dfe7e6265610da4e0ce2a386cfbf0b8eac3d9bd/src/lib/track.ml#L58-L76&quot;&gt;some code&lt;/a&gt; from docs CI:&lt;/p&gt; 1040 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type p = { 1041 + opam : OpamPackage.t; 1042 + path : Fpath.t; 1043 + } 1044 + 1045 + let rec take n l = 1046 + match n, l with 1047 + | n, x::xs when n &amp;gt; 0 -&amp;gt; 1048 + x :: take (n - 1) xs 1049 + | _, _ -&amp;gt; [] 1050 + 1051 + let get_versions ~limit path = 1052 + let open Rresult in 1053 + let package = Fpath.basename path in 1054 + let mk_pkg v = 1055 + Printf.sprintf &amp;quot;%s.%s&amp;quot; package v 1056 + in 1057 + Bos.OS.Dir.contents path 1058 + &amp;gt;&amp;gt;| (fun versions -&amp;gt; 1059 + versions 1060 + |&amp;gt; List.map (fun path -&amp;gt; 1061 + { opam = Fpath.basename path |&amp;gt; mk_pkg |&amp;gt; OpamPackage.of_string; 1062 + path = path }) 1063 + ) 1064 + |&amp;gt; Result.get_ok 1065 + |&amp;gt; (fun v -&amp;gt; 1066 + v 1067 + |&amp;gt; List.sort (fun a b -&amp;gt; 1068 + -OpamPackage.compare a.opam b.opam) 1069 + |&amp;gt; take limit)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1070 + &lt;p&gt;This gives us a sorted list of the versions for the package, and we can pick the first one to get the latest version. With the output from this we can then run &lt;code&gt;sherlodoc index&lt;/code&gt; and we get a nice big (1.7 gig!) index file.&lt;/p&gt; 1071 + &lt;h3 id=&quot;serving-the-index&quot;&gt;&lt;a href=&quot;#serving-the-index&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Serving the index&lt;/h3&gt; 1072 + &lt;p&gt;The next step is to serve this index file so that the MCP server can access it. The file format is a marshalled OCaml value, so we need to have an OCaml program to read it in and perform the search, and it'll have to be a server since the whole index needs to be unmarshalled into memory before any search can be performed, and it would be dumb to do this for every query.&lt;/p&gt; 1073 + &lt;p&gt;Sherlodoc got partially integrated into odoc's code base before the 3.0 release with the exception of the server, which was left out to avoid pulling a load of new dependencies to odoc. Unfortunately, we didn't expose the sherlodoc libraries publicly, so we'll need to do that in order to make anything useful with sherlodoc. In addition, odoc embeds the version of odoc used into the odocl files, and since right now the docs CI is building with a version of odoc that &lt;em&gt;doesn't&lt;/em&gt; expose the libraries, we might have to hack around that in order to use those odocl files. Obviously the longer term solution is just to make a new release of odoc with this change and update the docs CI to use that.&lt;/p&gt; 1074 + &lt;h2 id=&quot;package-to-library-map&quot;&gt;&lt;a href=&quot;#package-to-library-map&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Package to Library map&lt;/h2&gt; 1075 + &lt;p&gt;A related quest was to assemble a map of opam package to ocamlfind library names. It's a quirk of history that the library names that an opam package provides are not necessarily related to the name of the package. That means that tools like &lt;code&gt;dune&lt;/code&gt; have a hard time linting projects to check that the libraries they're using are mentioned in the opam files. Dune, of course, has resolved this be ensuring that it's an error to build a package using dune where the library names don't start with the package name, but as dune is just one of many OCaml build systems, the problem remains.&lt;/p&gt; 1076 + &lt;p&gt;Since docs CI has built every version of every package, and because the Odoc 3 package layout includes the library names in the paths to the documentation, we should be able to produce a fairly definitive list of the libraries that each package provides, which tools like dune can then use for this sort of lint check. We can just tweak the code above slightly to get the library names and output a big JSON file with the mapping - or perhaps with the exceptions to dune's rule.&lt;/p&gt; 1077 + &lt;p&gt;I thought this would be a neat first project to try Claude Code on - a 'starter for ten' - as it were, so I signed up to use Claude code and gave it a shot.&lt;/p&gt; 1078 + &lt;p&gt;It handily produced a working program that did exactly what I wanted, including creating a test directory that it used to verify the code worked. One fascinating thing I noted as it scrolled past was that it tried to use &lt;code&gt;yojson&lt;/code&gt; to write the output, but failed to get it to work and reverted back to &lt;code&gt;printf&lt;/code&gt; output. I suspect this will be due to it finding it troublesome to figure out the various steps that need to be taken to use a new library in a dune project, so this is something to have a play with later.&lt;/p&gt; 1079 + &lt;p&gt;After a couple of iterations with different heuristics to disambiguate between library names and other directories, I got a working program producing a JSON file with only the exceptions to dune's rule. I took a look through and almost immediately found &lt;code&gt;camlidl&lt;/code&gt; suggesting it produces a library called &lt;code&gt;com&lt;/code&gt;. This didn't look right at all so I installed it and found that the library is actually named &lt;code&gt;camlidl&lt;/code&gt;. The &lt;code&gt;cma&lt;/code&gt; file, though, is named &lt;code&gt;com.cma&lt;/code&gt;, so it looks like &lt;code&gt;odoc_driver&lt;/code&gt; has a bug. Interestingly, running &lt;code&gt;odoc_driver&lt;/code&gt; locally gets the library name correct, so it's only an issue when running it in the docs CI. &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1351&quot;&gt;Issue filed&lt;/a&gt;.&lt;/p&gt; 1080 + &lt;h2 id=&quot;further-claude-code-experiments&quot;&gt;&lt;a href=&quot;#further-claude-code-experiments&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Further claude code experiments&lt;/h2&gt; 1081 + &lt;p&gt;To see how well Claude Code could handle more complex tasks, I thought I'd give it a whirl on something more like its home territory, and somewhere where I was less familiar. I decided to ask it to write some code to make a nicer editor experience for the notebooks project. Since the &lt;a href=&quot;https://github.com/jonludlam/jsoo-code-mirror&quot;&gt;bindings to codemirror&lt;/a&gt; I'm using are very minimal, any new features I want to use end up with needing to write a bunch of bindings first, and only then seeing if the feature works as I'd like. So instead I thought I'd get claude to write the editor code for me in javascript, and then I could make sure it works as I want and only then convert it to OCaml. This worked pretty nicely, and I've now got a neat &lt;a href=&quot;https://jon.ludl.am/experiments/notebook-editor/notebook-editable.html&quot;&gt;demonstration editor&lt;/a&gt; that I can use to guide the OCaml implementation.&lt;/p&gt; 1082 + &lt;h2 id=&quot;more-notebook-work&quot;&gt;&lt;a href=&quot;#more-notebook-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;More notebook work&lt;/h2&gt; 1083 + &lt;p&gt;The oxcaml project will be launching this week hopefully. I've been looking at Luke's &lt;a href=&quot;https://github.com/ocaml-flambda/flambda-backend/pull/3886&quot;&gt;Parallelism tutorial&lt;/a&gt; and have been thinking about how this will work with the online notebooks. The parallel library works via effects, and the oxcaml branch of js_of_ocaml doesn't support effects yet, and it might be a while before it does. However, the blog post is mainly talking about the intricacies of the type system work that's been done to ensure the parallel library is safe, and as such perhaps we can get a lot out of doing this online with just Merlin.&lt;/p&gt; 1084 + &lt;p&gt;Some early experimentation showed that the parallel library breaks the worker on load, so we need to do something a bit more sophisticated than just 'not call exec', so I did some work to have a mode of worker that doesn't load the cmas, just the cmis for Merlin. This is almost there.&lt;/p&gt; 1085 + &lt;h2 id=&quot;odoc-work&quot;&gt;&lt;a href=&quot;#odoc-work&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc work&lt;/h2&gt; 1086 + &lt;p&gt;Ocaml 5.4 is just around the corner, and there's some odoc work to be done before the release. One of the main new features that will impact odoc is the new &lt;a href=&quot;https://tyconmismatch.com/papers/ml2024_labeled_tuples.pdf&quot;&gt;labelled tuples&lt;/a&gt; feature. Fortunately &lt;a href=&quot;https://github.com/lukemaurer&quot;&gt;Luke Maurer&lt;/a&gt; has already done a lot of work to plumb this into odoc, so this will save us a lot of work - thanks, Luke! There's likely to be a few other bits and pieces to do, but hopefully not too much.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/06/week23.html</id><title type="text">Week 23</title><updated>2025-06-09T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The docs build is progress well, and we've hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any in...</summary><published>2025-05-29T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/docs-progress.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;progress-in-ocaml-docs&quot;&gt;&lt;a href=&quot;#progress-in-ocaml-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Progress in OCaml docs&lt;/h1&gt; 1087 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-29&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1088 + &lt;p&gt;The docs build is progress well, and we've &lt;i&gt;just about&lt;/i&gt; hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any insights to be gained.&lt;/p&gt; 1089 + &lt;p&gt;Odoc requires a built package in order to generate the docs, there are two steps that have to be done before we can begin building the docs. Step one is to figure out the exact set of packages to build - ie, doing an opam solve, and step two is to actually build the packages. These two steps are, to some extent, out of docs-ci's control, and rely on the state of opam repository. While there are efforts to keep this in as good a state as possible, it's still the case that these steps fail much more often than the actual docs build itself. Let's take a look at some of the failures we see in each of these steps.&lt;/p&gt; 1090 + &lt;h2 id=&quot;step-1:-opam-solve&quot;&gt;&lt;a href=&quot;#step-1:-opam-solve&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 1: opam solve&lt;/h2&gt; 1091 + &lt;p&gt;There are 2,074 solver failures. A good chunk of these are due to the way docs ci works itself, that it starts with a specific version of OCaml. In order to do this, the solution must have a specific version of OCaml in it, and this is not always the case, for example, all of the &lt;code&gt;conf-*&lt;/code&gt; packages fail in this way. This particular class of &amp;quot;failures&amp;quot; is not at all important, as mostly they don't contain useful documentation, but even if they do, if they're actually being used then they will be built as part of another solution. For example, while &lt;code&gt;conf-faad&lt;/code&gt; fails with this error, the solution of the &lt;code&gt;faad&lt;/code&gt; package pulls it in anyway, so we can build any docs that it includes. Roughly two thirds (685) of the reported failures are due to this, and by checking the resulting HTML docs we can see that we do get docs for 278 of these, so they must be pulled in by other solutions.&lt;/p&gt; 1092 + &lt;p&gt;The remaining failures are &amp;quot;real&amp;quot; in the sense that we never currently get docs for these packages. In turn, these can be subcategorised. One class of failures happen with platform-specific packages, for example &lt;code&gt;camlkit&lt;/code&gt; which provides bindings to Cocoa frameworks, and is only available on macOS, or &lt;code&gt;eio_windows&lt;/code&gt; which clearly requires Windows. The current docs-ci setup only builds on Linux, and extending this to other platforms will require a little more work, and is not currently scheduled. These are &amp;quot;fixable&amp;quot; failures.&lt;/p&gt; 1093 + &lt;p&gt;The third class of failures are those that will &amp;quot;just never work&amp;quot;. For example, some early versions of &lt;code&gt;domainslib&lt;/code&gt; were released before the OCaml 5.0 APIs were finalised, and so they can only work with alpha versions of OCaml 5.0. We won't be documenting these.&lt;/p&gt; 1094 + &lt;p&gt;Finally there are some more 'unexplained' failures, such as &lt;code&gt;docteur.0.0.2&lt;/code&gt;. This one's particularly interesting as the solve actually succeeds when using the stand-alone tool opam-0install, whereas it's failing in docs-ci, which uses opam-0install as a library! I'm currently suspicious about the 'deprecated' flag, as the failure log contains:&lt;/p&gt; 1095 + &lt;pre&gt;- git-cohttp-unix -&amp;gt; (problem) 1096 + No usable implementations: 1097 + git-cohttp-unix.3.6.0: Availability condition not satisfied 1098 + git-cohttp-unix.3.5.0: Availability condition not satisfied 1099 + git-cohttp-unix.3.4.0: Availability condition not satisfied 1100 + git-cohttp-unix.3.3.3: Availability condition not satisfied 1101 + git-cohttp-unix.3.3.2: Availability condition not satisfied 1102 + ...&lt;/pre&gt; 1103 + &lt;p&gt;and that flag is the only thing I can immediately see that stands out in &lt;code&gt;git-cohttp-unix&lt;/code&gt;. In contrast, the solution given by opam-0install contains &lt;code&gt;git-cohttp-unix.3.6.0&lt;/code&gt; as a dependency. I suspect fixing this will cause quite a few more packages to succeed.&lt;/p&gt; 1104 + &lt;h2 id=&quot;step-2:-building-packages&quot;&gt;&lt;a href=&quot;#step-2:-building-packages&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 2: building packages&lt;/h2&gt; 1105 + &lt;p&gt;The next step, once we've got the solutions, is to build the packages. This is using the new method I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;previously wrote about&lt;/span&gt;. There are about 1,000 packages that fail to build, and once again we can take a look and categorise some of these failures. There are a wider variety of failures here, and it's quite useful to cross-check with &lt;em&gt;opam health check&lt;/em&gt; to see if it's known to be broken. Unfortunately OHC only builds the latest versions of everything, so we can't check in some cases. The interesting issues are where we're failing to build something that seems to work in OHC.&lt;/p&gt; 1106 + &lt;h3 id=&quot;llvm.18&quot;&gt;&lt;a href=&quot;#llvm.18&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;llvm.18&lt;/h3&gt; 1107 + &lt;p&gt;This is an interesting type of error, where the build fails because of a missing external dependency. The &lt;code&gt;llvm&lt;/code&gt; package depends upon &lt;code&gt;conf-llvm-static.18&lt;/code&gt;, which should be able to install the depext. Looking at the package, it does indeed have a depext for Debian:&lt;/p&gt; 1108 + &lt;pre&gt;depexts: [ 1109 + [&amp;quot;llvm@18&amp;quot; &amp;quot;zstd&amp;quot;] {os-distribution = &amp;quot;homebrew&amp;quot; &amp;amp; os = &amp;quot;macos&amp;quot;} 1110 + [&amp;quot;llvm-18&amp;quot;] {os-distribution = &amp;quot;macports&amp;quot; &amp;amp; os = &amp;quot;macos&amp;quot;} 1111 + [&amp;quot;llvm-18-dev&amp;quot; &amp;quot;zlib1g-dev&amp;quot; &amp;quot;libzstd-dev&amp;quot;] {os-family = &amp;quot;debian&amp;quot;} 1112 + [&amp;quot;llvm18-dev&amp;quot;] {os-distribution = &amp;quot;alpine&amp;quot;} 1113 + [&amp;quot;llvm18&amp;quot;] {os-family = &amp;quot;arch&amp;quot;} 1114 + [&amp;quot;llvm18-devel&amp;quot;] {os-family = &amp;quot;suse&amp;quot; | os-family = &amp;quot;opensuse&amp;quot;} 1115 + [&amp;quot;llvm18-devel&amp;quot;] {os-distribution = &amp;quot;fedora&amp;quot; &amp;amp; os-version &amp;gt;= &amp;quot;41&amp;quot;} 1116 + [&amp;quot;llvm-devel&amp;quot;] {os-distribution = &amp;quot;fedora&amp;quot; &amp;amp; os-version = &amp;quot;40&amp;quot;} 1117 + [&amp;quot;llvm18-devel&amp;quot; &amp;quot;epel-release&amp;quot;] {os-distribution = &amp;quot;centos&amp;quot;} 1118 + [&amp;quot;devel/llvm18&amp;quot;] {os = &amp;quot;freebsd&amp;quot;} 1119 + ]&lt;/pre&gt; 1120 + &lt;p&gt;However, in Debian 12, they've already updated to &lt;code&gt;llvm-19&lt;/code&gt;, so the depext is not available.&lt;/p&gt; 1121 + &lt;h3 id=&quot;camlimages.5.0.5&quot;&gt;&lt;a href=&quot;#camlimages.5.0.5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;camlimages.5.0.5&lt;/h3&gt; 1122 + &lt;p&gt;This one fails due to a linking error. Oddly enough it does seem to work in OHC.&lt;/p&gt; 1123 + &lt;pre&gt;(cd _build/default &amp;amp;&amp;amp; /home/opam/.opam/4.14/bin/ocamlmklib.opt -g -o freetype/camlimages_freetype_stubs freetype/ftintf.o -ldopt -lfreetype) 1124 + # /usr/bin/ld: freetype/ftintf.o: warning: relocation against `Caml_state' in read-only section `.text' 1125 + # /usr/bin/ld: freetype/ftintf.o: relocation R_X86_64_PC32 against undefined symbol `Caml_state' can not be used when making a shared object; recompile with -fPIC 1126 + # /usr/bin/ld: final link failed: bad value&lt;/pre&gt; 1127 + &lt;h3 id=&quot;ahrocksdb.0.2.2&quot;&gt;&lt;a href=&quot;#ahrocksdb.0.2.2&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ahrocksdb.0.2.2&lt;/h3&gt; 1128 + &lt;p&gt;This one fails in OHC too, but it looks like it's a build failure with more recent gccs, fixed upstream: https://github.com/ahrefs/ocaml-ahrocksdb/commit/e52316b3d30fededac023141bf8b47da79cabfed&lt;/p&gt; 1129 + &lt;pre&gt;# run: gcc -O2 -fno-strict-aliasing -fwrapv -fPIC -pthread -I/usr/include/rocksdb -I /home/opam/.opam/5.3/lib/ocaml -o /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.exe /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c -lm -lpthread -lrocksdb 1130 + # -&amp;gt; process exited with code 1 1131 + # -&amp;gt; stdout: 1132 + # -&amp;gt; stderr: 1133 + # | In file included from /tmp/build_02b340_dune/ocaml-configuratordc5e55/c-test-2/test.c:4: 1134 + # | /usr/include/rocksdb/version.h:7:10: fatal error: string: No such file or directory 1135 + # | 7 | #include &amp;lt;string&amp;gt; 1136 + # | | ^~~~~~~~ 1137 + # | compilation terminated. 1138 + # Error: discover error&lt;/pre&gt; 1139 + &lt;h3 id=&quot;alt-ergo.2.2.0&quot;&gt;&lt;a href=&quot;#alt-ergo.2.2.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;alt-ergo.2.2.0&lt;/h3&gt; 1140 + &lt;p&gt;Looks like it's trying to write outside the sandbox. The failure only occurs on alt-ergo 1.3.0 - 2.2.0.&lt;/p&gt; 1141 + &lt;pre&gt;# mkdir -p /home/opam/.opam/4.14/man/man1 1142 + # cp -f doc/alt-ergo.1 /home/opam/.opam/4.14/man/man1 1143 + # mkdir -p /usr/local/lib/alt-ergo/preludes 1144 + # mkdir: cannot create directory '/usr/local/lib/alt-ergo': Permission denied 1145 + # make: *** [Makefile.users:243: install-preludes] Error 1&lt;/pre&gt; 1146 + &lt;h3 id=&quot;ctypes-foreign.0.18.0&quot;&gt;&lt;a href=&quot;#ctypes-foreign.0.18.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ctypes-foreign.0.18.0&lt;/h3&gt; 1147 + &lt;p&gt;This one is a much more interesting failure. The logs show:&lt;/p&gt; 1148 + &lt;pre&gt;[ERROR] No solution for ctypes-foreign.0.18.0: * Missing dependency: 1149 + - ctypes-foreign -&amp;gt; ctypes 1150 + unknown package&lt;/pre&gt; 1151 + &lt;p&gt;which is happening because of the optimisation I &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;mentioned before&lt;/span&gt; where we build a new &lt;code&gt;opam-repository&lt;/code&gt; with only the packages we're going to need. In this case, we've somehow missed out the &lt;code&gt;ctypes&lt;/code&gt; package. Looking at the opam file for &lt;code&gt;ctypes-foreign&lt;/code&gt;, it has a &lt;code&gt;post&lt;/code&gt; dependency on &lt;code&gt;ctypes&lt;/code&gt;. The &lt;code&gt;post&lt;/code&gt; keyword indicates that &lt;code&gt;ctypes&lt;/code&gt; should be installed with &lt;code&gt;ctypes-foreign&lt;/code&gt;, but that having it as a &amp;quot;normal&amp;quot; dependency would introduce a dependency cycle. Since we require a DAG of dependencies, we explicitly remove any &lt;code&gt;post&lt;/code&gt; dependencies from the set of packages to build, but it seems that &lt;code&gt;opam&lt;/code&gt; would like to know about it anyway!&lt;/p&gt; 1152 + &lt;h3 id=&quot;others&quot;&gt;&lt;a href=&quot;#others&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;others&lt;/h3&gt; 1153 + &lt;p&gt;There are many more. An automatic cross-check with OHC would be really useful, mainly to distinguish between the packages that are broken due to &lt;code&gt;ocaml-docs-ci&lt;/code&gt; issues (like &lt;code&gt;ctypes-foreign&lt;/code&gt;) and those that are broken for other reasons (like &lt;code&gt;ahrocksdb&lt;/code&gt;).&lt;/p&gt; 1154 + &lt;h2 id=&quot;step-3:-building-docs&quot;&gt;&lt;a href=&quot;#step-3:-building-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Step 3: building docs&lt;/h2&gt; 1155 + &lt;p&gt;Finally, we have the actual docs build. This is where we run &lt;code&gt;odoc&lt;/code&gt; and &lt;code&gt;odoc_driver&lt;/code&gt; to produce the HTML docs. All the errors here are ones that we should be able to fix!&lt;/p&gt; 1156 + &lt;p&gt;Firstly, there are the internal errors:&lt;/p&gt; 1157 + &lt;pre&gt;Uncaught exception: Failure(&amp;quot;\&amp;quot;rm\&amp;quot; \&amp;quot;-rf\&amp;quot; \&amp;quot;/var/cache/obuilder/merged/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&amp;quot; \&amp;quot;/var/cache/obuilder/work/582e973685d380d4c91eadc2611eee02c82c5fe4f8bd732e0080fb22bc4404cd\&amp;quot; failed with exit status 1&amp;quot;) 1158 + 2025-05-22 09:30.18: Job failed: Failed: Internal error&lt;/pre&gt; 1159 + &lt;p&gt;These are some &lt;code&gt;obuilder&lt;/code&gt; error that needs fixing. Currently we're just rerunning the job to fix these.&lt;/p&gt; 1160 + &lt;h3 id=&quot;odoc.2.0.0&quot;&gt;&lt;a href=&quot;#odoc.2.0.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;odoc.2.0.0&lt;/h3&gt; 1161 + &lt;p&gt;Oops, we can't build our own docs! At least it's an old version :-)&lt;/p&gt; 1162 + &lt;pre&gt;odoc: internal error, uncaught exception: 1163 + File &amp;quot;src/html/link.ml&amp;quot;, line 101, characters 16-22: Assertion failed 1164 + Raised at Odoc_html__Link.href in file &amp;quot;src/html/link.ml&amp;quot;, line 101, characters 16-57 1165 + Called from Odoc_html__Generator.internallink in file &amp;quot;src/html/generator.ml&amp;quot;, line 108, characters 19-49 1166 + ...&lt;/pre&gt; 1167 + &lt;p&gt;The failure points &lt;a href=&quot;https://github.com/ocaml/odoc/blob/42190737339d9be4510eeeb0e3c47e84badf4d73/src/html/link.ml#L101&quot;&gt;here&lt;/a&gt;, an assertion about the common ancestor of two paths. &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1345&quot;&gt;Issue filed&lt;/a&gt;.&lt;/p&gt; 1168 + &lt;h3 id=&quot;ocaml-base-compiler.4.07.0&quot;&gt;&lt;a href=&quot;#ocaml-base-compiler.4.07.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;ocaml-base-compiler.4.07.0&lt;/h3&gt; 1169 + &lt;p&gt;This one happens because of our &amp;quot;optimisation&amp;quot; to use a base image with OCaml pre-installed. What we &lt;i&gt;actually&lt;/i&gt; do is find the major/minor version of OCaml and use the corresponding docker image - so in this case we'll use ocaml/opam:debian-12-ocaml-4.07. Now this image actually contains OCaml 4.07.1, and the format of &lt;code&gt;cmt&lt;/code&gt; and &lt;code&gt;cmti&lt;/code&gt; files changed between these releases, so we get a failure.&lt;/p&gt; 1170 + &lt;p&gt;We'll fix this by getting rid of the optimisation and building from an empty switch.&lt;/p&gt; 1171 + &lt;h3 id=&quot;lascar.0.7.0&quot;&gt;&lt;a href=&quot;#lascar.0.7.0&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;lascar.0.7.0&lt;/h3&gt; 1172 + &lt;p&gt;This one is quite interesting. It's another assertion failure in odoc:&lt;/p&gt; 1173 + &lt;pre&gt;odoc: internal error, uncaught exception: 1174 + File &amp;quot;src/xref2/cpath.ml&amp;quot;, line 364, characters 37-43: Assertion failed 1175 + Raised at Odoc_xref2__Cpath.unresolve_resolved_parent_path in file &amp;quot;src/xref2/cpath.ml&amp;quot;, line 364, characters 37-49 1176 + Called from Odoc_xref2__Cpath.unresolve_module_path in file &amp;quot;src/xref2/cpath.ml&amp;quot;, line 349, characters 28-60 1177 + Called from Odoc_xref2__Tools.fragmap.map_module_decl in file &amp;quot;src/xref2/tools.ml&amp;quot;, line 1792, characters 48-80&lt;/pre&gt; 1178 + &lt;p&gt;It's happening when we 'unresolve' a previously resolved path. We end up having to do this when something about the path has changed, in this case while we're handling a &lt;code&gt;S with module Foo = Bar&lt;/code&gt; or similar. Issue &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1346&quot;&gt;filed&lt;/a&gt;.&lt;/p&gt; 1179 + &lt;h3 id=&quot;camlp5&quot;&gt;&lt;a href=&quot;#camlp5&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;camlp5&lt;/h3&gt; 1180 + &lt;p&gt;This one actually occurs in &lt;code&gt;odoc_driver&lt;/code&gt; rather than in &lt;code&gt;odoc&lt;/code&gt; itself.&lt;/p&gt; 1181 + &lt;pre&gt;odoc_driver_voodoo: [DEBUG] Found cmi_only_lib in dir: /home/opam/.opam/4.08/lib/camlp5 1182 + odoc_driver_voodoo: internal error, uncaught exception: 1183 + Invalid_argument(&amp;quot;\&amp;quot;/home/opam/.opam/4.08/lib/camlp5\&amp;quot;: invalid segment&amp;quot;) 1184 + &lt;/pre&gt; 1185 + &lt;p&gt;Here we're trying to add a segment to a path, but rather than a single path segment we've got an entire fully qualified path. Issue &lt;a href=&quot;https://github.com/ocaml/odoc/issues/1347&quot;&gt;filed&lt;/a&gt;.&lt;/p&gt; 1186 + &lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Conclusion&lt;/h2&gt; 1187 + &lt;p&gt;It's pretty good that we've only got 4 types of error happening at the doc-generation phase. However, as a whole, any error that occurs earlier in the pipeline ends up with a missing documentation tab on the website, and we need to do a bit more so that the actual problem can be tracked down and fixed. This is obviously a more general problem than just the docs, and one that &lt;a href=&quot;https://check.ci.ocaml.org&quot;&gt;opam health check&lt;/a&gt; seeks to highlight. However, the current incarnation of OHC is significantly less efficient than docs-ci, so generalising the approach we've taken with &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;opamh&lt;/a&gt; should really help with making this more responsive.&lt;/p&gt; 1188 + &lt;p&gt;In addition, a number of the issues seen here could be addressed with a tool my colleague &lt;a href=&quot;https://ryan.freumh.org/&quot;&gt;Ryan&lt;/a&gt; is working on: &lt;a href=&quot;https://ryan.freumh.org/enki.html&quot;&gt;Enki&lt;/a&gt;. This tool would allow us to run a solve that actually determines not only the set of packages we wish to install, but the platform to install onto - e.g. for &lt;code&gt;eio_windows&lt;/code&gt; the solution would be to install on Windows, and for &lt;code&gt;llvm.18-static&lt;/code&gt; the solution might be Fedora 40.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/docs-progress.html</id><title type="text">Progress in OCaml docs</title><updated>2025-05-29T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, s...</summary><published>2025-05-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/lots-of-things.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;lots-of-things-have-been-happening&quot;&gt;&lt;a href=&quot;#lots-of-things-have-been-happening&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Lots of things have been happening&lt;/h1&gt; 1189 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1190 + &lt;p&gt;I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, so I haven't written up anything for a while.&lt;/p&gt; 1191 + &lt;p&gt;Time for a little summary of things then!&lt;/p&gt; 1192 + &lt;h2 id=&quot;ocaml-docs-ci&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Ocaml-docs-ci&lt;/h2&gt; 1193 + &lt;p&gt;I've been working with &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; on getting the docs CI running using Odoc 3.0. There are quite a few changes involved, both in how we're &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/ocaml-docs-ci-and-odoc-3&quot;&gt;building the packages&lt;/span&gt; but also how we're running odoc - it's building using &lt;code&gt;odoc_driver&lt;/code&gt; rather than &lt;code&gt;voodoo&lt;/code&gt; now, and while it's looking promising now we had hit a few hurdles along the way.&lt;/p&gt; 1194 + &lt;p&gt;We set the CI going last weekend but discovered that it was having some issues building packages using OCaml 5.3.0. The way the builds work is that we first do a &amp;quot;solve&amp;quot; step for each version of every package so we've got exact versions of all of the packages required to build them. We then look through that solution to figure out the version of OCaml required, and the (to avoid a little bit of work) we start from one of the &lt;a href=&quot;https://hub.docker.com/r/ocaml/opam&quot;&gt;opam docker images&lt;/a&gt; for that version of OCaml.&lt;/p&gt; 1195 + &lt;p&gt;When installing a package using opam it does a few operations that scale with the size of the opam repository, which ends up adding around ten of seconds to the build time. When we're building 50,000 packages, this adds up to quite a lot of time, so we short-cut this process with the simple expedient of creating an opam-repository that only contains the packages we need for the build. However, since we've already got a few packages installed in the image, we need to make sure our repository contains these packages too, otherwise opam gets thoroughly confused. My mistake was that we were missing out the `ocaml-compiler` package, which is new in OCaml 5.3.0, which led to the builds failing. Adding this in and kicking off the build again it's now got a lot further - at time of writing it has built 14,000 packages, there are 6,000 still building, and 1000 that have failed. If it continues in a similar fashion, this will compare quite favourably with the docs CI that's currently powering ocaml.org, where it has successfully built 17,000 packages, and 4,500 have failed.&lt;/p&gt; 1196 + &lt;p&gt;Mark has been working on a different approach to the build process, which is to come up with a new binary that doesn't do any of the &lt;code&gt;O(n)&lt;/code&gt; operations and just builds the package! This is definitely a promising direction, and I'm hoping to take a look at &lt;a href=&quot;https://github.com/mtelvers/ohc&quot;&gt;his prototype&lt;/a&gt; soon.&lt;/p&gt; 1197 + &lt;p&gt;Meanwhile, &lt;a href=&quot;https://choum.net&quot;&gt;panglesd&lt;/a&gt; is working on integrating this into the ocaml.org site, and is making good progress. He spotted last week that we were overwriting the `status.json` file that comes out of `odoc_driver` which we will use to power the redirections on ocaml.org. One of the changes of odoc 3.0 is that we carefully put modules into a directory structure that represents the library in which they are found. It's long been a pain that OCaml libraries (what Ocamlfind unhelpfully calls 'packages') are not always the same name as the opam package in which they're found. For example, the package &lt;code&gt;ocamlfind&lt;/code&gt; contains the library &lt;code&gt;findlib&lt;/code&gt;. So to help the user figure out where to find the module, we're putting it into the URL of the docs, and therefore into the breadcrumbs. The downside is that the modules are now in a different place on the website to where they were before, so the &lt;code&gt;status.json&lt;/code&gt; file is there to help with the redirections.&lt;/p&gt; 1198 + &lt;h2 id=&quot;notebooks&quot;&gt;&lt;a href=&quot;#notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Notebooks&lt;/h2&gt; 1199 + &lt;p&gt;I've been working on Merlin integration with the notebooks, which has been a fun little project. The bits that needed improving most were that merlin didn't work with toplevel-style code, and that each cell was a separate typing context, so while you could define a function in one cell and execute it in another, Merlin would tell you the function was undefined.&lt;/p&gt; 1200 + &lt;p&gt;For the toplevel-style code, what I've ended up doing is to essentially strip out all of the toplevel bits and pieces, and replace them with whitespace. So where I have a cell that looks like:&lt;/p&gt; 1201 + &lt;pre&gt;# let x = 1 + 2;; 1202 + - val x : int = 3 1203 + # let y = x + 1;; 1204 + - val y : int = 4&lt;/pre&gt; 1205 + &lt;p&gt;I tell Merlin that the contents are:&lt;/p&gt; 1206 + &lt;pre&gt; let x = 1 + 2;; 1207 + 1208 + let y = x + 1;; 1209 + &lt;/pre&gt; 1210 + &lt;p&gt;where I'm careful to maintain the position of the original code. This bit is working quite nicely, but only when the code is syntactically correct, as I'm using the standard toplevel parser to figure out where the expression ends. I think I'm going to end up needing to write a custom parser for this, so something that will end on a &lt;code&gt;;;&lt;/code&gt; but ignore them in string constants, comments and so on.&lt;/p&gt; 1211 + &lt;p&gt;The approach I've taken for the second problem is to treat each cell as a separate module. I then write out a &lt;code&gt;cmi&lt;/code&gt; file into the virtual filesystem as &lt;code&gt;cell__id_0.cmi&lt;/code&gt; and &lt;code&gt;open&lt;/code&gt; all the previous modules in 'line 0' of every cell. I then remap all of the reported locations by removing 'line 0'.&lt;/p&gt; 1212 + &lt;p&gt;There are a number of issues with the current approaches: 1. The stripping of the toplevel bits is a little fragile, and currently only works when the toplevel is syntactically correct. This is fairly fixable. 2. When the contents of the cells change we need to flush any caches merlin and the compiler have. 3. An &lt;code&gt;open&lt;/code&gt; statement in once cell does _not_ cause the module to be available in the next cell. 4. A lot of cells leads to a lot of opens!&lt;/p&gt; 1213 + &lt;p&gt;I suspect that this the 'cells as modules' approach might end up being a bit of a dead-end, so I'll have a chat with &lt;a href=&quot;https://github.com/voodoos&quot;&gt;Ulysse&lt;/a&gt; to figure out the next experiment.&lt;/p&gt; 1214 + &lt;h2 id=&quot;oxcaml&quot;&gt;&lt;a href=&quot;#oxcaml&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Oxcaml&lt;/h2&gt; 1215 + &lt;p&gt;I've been working on trying out oxcaml too, which has been a bit challenging. Firstly, although Jane Street provide a version of &lt;code&gt;js_of_ocaml&lt;/code&gt;, the toplevel didn't work. Fortunately, my amazing colleagues &lt;a href=&quot;https://patrick.sirref.org/&quot;&gt;Patrick O'Ferris&lt;/a&gt; and &lt;a href=&quot;https://github.com/art-w&quot;&gt;Arthur Wendling&lt;/a&gt; spent a good chunk of time fixing this and provided an &lt;a href=&quot;https://github.com/patricoferris/opam-repository-js#with-extensions&quot;&gt;opam repository&lt;/a&gt; with the relevant changes, without which I would have not been able to make any progress. Thanks, guys! So my goal of making my notebooks work with it looked doable, but I almost immediately hit more dependency issues that make it problematic to port the whole site over - including odoc and various PPXes that I use.&lt;/p&gt; 1216 + &lt;p&gt;I've therefore decided that I would bring forward a feature that I'd had in mind for a while - that we could have different &amp;quot;backends&amp;quot; for the notebooks. So I'd still build the frontend using &amp;quot;normal&amp;quot; OCaml, but the web-worker serving as the toplevel would be an oxcaml one.&lt;/p&gt; 1217 + &lt;p&gt;Of course, it didn't work first time! After a bit of head-scratching, it turned out that the interface between the worker and the main thread, although I'd &lt;i&gt;almost&lt;/i&gt; got it ocaml-agnostic, wasn't quite right. The way it works is that it uses the jsonrpc protocol to communicate, and while it had marshalled the requests into a string, it hadn't turned that final OCaml string into a Javascript string, so it was sending the js_of_ocaml representation of a string as an object, rather than a simple string. When the frontend and workers were built with different compilers, this ended up just failing with an obscure error, which took a good deal of time to track down. Once that was fixed, it was just a case of making sure I could have 2 independent 'switches' on my site - one for oxcaml and one for standard OCaml.&lt;/p&gt; 1218 + &lt;p&gt;The upshot of all this is that I now have a semi-working version of the notebooks using oxcaml. As an initial demonstration I ported one of my colleague &lt;a href=&quot;https://github.com/cuihtlauac&quot;&gt;Cuihtlauac&lt;/a&gt;'s oxcaml tutorial docs to the notebook format, and it &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/oxcaml/local&quot;&gt;works quite nicely&lt;/span&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/lots-of-things.html</id><title type="text">Lots of things have been happening</title><updated>2025-05-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">My colleague and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has don...</summary><published>2025-05-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;solving-first-year-ocaml-exercises-with-ai&quot;&gt;&lt;a href=&quot;#solving-first-year-ocaml-exercises-with-ai&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Solving First-year OCaml exercises with AI&lt;/h1&gt; 1219 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1220 + &lt;p&gt;My colleague &lt;a href=&quot;https://toao.com&quot;&gt;Sadiq Jaffer&lt;/a&gt; and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has done an excellent &lt;a href=&quot;https://toao.com/blog/ocaml-local-code-models&quot;&gt;write up&lt;/a&gt; of our initial results, which you should all go and read! The tl;dr though, as Sadiq writes, is that even some of the smaller models would score top marks on these exercises!&lt;/p&gt; 1221 + &lt;p&gt;One interesting aspect we discovered quite quickly is that we had to make the testing feedback a little more generous than just &amp;quot;exception raised&amp;quot;! The problems are presented as a Jupyter notebook using &lt;a href=&quot;https://github.com/akabe&quot;&gt;akabe's&lt;/a&gt; excellent OCaml kernel, with &lt;a href=&quot;https://nbgrader.readthedocs.io/en/stable/&quot;&gt;nbgrader&lt;/a&gt; to do the assessment. Our students can see the tests that are run, and if they fail they're able to copy the test cell out and play with their code to figure out exactly what went wrong. The AI models, however, have a far less interactive experience, and get just 3 chances to write code that passes the tests. We found that the performance of the models increased hugely when we adjusted the test cells such that they clearly indicated which test failed, the results that were expected, and the results the code actually produced.&lt;/p&gt; 1222 + &lt;p&gt;Of course, we &lt;a href=&quot;https://anil.recoil.org/notes/claude-copilot-sandbox&quot;&gt;already knew&lt;/a&gt; that AI models can code OCaml very well, and we (along with the rest of the teaching world) are still ruminating on the implications of this from a pedagogical perspective. Our plan, though, is to try and make the 'problem' worse by training these models on more OCaml code, and see just how well we can get them to perform! It's pretty amazing, and a little startling to know that a model that'll run pretty comfortably on my laptop can solve these problems so well even without extra training, though given how hot it gets, I'd rather not have the laptop on my actual lap while it's doing so!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html</id><title type="text">Solving First-year OCaml exercises with AI</title><updated>2025-05-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</summary><published>2025-05-02T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;oxcaml-is-getting-closer...&quot;&gt;&lt;a href=&quot;#oxcaml-is-getting-closer...&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OxCaml is getting closer...&lt;/h1&gt; 1223 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-02&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1224 + &lt;p&gt;I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.&lt;/p&gt; 1225 + &lt;p&gt;It seems that mainly what needs to be done before the release can be made is to ensure there is some reasonable documentation for the new features, and that a reasonable number of packages are working, so people are furiously writing and bugfixing to try and get this ready.&lt;/p&gt; 1226 + &lt;p&gt;As well as this though, there are some challenges of a more organisational level that will need to be addressed to ensure the success of the project. Jane Street have long had a public branch of their compiler, but while they've had patches internally to ensure the tooling and other libraries work, these patches haven't previously been made public in a usable way. In order for OxCaml to be useful, it will clearly need these patches not only to be available, but also to be maintained and to easily allow contributions from the community -- in short, they need to be properly Open Source!&lt;/p&gt; 1227 + &lt;p&gt;Personally, I'm looking forward to seeing their branch of &lt;a href=&quot;https://ocaml.github.io/odoc/&quot;&gt;odoc&lt;/a&gt; and having a look to see how the modes will fit into the documentation. I'm also keen to see whether the &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/this-site&quot;&gt;notebook features&lt;/span&gt; I've been working on can be ported over to run on OxCaml!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html</id><title type="text">OxCaml is getting closer...</title><updated>2025-05-02T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text"> Today was the &quot;AI for Climate &amp; Nature Community Day&quot; at the . A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</summary><published>2025-05-01T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;ai-for-climate-&amp;amp;-nature-community-day&quot;&gt;&lt;a href=&quot;#ai-for-climate-&amp;amp;-nature-community-day&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;AI for Climate &amp;amp; Nature Community Day&lt;/h1&gt; 1228 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-05-01&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1229 + &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Melissa Leach&lt;/span&gt;&lt;/div&gt; 1230 + &lt;p&gt;&lt;i&gt;Melissa Leach introducing the day&lt;/i&gt; Today was the &amp;quot;AI for Climate &amp;amp; Nature Community Day&amp;quot; at the &lt;a href=&quot;https://map.cam.ac.uk/?maplon=0.12032&amp;amp;maplat=52.20354&amp;amp;mapzoom=18&amp;amp;maplayers=Building+Labels%2CExternal+Sites%2CColleges%2CUniversity+Sites%2CBuildings%2CTransport&amp;amp;mapfeature=mfid257%2CBuildings&quot;&gt;David Attenborough Building&lt;/a&gt;. A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.&lt;/p&gt; 1231 + &lt;h2 id=&quot;keynote-and-main-talks&quot;&gt;&lt;a href=&quot;#keynote-and-main-talks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Keynote and main talks&lt;/h2&gt; 1232 + &lt;p&gt;Following the intro talks from Professors &lt;a href=&quot;https://www.cambridgeconservation.org/about/people/prof-melissa-leach/&quot;&gt;Melissa Leach&lt;/a&gt; and &lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/bill-sutherland&quot;&gt;Bill Sutherland&lt;/a&gt;, the day started with the keynote talk from &lt;a href=&quot;https://www.biology.ox.ac.uk/people/amy-hinsley&quot;&gt;Amy Hinsley&lt;/a&gt;, who, using the specific case of animial trafficking, talked about the need to make AI in conservation equitable, explainable and useful.&lt;/p&gt; 1233 + &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Amy Hinsley&lt;/span&gt;&lt;/div&gt; 1234 + &lt;p&gt;&lt;i&gt;Amy Hinsley delivering the keynote talk&lt;/i&gt;&lt;/p&gt; 1235 + &lt;p&gt;We then moved into the first session with &lt;a href=&quot;https://www.geog.cam.ac.uk/people/lines/&quot;&gt;Emily Lines&lt;/a&gt; from the Geography Department who talked about the challenges processing sensor data in the context of forests. Her group has a variety of data collected from forests across Europe, collected from using many different methods, from drones taking pictures of the canopies to ground-based laser scanners producing 3d point clouds. The challenge is then not only to identify individual trees, which is pretty tricky, but also to then distinguish between the leaves of the trees and the wood.&lt;/p&gt; 1236 + &lt;p&gt;After Emily came &lt;a href=&quot;https://ai.cam.ac.uk/people/robert-rouse.html&quot;&gt;Robert Rouse&lt;/a&gt; from the &lt;a href=&quot;https://iccs.cam.ac.uk&quot;&gt;ICCS&lt;/a&gt;, who's using a small neural net and genetic algorithms to extend a study from the RSPB on figuring out an optimal way to do some land use adjustments to cut carbon and improve outcomes for birds, whilst not significantly impacting the ability to produce food.&lt;/p&gt; 1237 + &lt;p&gt;We then had &lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/dr-sam-reynolds&quot;&gt;Sam Reynolds&lt;/a&gt; and &lt;a href=&quot;https://toao.com&quot;&gt;Sadiq Jaffer&lt;/a&gt; who talked about their project; using AI to sift through millions of papers looking for those relevant to a specified conservation topic. They're able to directly compare their results with results obtained by manually doing this process, a project that's been going on over the last 20 or so years summing to something like 75 man years of effort. In the end they only missed a few papers that the manual process had found, but actually found many relevant papers that had been missed - and all in only a few days of compute.&lt;/p&gt; 1238 + &lt;div&gt;&lt;span class=&quot;xref-unresolved&quot;&gt;Sam Reynolds and Sadiq Jaffer&lt;/span&gt;&lt;/div&gt; 1239 + &lt;p&gt;&lt;i&gt;Sam Reynolds and Sadiq Jaffer sorting millions of papers&lt;/i&gt;&lt;/p&gt; 1240 + &lt;h2 id=&quot;lightning-talks&quot;&gt;&lt;a href=&quot;#lightning-talks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Lightning talks&lt;/h2&gt; 1241 + &lt;p&gt;We then had a number of 'lightning talks', with each presenter having only three minutes to talk about their work.&lt;/p&gt; 1242 + &lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://www.maths.cam.ac.uk/person/ss3299&quot;&gt;Sebastian Schemm&lt;/a&gt; presented his work on creating a foundational model for the climate&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.eng.cam.ac.uk/profiles/ac685&quot;&gt;Alice Cicirello&lt;/a&gt; talked about the prospects of applying machine learning to &lt;a href=&quot;https://en.wikipedia.org/wiki/Marine_cloud_brightening&quot;&gt;Marine Cloud Brightening&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.maths.cam.ac.uk/person/sdat2&quot;&gt;Simon Thomas&lt;/a&gt; has been looking at analysing the heights of tropical cyclone storm surges&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://github.com/niccolozanotti&quot;&gt;Niccolò Zanotti&lt;/a&gt; gave us an introduction to &lt;a href=&quot;https://github.com/cambridge-ICCS/FTorch&quot;&gt;FTorch&lt;/a&gt;, a library to integrate the worlds of PyTorch and Fortran&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.nceo.ac.uk/contact-us/people/dr-simon-driscoll/&quot;&gt;Simon Driscoll&lt;/a&gt; then talked about melt ponds on arctic sea ice, a poorly understood but important component of the climate in the Arctic.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/emilio-luz-ricca&quot;&gt;Emilio Luz-Ricca&lt;/a&gt; talked about his project to apply machine learning to predict hunting pressure&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://orlando-code.github.io&quot;&gt;Orlando Timmerman&lt;/a&gt; gave us some insights into how he's been using machine learning to predict the future of coral reefs, and how we might use this to help with their conservation.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.zoo.cam.ac.uk/directory/ruari-marshall-hawkes&quot;&gt;Ruari Marshall-Hawkes&lt;/a&gt; showed us how to listen very carefully to figure out population numbers,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/harriet-branson-a93a8313b/&quot;&gt;Hattie Branson&lt;/a&gt; from &lt;a href=&quot;https://www.fauna-flora.org&quot;&gt;Fauna &amp;amp; Flora&lt;/a&gt; talked about habitat detection in South Sudan,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/martakoch/&quot;&gt;Marta Koch&lt;/a&gt; showed us an analysis of how well ChatGPT, Claude and the like would perform at setting the agendas for SDPs,&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/zhengpeng-feng-2410a132a/&quot;&gt;Frank Feng&lt;/a&gt; finished the session with a talk on the &lt;a href=&quot;https://www.cst.cam.ac.uk/seminars/list/227335&quot;&gt;Barlow Twins Earth Foundation Model&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt; 1243 + 1244 + &lt;div style=&quot;display: grid; grid-template-columns: 1fr 1fr; gap: 20px;&quot;&gt; 1245 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1246 + &lt;img src=&quot;sebastian.jpg&quot; alt=&quot;Sebastian Schemm&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1247 + &lt;figcaption&gt;Sebastian Schemm&lt;/figcaption&gt; 1248 + &lt;/figure&gt; 1249 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1250 + &lt;img src=&quot;alice.jpg&quot; alt=&quot;Alice Cicirello&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1251 + &lt;figcaption&gt;Alice Cicirello&lt;/figcaption&gt; 1252 + &lt;/figure&gt; 1253 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1254 + &lt;img src=&quot;simon.jpg&quot; alt=&quot;Simon Thomas&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1255 + &lt;figcaption&gt;Simon Thomas&lt;/figcaption&gt; 1256 + &lt;/figure&gt; 1257 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1258 + &lt;img src=&quot;simond.jpg&quot; alt=&quot;Simon Driscoll&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1259 + &lt;figcaption&gt;Simon Driscoll&lt;/figcaption&gt; 1260 + &lt;/figure&gt; 1261 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1262 + &lt;img src=&quot;emilio.jpg&quot; alt=&quot;Emilio Luz-Ricca&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1263 + &lt;figcaption&gt;Emilio Luz-Ricca&lt;/figcaption&gt; 1264 + &lt;/figure&gt; 1265 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1266 + &lt;img src=&quot;orlando.jpg&quot; alt=&quot;Orlando Timmerman&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1267 + &lt;figcaption&gt;Orlando Timmerman&lt;/figcaption&gt; 1268 + &lt;/figure&gt; 1269 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1270 + &lt;img src=&quot;ruari.jpg&quot; alt=&quot;Ruari Marshall-Hawkes&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1271 + &lt;figcaption&gt;Ruari Marshall-Hawkes&lt;/figcaption&gt; 1272 + &lt;/figure&gt; 1273 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1274 + &lt;img src=&quot;hattie.jpg&quot; alt=&quot;Hattie Branson&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1275 + &lt;figcaption&gt;Hattie Branson&lt;/figcaption&gt; 1276 + &lt;/figure&gt; 1277 + &lt;figure style=&quot;margin:0; width: 100%;&quot;&gt; 1278 + &lt;img src=&quot;frank.jpg&quot; alt=&quot;Frank Feng&quot; style=&quot;max-width: 100%; height: auto;&quot;&gt; 1279 + &lt;figcaption&gt;Frank Feng&lt;/figcaption&gt; 1280 + &lt;/figure&gt; 1281 + 1282 + &lt;/div&gt; 1283 + 1284 + &lt;h2 id=&quot;discussions&quot;&gt;&lt;a href=&quot;#discussions&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Discussions&lt;/h2&gt; 1285 + &lt;p&gt;We then split up into three discussion groups; one on the future of this work, one on how to continue building this community of researchers, and the last on applying AI to real-world problems. As a newcomer to the field I was interested in the direction it's heading in, so I joined in &lt;a href=&quot;https://dorchard.github.io&quot;&gt;Dominic Orchard&lt;/a&gt;'s led session on the future of AI.&lt;/p&gt; 1286 + &lt;p&gt;We had a fascinating discussion on both the immediate things we can do and longer term worries. We were imagining a world where AI becomes 'just a tool' that we don't need to be experts in to apply it, but right now we're in a much more tightly coupled collaborative world where we need experts in AI to complement the experts in the application field to make progress. This comes with challenges - applying for funding for multidisciplinary work is not the norm, so we spent some time discussing this too.&lt;/p&gt; 1287 + &lt;p&gt;One of our group spoke about statistics now being 'just a tool', but it's been one that we've worked with for a long time now and we know where the sharp corners are. We have protocols for applying statistical tools and we have diagnostic plots to tell us whether the results are trustworthy, but not only do we not have these for AI models, but it's not yet clear whether such a thing will be even possible.&lt;/p&gt; 1288 + &lt;p&gt;Overall it was a fascinating day, and I'm very much looking forward to following the work of these outstanding researchers, and maybe even contributing to their work in some way in the future.&lt;/p&gt; 1289 + &lt;p&gt;&lt;i&gt;Thanks to &lt;a href=&quot;https://anil.recoil.org&quot;&gt;Anil Madhavapeddy&lt;/a&gt; for the photos of the day.&lt;/i&gt;&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html</id><title type="text">AI for Climate &amp; Nature Community Day</title><updated>2025-05-01T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">The release of Odoc 3 means that we need to update the project so that the documentation that appears on is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give t...</summary><published>2025-04-29T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;ocaml-docs-ci-and-odoc-3&quot;&gt;&lt;a href=&quot;#ocaml-docs-ci-and-odoc-3&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;OCaml-Docs-CI and Odoc 3&lt;/h1&gt; 1290 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-29&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1291 + &lt;p&gt;The release of Odoc 3 means that we need to update the &lt;em&gt;docs-ci&lt;/em&gt; project so that the documentation that appears on &lt;em&gt;ocaml.org&lt;/em&gt; is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give the CI pipeline a bit of an overhaul too, and fix some of the irritations that it causes.&lt;/p&gt; 1292 + &lt;h2 id=&quot;the-challenge-of-documenting-ocaml&quot;&gt;&lt;a href=&quot;#the-challenge-of-documenting-ocaml&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The challenge of documenting OCaml&lt;/h2&gt; 1293 + &lt;p&gt;As I wrote about &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/blog/2025/04/semantic-versioning-is-hard&quot;&gt;recently&lt;/span&gt;, the APIs of OCaml libraries are dependent not only on the version of its package, but possibly also on the versions of any of its dependencies. Due to this fact, to produce the docs for ocaml.org means that sometimes we need to build the docs for a particular version of a particular package multiple times with different versions of its dependencies.&lt;/p&gt; 1294 + &lt;p&gt;It's clearly impractical to try to build every possible combination, so what we do is to run the opam solver once for each version of each package. This gives us a set of packages at particular versions. We then take that, and for each package in the set, we pluck out &lt;i&gt;its&lt;/i&gt; dependencies from the set, producing a &amp;quot;universe&amp;quot; of dependencies for every package in the set. Let's look at a very simple example; the package &lt;code&gt;cry&lt;/code&gt; from the &lt;a href=&quot;https://www.liquidsoap.info&quot;&gt;LiquidSoap&lt;/a&gt; project.&lt;/p&gt; 1295 + &lt;p&gt;The oldest version of &lt;code&gt;cry&lt;/code&gt; from before the &lt;a href=&quot;https://discuss.ocaml.org/t/opam-repository-archival-phase-1-unavailable-packages/15797/6&quot;&gt;Great Purge&lt;/a&gt; was 0.2.2, which when solved produced the following dependencies:&lt;/p&gt; 1296 + &lt;pre&gt;cry.0.2.2 1297 + ocaml.4.05.0 1298 + ocaml-base-compiler.4.05.0 1299 + ocaml-config.1 1300 + ocamlfind.1.9.6&lt;/pre&gt; 1301 + &lt;p&gt;and the oldest version of &lt;code&gt;cry&lt;/code&gt; after the purge is 0.6.0 which produces the following solution:&lt;/p&gt; 1302 + &lt;pre&gt;cry.0.6.0 1303 + ocaml.5.2.1 1304 + ocaml-base-compiler.5.2.1 1305 + ocaml-config.3 1306 + ocamlfind.1.9.6&lt;/pre&gt; 1307 + &lt;p&gt;so we we can see from these two solutions that we'll need to build &lt;code&gt;ocamlfind.1.9.6&lt;/code&gt; twice, once with &lt;code&gt;ocaml.4.05.0&lt;/code&gt; and once with &lt;code&gt;ocaml.5.2.1&lt;/code&gt;.&lt;/p&gt; 1308 + &lt;p&gt;Once we've got, for every version of every package, a set of dependency universes, we choose one of these to be the one presented to the user under the &lt;code&gt;ocaml.org/p/&lt;/code&gt; hierarchy. For all of the other universes, we build the package againt them, and put the docs under the &lt;code&gt;ocaml.org/u/&lt;/code&gt; hierarchy.&lt;/p&gt; 1309 + &lt;h2 id=&quot;performing-the-builds&quot;&gt;&lt;a href=&quot;#performing-the-builds&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Performing the builds&lt;/h2&gt; 1310 + &lt;p&gt;Once we've got a complete set of solutions and builds to do, the current CI pipeline batches the builds up to try and build as many packages as possible in as few builds as possible. While this works well enough, it does mean that we build a lot packages more than once - dune, for example, is built thousands of times during this process, producing exactly the same binaries each time.&lt;/p&gt; 1311 + &lt;p&gt;In the new pipeline, I wrote a &lt;a href=&quot;https://github.com/jonludlam/opamh&quot;&gt;little tool&lt;/a&gt; that allows opam packages to be archived and restored, which happens to work nicely because we're always building the packages in the same container in the same location. This means there are no worries about relocatability, although that is something that is &lt;a href=&quot;https://www.dra27.uk/blog/platform/2025/04/22/branching-out.html&quot;&gt;nearly here!&lt;/a&gt;&lt;/p&gt; 1312 + &lt;p&gt;The downside to this is that our storage requirements are quite a bit larger, as we're storing the entire package rather than just the bits that odoc needs. However, we were always going to use more storage than before simply because the new &lt;code&gt;odoc&lt;/code&gt; and &lt;code&gt;odoc_driver&lt;/code&gt; pair are more capable, and the new features like &lt;a href=&quot;https://github.com/ocaml/odoc/pull/909&quot;&gt;source code rendering&lt;/a&gt; and &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1121/files#diff-10c8829023814c0bcc3316f95f643623404c000b13c68849ef3d61097a6e03a6R1-R415&quot;&gt;classify&lt;/a&gt; require more files from the original package.&lt;/p&gt; 1313 + &lt;p&gt;The upshot is that I'll be working with &lt;a href=&quot;https://www.tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; to move the docs CI from its current machine to a shiny new &lt;a href=&quot;https://www.tunbury.org/blade-reallocation/&quot;&gt;blade server&lt;/a&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html</id><title type="text">OCaml-Docs-CI and Odoc 3</title><updated>2025-04-29T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Odoc 3 was and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</summary><published>2025-04-25T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/odoc-3.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;odoc-3:-so-what?&quot;&gt;&lt;a href=&quot;#odoc-3:-so-what?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc 3: So what?&lt;/h1&gt; 1314 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-25&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1315 + &lt;p&gt;Odoc 3 was &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339&quot;&gt;released last month&lt;/a&gt; and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.&lt;/p&gt; 1316 + &lt;p&gt;It's &lt;b&gt;manuals&lt;/b&gt;, the theme of Odoc 3 is &lt;b&gt;manuals&lt;/b&gt;. It's got a load of features to make it much better for writing &lt;code&gt;mld&lt;/code&gt; pages (files written using odoc's markup) to document your packages and their relationship to the surrounding ecosystem. Previous versions of Odoc were very library-centric, in that while we did have mld-file support, most of the effort went into making sure that we were generating correct per-module pages, which show the shape of your API even if you've not put in any doc comments at all. We've still got that, obviously, but we've added many features to make write &lt;code&gt;mld&lt;/code&gt; pages far more useful, and we're really hoping that these will draw people in to make documenting packages a much more enjoyable experience.&lt;/p&gt; 1317 + &lt;h2 id=&quot;odoc's-special-skill:-links!&quot;&gt;&lt;a href=&quot;#odoc's-special-skill:-links!&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Odoc's special skill: links!&lt;/h2&gt; 1318 + &lt;p&gt;But why you might want to use Odoc at all for your package's manuals, rather than, say, markdown, asciidoc, rst or any other similar language? The biggest thing that Odoc brings, and has always brought, is &lt;b&gt;reliable linking&lt;/b&gt;. Just write &lt;code&gt;{!Module.func}&lt;/code&gt; and Odoc will check that the target exists and ensure that the link goes to the correct place, no matter how complex the definition of &lt;code&gt;Module&lt;/code&gt; is or what the layout of the docs. We can link to almost all elements of an OCaml library, from modules and types through to fields of records, exceptions and extensions, and we have facilities for disambiguating, so if you happen to have both a module &lt;code&gt;S&lt;/code&gt; and a module type &lt;code&gt;S&lt;/code&gt; you can easily link to whichever you please.&lt;/p&gt; 1319 + &lt;p&gt;In Odoc 2 though, these links were pretty limited - the only ones possible were only those to docs and API elements (modules, types, values, etc) in your own package, or to API elements in any libraries that your package depends on. When writing API docs, which tend to be at the level of types and functions, this wasn't a huge problem, but when considering manuals this turned out to be a really limiting constraint. For example, in Odoc's own docs, we really want to have a link to &lt;code&gt;odoc-driver&lt;/code&gt;, but since &lt;code&gt;odoc-driver&lt;/code&gt; is a separate package and depends upon &lt;code&gt;odoc&lt;/code&gt;, the only way to do that in Odoc 2.x would be to use an HTML link. With Odoc 3, this constraint is gone, so you can &lt;b&gt;link to any other package or library&lt;/b&gt;. The link to &lt;code&gt;odoc-driver&lt;/code&gt; would look like &lt;code&gt;{!/odoc-driver/page-index}&lt;/code&gt;, as can be seen in &lt;a href=&quot;https://github.com/ocaml/odoc/blob/master/doc/driver.mld#L10&quot;&gt;odoc's source&lt;/a&gt;. The only requirement is that you must be able to simultaneously install all of the packages you'd like to link to, so you can't easily link to, for example, different versions of the same package.&lt;/p&gt; 1320 + &lt;p&gt;This will be particularly useful for any projects that's grouped into multiple packages. For example, the &lt;a href=&quot;https://mirage.io&quot;&gt;Mirage project&lt;/a&gt;. The main package there -- &lt;code&gt;mirage&lt;/code&gt; -- is actually right at the bottom of the dependency hierarchy, but it's the perfect place to have docs that link to all of the other Mirage packages. On a smaller scale, the &lt;a href=&quot;https://github.com/ocaml-multicore/picos&quot;&gt;Picos project&lt;/a&gt; consists of multiple packages all from a single git repository, and this would allow the docs pages from the &lt;code&gt;picos&lt;/code&gt; package to link to any of the other packages.&lt;/p&gt; 1321 + &lt;p&gt;Of course there are also a lot of other new features in this release, which are called out in the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;annoucement post on discuss&lt;/a&gt;, some of which I may post about in the future.&lt;/p&gt; 1322 + &lt;h2 id=&quot;can-i-use-it-now?&quot;&gt;&lt;a href=&quot;#can-i-use-it-now?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Can I use it now?&lt;/h2&gt; 1323 + &lt;p&gt;Of course! These new features can be used right now, so long as you're happy to self-host the docs. All that's needed is to create a switch containing all the packages you're interested in together, and use &lt;code&gt;odoc_driver&lt;/code&gt; to generate the HTML and push them to your web server. At time of writing though, ocaml.org is still using Odoc 2.4, so any packages that are published to opam that choose to use these new features will be missing the new features. Furthermore, it's actually quite a challenge to do this, since we'll have to extend the package-universe solutions to include all relevant packages, for which we need extra fields in the opam metadata.&lt;/p&gt; 1324 + &lt;h2 id=&quot;what's-next?&quot;&gt;&lt;a href=&quot;#what's-next?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's next?&lt;/h2&gt; 1325 + &lt;p&gt;We're actively working on getting Odoc 3 into the pipeline generating the docs found in https://ocaml.org/p/. This will bring with it some of the developments that landed in Odoc 2, but didn't make it onto ocaml.org - for example, the rendering of source pages. Not only are there challenges related to the package-universe solutions as mentioned above, but the storage requirements are considerably larger, so I'll be working with &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; to complete this project.&lt;/p&gt; 1326 + &lt;p&gt;We've also got work to do to update the build rules in dune to take advantage of these features. While &lt;code&gt;odoc_driver&lt;/code&gt; works very well as part of the process of deploying a docs site, it's quite impractical as a tool to help while you're actually writing the docs. For that, we'll need to make sure &lt;code&gt;dune&lt;/code&gt; understands how to use these new features. Fortunately we've had some experience with those rules in the past, and part of the work that's gone into Odoc 3 was to ensure that incremental build rules should be far more straightforward to write than for Odoc 2. In addition, some of the logic that previously only existed in &lt;a href=&quot;https://github.com/ocaml-doc/voodoo&quot;&gt;Voodoo&lt;/a&gt; - the old driver that builds docs for ocaml.org - has been integrated into &lt;code&gt;odoc&lt;/code&gt; itself, meaning one again that getting dune to produce correct docs for non-dune packages (e.g. the standard library!) should again be simpler.&lt;/p&gt; 1327 + &lt;p&gt;After we've done these, there are plans afoot to make more improvements to the manual writing experience. &lt;a href=&quot;https://choum.net/&quot;&gt;@panglesd&lt;/a&gt; has been investigating how to add admonitions to the language, I've been thinking about custom tag support, we're looking at the &lt;a href=&quot;https://discuss.ocaml.org/t/ann-oxidizing-ocaml-an-update/15237&quot;&gt;modes&lt;/a&gt; work coming from Jane Street to see how to support that. There's plenty more to do, so if you'd like to lend a hand, reach out and join in!&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/odoc-3.html</id><title type="text">Odoc 3: So what?</title><updated>2025-04-25T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text"> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projec...</summary><published>2025-04-20T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;semantic-versioning-in-ocaml-is-hard&quot;&gt;&lt;a href=&quot;#semantic-versioning-in-ocaml-is-hard&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Semantic Versioning in OCaml is Hard&lt;/h1&gt; 1328 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-20&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1329 + &lt;p&gt;&lt;a href=&quot;https://semver.org&quot;&gt;Semantic versioning&lt;/a&gt; is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projects that are trying to do this, including a recent &lt;a href=&quot;https://www.outreachy.org&quot;&gt;Outreachy&lt;/a&gt; project by &lt;a href=&quot;https://github.com/azzsal/&quot;&gt;Abdulaziz Alkurd&lt;/a&gt; mentored by &lt;a href=&quot;https://choum.net&quot;&gt;panglesd&lt;/a&gt; and &lt;a href=&quot;https://github.com/nathanreb&quot;&gt;Nathan Reb&lt;/a&gt;. While this is a great start, there are some subtleties of the OCaml module system that make it a good deal more complex than in other languages.&lt;/p&gt; 1330 + &lt;h2 id=&quot;opam-format.2.3.0-≠-opam-format.2.3.0?&quot;&gt;&lt;a href=&quot;#opam-format.2.3.0-≠-opam-format.2.3.0?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;opam-format.2.3.0 ≠ opam-format.2.3.0?&lt;/h2&gt; 1331 + &lt;p&gt;Let's take the case that hit me this morning. I've been working on &lt;a href=&quot;https://github.com/ocurrent/ocaml-docs-ci&quot;&gt;ocaml-docs-ci&lt;/a&gt; in order to bring the exciting new &lt;a href=&quot;https://ocaml.github.io/odoc&quot;&gt;odoc 3&lt;/a&gt; features to &lt;a href=&quot;https://ocaml.org/&quot;&gt;ocaml.org&lt;/a&gt; for everyone to enjoy. I have it checked out and building locally, but to deploy it to the infrastructure managed by &lt;a href=&quot;https://tunbury.org/&quot;&gt;Mark Elvers&lt;/a&gt; it needs to be packaged up into a Docker image. So I issued the usual &lt;code&gt;docker build .&lt;/code&gt; and after it churned through the setup stages and got on to building the project, it hit an error:&lt;/p&gt; 1332 + &lt;pre&gt;File &amp;quot;src/solver/solver.ml&amp;quot;, line 58, characters 75-98: 1333 + let deps = List.map (fun pkg -&amp;gt; OpamPackage.Map.find pkg simple_deps) (OpamPackage.Set.to_list pkgs) in 1334 + Error: Unbound value OpamPackage.Set.to_list 1335 + Hint: Did you mean of_list?&lt;/pre&gt; 1336 + &lt;p&gt;Now &lt;code&gt;OpamPackage&lt;/code&gt; is a module in the &lt;code&gt;opam-format&lt;/code&gt; library, which is easily discovered using the excellent &lt;a href=&quot;https://doc.sherlocode.com/?q=OpamPackage&quot;&gt;Sherlodoc&lt;/a&gt; tool, so I checked what version I had locally, and what version I had in the Docker container, and it turned out I was using exactly the same version -- 2.3.0 -- both locally and in the container. So what's going on?&lt;/p&gt; 1337 + &lt;p&gt;The problem is that the Dockerfile I was using was using OCaml version 4.14, whereas locally I was using OCaml 5.3. &amp;quot;But how on earth can this cause the API of &lt;code&gt;opam-format&lt;/code&gt; to change?&amp;quot; I hear you wail! Well, this is actually one of the simpler outcomes of the way the OCaml module system works. Let's look at &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli&quot;&gt;the code&lt;/a&gt;.&lt;/p&gt; 1338 + &lt;p&gt;The first thing we note is the absence of any definition of &lt;code&gt;Set&lt;/code&gt; or &lt;code&gt;Map&lt;/code&gt; here&lt;/p&gt; 1339 + &lt;ul&gt;&lt;li&gt;where do they come from? It turns out they come from &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/format/opamPackage.mli#L49&quot;&gt;this line here&lt;/a&gt;:&lt;/li&gt;&lt;/ul&gt; 1340 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;include OpamStd.ABSTRACT with type t := t&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1341 + &lt;p&gt;So let's take a look over in &lt;code&gt;opamStd.mli&lt;/code&gt; to see what that signature looks like:&lt;/p&gt; 1342 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;(** A signature for handling abstract keys and collections thereof *) 1343 + module type ABSTRACT = sig 1344 + 1345 + type t 1346 + 1347 + val compare: t -&amp;gt; t -&amp;gt; int 1348 + val equal: t -&amp;gt; t -&amp;gt; bool 1349 + val of_string: string -&amp;gt; t 1350 + val to_string: t -&amp;gt; string 1351 + val to_json: t OpamJson.encoder 1352 + val of_json: t OpamJson.decoder 1353 + 1354 + module Set: SET with type elt = t 1355 + module Map: MAP with type key = t 1356 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1357 + &lt;p&gt;OK, so we've found the definitions of &lt;code&gt;Set&lt;/code&gt; and &lt;code&gt;Map&lt;/code&gt; - they refer to signatures &lt;code&gt;SET&lt;/code&gt; and &lt;code&gt;MAP&lt;/code&gt; which are defined just above in &lt;a href=&quot;https://github.com/ocaml/opam/blob/2.3.0/src/core/opamStd.mli#L17-L98&quot;&gt;opamStd.mli&lt;/a&gt;. Let's just look at &lt;code&gt;Set&lt;/code&gt; since that's where the problem was:&lt;/p&gt; 1358 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type SET = sig 1359 + 1360 + include Set.S 1361 + 1362 + val map: (elt -&amp;gt; elt) -&amp;gt; t -&amp;gt; t 1363 + 1364 + val is_singleton: t -&amp;gt; bool 1365 + 1366 + (** Returns one element, assuming the set is a singleton. Raises [Not_found] 1367 + on an empty set, [Failure] on a non-singleton. *) 1368 + val choose_one : t -&amp;gt; elt 1369 + 1370 + val choose_opt: t -&amp;gt; elt option 1371 + 1372 + val of_list: elt list -&amp;gt; t 1373 + val to_list_map: (elt -&amp;gt; 'b) -&amp;gt; t -&amp;gt; 'b list 1374 + val to_string: t -&amp;gt; string 1375 + val to_json: t OpamJson.encoder 1376 + val of_json: t OpamJson.decoder 1377 + val find: (elt -&amp;gt; bool) -&amp;gt; t -&amp;gt; elt 1378 + val find_opt: (elt -&amp;gt; bool) -&amp;gt; t -&amp;gt; elt option 1379 + 1380 + (** Raises Failure in case the element is already present *) 1381 + val safe_add: elt -&amp;gt; t -&amp;gt; t 1382 + 1383 + (** Accumulates the resulting sets of a function of elements until a fixpoint 1384 + is reached *) 1385 + val fixpoint: (elt -&amp;gt; t) -&amp;gt; t -&amp;gt; t 1386 + 1387 + (** [map_reduce f op t] applies [f] to every element of [t] and combines the 1388 + results using associative operator [op]. Raises [Invalid_argument] on an 1389 + empty set, or returns [default] if it is defined. *) 1390 + val map_reduce: ?default:'a -&amp;gt; (elt -&amp;gt; 'a) -&amp;gt; ('a -&amp;gt; 'a -&amp;gt; 'a) -&amp;gt; t -&amp;gt; 'a 1391 + 1392 + module Op : sig 1393 + val (++): t -&amp;gt; t -&amp;gt; t (** Infix set union *) 1394 + 1395 + val (--): t -&amp;gt; t -&amp;gt; t (** Infix set difference *) 1396 + 1397 + val (%%): t -&amp;gt; t -&amp;gt; t (** Infix set intersection *) 1398 + end 1399 + 1400 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1401 + &lt;p&gt;Sure enough, there's no &lt;code&gt;to_list&lt;/code&gt; function defined in there. Once again though, there's an &lt;code&gt;include Set.S&lt;/code&gt; in there. It turns out that that refers to the &lt;code&gt;Set&lt;/code&gt; module in the OCaml standard library. We can again &lt;a href=&quot;https://github.com/ocaml/ocaml/blob/5.3.0/stdlib/set.mli&quot;&gt;look at the source&lt;/a&gt;:&lt;/p&gt; 1402 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;val to_list : t -&amp;gt; elt list 1403 + (** [to_list s] is {!elements}[ s]. 1404 + @since 5.1 *)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1405 + &lt;p&gt;And there it is. The &lt;code&gt;to_list&lt;/code&gt; function has only been in the &lt;code&gt;Set&lt;/code&gt; module since version 5.1.&lt;/p&gt; 1406 + &lt;h2 id=&quot;using-ocaml.org-docs&quot;&gt;&lt;a href=&quot;#using-ocaml.org-docs&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Using ocaml.org docs&lt;/h2&gt; 1407 + &lt;p&gt;It was pretty difficult to figure that out from the source, but happily there's a better way. We can browse the docs on https://ocaml.org/ - We can look at the docs for the &lt;a href=&quot;https://ocaml.org/p/opam-format/2.3.0/doc/OpamPackage/Set/index.html&quot;&gt;OpamPackage.Set module&lt;/a&gt; which, as of today, does not contain any &lt;code&gt;to_list&lt;/code&gt; function. The &lt;code&gt;include Set.S&lt;/code&gt; is there with the expansion showing all of the types and values coming from it, so we can click on the &lt;code&gt;Set.S&lt;/code&gt; link on the include line which takes us to the documentation for the stdlib from OCaml 4.11.2. Changing the version from the dropdown at the top to something more recent takes us to a page containing the &lt;code&gt;to_list&lt;/code&gt; function with the helpful &lt;code&gt;since 5.1&lt;/code&gt; annotation.&lt;/p&gt; 1408 + &lt;p&gt;This is, in fact, a relatively simple example of the sorts of issues that can occur that make semantic versioning a headache. In this example, it was a change due to a difference in the compiler version used, but there's nothing particularly special about that - a package may expose signatures derived from any of its dependencies! So is there anything we can do about this? Obviously, yes!&lt;/p&gt; 1409 + &lt;h2 id=&quot;towards-a-solution&quot;&gt;&lt;a href=&quot;#towards-a-solution&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Towards a solution&lt;/h2&gt; 1410 + &lt;p&gt;Step 1 of any approach to solving this is to be able to identify which bits of a libraries API come from which packages, and therefore which versions of those packages. It turns out there may well be a nice way to piggy-back on a recent feature from Odoc, which was originally intended to help with suppressing suprious warnings.&lt;/p&gt; 1411 + &lt;p&gt;The problem we were tackling was that if your library ends up exporting a module whose signature is defined in someone else's package, then any warnings that come from it are unfixable. To fix this we added a tag to each signature of a module that indicates which package it originally came from. Odoc is then very careful to keep track of this as it performs its signature manipulations, resulting in an accurate way to know which signature elements came from which package. This fixed the problem of the spurious warnings nicely.&lt;/p&gt; 1412 + &lt;p&gt;Quite separately, we've got the docs CI that is attempting to build docs for every version of every package. Obviously given the above, in order to exhaustively show all the possible APIs of every library, we should build all possible combinations of every version of every package. Clearly we can't possibly do this, so the docs CI focuses on the goal of building at least one solution for every version of every package.&lt;/p&gt; 1413 + &lt;p&gt;Now if you combine these two ideas, we can use the builds of the packages with the tracking of the package of the originating signatures to be able to precisely track the differences in API between different versions of a package. This would allow us to build a database of those changes, and with this in hand we can look at what APIs are used in any other package and be able to suggest upper and lower bounds on the versions of its dependencies.&lt;/p&gt; 1414 + &lt;p&gt;Now wouldn't that be cool?&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html</id><title type="text">Semantic Versioning in OCaml is Hard</title><updated>2025-04-20T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">It's tremendously exciting to be back in the , as the last time I worked here was just before the pandemic. I'm now a member of the whose goal is &quot;to have a measurable impact on tools and techniques ...</summary><published>2025-04-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/meeting-the-team.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;meeting-the-team&quot;&gt;&lt;a href=&quot;#meeting-the-team&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Meeting the Team&lt;/h1&gt; 1415 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1416 + &lt;p&gt;It's tremendously exciting to be back in the &lt;a href=&quot;https://www.cst.cam.ac.uk/&quot;&gt;Computer Laboratory&lt;/a&gt;, as the last time I worked here was just before the pandemic. I'm now a member of the &lt;a href=&quot;https://www.cst.cam.ac.uk/research/eeg&quot;&gt;Energy and Environment Group&lt;/a&gt; whose goal is &amp;quot;to have a measurable impact on tools and techniques for de-risking the future&amp;quot;.&lt;/p&gt; 1417 + &lt;h2 id=&quot;what's-going-on?&quot;&gt;&lt;a href=&quot;#what's-going-on?&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;What's going on?&lt;/h2&gt; 1418 + &lt;p&gt;With such a broad goal, it's hard to know where to start and how I'll fit in, so my first few weeks have been spent getting to know the other members of the group and what they're up to. It's an incredibly inspiring group of individuals who are all doing amazing work, and it's really humbling and daunting to be a part of it.&lt;/p&gt; 1419 + &lt;p&gt;There's some really interesting work going on in our group on LLMs, principally led by the fantastic &lt;a href=&quot;https://toao.com/&quot;&gt;Sadiq Jaffer&lt;/a&gt;. We had a chat a few weeks ago and have started to explore some ideas around seeing how well LLMs can program in OCaml already before we start to do some RL training on them. Having not done any LLM stuff before, it's a steep learning curve for me, but we're already seeing some interesting results. We should have some more to say about this in the coming weeks.&lt;/p&gt; 1420 + &lt;p&gt;Last week I met with &lt;a href=&quot;https://digitalflapjack.com/&quot;&gt;Michael Dales&lt;/a&gt;, and he talked about the project &lt;a href=&quot;https://github.com/quantifyearth/shark&quot;&gt;shark&lt;/a&gt; that he and &lt;a href=&quot;patrick.sirref.org&quot;&gt;Patrick Ferris&lt;/a&gt; have been working on. It's kind of a mix between a shell and a jupyter-style notebook, with a strong focus on reproducibility. The traditional pain of notebooks is, of course, the execution model, whereby cells might be executed in any order you like. This means that the state you find the notebook in might not be even reachable again, let alone consistently reproducible. Shark is trying to address this by using file-system snapshots and clever analysis of the inputs and outputs of each cell to both ensure reproducibility, but also to allow a fast editing cycle, rerunning of only the bits that need to be rerun, even in the presence of slow data processing steps. It's a fascinating project, and I can't wait to see it in action when Michael gives us a demo!&lt;/p&gt; 1421 + &lt;p&gt;I also met up with &lt;a href=&quot;https://ryan.freumh.org&quot;&gt;Ryan Gibb&lt;/a&gt; with &lt;a href=&quot;https://www.dra27.uk/blog/&quot;&gt;David Allsopp&lt;/a&gt; and we had a good chat about his project &lt;a href=&quot;https://github.com/RyanGibb/babel&quot;&gt;Babel&lt;/a&gt;, which is using the &lt;a href=&quot;https://nex3.medium.com/pubgrub-2fb6470504f&quot;&gt;PubGrub&lt;/a&gt; algorithm to do package resolution for multiple package domains at once. We've got a number of avenues to explore here, from building a PubGrub implementation in OxCaml, to using Babel to construct Docker images for opam packages entirely from scratch, without using a base image.&lt;/p&gt; 1422 + &lt;p&gt;With my other hat on as a member of the CTO office at &lt;a href=&quot;https://tarides.com/&quot;&gt;Tarides&lt;/a&gt;, I'm very much looking forward to using OCaml and OxCaml to solve some real-world problems that are in an entirely different domain than I've been used to over the last few years.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/meeting-the-team.html</id><title type="text">Meeting the Team</title><updated>2025-04-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">I've spent a of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about ho...</summary><published>2025-04-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/04/this-site.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;this-site&quot;&gt;&lt;a href=&quot;#this-site&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;This site&lt;/h1&gt; 1423 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;x-ocaml.requires&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;x-ocaml.requires&lt;/span&gt; &lt;p&gt;mime_printer&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1424 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-04-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1425 + &lt;p&gt;I've spent a &lt;em&gt;lot&lt;/em&gt; of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about how I might use odoc as part of it. We've spent a lot of time recently trying to make odoc more able to generate structured documentation sites, so I've gone all in and am trialling using it as a tool to generate my entire site. This is a bit of an experiment, and I don't know how well it will work out, but let's see how it goes.&lt;/p&gt; 1426 + &lt;p&gt;Additionally, I've recently been working on a project currently called &lt;code&gt;odoc_notebook&lt;/code&gt;, which is a set of tools to allow odoc &lt;code&gt;mld&lt;/code&gt; files to be used as a sort of Jupyter-style notebook. The idea is that you can write both text and code in the same file, and then run the code in the notebook interactively. Since I've only got a webserver, all the execution of code has to be done client side, so I'm making extensive use of the phenomenal &lt;a href=&quot;https://github.com/ocsigen/js_of_ocaml&quot;&gt;Js_of_ocaml&lt;/a&gt; project to get an OCaml engine running in the browser.&lt;/p&gt; 1427 + &lt;p&gt;My focus has initially been on getting 'toplevel-style' code execution working. As an example, let's write a little demo.&lt;/p&gt; 1428 + &lt;h2 id=&quot;demo&quot;&gt;&lt;a href=&quot;#demo&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Demo&lt;/h2&gt; 1429 + &lt;p&gt;Let's start with a little demo:&lt;/p&gt; 1430 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let x = 1 + 2&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1431 + &lt;p&gt;It's intended to look like an OCaml toplevel session, so each new expression starts with a &lt;code&gt;#&lt;/code&gt; and is terminated with a double semicolon. The response from the toplevel is then below that indented with 2 spaces. Right now, there's not much in the way of error checking so you can make it all very confused by deleting the hash, removing the &lt;code&gt;;;&lt;/code&gt; and so on. Avoiding this, however, you can edit the numbers here and hit 'run' (maybe twice!) to see the results being updated.&lt;/p&gt; 1432 + &lt;p&gt;There is also a little integration to allow the code to produce output more interesting than just text. The following cell creates an SVG image and 'pushes' it to &lt;code&gt;Mime_printer&lt;/code&gt;, which receives the mime value and renders it in the browser below the code block.&lt;/p&gt; 1433 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;let svg = [ 1434 + {|&amp;lt;svg height=&amp;quot;210&amp;quot; width=&amp;quot;500&amp;quot; xmlns=&amp;quot;http://www.w3.org/2000/svg&amp;quot;&amp;gt;|}; 1435 + {|&amp;lt;polygon points=&amp;quot;100,10 40,198 190,78 10,78 160,198&amp;quot; |}; 1436 + {|style=&amp;quot;fill:lime;stroke:purple;stroke-width:5;&amp;quot;/&amp;gt;&amp;lt;/svg&amp;gt;|}];; 1437 + 1438 + Mime_printer.push &amp;quot;image/svg&amp;quot; (String.concat &amp;quot;\n&amp;quot; svg)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1439 + &lt;h2 id=&quot;things-to-come&quot;&gt;&lt;a href=&quot;#things-to-come&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Things to come&lt;/h2&gt; 1440 + &lt;h3 id=&quot;merlin-support&quot;&gt;&lt;a href=&quot;#merlin-support&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Merlin support&lt;/h3&gt; 1441 + &lt;p&gt;There are a bunch of things I want to add to this, for example, Merlin support. In fact, &lt;a href=&quot;https://github.com/voodoos/merlin-js&quot;&gt;merlin-js&lt;/a&gt; already exists and works, thanks to the fantastic work of &lt;a href=&quot;https://github.com/voodoos&quot;&gt;Ulysse&lt;/a&gt;, but the problem is that it's not really designed for toplevel work, and it doesn't work when the code is broken up into chunks like I do here. So either I need to concatenate all the cells together before I give it to Merlin, or I need to make each cell it's own little module and 'open' every previous cell's module.&lt;/p&gt; 1442 + &lt;p&gt;Within a single cell, it does already work. You can see that Merlin is correctly underlining the error in the following cell. You should also be able to hover over the variables and see their types.&lt;/p&gt; 1443 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;type t = { foo : int; bar : string };; 1444 + 1445 + let x = { foo = 1; bar = &amp;quot;hello&amp;quot; };; 1446 + 1447 + let this_line_has_an_error = { foo = 1; bar = None };;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1448 + &lt;p&gt;But across cells, I've broken Merlin, though the code is executes correctly. You can see the problem in the following cell, which re-pushes the SVG image using the variable &lt;code&gt;svg&lt;/code&gt; defined in the cell above. Merlin highlights the use of the varible &lt;code&gt;svg&lt;/code&gt; is, because it's not aware of the varible, but the code gets executed correctly and the image is rendered below the cell.&lt;/p&gt; 1449 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;Mime_printer.push &amp;quot;image/svg&amp;quot; (String.concat &amp;quot;\n&amp;quot; svg)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1450 + &lt;p&gt;Edit 2025-05-20: I have now got merlin working across cells, though I'm not convinced the current solution is the right long-term solution. S&lt;/p&gt; 1451 + &lt;h3 id=&quot;dynamic-libraries&quot;&gt;&lt;a href=&quot;#dynamic-libraries&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Dynamic libraries&lt;/h3&gt; 1452 + &lt;p&gt;Currently the use of libraries it quite limited - they are defined more-or-less statically. I've had dynamic libraries working in the past, but I need to re-implement them. The plan is to have the &lt;code&gt;cma&lt;/code&gt; files converted to &lt;code&gt;js&lt;/code&gt; files and then load them on-demand when the notebook specifies them. The tricky thing here is that we need to be able to use them both in the browser and in bytecode executables so that the 'test-promote' workflow still works. This will probably require specifying the libraries by name, and having to re-implement the work that &lt;a href=&quot;https://projects.camlcity.org/projects/findlib.html&quot;&gt;findlib&lt;/a&gt; does to find the libraries and load them and their dependencies in the right order, though this time entirely over HTTP.&lt;/p&gt; 1453 + &lt;h3 id=&quot;other-things&quot;&gt;&lt;a href=&quot;#other-things&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Other things&lt;/h3&gt; 1454 + &lt;p&gt;There are loads of other things I'm interested in doing, including:&lt;/p&gt; 1455 + &lt;ul&gt;&lt;li&gt;Investigating how to do 'exercises' to allow readers to try things out in a guided way&lt;/li&gt;&lt;li&gt;'Test cells' to see if implementations are correct&lt;/li&gt;&lt;li&gt;Persistence of the notebook state - both using local and remote storage&lt;/li&gt;&lt;li&gt;Integration of docs&lt;/li&gt;&lt;li&gt;Exploration of the execution model - how to run the code in the right order and ensure reproducibility&lt;/li&gt;&lt;li&gt;Use of remote execution engines rather than just in the browser&lt;/li&gt;&lt;li&gt;Other languages?&lt;/li&gt;&lt;/ul&gt; 1456 + &lt;p&gt;Right now though, my focus is on the functionality required for this blog, with a secondary goal of looking at how we might use this sort of technology on the docs site on ocaml.org. Wouldn't it be cool to be able to drop into a live OCaml toplevel for any library in opam?&lt;/p&gt; 1457 + &lt;h2 id=&quot;example-notebooks&quot;&gt;&lt;a href=&quot;#example-notebooks&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Example notebooks&lt;/h2&gt; 1458 + &lt;p&gt;As a more extended example of odoc notebooks, I have converted to this format the course that I help teach at the University of Cambridge; &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/&quot;&gt;Foundations of Computer Science&lt;/a&gt;. &lt;span class=&quot;xref-unresolved&quot; title=&quot;/jon-site/notebooks/foundations/index&quot;&gt;Try them out for yourself!&lt;/span&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/04/this-site.html</id><title type="text">This site</title><updated>2025-04-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">There are that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an that landed in May 2024.</summary><published>2025-03-08T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/03/module-type-of.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;the-road-to-odoc-3:-module-type-of&quot;&gt;&lt;a href=&quot;#the-road-to-odoc-3:-module-type-of&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The Road to Odoc 3: Module Type Of&lt;/h1&gt; 1459 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-03-08&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1460 + &lt;p&gt;There are &lt;a href=&quot;https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043&quot;&gt;many new and improved features&lt;/a&gt; that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1081&quot;&gt;overhaul of &amp;quot;module type of&amp;quot;&lt;/a&gt; that landed in May 2024.&lt;/p&gt; 1461 + &lt;h2 id=&quot;module-type-of&quot;&gt;&lt;a href=&quot;#module-type-of&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Module Type Of&lt;/h2&gt; 1462 + &lt;p&gt;module type of is a language feature of OCaml allowing one to recover the signature of an existing module. For example, if I had a module &lt;code&gt;X&lt;/code&gt;:&lt;/p&gt; 1463 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module X = struct 1464 + type t = Foo | Bar 1465 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1466 + &lt;p&gt;then I can get back the signature of &lt;code&gt;X&lt;/code&gt; using &lt;code&gt;module type of&lt;/code&gt;:&lt;/p&gt; 1467 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type Xsig = module type of X&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1468 + &lt;p&gt;which can be very useful if you’re trying to &lt;a href=&quot;https://discuss.ocaml.org/t/extend-existing-module/1389&quot;&gt;extend existing modules&lt;/a&gt; amongst other things.&lt;/p&gt; 1469 + &lt;p&gt;OCaml and Odoc treat module type of in somewhat different ways. OCaml internally expands the expression immediately it sees it, and effectively replaces it with the signature - ie, in the above example Xsig is now a signature, not a module type of expression.&lt;/p&gt; 1470 + &lt;p&gt;In contrast, Odoc would like to keep track of the fact that this signature came from a &lt;code&gt;module type of&lt;/code&gt; expression, as it’s very useful to know. If you’re extending a module, your signature might look like:&lt;/p&gt; 1471 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type UnitExtended = sig 1472 + include module type of Unit 1473 + val extra_unit_function : unit -&amp;gt; unit 1474 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1475 + &lt;p&gt;The documentation we produce will expand the contents of the &lt;code&gt;include&lt;/code&gt; statement, but keep track of the fact that it came from a &lt;code&gt;module type of&lt;/code&gt; expression so the reader can see where these signature items came from. In practice, you'd probably want to use &lt;code&gt;module type of struct include Unit end&lt;/code&gt;, which is a bit different from simply &lt;code&gt;module type of Unit&lt;/code&gt;, and I'll talk about this at some point in a future post.&lt;/p&gt; 1476 + &lt;h2 id=&quot;the-problem&quot;&gt;&lt;a href=&quot;#the-problem&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The problem&lt;/h2&gt; 1477 + &lt;p&gt;We run into difficulties as soon as we introduce another language feature that operates on signatures: with. Let’s start with a module type &lt;code&gt;S&lt;/code&gt;:&lt;/p&gt; 1478 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type S = sig 1479 + module X : sig 1480 + type t = int 1481 + end 1482 + 1483 + module type Y = 1484 + module type of X 1485 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1486 + &lt;p&gt;We’ll now define a new module &lt;code&gt;X2&lt;/code&gt; that we intend to use as a replacement for &lt;code&gt;X&lt;/code&gt;:&lt;/p&gt; 1487 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module X2 = struct 1488 + type t = int 1489 + type u = float 1490 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1491 + &lt;p&gt;Now we’ll define a new module type &lt;code&gt;T&lt;/code&gt; which is &lt;code&gt;S&lt;/code&gt; but with &lt;code&gt;X&lt;/code&gt; replaced:&lt;/p&gt; 1492 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type T = S with module X := X2&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1493 + &lt;p&gt;Here you can see that OCaml has expanded the &lt;code&gt;module type of&lt;/code&gt; expressions and told us the computed signature. The interesting thing here is that in module type &lt;code&gt;T&lt;/code&gt;, module type &lt;code&gt;Y&lt;/code&gt; only has a type &lt;code&gt;t&lt;/code&gt; in it, not a type &lt;code&gt;u&lt;/code&gt;. As above, Odoc wants to keep the &lt;code&gt;module type of&lt;/code&gt; expression so the reader can tell where module type &lt;code&gt;Y&lt;/code&gt; came from. However, the substitution would do a different thing in this case - we would have the following:&lt;/p&gt; 1494 + &lt;div&gt;&lt;pre class=&quot;language-ocaml&quot;&gt;&lt;code&gt;module type T = sig 1495 + module type Y = module type of X2 1496 + end&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt; 1497 + &lt;p&gt;and the expansion of this would then clearly have both types &lt;code&gt;t&lt;/code&gt; and &lt;code&gt;u&lt;/code&gt; in it.&lt;/p&gt; 1498 + &lt;p&gt;So now Odoc has two problems: We need to compute the correct signature, and we need to be able to describe how we computed it.&lt;/p&gt; 1499 + &lt;h2 id=&quot;the-solution&quot;&gt;&lt;a href=&quot;#the-solution&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;The solution&lt;/h2&gt; 1500 + &lt;p&gt;The previous solution to this was to have a ‘phase 0’ of odoc which would compute the expansions of all module type of expressions before doing any other work. This was necessary because of a ‘simplfying’ assumption in how we handled the typing environment. The new, simpler approach was to calculate the expansion during the normal flow of work, and never to attempt to recalculate it, but simply operate on the signature. This was a nice big simplification and optimisation that removed a few corner cases in the previous code (including an &lt;a href=&quot;https://github.com/ocaml/odoc/blob/v2.4/src/xref2/type_of.ml#L167-L174&quot;&gt;infinite loop&lt;/a&gt; that we &lt;em&gt;hoped&lt;/em&gt; always terminated…!)&lt;/p&gt; 1501 + &lt;p&gt;The second issue was how to describe it. We still want it clear that this signature was derived from another, but it’s clear we can’t honestly say that in the above example that it’s &lt;code&gt;module type of X2&lt;/code&gt;. The answer is that we have applied a transparent ascription to the signature. Essentially, the signature is &lt;code&gt;X2&lt;/code&gt; but constrained to only have the fields of &lt;code&gt;X&lt;/code&gt;.&lt;/p&gt; 1502 + &lt;p&gt;This is not a current feature of OCaml, though Jane Street has &lt;a href=&quot;https://blog.janestreet.com/plans-for-ocaml-408/&quot;&gt;done some work&lt;/a&gt; on this, including declaring the syntax: &lt;code&gt;X2 &amp;lt;: X&lt;/code&gt;. However, there’s another interesting wrinkle here. &lt;code&gt;X&lt;/code&gt; is a module defined in the module type &lt;code&gt;S&lt;/code&gt;, so it’s not possible to write a valid OCaml path that points to it – &lt;code&gt;S.X&lt;/code&gt; has no meaning. In addition, the right-hand side of the &lt;code&gt;&amp;lt;:&lt;/code&gt; operator should be a module type, so we’d actually need to write &lt;code&gt;X2 &amp;lt;: module type of S.X&lt;/code&gt; . We’re still figuring out the right thing to do here, so for now Odoc 3 will still pretend that it’s simply &lt;code&gt;module type of X2&lt;/code&gt;.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/03/module-type-of.html</id><title type="text">The Road to Odoc 3: Module Type Of</title><updated>2025-03-08T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry><entry><summary type="text">Back in 2021 introduced some to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language ta...</summary><published>2025-03-07T00:00:00-00:00</published><link href="https://jon.recoil.org/blog/2025/03/code-block-metadata.html" rel="alternate"/><content type="html">&lt;h1 id=&quot;code-block-metadata&quot;&gt;&lt;a href=&quot;#code-block-metadata&quot; class=&quot;anchor&quot;&gt;&lt;/a&gt;Code block metadata&lt;/h1&gt; 1503 + &lt;ul class=&quot;at-tags&quot;&gt;&lt;li class=&quot;published&quot;&gt;&lt;span class=&quot;at-tag&quot;&gt;published&lt;/span&gt; &lt;p&gt;2025-03-07&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt; 1504 + &lt;p&gt;Back in 2021 &lt;a href=&quot;https://github.com/julow&quot;&gt;julow&lt;/a&gt; introduced some &lt;a href=&quot;https://github.com/ocaml-doc/odoc-parser/pull/2&quot;&gt;new syntax&lt;/a&gt; to odoc’s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language tag and the start of the code block. Now odoc needs to use it itself, we need to be a bit more precise about how it’s defined.&lt;/p&gt; 1505 + &lt;p&gt;The original concept looked like this:&lt;/p&gt; 1506 + &lt;pre&gt;{@ocaml metadata goes here in an unstructured way[ 1507 + ... code ... 1508 + ]}&lt;/pre&gt; 1509 + &lt;p&gt;where everything in between the language (“ocaml” in this case) and the opening square bracket would be captured and put into the AST verbatim. Odoc itself has had no particular use for this, but it has been used in &lt;a href=&quot;https://github.com/realworldocaml/mdx&quot;&gt;mdx&lt;/a&gt; to control how it handles the code blocks, for example to skip processing of the block, to synchronise the block with another file, to disable testing the block on particular OSs and so on.&lt;/p&gt; 1510 + &lt;p&gt;As part of the Odoc 3 release we decided to address one of our &lt;a href=&quot;https://github.com/ocaml/odoc/pull/303&quot;&gt;oldest open issues&lt;/a&gt;, that of extracting code blocks from mli/mld files for inclusion into other files. This is similar to the file-sync facility in mdx but it works in the other direction: the canonical source is in the mld/mli file. In order to do this, we now need to use the metadata so we can select which code blocks to extract, and so we needed a more concrete specification of how the metadata should be parsed.&lt;/p&gt; 1511 + &lt;p&gt;We looked at what &lt;a href=&quot;https://github.com/realworldocaml/mdx/blob/main/lib/label.ml#L195-L210&quot;&gt;mdx does&lt;/a&gt;, but the way it works is rather ad-hoc, using very simple String.splits to chop up the metadata. This is OK for mdx as it’s fully in charge of what things the user might want to put into the metadata, but for a general parsing library like odoc.parser we need to be a bit more careful. Daniel Bünzli &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1326#issuecomment-2702260053&quot;&gt;suggested&lt;/a&gt; a simple strategy of atoms and bindings inspired by s-expressions. The idea is that we can have something like this:&lt;/p&gt; 1512 + &lt;pre&gt;{@ocaml atom1 &amp;quot;atom two&amp;quot; key1=value1 &amp;quot;key 2&amp;quot;=&amp;quot;value with spaces&amp;quot;[ 1513 + ... code content ... 1514 + ]}&lt;/pre&gt; 1515 + &lt;p&gt;Daniel suggested a very minimal escaping rule, whereby a string could contain a literal &amp;quot; by prefixing with a backslash - something like; &amp;quot;value with a \&amp;quot; and spaces&amp;quot;, but we discussed it during the &lt;a href=&quot;https://ocaml.org/governance/platform&quot;&gt;odoc developer meeting&lt;/a&gt; and felt that we might want something a little more familiar. So we took a look at the lexer in &lt;a href=&quot;https://github.com/janestreet/sexplib/blob/master/src/lexer.mll&quot;&gt;sexplib&lt;/a&gt; and found that it follows the &lt;a href=&quot;https://github.com/janestreet/sexplib/blob/d7c5e3adc16fcf0435220c3cd44bb695775020c1/README.org#lexical-conventions-of-s-expression&quot;&gt;lexical conventions&lt;/a&gt; of OCaml’s strings, and decided that would be a reasonable approach for us to follow too.&lt;/p&gt; 1516 + &lt;p&gt;The resulting code, including the extraction logic, was implemented in &lt;a href=&quot;https://github.com/ocaml/odoc/pull/1326/&quot;&gt;PR 1326&lt;/a&gt; mainly by &lt;a href=&quot;https://github.com/panglesd&quot;&gt;panglesd&lt;/a&gt; with a little help from me on the lexer.&lt;/p&gt;</content><id>https://jon.recoil.org/blog/2025/03/code-block-metadata.html</id><title type="text">Code block metadata</title><updated>2025-03-07T00:00:00-00:00</updated><author><uri>https://jon.recoil.org/</uri><name>Jon Ludlam</name></author></entry></feed>
+2 -2
deploy-site.sh
··· 13 13 # 14 14 # Prerequisites: 15 15 # - opam switch "default" with dune 3.21+ (html_flags + prefix support) 16 - # - odoc-jon-shell and odoc-interactive-extension in the workspace 16 + # - odoc-jons-plugins and odoc-interactive-extension in the workspace 17 17 # 18 18 # Usage: 19 19 # ./deploy-site.sh # build everything and serve on port 8080 ··· 39 39 dune build @install 40 40 # Shell and extensions must be findlib-visible for odoc to load them. 41 41 # x-ocaml.js is found from _build/install/ so x-ocaml itself doesn't need installing. 42 - dune install odoc-jon-shell odoc-interactive-extension odoc-scrollycode-extension 2>/dev/null 42 + dune install odoc-jons-plugins odoc-interactive-extension odoc-scrollycode-extension 2>/dev/null 43 43 echo " plugins registered" 44 44 45 45 echo ""
-11
odoc-jon-shell/dune-project
··· 1 - (lang dune 3.18) 2 - (using dune_site 0.1) 3 - (name odoc-jon-shell) 4 - (generate_opam_files true) 5 - 6 - (package 7 - (name odoc-jon-shell) 8 - (synopsis "Minimal content-first shell for jon.recoil.org") 9 - (depends 10 - (ocaml (>= 4.14)) 11 - odoc))
+1 -1
odoc-jon-shell/odoc-jon-shell.opam odoc-jons-plugins/odoc-jons-plugins.opam
··· 1 1 # This file is generated by dune, edit dune-project instead 2 2 opam-version: "2.0" 3 - synopsis: "Minimal content-first shell for jon.recoil.org" 3 + synopsis: "odoc shell and extensions for jon.recoil.org" 4 4 depends: [ 5 5 "dune" {>= "3.18"} 6 6 "ocaml" {>= "4.14"}
-9
odoc-jon-shell/src/dune
··· 1 - (library 2 - (public_name odoc-jon-shell.impl) 3 - (name odoc_jon_shell) 4 - (libraries odoc.html odoc.extension_api)) 5 - 6 - (plugin 7 - (name odoc-jon-shell) 8 - (libraries odoc-jon-shell.impl) 9 - (site (odoc extensions)))
+25 -4
odoc-jon-shell/src/odoc_jon_shell.ml odoc-jons-plugins/src/odoc_jons_plugins.ml
··· 1 - (* odoc-jon-shell: A minimal, content-first shell plugin for odoc. 2 - Registers as the "jon-shell" shell, usable with --shell jon-shell. *) 1 + (* odoc-jons-plugins: Shell and extensions for jon.recoil.org. 2 + Registers the "jon-shell" shell and metadata tag extensions. *) 3 3 4 4 open Odoc_utils 5 5 module Html = Tyxml.Html ··· 10 10 Odoc_extension_registry.register_support_file ~prefix:"jon-shell" 11 11 { 12 12 filename = "extensions/jon-shell.css"; 13 - content = Inline Odoc_jon_shell_css.css; 13 + content = Inline Odoc_jons_plugins_css.css; 14 14 }; 15 15 Odoc_extension_registry.register_support_file ~prefix:"jon-shell" 16 16 { 17 17 filename = "extensions/jon-shell.js"; 18 - content = Inline Odoc_jon_shell_js.js; 18 + content = Inline Odoc_jons_plugins_js.js; 19 19 } 20 20 21 21 (* Serialize sidebar data to JSON for inline embedding *) ··· 374 374 make_src ~config ~url:data.url ~header:data.header 375 375 ~sidebar_data:data.sidebar_data data.title data.content 376 376 end) 377 + 378 + (* --- Metadata tag extensions --- 379 + 380 + Custom tags like @published, @notanotebook, and @packages are used as 381 + metadata for tooling (feed generation, blog indexing) but should not 382 + appear in the rendered HTML. We register extension handlers that 383 + suppress them by returning empty content. *) 384 + 385 + module Api = Odoc_extension_api 386 + 387 + let hidden_tag_extension prefix = 388 + let module E = struct 389 + let prefix = prefix 390 + 391 + let to_document ~tag:_ _content = 392 + Api.simple_output [] 393 + end in 394 + Api.Registry.register (module E) 395 + 396 + let () = 397 + List.iter hidden_tag_extension [ "published"; "notanotebook"; "packages" ]
odoc-jon-shell/src/odoc_jon_shell_css.ml odoc-jons-plugins/src/odoc_jons_plugins_css.ml
odoc-jon-shell/src/odoc_jon_shell_js.ml odoc-jons-plugins/src/odoc_jons_plugins_js.ml
+11
odoc-jons-plugins/dune-project
··· 1 + (lang dune 3.18) 2 + (using dune_site 0.1) 3 + (name odoc-jons-plugins) 4 + (generate_opam_files true) 5 + 6 + (package 7 + (name odoc-jons-plugins) 8 + (synopsis "odoc shell and extensions for jon.recoil.org") 9 + (depends 10 + (ocaml (>= 4.14)) 11 + odoc))
+9
odoc-jons-plugins/src/dune
··· 1 + (library 2 + (public_name odoc-jons-plugins.impl) 3 + (name odoc_jons_plugins) 4 + (libraries odoc.html odoc.extension_api)) 5 + 6 + (plugin 7 + (name odoc-jons-plugins) 8 + (libraries odoc-jons-plugins.impl) 9 + (site (odoc extensions)))