···1+#!/usr/bin/env bash
2+3+# this does the same thing as typst-unlit.lhs, but depends on typst and jq
4+# this script does clobber the line numbers, so users beware
5+6+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
7+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
+16
examples/quicksort.lhs
···0000000000000000
···1+= Quicksort in Haskell
2+The first thing to know about Haskell's syntax is that parentheses
3+are used for grouping, and not for function application.
4+5+```haskell
6+quicksort :: Ord a => [a] -> [a]
7+quicksort [] = []
8+quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
9+ where
10+ lesser = filter (< p) xs
11+ greater = filter (>= p) xs
12+```
13+14+The parentheses indicate the grouping of operands on the
15+right-hand side of equations.
16+
···1+# Typst-Unlit
2+3+*[tangled.org/@oppi.li/typst-unlit](https://tangled.org/@oppi.li/typst-unlit)*
4+5+*Serves: 1 Prep Time: 10min Compile Time: 10ms*
6+7+A literate program is one where comments are first-class citizens, and
8+code is explicitly demarcated, as opposed to a regular program, where
9+comments are explicitly marked, and code is a first-class entity.
10+11+GHC supports literate programming out of the box, by using a
12+preprocessor to extract code from documents. This preprocessor is known
13+as *unlit*[^1]. GHC also supports *custom* preprocessors, which can be
14+passed in via the `-pgmL` flag. This very document you are reading, is
15+one such preprocessor that allows embedding Haskell code inside typst
16+files[^2].
17+18+This recipe not only gives you a fish (the typst-unlit preprocessor),
19+but also, teaches you how to fish (write your own preprocessors).
20+21+## Ingredients
22+23+<table>
24+<colgroup>
25+<col style="width: 50%" />
26+<col style="width: 50%" />
27+</colgroup>
28+<tbody>
29+<tr>
30+<td><p>To write your own preprocessor:</p>
31+<ul>
32+<li>GHC: the Glorious Haskell Compiler</li>
33+<li>Typst: to generate PDFs</li>
34+<li>And thats it! No stacking, shaking or caballing here.</li>
35+</ul></td>
36+<td><p>To compile this very document:</p>
37+<ul>
38+<li>The bootstrap program</li>
39+<li>GHC: to produce an executable program</li>
40+<li>Typst: to produce a readable PDF</li>
41+</ul></td>
42+</tr>
43+</tbody>
44+</table>
45+46+**Pro Tip:** If you’re missing any ingredients, your local nixpkgs
47+should stock them!
48+49+## Instructions
50+51+The idea behind the unlit program is super simple: iterate over the
52+lines in the supplied input file and replace lines that aren’t Haskell
53+with an empty line! To detect lines that are Haskell, we look for the
54+```` ```haskell ```` directive and stop at the end of the code fence.
55+Simple enough! Annoyingly, Haskell requires that imports be declared at
56+the top of the file. This results in literate haskell programs always
57+starting with a giant block of imports:
58+59+> -- So first we need to get some boilerplate and imports out of the way.
60+61+— Every literate programmer
62+63+Oh gee, if only we had a tool to put the important stuff first. Our
64+preprocessor will remedy this wart, with the `haskell-top` directive to
65+move blocks to the top. With that out of the way, lets move onto the
66+program itself!
67+68+### Step 1: The maincourse
69+70+I prefer starting with `main` but you do you. Any program that is passed
71+to `ghc -pgmL` has to accept exactly 4 arguments:
72+73+- `-h`: ignore this for now
74+- `<label>`: ignore this for now
75+- `<infile>`: the input lhaskell source code
76+- `<outfile>`: the output haskell source code
77+78+Invoke the runes to handle CLI arguments:
79+80+```
81+main = do
82+ args <- getArgs
83+ case args of
84+ ["-h", _label, infile, outfile] -> process infile outfile
85+ _ -> die "Usage: typst-unlit -h <label> <source> <destination>"
86+```
87+88+You will need these imports accordingly (notice how I am writing my
89+imports *after* the main function!):
90+91+```
92+import System.Environment (getArgs)
93+import System.Exit (die)
94+```
95+96+Now, we move onto defining `process`:
97+98+### Step 2: The processor
99+100+`process` does a bit of IO to read from the input file, remove comments,
101+and write to the output file, `removeComments` is a pure function
102+however:
103+104+```
105+process :: FilePath -> FilePath -> IO ()
106+process infile outfile = do
107+ ls <- lines <$> readFile infile
108+ writeFile outfile $ unlines $ removeComments ls
109+```
110+111+### Step 3: Removing comments
112+113+We will be iterating over lines in the file, and wiping clean those
114+lines that are not Haskell. To do so, we must track some state as we
115+will be jumping in and out of code fences:
116+117+```
118+data State
119+ = OutsideCode
120+ | InHaskell
121+ | InHaskellTop
122+ deriving (Eq, Show)
123+```
124+125+To detect the code fences itself, we can define a few matcher functions,
126+here is one for the ```` ```haskell ```` pattern:
127+128+```
129+withTag :: (String -> Bool) -> String -> Bool
130+withTag pred line = length ticks > 2 && pred tag
131+ where (ticks, tag) = span (== '`') line
132+133+isHaskell :: String -> Bool
134+isHaskell = withTag (== "haskell")
135+```
136+137+You will notice that this will also match ````` ````haskell `````, and
138+this is intentional. If your text already contains 3 backticks inside
139+it, you will need 4 backticks in the code fence and so on.
140+141+We do the same exercise for `haskell-top`:
142+143+```
144+isHaskellTop = withTag (== "haskell-top")
145+```
146+147+And for the closing code fences:
148+149+```
150+isCodeEnd = withTag null
151+```
152+153+`removeComments` itself, is just a filter, that takes a list of lines
154+and removes comments from those lines:
155+156+```
157+removeComments :: [String] -> [String]
158+removeComments ls = go OutsideCode ls [] []
159+```
160+161+Finally, `go` is a recursive function that starts with some `State`, a
162+list of input lines, and two more empty lists that are used to store the
163+lines of code that go at the top (using the `haskell-top` directive),
164+and the ones that go below, using the `haskell` directive:
165+166+```
167+go :: State -> [String] -> [String] -> [String] -> [String]
168+```
169+170+When the input file is empty, we just combine the `top` and `bottom`
171+stacks of lines to form the file:
172+173+```
174+go _ [] top bot = reverse top ++ reverse bot
175+```
176+177+Next, whenever, we are `OutsideCode`, and the current line contains a
178+directive, we must update the state to enter a code block:
179+180+```
181+go OutsideCode (x : rest) top bot
182+ | isHaskellTop x = go InHaskellTop rest top ("" : bot)
183+ | isHaskell x = go InHaskell rest top ("" : bot)
184+ | otherwise = go OutsideCode rest top ("" : bot)
185+```
186+187+When we are already inside a Haskell code block, encountering a
188+triple-tick should exit the code block, and any other line encountered
189+in the block is to be included in the final file, but below the imports:
190+191+```
192+go InHaskell (x : rest) top bot
193+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
194+ | otherwise = go InHaskell rest top (x : bot)
195+```
196+197+And similarly, for blocks that start with the `haskell-top` directive,
198+lines encountered here go into the `top` stack:
199+200+```
201+go InHaskellTop (x : rest) top bot
202+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
203+ | otherwise = go InHaskellTop rest (x : top) bot
204+```
205+206+And thats it! Gently tap the baking pan against the table and let your
207+code settle. Once it is set, you can compile the preprocessor like so:
208+209+```
210+ghc -o typst-unlit typst-unlit.hs
211+```
212+213+And now, we can execute our preprocessor on literate haskell files!
214+215+## Serving
216+217+To test our preprocessor, first, write a literate haskell file
218+containing your typst code:
219+220+````
221+ = Quicksort in Haskell
222+ The first thing to know about Haskell's syntax is that parentheses
223+ are used for grouping, and not for function application.
224+225+ ```haskell
226+ quicksort :: Ord a => [a] -> [a]
227+ quicksort [] = []
228+ quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
229+ where
230+ lesser = filter (< p) xs
231+ greater = filter (>= p) xs
232+ ```
233+234+ The parentheses indicate the grouping of operands on the
235+ right-hand side of equations.
236+````
237+238+Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can
239+compile it with both `ghc` …
240+241+```
242+ghci -pgmL ./typst-unlit quicksort.lhs
243+GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help
244+[1 of 2] Compiling Main ( quicksort.lhs, interpreted )
245+Ok, one module loaded.
246+ghci> quicksort [3,2,4,1,5,4]
247+[1,2,3,4,4,5]
248+```
249+250+… and `typst`:
251+252+```
253+typst compile quicksort.lhs
254+```
255+256+And there you have it! One file that can be interpreted by `ghc` and
257+rendered beautifully with `typst` simultaneously.
258+259+#### Notes
260+261+This entire document is just a bit of ceremony around writing
262+preprocessors, the Haskell code in this file can be summarized in this
263+shell script:
264+265+```
266+#!/usr/bin/env bash
267+268+# this does the same thing as typst-unlit.lhs, but depends on `typst` and `jq`
269+270+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
271+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
272+```
273+274+[^1]: <https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit>
275+276+[^2]: This document needs itself to compile itself! This is why a
277+ bootstrap program is included.
···1+#set document(title: [Typst-Unlit])
2+#set par(justify: true)
3+#show raw.where(lang: "haskell"): set align(center)
4+#show raw.where(lang: "haskell-top"): set align(center)
5+#show title: set align(center)
6+#show <subtitle>: set align(center)
7+8+#title()
9+10+Write literate Haskell programs in Typst <subtitle>
11+12+_#link("https://tangled.org/@oppi.li/typst-unlit")[tangled.org/\@oppi.li/typst-unlit]_ <subtitle>
13+14+*Serves: 1 #h(20pt) Prep Time: 10min #h(20pt) Compile Time: 10ms* <subtitle>
15+16+A literate program is one where comments are first-class citizens, and code is explicitly demarcated, as opposed to a regular program, where comments are explicitly marked, and code is a first-class entity.
17+18+GHC supports literate programming out of the box, by using a preprocessor to extract code from documents. This preprocessor is known as _unlit_ #footnote[https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit]. GHC also supports _custom_ preprocessors, which can be passed in via the `-pgmL` flag. This very document you are reading, is one such preprocessor that allows embedding Haskell code inside typst files #footnote[This document needs itself to compile itself! This is why a bootstrap program is included.].
19+20+This recipe not only gives you a fish (the typst-unlit preprocessor), but also, teaches you how to fish (write your own preprocessors).
21+22+= Ingredients
23+24+#table(
25+ columns: (1fr, 1fr),
26+ gutter: 3pt,
27+ stroke: none,
28+ table.cell(inset: 10pt)[
29+ To write your own preprocessor:
30+ - GHC: the Glorious Haskell Compiler
31+ - Typst: to generate PDFs
32+ - And thats it! No stacking, shaking or caballing here.
33+ ],
34+ table.cell(inset: 10pt)[
35+ To compile this very document:
36+ - The bootstrap program
37+ - GHC: to produce an executable program
38+ - Typst: to produce a readable PDF
39+ ],
40+)
41+42+*Pro Tip:* If you're missing any ingredients, your local nixpkgs should stock them!
43+44+= Instructions
45+46+The idea behind the unlit program is super simple: iterate over the lines in the supplied input file and replace lines that aren't Haskell with an empty line! To detect lines that are Haskell, we look for the #raw("\u{0060}\u{0060}\u{0060}haskell") directive and stop at the end of the code fence. Simple enough! Annoyingly, Haskell requires that imports be declared at the top of the file. This results in literate Haskell programs always starting with a giant block of imports:
47+48+#set quote(block: true)
49+#quote(attribution: [Every literate programmer])[
50+```
51+ -- So first we need to get some boilerplate and imports out of the way.
52+```
53+]
54+55+Oh gee, if only we had a tool to put the important stuff first. Our preprocessor will remedy this wart, with the `haskell-top` directive to move blocks to the top. With that out of the way, lets move onto the program itself!
56+57+#pagebreak()
58+59+== Step 1: The maincourse
60+61+I prefer starting with `main` but you do you. Any program that is passed to `ghc -pgmL` has to accept exactly 4 arguments:
62+63+- `-h`: ignore this for now
64+- `<label>`: ignore this for now
65+- `<infile>`: the input lhaskell source code
66+- `<outfile>`: the output Haskell source code
67+68+Invoke the runes to handle CLI arguments:
69+70+```haskell
71+main = do
72+ args <- getArgs
73+ case args of
74+ ["-h", _label, infile, outfile] -> process infile outfile
75+ _ -> die "Usage: typst-unlit -h <label> <source> <destination>"
76+```
77+78+You will need these imports accordingly (notice how I am writing my imports _after_ the main function!):
79+80+```haskell-top
81+import System.Environment (getArgs)
82+import System.Exit (die)
83+```
84+85+Now, we move onto defining `process`:
86+87+== Step 2: The processor
88+89+`process` does a bit of IO to read from the input file, remove comments, and write to the output file, `removeComments` is a pure function however:
90+91+```haskell
92+process :: FilePath -> FilePath -> IO ()
93+process infile outfile = do
94+ ls <- lines <$> readFile infile
95+ writeFile outfile $ unlines $ removeComments ls
96+```
97+98+== Step 3: Removing comments
99+100+We will be iterating over lines in the file, and wiping clean those lines that are not Haskell. To do so, we must track some state as we will be jumping in and out of code fences:
101+102+```haskell
103+data State
104+ = OutsideCode
105+ | InHaskell
106+ | InHaskellTop
107+ deriving (Eq, Show)
108+```
109+110+To detect the code fences itself, we can define a few matcher functions, here is one for the #raw("\u{0060}\u{0060}\u{0060}haskell") pattern:
111+112+```haskell
113+withTag :: (String -> Bool) -> String -> Bool
114+withTag pred line = length ticks > 2 && pred tag
115+ where (ticks, tag) = span (== '`') line
116+117+isHaskell :: String -> Bool
118+isHaskell = withTag (== "haskell")
119+```
120+121+You will notice that this will also match #raw("\u{0060}\u{0060}\u{0060}\u{0060}haskell"), and this is intentional. If your text already contains 3 backticks inside it, you will need 4 backticks in the code fence and so on.
122+123+We do the same exercise for `haskell-top`:
124+125+```haskell
126+isHaskellTop = withTag (== "haskell-top")
127+```
128+129+And for the closing code fences:
130+131+```haskell
132+isCodeEnd = withTag null
133+```
134+135+`removeComments` itself, is just a filter, that takes a list of lines and removes comments from those lines:
136+137+```haskell
138+removeComments :: [String] -> [String]
139+removeComments ls = go OutsideCode ls [] []
140+```
141+142+Finally, `go` is a recursive function that starts with some `State`, a list of input lines, and two more empty lists that are used to store the lines of code that go at the top (using the `haskell-top` directive), and the ones that go below, using the `haskell` directive:
143+144+```haskell
145+go :: State -> [String] -> [String] -> [String] -> [String]
146+```
147+148+When the input file is empty, we just combine the `top` and `bottom` stacks of lines to form the file:
149+150+```haskell
151+go _ [] top bot = reverse top ++ reverse bot
152+```
153+154+Next, whenever, we are `OutsideCode`, and the current line contains a directive, we must update the state to enter a code block:
155+156+```haskell
157+go OutsideCode (x : rest) top bot
158+ | isHaskellTop x = go InHaskellTop rest top ("" : bot)
159+ | isHaskell x = go InHaskell rest top ("" : bot)
160+ | otherwise = go OutsideCode rest top ("" : bot)
161+```
162+163+When we are already inside a Haskell code block, encountering a triple-tick should exit the code block, and any other line encountered in the block is to be included in the final file, but below the imports:
164+165+```haskell
166+go InHaskell (x : rest) top bot
167+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
168+ | otherwise = go InHaskell rest top (x : bot)
169+```
170+171+And similarly, for blocks that start with the `haskell-top` directive, lines encountered here go into the `top` stack:
172+173+```haskell
174+go InHaskellTop (x : rest) top bot
175+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
176+ | otherwise = go InHaskellTop rest (x : top) bot
177+```
178+179+And thats it! Gently tap the baking pan against the table and let your code settle. Once it is set, you can compile the preprocessor like so:
180+181+```bash
182+ghc -o typst-unlit typst-unlit.hs
183+```
184+185+And now, we can execute our preprocessor on literate Haskell files!
186+187+#pagebreak()
188+189+= Serving
190+191+To test our preprocessor, first, write a literate Haskell file containing your typst code:
192+193+````typst
194+ = Quicksort in Haskell
195+ The first thing to know about Haskell's syntax is that parentheses
196+ are used for grouping, and not for function application.
197+198+ ```haskell
199+ quicksort :: Ord a => [a] -> [a]
200+ quicksort [] = []
201+ quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
202+ where
203+ lesser = filter (< p) xs
204+ greater = filter (>= p) xs
205+ ```
206+207+ The parentheses indicate the grouping of operands on the
208+ right-hand side of equations.
209+````
210+211+Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can compile it with both `ghc` ...
212+213+```bash
214+ghci -pgmL ./typst-unlit quicksort.lhs
215+GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help
216+[1 of 2] Compiling Main ( quicksort.lhs, interpreted )
217+Ok, one module loaded.
218+ghci> quicksort [3,2,4,1,5,4]
219+[1,2,3,4,4,5]
220+```
221+222+... and `typst`:
223+224+```bash
225+typst compile quicksort.lhs
226+```
227+228+And there you have it! One file that can be interpreted by `ghc` and rendered beautifully with `typst` simultaneously.
229+230+=== Notes
231+232+This entire document is just a bit of ceremony around writing preprocessors, the Haskell code in this file can be summarized in this shell script:
233+234+```bash
235+#!/usr/bin/env bash
236+237+# this does the same thing as typst-unlit.lhs, but depends on typst and jq
238+# this script does clobber the line numbers, so users beware
239+240+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
241+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
242+```
243+244+This document mentions the word "Haskell" 60 times.