···11+#!/usr/bin/env bash
22+33+# this does the same thing as typst-unlit.lhs, but depends on typst and jq
44+# this script does clobber the line numbers, so users beware
55+66+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
77+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
+16
examples/quicksort.lhs
···11+= Quicksort in Haskell
22+The first thing to know about Haskell's syntax is that parentheses
33+are used for grouping, and not for function application.
44+55+```haskell
66+quicksort :: Ord a => [a] -> [a]
77+quicksort [] = []
88+quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
99+ where
1010+ lesser = filter (< p) xs
1111+ greater = filter (>= p) xs
1212+```
1313+1414+The parentheses indicate the grouping of operands on the
1515+right-hand side of equations.
1616+
···11+# Typst-Unlit
22+33+*[tangled.org/@oppi.li/typst-unlit](https://tangled.org/@oppi.li/typst-unlit)*
44+55+*Serves: 1 Prep Time: 10min Compile Time: 10ms*
66+77+A literate program is one where comments are first-class citizens, and
88+code is explicitly demarcated, as opposed to a regular program, where
99+comments are explicitly marked, and code is a first-class entity.
1010+1111+GHC supports literate programming out of the box, by using a
1212+preprocessor to extract code from documents. This preprocessor is known
1313+as *unlit*[^1]. GHC also supports *custom* preprocessors, which can be
1414+passed in via the `-pgmL` flag. This very document you are reading, is
1515+one such preprocessor that allows embedding Haskell code inside typst
1616+files[^2].
1717+1818+This recipe not only gives you a fish (the typst-unlit preprocessor),
1919+but also, teaches you how to fish (write your own preprocessors).
2020+2121+## Ingredients
2222+2323+<table>
2424+<colgroup>
2525+<col style="width: 50%" />
2626+<col style="width: 50%" />
2727+</colgroup>
2828+<tbody>
2929+<tr>
3030+<td><p>To write your own preprocessor:</p>
3131+<ul>
3232+<li>GHC: the Glorious Haskell Compiler</li>
3333+<li>Typst: to generate PDFs</li>
3434+<li>And thats it! No stacking, shaking or caballing here.</li>
3535+</ul></td>
3636+<td><p>To compile this very document:</p>
3737+<ul>
3838+<li>The bootstrap program</li>
3939+<li>GHC: to produce an executable program</li>
4040+<li>Typst: to produce a readable PDF</li>
4141+</ul></td>
4242+</tr>
4343+</tbody>
4444+</table>
4545+4646+**Pro Tip:** If you’re missing any ingredients, your local nixpkgs
4747+should stock them!
4848+4949+## Instructions
5050+5151+The idea behind the unlit program is super simple: iterate over the
5252+lines in the supplied input file and replace lines that aren’t Haskell
5353+with an empty line! To detect lines that are Haskell, we look for the
5454+```` ```haskell ```` directive and stop at the end of the code fence.
5555+Simple enough! Annoyingly, Haskell requires that imports be declared at
5656+the top of the file. This results in literate haskell programs always
5757+starting with a giant block of imports:
5858+5959+> -- So first we need to get some boilerplate and imports out of the way.
6060+6161+— Every literate programmer
6262+6363+Oh gee, if only we had a tool to put the important stuff first. Our
6464+preprocessor will remedy this wart, with the `haskell-top` directive to
6565+move blocks to the top. With that out of the way, lets move onto the
6666+program itself!
6767+6868+### Step 1: The maincourse
6969+7070+I prefer starting with `main` but you do you. Any program that is passed
7171+to `ghc -pgmL` has to accept exactly 4 arguments:
7272+7373+- `-h`: ignore this for now
7474+- `<label>`: ignore this for now
7575+- `<infile>`: the input lhaskell source code
7676+- `<outfile>`: the output haskell source code
7777+7878+Invoke the runes to handle CLI arguments:
7979+8080+```
8181+main = do
8282+ args <- getArgs
8383+ case args of
8484+ ["-h", _label, infile, outfile] -> process infile outfile
8585+ _ -> die "Usage: typst-unlit -h <label> <source> <destination>"
8686+```
8787+8888+You will need these imports accordingly (notice how I am writing my
8989+imports *after* the main function!):
9090+9191+```
9292+import System.Environment (getArgs)
9393+import System.Exit (die)
9494+```
9595+9696+Now, we move onto defining `process`:
9797+9898+### Step 2: The processor
9999+100100+`process` does a bit of IO to read from the input file, remove comments,
101101+and write to the output file, `removeComments` is a pure function
102102+however:
103103+104104+```
105105+process :: FilePath -> FilePath -> IO ()
106106+process infile outfile = do
107107+ ls <- lines <$> readFile infile
108108+ writeFile outfile $ unlines $ removeComments ls
109109+```
110110+111111+### Step 3: Removing comments
112112+113113+We will be iterating over lines in the file, and wiping clean those
114114+lines that are not Haskell. To do so, we must track some state as we
115115+will be jumping in and out of code fences:
116116+117117+```
118118+data State
119119+ = OutsideCode
120120+ | InHaskell
121121+ | InHaskellTop
122122+ deriving (Eq, Show)
123123+```
124124+125125+To detect the code fences itself, we can define a few matcher functions,
126126+here is one for the ```` ```haskell ```` pattern:
127127+128128+```
129129+withTag :: (String -> Bool) -> String -> Bool
130130+withTag pred line = length ticks > 2 && pred tag
131131+ where (ticks, tag) = span (== '`') line
132132+133133+isHaskell :: String -> Bool
134134+isHaskell = withTag (== "haskell")
135135+```
136136+137137+You will notice that this will also match ````` ````haskell `````, and
138138+this is intentional. If your text already contains 3 backticks inside
139139+it, you will need 4 backticks in the code fence and so on.
140140+141141+We do the same exercise for `haskell-top`:
142142+143143+```
144144+isHaskellTop = withTag (== "haskell-top")
145145+```
146146+147147+And for the closing code fences:
148148+149149+```
150150+isCodeEnd = withTag null
151151+```
152152+153153+`removeComments` itself, is just a filter, that takes a list of lines
154154+and removes comments from those lines:
155155+156156+```
157157+removeComments :: [String] -> [String]
158158+removeComments ls = go OutsideCode ls [] []
159159+```
160160+161161+Finally, `go` is a recursive function that starts with some `State`, a
162162+list of input lines, and two more empty lists that are used to store the
163163+lines of code that go at the top (using the `haskell-top` directive),
164164+and the ones that go below, using the `haskell` directive:
165165+166166+```
167167+go :: State -> [String] -> [String] -> [String] -> [String]
168168+```
169169+170170+When the input file is empty, we just combine the `top` and `bottom`
171171+stacks of lines to form the file:
172172+173173+```
174174+go _ [] top bot = reverse top ++ reverse bot
175175+```
176176+177177+Next, whenever, we are `OutsideCode`, and the current line contains a
178178+directive, we must update the state to enter a code block:
179179+180180+```
181181+go OutsideCode (x : rest) top bot
182182+ | isHaskellTop x = go InHaskellTop rest top ("" : bot)
183183+ | isHaskell x = go InHaskell rest top ("" : bot)
184184+ | otherwise = go OutsideCode rest top ("" : bot)
185185+```
186186+187187+When we are already inside a Haskell code block, encountering a
188188+triple-tick should exit the code block, and any other line encountered
189189+in the block is to be included in the final file, but below the imports:
190190+191191+```
192192+go InHaskell (x : rest) top bot
193193+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
194194+ | otherwise = go InHaskell rest top (x : bot)
195195+```
196196+197197+And similarly, for blocks that start with the `haskell-top` directive,
198198+lines encountered here go into the `top` stack:
199199+200200+```
201201+go InHaskellTop (x : rest) top bot
202202+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
203203+ | otherwise = go InHaskellTop rest (x : top) bot
204204+```
205205+206206+And thats it! Gently tap the baking pan against the table and let your
207207+code settle. Once it is set, you can compile the preprocessor like so:
208208+209209+```
210210+ghc -o typst-unlit typst-unlit.hs
211211+```
212212+213213+And now, we can execute our preprocessor on literate haskell files!
214214+215215+## Serving
216216+217217+To test our preprocessor, first, write a literate haskell file
218218+containing your typst code:
219219+220220+````
221221+ = Quicksort in Haskell
222222+ The first thing to know about Haskell's syntax is that parentheses
223223+ are used for grouping, and not for function application.
224224+225225+ ```haskell
226226+ quicksort :: Ord a => [a] -> [a]
227227+ quicksort [] = []
228228+ quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
229229+ where
230230+ lesser = filter (< p) xs
231231+ greater = filter (>= p) xs
232232+ ```
233233+234234+ The parentheses indicate the grouping of operands on the
235235+ right-hand side of equations.
236236+````
237237+238238+Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can
239239+compile it with both `ghc` …
240240+241241+```
242242+ghci -pgmL ./typst-unlit quicksort.lhs
243243+GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help
244244+[1 of 2] Compiling Main ( quicksort.lhs, interpreted )
245245+Ok, one module loaded.
246246+ghci> quicksort [3,2,4,1,5,4]
247247+[1,2,3,4,4,5]
248248+```
249249+250250+… and `typst`:
251251+252252+```
253253+typst compile quicksort.lhs
254254+```
255255+256256+And there you have it! One file that can be interpreted by `ghc` and
257257+rendered beautifully with `typst` simultaneously.
258258+259259+#### Notes
260260+261261+This entire document is just a bit of ceremony around writing
262262+preprocessors, the Haskell code in this file can be summarized in this
263263+shell script:
264264+265265+```
266266+#!/usr/bin/env bash
267267+268268+# this does the same thing as typst-unlit.lhs, but depends on `typst` and `jq`
269269+270270+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
271271+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
272272+```
273273+274274+[^1]: <https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit>
275275+276276+[^2]: This document needs itself to compile itself! This is why a
277277+ bootstrap program is included.
+244
typst-unlit.lhs
···11+#set document(title: [Typst-Unlit])
22+#set par(justify: true)
33+#show raw.where(lang: "haskell"): set align(center)
44+#show raw.where(lang: "haskell-top"): set align(center)
55+#show title: set align(center)
66+#show <subtitle>: set align(center)
77+88+#title()
99+1010+Write literate Haskell programs in Typst <subtitle>
1111+1212+_#link("https://tangled.org/@oppi.li/typst-unlit")[tangled.org/\@oppi.li/typst-unlit]_ <subtitle>
1313+1414+*Serves: 1 #h(20pt) Prep Time: 10min #h(20pt) Compile Time: 10ms* <subtitle>
1515+1616+A literate program is one where comments are first-class citizens, and code is explicitly demarcated, as opposed to a regular program, where comments are explicitly marked, and code is a first-class entity.
1717+1818+GHC supports literate programming out of the box, by using a preprocessor to extract code from documents. This preprocessor is known as _unlit_ #footnote[https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit]. GHC also supports _custom_ preprocessors, which can be passed in via the `-pgmL` flag. This very document you are reading, is one such preprocessor that allows embedding Haskell code inside typst files #footnote[This document needs itself to compile itself! This is why a bootstrap program is included.].
1919+2020+This recipe not only gives you a fish (the typst-unlit preprocessor), but also, teaches you how to fish (write your own preprocessors).
2121+2222+= Ingredients
2323+2424+#table(
2525+ columns: (1fr, 1fr),
2626+ gutter: 3pt,
2727+ stroke: none,
2828+ table.cell(inset: 10pt)[
2929+ To write your own preprocessor:
3030+ - GHC: the Glorious Haskell Compiler
3131+ - Typst: to generate PDFs
3232+ - And thats it! No stacking, shaking or caballing here.
3333+ ],
3434+ table.cell(inset: 10pt)[
3535+ To compile this very document:
3636+ - The bootstrap program
3737+ - GHC: to produce an executable program
3838+ - Typst: to produce a readable PDF
3939+ ],
4040+)
4141+4242+*Pro Tip:* If you're missing any ingredients, your local nixpkgs should stock them!
4343+4444+= Instructions
4545+4646+The idea behind the unlit program is super simple: iterate over the lines in the supplied input file and replace lines that aren't Haskell with an empty line! To detect lines that are Haskell, we look for the #raw("\u{0060}\u{0060}\u{0060}haskell") directive and stop at the end of the code fence. Simple enough! Annoyingly, Haskell requires that imports be declared at the top of the file. This results in literate Haskell programs always starting with a giant block of imports:
4747+4848+#set quote(block: true)
4949+#quote(attribution: [Every literate programmer])[
5050+```
5151+ -- So first we need to get some boilerplate and imports out of the way.
5252+```
5353+]
5454+5555+Oh gee, if only we had a tool to put the important stuff first. Our preprocessor will remedy this wart, with the `haskell-top` directive to move blocks to the top. With that out of the way, lets move onto the program itself!
5656+5757+#pagebreak()
5858+5959+== Step 1: The maincourse
6060+6161+I prefer starting with `main` but you do you. Any program that is passed to `ghc -pgmL` has to accept exactly 4 arguments:
6262+6363+- `-h`: ignore this for now
6464+- `<label>`: ignore this for now
6565+- `<infile>`: the input lhaskell source code
6666+- `<outfile>`: the output Haskell source code
6767+6868+Invoke the runes to handle CLI arguments:
6969+7070+```haskell
7171+main = do
7272+ args <- getArgs
7373+ case args of
7474+ ["-h", _label, infile, outfile] -> process infile outfile
7575+ _ -> die "Usage: typst-unlit -h <label> <source> <destination>"
7676+```
7777+7878+You will need these imports accordingly (notice how I am writing my imports _after_ the main function!):
7979+8080+```haskell-top
8181+import System.Environment (getArgs)
8282+import System.Exit (die)
8383+```
8484+8585+Now, we move onto defining `process`:
8686+8787+== Step 2: The processor
8888+8989+`process` does a bit of IO to read from the input file, remove comments, and write to the output file, `removeComments` is a pure function however:
9090+9191+```haskell
9292+process :: FilePath -> FilePath -> IO ()
9393+process infile outfile = do
9494+ ls <- lines <$> readFile infile
9595+ writeFile outfile $ unlines $ removeComments ls
9696+```
9797+9898+== Step 3: Removing comments
9999+100100+We will be iterating over lines in the file, and wiping clean those lines that are not Haskell. To do so, we must track some state as we will be jumping in and out of code fences:
101101+102102+```haskell
103103+data State
104104+ = OutsideCode
105105+ | InHaskell
106106+ | InHaskellTop
107107+ deriving (Eq, Show)
108108+```
109109+110110+To detect the code fences itself, we can define a few matcher functions, here is one for the #raw("\u{0060}\u{0060}\u{0060}haskell") pattern:
111111+112112+```haskell
113113+withTag :: (String -> Bool) -> String -> Bool
114114+withTag pred line = length ticks > 2 && pred tag
115115+ where (ticks, tag) = span (== '`') line
116116+117117+isHaskell :: String -> Bool
118118+isHaskell = withTag (== "haskell")
119119+```
120120+121121+You will notice that this will also match #raw("\u{0060}\u{0060}\u{0060}\u{0060}haskell"), and this is intentional. If your text already contains 3 backticks inside it, you will need 4 backticks in the code fence and so on.
122122+123123+We do the same exercise for `haskell-top`:
124124+125125+```haskell
126126+isHaskellTop = withTag (== "haskell-top")
127127+```
128128+129129+And for the closing code fences:
130130+131131+```haskell
132132+isCodeEnd = withTag null
133133+```
134134+135135+`removeComments` itself, is just a filter, that takes a list of lines and removes comments from those lines:
136136+137137+```haskell
138138+removeComments :: [String] -> [String]
139139+removeComments ls = go OutsideCode ls [] []
140140+```
141141+142142+Finally, `go` is a recursive function that starts with some `State`, a list of input lines, and two more empty lists that are used to store the lines of code that go at the top (using the `haskell-top` directive), and the ones that go below, using the `haskell` directive:
143143+144144+```haskell
145145+go :: State -> [String] -> [String] -> [String] -> [String]
146146+```
147147+148148+When the input file is empty, we just combine the `top` and `bottom` stacks of lines to form the file:
149149+150150+```haskell
151151+go _ [] top bot = reverse top ++ reverse bot
152152+```
153153+154154+Next, whenever, we are `OutsideCode`, and the current line contains a directive, we must update the state to enter a code block:
155155+156156+```haskell
157157+go OutsideCode (x : rest) top bot
158158+ | isHaskellTop x = go InHaskellTop rest top ("" : bot)
159159+ | isHaskell x = go InHaskell rest top ("" : bot)
160160+ | otherwise = go OutsideCode rest top ("" : bot)
161161+```
162162+163163+When we are already inside a Haskell code block, encountering a triple-tick should exit the code block, and any other line encountered in the block is to be included in the final file, but below the imports:
164164+165165+```haskell
166166+go InHaskell (x : rest) top bot
167167+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
168168+ | otherwise = go InHaskell rest top (x : bot)
169169+```
170170+171171+And similarly, for blocks that start with the `haskell-top` directive, lines encountered here go into the `top` stack:
172172+173173+```haskell
174174+go InHaskellTop (x : rest) top bot
175175+ | isCodeEnd x = go OutsideCode rest top ("" : bot)
176176+ | otherwise = go InHaskellTop rest (x : top) bot
177177+```
178178+179179+And thats it! Gently tap the baking pan against the table and let your code settle. Once it is set, you can compile the preprocessor like so:
180180+181181+```bash
182182+ghc -o typst-unlit typst-unlit.hs
183183+```
184184+185185+And now, we can execute our preprocessor on literate Haskell files!
186186+187187+#pagebreak()
188188+189189+= Serving
190190+191191+To test our preprocessor, first, write a literate Haskell file containing your typst code:
192192+193193+````typst
194194+ = Quicksort in Haskell
195195+ The first thing to know about Haskell's syntax is that parentheses
196196+ are used for grouping, and not for function application.
197197+198198+ ```haskell
199199+ quicksort :: Ord a => [a] -> [a]
200200+ quicksort [] = []
201201+ quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
202202+ where
203203+ lesser = filter (< p) xs
204204+ greater = filter (>= p) xs
205205+ ```
206206+207207+ The parentheses indicate the grouping of operands on the
208208+ right-hand side of equations.
209209+````
210210+211211+Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can compile it with both `ghc` ...
212212+213213+```bash
214214+ghci -pgmL ./typst-unlit quicksort.lhs
215215+GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help
216216+[1 of 2] Compiling Main ( quicksort.lhs, interpreted )
217217+Ok, one module loaded.
218218+ghci> quicksort [3,2,4,1,5,4]
219219+[1,2,3,4,4,5]
220220+```
221221+222222+... and `typst`:
223223+224224+```bash
225225+typst compile quicksort.lhs
226226+```
227227+228228+And there you have it! One file that can be interpreted by `ghc` and rendered beautifully with `typst` simultaneously.
229229+230230+=== Notes
231231+232232+This entire document is just a bit of ceremony around writing preprocessors, the Haskell code in this file can be summarized in this shell script:
233233+234234+```bash
235235+#!/usr/bin/env bash
236236+237237+# this does the same thing as typst-unlit.lhs, but depends on typst and jq
238238+# this script does clobber the line numbers, so users beware
239239+240240+typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4"
241241+typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
242242+```
243243+244244+This document mentions the word "Haskell" 60 times.