write literate haskell programs in typst cdn.oppi.li/typst-unlit.pdf
haskell typst

init: typst-unlit

Signed-off-by: oppiliappan <me@oppi.li>

oppi.li d8aa995e

+633
+3
.gitignore
···
··· 1 + .direnv 2 + .envrc 3 + out
+7
bootstrap.sh
···
··· 1 + #!/usr/bin/env bash 2 + 3 + # this does the same thing as typst-unlit.lhs, but depends on typst and jq 4 + # this script does clobber the line numbers, so users beware 5 + 6 + typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4" 7 + typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4"
+16
examples/quicksort.lhs
···
··· 1 + = Quicksort in Haskell 2 + The first thing to know about Haskell's syntax is that parentheses 3 + are used for grouping, and not for function application. 4 + 5 + ```haskell 6 + quicksort :: Ord a => [a] -> [a] 7 + quicksort [] = [] 8 + quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater) 9 + where 10 + lesser = filter (< p) xs 11 + greater = filter (>= p) xs 12 + ``` 13 + 14 + The parentheses indicate the grouping of operands on the 15 + right-hand side of equations. 16 +
+48
flake.lock
···
··· 1 + { 2 + "nodes": { 3 + "gitignore": { 4 + "inputs": { 5 + "nixpkgs": [ 6 + "nixpkgs" 7 + ] 8 + }, 9 + "locked": { 10 + "lastModified": 1709087332, 11 + "narHash": "sha256-HG2cCnktfHsKV0s4XW83gU3F57gaTljL9KNSuG6bnQs=", 12 + "owner": "hercules-ci", 13 + "repo": "gitignore.nix", 14 + "rev": "637db329424fd7e46cf4185293b9cc8c88c95394", 15 + "type": "github" 16 + }, 17 + "original": { 18 + "owner": "hercules-ci", 19 + "repo": "gitignore.nix", 20 + "type": "github" 21 + } 22 + }, 23 + "nixpkgs": { 24 + "locked": { 25 + "lastModified": 1762286042, 26 + "narHash": "sha256-OD5HsZ+sN7VvNucbrjiCz7CHF5zf9gP51YVJvPwYIH8=", 27 + "owner": "nixos", 28 + "repo": "nixpkgs", 29 + "rev": "12c1f0253aa9a54fdf8ec8aecaafada64a111e24", 30 + "type": "github" 31 + }, 32 + "original": { 33 + "owner": "nixos", 34 + "ref": "nixpkgs-unstable", 35 + "repo": "nixpkgs", 36 + "type": "github" 37 + } 38 + }, 39 + "root": { 40 + "inputs": { 41 + "gitignore": "gitignore", 42 + "nixpkgs": "nixpkgs" 43 + } 44 + } 45 + }, 46 + "root": "root", 47 + "version": 7 48 + }
+38
flake.nix
···
··· 1 + { 2 + inputs = { 3 + nixpkgs.url = "github:nixos/nixpkgs/nixpkgs-unstable"; 4 + gitignore = { 5 + url = "github:hercules-ci/gitignore.nix"; 6 + inputs.nixpkgs.follows = "nixpkgs"; 7 + }; 8 + }; 9 + 10 + outputs = { 11 + self, 12 + nixpkgs, 13 + gitignore, 14 + }: let 15 + inherit (gitignore.lib) gitignoreSource; 16 + supportedSystems = ["x86_64-linux" "aarch64-linux" "x86_64-darwin" "aarch64-darwin"]; 17 + forAllSystems = nixpkgs.lib.genAttrs supportedSystems; 18 + nixpkgsFor = 19 + forAllSystems (system: 20 + import nixpkgs { 21 + config.allowUnfree = true; 22 + inherit system; 23 + }); 24 + in { 25 + devShell = forAllSystems (system: let 26 + pkgs = nixpkgsFor.${system}; 27 + in 28 + pkgs.mkShell { 29 + nativeBuildInputs = [ 30 + pkgs.typst 31 + pkgs.ghc 32 + pkgs.pandoc 33 + ]; 34 + }); 35 + 36 + formatter = forAllSystems (system: nixpkgsFor.${system}.alejandra); 37 + }; 38 + }
+277
readme.md
···
··· 1 + # Typst-Unlit 2 + 3 + *[tangled.org/@oppi.li/typst-unlit](https://tangled.org/@oppi.li/typst-unlit)* 4 + 5 + *Serves: 1 Prep Time: 10min Compile Time: 10ms* 6 + 7 + A literate program is one where comments are first-class citizens, and 8 + code is explicitly demarcated, as opposed to a regular program, where 9 + comments are explicitly marked, and code is a first-class entity. 10 + 11 + GHC supports literate programming out of the box, by using a 12 + preprocessor to extract code from documents. This preprocessor is known 13 + as *unlit*[^1]. GHC also supports *custom* preprocessors, which can be 14 + passed in via the `-pgmL` flag. This very document you are reading, is 15 + one such preprocessor that allows embedding Haskell code inside typst 16 + files[^2]. 17 + 18 + This recipe not only gives you a fish (the typst-unlit preprocessor), 19 + but also, teaches you how to fish (write your own preprocessors). 20 + 21 + ## Ingredients 22 + 23 + <table> 24 + <colgroup> 25 + <col style="width: 50%" /> 26 + <col style="width: 50%" /> 27 + </colgroup> 28 + <tbody> 29 + <tr> 30 + <td><p>To write your own preprocessor:</p> 31 + <ul> 32 + <li>GHC: the Glorious Haskell Compiler</li> 33 + <li>Typst: to generate PDFs</li> 34 + <li>And thats it! No stacking, shaking or caballing here.</li> 35 + </ul></td> 36 + <td><p>To compile this very document:</p> 37 + <ul> 38 + <li>The bootstrap program</li> 39 + <li>GHC: to produce an executable program</li> 40 + <li>Typst: to produce a readable PDF</li> 41 + </ul></td> 42 + </tr> 43 + </tbody> 44 + </table> 45 + 46 + **Pro Tip:** If you’re missing any ingredients, your local nixpkgs 47 + should stock them! 48 + 49 + ## Instructions 50 + 51 + The idea behind the unlit program is super simple: iterate over the 52 + lines in the supplied input file and replace lines that aren’t Haskell 53 + with an empty line! To detect lines that are Haskell, we look for the 54 + ```` ```haskell ```` directive and stop at the end of the code fence. 55 + Simple enough! Annoyingly, Haskell requires that imports be declared at 56 + the top of the file. This results in literate haskell programs always 57 + starting with a giant block of imports: 58 + 59 + > -- So first we need to get some boilerplate and imports out of the way. 60 + 61 + — Every literate programmer 62 + 63 + Oh gee, if only we had a tool to put the important stuff first. Our 64 + preprocessor will remedy this wart, with the `haskell-top` directive to 65 + move blocks to the top. With that out of the way, lets move onto the 66 + program itself! 67 + 68 + ### Step 1: The maincourse 69 + 70 + I prefer starting with `main` but you do you. Any program that is passed 71 + to `ghc -pgmL` has to accept exactly 4 arguments: 72 + 73 + - `-h`: ignore this for now 74 + - `<label>`: ignore this for now 75 + - `<infile>`: the input lhaskell source code 76 + - `<outfile>`: the output haskell source code 77 + 78 + Invoke the runes to handle CLI arguments: 79 + 80 + ``` 81 + main = do 82 + args <- getArgs 83 + case args of 84 + ["-h", _label, infile, outfile] -> process infile outfile 85 + _ -> die "Usage: typst-unlit -h <label> <source> <destination>" 86 + ``` 87 + 88 + You will need these imports accordingly (notice how I am writing my 89 + imports *after* the main function!): 90 + 91 + ``` 92 + import System.Environment (getArgs) 93 + import System.Exit (die) 94 + ``` 95 + 96 + Now, we move onto defining `process`: 97 + 98 + ### Step 2: The processor 99 + 100 + `process` does a bit of IO to read from the input file, remove comments, 101 + and write to the output file, `removeComments` is a pure function 102 + however: 103 + 104 + ``` 105 + process :: FilePath -> FilePath -> IO () 106 + process infile outfile = do 107 + ls <- lines <$> readFile infile 108 + writeFile outfile $ unlines $ removeComments ls 109 + ``` 110 + 111 + ### Step 3: Removing comments 112 + 113 + We will be iterating over lines in the file, and wiping clean those 114 + lines that are not Haskell. To do so, we must track some state as we 115 + will be jumping in and out of code fences: 116 + 117 + ``` 118 + data State 119 + = OutsideCode 120 + | InHaskell 121 + | InHaskellTop 122 + deriving (Eq, Show) 123 + ``` 124 + 125 + To detect the code fences itself, we can define a few matcher functions, 126 + here is one for the ```` ```haskell ```` pattern: 127 + 128 + ``` 129 + withTag :: (String -> Bool) -> String -> Bool 130 + withTag pred line = length ticks > 2 && pred tag 131 + where (ticks, tag) = span (== '`') line 132 + 133 + isHaskell :: String -> Bool 134 + isHaskell = withTag (== "haskell") 135 + ``` 136 + 137 + You will notice that this will also match ````` ````haskell `````, and 138 + this is intentional. If your text already contains 3 backticks inside 139 + it, you will need 4 backticks in the code fence and so on. 140 + 141 + We do the same exercise for `haskell-top`: 142 + 143 + ``` 144 + isHaskellTop = withTag (== "haskell-top") 145 + ``` 146 + 147 + And for the closing code fences: 148 + 149 + ``` 150 + isCodeEnd = withTag null 151 + ``` 152 + 153 + `removeComments` itself, is just a filter, that takes a list of lines 154 + and removes comments from those lines: 155 + 156 + ``` 157 + removeComments :: [String] -> [String] 158 + removeComments ls = go OutsideCode ls [] [] 159 + ``` 160 + 161 + Finally, `go` is a recursive function that starts with some `State`, a 162 + list of input lines, and two more empty lists that are used to store the 163 + lines of code that go at the top (using the `haskell-top` directive), 164 + and the ones that go below, using the `haskell` directive: 165 + 166 + ``` 167 + go :: State -> [String] -> [String] -> [String] -> [String] 168 + ``` 169 + 170 + When the input file is empty, we just combine the `top` and `bottom` 171 + stacks of lines to form the file: 172 + 173 + ``` 174 + go _ [] top bot = reverse top ++ reverse bot 175 + ``` 176 + 177 + Next, whenever, we are `OutsideCode`, and the current line contains a 178 + directive, we must update the state to enter a code block: 179 + 180 + ``` 181 + go OutsideCode (x : rest) top bot 182 + | isHaskellTop x = go InHaskellTop rest top ("" : bot) 183 + | isHaskell x = go InHaskell rest top ("" : bot) 184 + | otherwise = go OutsideCode rest top ("" : bot) 185 + ``` 186 + 187 + When we are already inside a Haskell code block, encountering a 188 + triple-tick should exit the code block, and any other line encountered 189 + in the block is to be included in the final file, but below the imports: 190 + 191 + ``` 192 + go InHaskell (x : rest) top bot 193 + | isCodeEnd x = go OutsideCode rest top ("" : bot) 194 + | otherwise = go InHaskell rest top (x : bot) 195 + ``` 196 + 197 + And similarly, for blocks that start with the `haskell-top` directive, 198 + lines encountered here go into the `top` stack: 199 + 200 + ``` 201 + go InHaskellTop (x : rest) top bot 202 + | isCodeEnd x = go OutsideCode rest top ("" : bot) 203 + | otherwise = go InHaskellTop rest (x : top) bot 204 + ``` 205 + 206 + And thats it! Gently tap the baking pan against the table and let your 207 + code settle. Once it is set, you can compile the preprocessor like so: 208 + 209 + ``` 210 + ghc -o typst-unlit typst-unlit.hs 211 + ``` 212 + 213 + And now, we can execute our preprocessor on literate haskell files! 214 + 215 + ## Serving 216 + 217 + To test our preprocessor, first, write a literate haskell file 218 + containing your typst code: 219 + 220 + ```` 221 + = Quicksort in Haskell 222 + The first thing to know about Haskell's syntax is that parentheses 223 + are used for grouping, and not for function application. 224 + 225 + ```haskell 226 + quicksort :: Ord a => [a] -> [a] 227 + quicksort [] = [] 228 + quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater) 229 + where 230 + lesser = filter (< p) xs 231 + greater = filter (>= p) xs 232 + ``` 233 + 234 + The parentheses indicate the grouping of operands on the 235 + right-hand side of equations. 236 + ```` 237 + 238 + Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can 239 + compile it with both `ghc` … 240 + 241 + ``` 242 + ghci -pgmL ./typst-unlit quicksort.lhs 243 + GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help 244 + [1 of 2] Compiling Main ( quicksort.lhs, interpreted ) 245 + Ok, one module loaded. 246 + ghci> quicksort [3,2,4,1,5,4] 247 + [1,2,3,4,4,5] 248 + ``` 249 + 250 + … and `typst`: 251 + 252 + ``` 253 + typst compile quicksort.lhs 254 + ``` 255 + 256 + And there you have it! One file that can be interpreted by `ghc` and 257 + rendered beautifully with `typst` simultaneously. 258 + 259 + #### Notes 260 + 261 + This entire document is just a bit of ceremony around writing 262 + preprocessors, the Haskell code in this file can be summarized in this 263 + shell script: 264 + 265 + ``` 266 + #!/usr/bin/env bash 267 + 268 + # this does the same thing as typst-unlit.lhs, but depends on `typst` and `jq` 269 + 270 + typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4" 271 + typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4" 272 + ``` 273 + 274 + [^1]: <https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit> 275 + 276 + [^2]: This document needs itself to compile itself! This is why a 277 + bootstrap program is included.
+244
typst-unlit.lhs
···
··· 1 + #set document(title: [Typst-Unlit]) 2 + #set par(justify: true) 3 + #show raw.where(lang: "haskell"): set align(center) 4 + #show raw.where(lang: "haskell-top"): set align(center) 5 + #show title: set align(center) 6 + #show <subtitle>: set align(center) 7 + 8 + #title() 9 + 10 + Write literate Haskell programs in Typst <subtitle> 11 + 12 + _#link("https://tangled.org/@oppi.li/typst-unlit")[tangled.org/\@oppi.li/typst-unlit]_ <subtitle> 13 + 14 + *Serves: 1 #h(20pt) Prep Time: 10min #h(20pt) Compile Time: 10ms* <subtitle> 15 + 16 + A literate program is one where comments are first-class citizens, and code is explicitly demarcated, as opposed to a regular program, where comments are explicitly marked, and code is a first-class entity. 17 + 18 + GHC supports literate programming out of the box, by using a preprocessor to extract code from documents. This preprocessor is known as _unlit_ #footnote[https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit]. GHC also supports _custom_ preprocessors, which can be passed in via the `-pgmL` flag. This very document you are reading, is one such preprocessor that allows embedding Haskell code inside typst files #footnote[This document needs itself to compile itself! This is why a bootstrap program is included.]. 19 + 20 + This recipe not only gives you a fish (the typst-unlit preprocessor), but also, teaches you how to fish (write your own preprocessors). 21 + 22 + = Ingredients 23 + 24 + #table( 25 + columns: (1fr, 1fr), 26 + gutter: 3pt, 27 + stroke: none, 28 + table.cell(inset: 10pt)[ 29 + To write your own preprocessor: 30 + - GHC: the Glorious Haskell Compiler 31 + - Typst: to generate PDFs 32 + - And thats it! No stacking, shaking or caballing here. 33 + ], 34 + table.cell(inset: 10pt)[ 35 + To compile this very document: 36 + - The bootstrap program 37 + - GHC: to produce an executable program 38 + - Typst: to produce a readable PDF 39 + ], 40 + ) 41 + 42 + *Pro Tip:* If you're missing any ingredients, your local nixpkgs should stock them! 43 + 44 + = Instructions 45 + 46 + The idea behind the unlit program is super simple: iterate over the lines in the supplied input file and replace lines that aren't Haskell with an empty line! To detect lines that are Haskell, we look for the #raw("\u{0060}\u{0060}\u{0060}haskell") directive and stop at the end of the code fence. Simple enough! Annoyingly, Haskell requires that imports be declared at the top of the file. This results in literate Haskell programs always starting with a giant block of imports: 47 + 48 + #set quote(block: true) 49 + #quote(attribution: [Every literate programmer])[ 50 + ``` 51 + -- So first we need to get some boilerplate and imports out of the way. 52 + ``` 53 + ] 54 + 55 + Oh gee, if only we had a tool to put the important stuff first. Our preprocessor will remedy this wart, with the `haskell-top` directive to move blocks to the top. With that out of the way, lets move onto the program itself! 56 + 57 + #pagebreak() 58 + 59 + == Step 1: The maincourse 60 + 61 + I prefer starting with `main` but you do you. Any program that is passed to `ghc -pgmL` has to accept exactly 4 arguments: 62 + 63 + - `-h`: ignore this for now 64 + - `<label>`: ignore this for now 65 + - `<infile>`: the input lhaskell source code 66 + - `<outfile>`: the output Haskell source code 67 + 68 + Invoke the runes to handle CLI arguments: 69 + 70 + ```haskell 71 + main = do 72 + args <- getArgs 73 + case args of 74 + ["-h", _label, infile, outfile] -> process infile outfile 75 + _ -> die "Usage: typst-unlit -h <label> <source> <destination>" 76 + ``` 77 + 78 + You will need these imports accordingly (notice how I am writing my imports _after_ the main function!): 79 + 80 + ```haskell-top 81 + import System.Environment (getArgs) 82 + import System.Exit (die) 83 + ``` 84 + 85 + Now, we move onto defining `process`: 86 + 87 + == Step 2: The processor 88 + 89 + `process` does a bit of IO to read from the input file, remove comments, and write to the output file, `removeComments` is a pure function however: 90 + 91 + ```haskell 92 + process :: FilePath -> FilePath -> IO () 93 + process infile outfile = do 94 + ls <- lines <$> readFile infile 95 + writeFile outfile $ unlines $ removeComments ls 96 + ``` 97 + 98 + == Step 3: Removing comments 99 + 100 + We will be iterating over lines in the file, and wiping clean those lines that are not Haskell. To do so, we must track some state as we will be jumping in and out of code fences: 101 + 102 + ```haskell 103 + data State 104 + = OutsideCode 105 + | InHaskell 106 + | InHaskellTop 107 + deriving (Eq, Show) 108 + ``` 109 + 110 + To detect the code fences itself, we can define a few matcher functions, here is one for the #raw("\u{0060}\u{0060}\u{0060}haskell") pattern: 111 + 112 + ```haskell 113 + withTag :: (String -> Bool) -> String -> Bool 114 + withTag pred line = length ticks > 2 && pred tag 115 + where (ticks, tag) = span (== '`') line 116 + 117 + isHaskell :: String -> Bool 118 + isHaskell = withTag (== "haskell") 119 + ``` 120 + 121 + You will notice that this will also match #raw("\u{0060}\u{0060}\u{0060}\u{0060}haskell"), and this is intentional. If your text already contains 3 backticks inside it, you will need 4 backticks in the code fence and so on. 122 + 123 + We do the same exercise for `haskell-top`: 124 + 125 + ```haskell 126 + isHaskellTop = withTag (== "haskell-top") 127 + ``` 128 + 129 + And for the closing code fences: 130 + 131 + ```haskell 132 + isCodeEnd = withTag null 133 + ``` 134 + 135 + `removeComments` itself, is just a filter, that takes a list of lines and removes comments from those lines: 136 + 137 + ```haskell 138 + removeComments :: [String] -> [String] 139 + removeComments ls = go OutsideCode ls [] [] 140 + ``` 141 + 142 + Finally, `go` is a recursive function that starts with some `State`, a list of input lines, and two more empty lists that are used to store the lines of code that go at the top (using the `haskell-top` directive), and the ones that go below, using the `haskell` directive: 143 + 144 + ```haskell 145 + go :: State -> [String] -> [String] -> [String] -> [String] 146 + ``` 147 + 148 + When the input file is empty, we just combine the `top` and `bottom` stacks of lines to form the file: 149 + 150 + ```haskell 151 + go _ [] top bot = reverse top ++ reverse bot 152 + ``` 153 + 154 + Next, whenever, we are `OutsideCode`, and the current line contains a directive, we must update the state to enter a code block: 155 + 156 + ```haskell 157 + go OutsideCode (x : rest) top bot 158 + | isHaskellTop x = go InHaskellTop rest top ("" : bot) 159 + | isHaskell x = go InHaskell rest top ("" : bot) 160 + | otherwise = go OutsideCode rest top ("" : bot) 161 + ``` 162 + 163 + When we are already inside a Haskell code block, encountering a triple-tick should exit the code block, and any other line encountered in the block is to be included in the final file, but below the imports: 164 + 165 + ```haskell 166 + go InHaskell (x : rest) top bot 167 + | isCodeEnd x = go OutsideCode rest top ("" : bot) 168 + | otherwise = go InHaskell rest top (x : bot) 169 + ``` 170 + 171 + And similarly, for blocks that start with the `haskell-top` directive, lines encountered here go into the `top` stack: 172 + 173 + ```haskell 174 + go InHaskellTop (x : rest) top bot 175 + | isCodeEnd x = go OutsideCode rest top ("" : bot) 176 + | otherwise = go InHaskellTop rest (x : top) bot 177 + ``` 178 + 179 + And thats it! Gently tap the baking pan against the table and let your code settle. Once it is set, you can compile the preprocessor like so: 180 + 181 + ```bash 182 + ghc -o typst-unlit typst-unlit.hs 183 + ``` 184 + 185 + And now, we can execute our preprocessor on literate Haskell files! 186 + 187 + #pagebreak() 188 + 189 + = Serving 190 + 191 + To test our preprocessor, first, write a literate Haskell file containing your typst code: 192 + 193 + ````typst 194 + = Quicksort in Haskell 195 + The first thing to know about Haskell's syntax is that parentheses 196 + are used for grouping, and not for function application. 197 + 198 + ```haskell 199 + quicksort :: Ord a => [a] -> [a] 200 + quicksort [] = [] 201 + quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater) 202 + where 203 + lesser = filter (< p) xs 204 + greater = filter (>= p) xs 205 + ``` 206 + 207 + The parentheses indicate the grouping of operands on the 208 + right-hand side of equations. 209 + ```` 210 + 211 + Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can compile it with both `ghc` ... 212 + 213 + ```bash 214 + ghci -pgmL ./typst-unlit quicksort.lhs 215 + GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help 216 + [1 of 2] Compiling Main ( quicksort.lhs, interpreted ) 217 + Ok, one module loaded. 218 + ghci> quicksort [3,2,4,1,5,4] 219 + [1,2,3,4,4,5] 220 + ``` 221 + 222 + ... and `typst`: 223 + 224 + ```bash 225 + typst compile quicksort.lhs 226 + ``` 227 + 228 + And there you have it! One file that can be interpreted by `ghc` and rendered beautifully with `typst` simultaneously. 229 + 230 + === Notes 231 + 232 + This entire document is just a bit of ceremony around writing preprocessors, the Haskell code in this file can be summarized in this shell script: 233 + 234 + ```bash 235 + #!/usr/bin/env bash 236 + 237 + # this does the same thing as typst-unlit.lhs, but depends on typst and jq 238 + # this script does clobber the line numbers, so users beware 239 + 240 + typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4" 241 + typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4" 242 + ``` 243 + 244 + This document mentions the word "Haskell" 60 times.