···11+# JSON Pointer Tutorial
22+33+This tutorial introduces JSON Pointer as defined in
44+[RFC 6901](https://www.rfc-editor.org/rfc/rfc6901), and demonstrates
55+the `jsont-pointer` OCaml library through interactive examples.
66+77+## What is JSON Pointer?
88+99+From RFC 6901, Section 1:
1010+1111+> JSON Pointer defines a string syntax for identifying a specific value
1212+> within a JavaScript Object Notation (JSON) document.
1313+1414+In other words, JSON Pointer is an addressing scheme for locating values
1515+inside a JSON structure. Think of it like a filesystem path, but for JSON
1616+documents instead of files.
1717+1818+For example, given this JSON document:
1919+2020+```json
2121+{
2222+ "users": [
2323+ {"name": "Alice", "age": 30},
2424+ {"name": "Bob", "age": 25}
2525+ ]
2626+}
2727+```
2828+2929+The JSON Pointer `/users/0/name` refers to the string `"Alice"`.
3030+3131+## Syntax: Reference Tokens
3232+3333+RFC 6901, Section 3 defines the syntax:
3434+3535+> A JSON Pointer is a Unicode string containing a sequence of zero or more
3636+> reference tokens, each prefixed by a '/' (%x2F) character.
3737+3838+The grammar is elegantly simple:
3939+4040+```
4141+json-pointer = *( "/" reference-token )
4242+reference-token = *( unescaped / escaped )
4343+```
4444+4545+This means:
4646+- The empty string `""` is a valid pointer (it refers to the whole document)
4747+- Every non-empty pointer starts with `/`
4848+- Everything between `/` characters is a "reference token"
4949+5050+Let's see this in action. We can parse pointers and see their structure:
5151+5252+```sh
5353+$ jsonpp parse ""
5454+OK: []
5555+```
5656+5757+The empty pointer has no reference tokens - it points to the root.
5858+5959+```sh
6060+$ jsonpp parse "/foo"
6161+OK: [Mem:foo]
6262+```
6363+6464+The pointer `/foo` has one token: `foo`. Since it's not a number, it's
6565+interpreted as an object member name (`Mem`).
6666+6767+```sh
6868+$ jsonpp parse "/foo/0"
6969+OK: [Mem:foo, Nth:0]
7070+```
7171+7272+Here we have two tokens: `foo` (a member name) and `0` (interpreted as
7373+an array index `Nth`).
7474+7575+```sh
7676+$ jsonpp parse "/foo/bar/baz"
7777+OK: [Mem:foo, Mem:bar, Mem:baz]
7878+```
7979+8080+Multiple tokens navigate deeper into nested structures.
8181+8282+### Invalid Syntax
8383+8484+What happens if a pointer doesn't start with `/`?
8585+8686+```sh
8787+$ jsonpp parse "foo"
8888+ERROR: Invalid JSON Pointer: must be empty or start with '/': foo
8989+```
9090+9191+The RFC is strict: non-empty pointers MUST start with `/`.
9292+9393+## Escaping Special Characters
9494+9595+RFC 6901, Section 3 explains the escaping rules:
9696+9797+> Because the characters '~' (%x7E) and '/' (%x2F) have special meanings
9898+> in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be
9999+> encoded as '~1' when these characters appear in a reference token.
100100+101101+Why these specific characters?
102102+- `/` separates tokens, so it must be escaped inside a token
103103+- `~` is the escape character itself, so it must also be escaped
104104+105105+The escape sequences are:
106106+- `~0` represents `~` (tilde)
107107+- `~1` represents `/` (forward slash)
108108+109109+Let's see escaping in action:
110110+111111+```sh
112112+$ jsonpp escape "hello"
113113+hello
114114+```
115115+116116+No special characters, no escaping needed.
117117+118118+```sh
119119+$ jsonpp escape "a/b"
120120+a~1b
121121+```
122122+123123+The `/` becomes `~1`.
124124+125125+```sh
126126+$ jsonpp escape "a~b"
127127+a~0b
128128+```
129129+130130+The `~` becomes `~0`.
131131+132132+```sh
133133+$ jsonpp escape "~/"
134134+~0~1
135135+```
136136+137137+Both characters are escaped.
138138+139139+### Unescaping
140140+141141+And the reverse process:
142142+143143+```sh
144144+$ jsonpp unescape "a~1b"
145145+OK: a/b
146146+```
147147+148148+```sh
149149+$ jsonpp unescape "a~0b"
150150+OK: a~b
151151+```
152152+153153+### The Order Matters!
154154+155155+RFC 6901, Section 4 is careful to specify the unescaping order:
156156+157157+> Evaluation of each reference token begins by decoding any escaped
158158+> character sequence. This is performed by first transforming any
159159+> occurrence of the sequence '~1' to '/', and then transforming any
160160+> occurrence of the sequence '~0' to '~'. By performing the substitutions
161161+> in this order, an implementation avoids the error of turning '~01' first
162162+> into '~1' and then into '/', which would be incorrect (the string '~01'
163163+> correctly becomes '~1' after transformation).
164164+165165+Let's verify this tricky case:
166166+167167+```sh
168168+$ jsonpp unescape "~01"
169169+OK: ~1
170170+```
171171+172172+If we unescaped `~0` first, `~01` would become `~1`, which would then become
173173+`/`. But that's wrong! The sequence `~01` should become the literal string
174174+`~1` (a tilde followed by the digit one).
175175+176176+Invalid escape sequences are rejected:
177177+178178+```sh
179179+$ jsonpp unescape "~2"
180180+ERROR: Invalid JSON Pointer: invalid escape sequence ~2
181181+```
182182+183183+```sh
184184+$ jsonpp unescape "hello~"
185185+ERROR: Invalid JSON Pointer: incomplete escape sequence at end
186186+```
187187+188188+## Evaluation: Navigating JSON
189189+190190+Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4
191191+describes how a pointer is resolved against a JSON document:
192192+193193+> Evaluation of a JSON Pointer begins with a reference to the root value
194194+> of a JSON document and completes with a reference to some value within
195195+> the document. Each reference token in the JSON Pointer is evaluated
196196+> sequentially.
197197+198198+Let's use the example JSON document from RFC 6901, Section 5:
199199+200200+```sh
201201+$ cat rfc6901_example.json
202202+{
203203+ "foo": ["bar", "baz"],
204204+ "": 0,
205205+ "a/b": 1,
206206+ "c%d": 2,
207207+ "e^f": 3,
208208+ "g|h": 4,
209209+ "i\\j": 5,
210210+ "k\"l": 6,
211211+ " ": 7,
212212+ "m~n": 8
213213+}
214214+```
215215+216216+This document is carefully constructed to exercise various edge cases!
217217+218218+### The Root Pointer
219219+220220+```sh
221221+$ jsonpp eval rfc6901_example.json ""
222222+OK: {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8}
223223+```
224224+225225+The empty pointer returns the whole document.
226226+227227+### Object Member Access
228228+229229+```sh
230230+$ jsonpp eval rfc6901_example.json "/foo"
231231+OK: ["bar","baz"]
232232+```
233233+234234+`/foo` accesses the member named `foo`, which is an array.
235235+236236+### Array Index Access
237237+238238+```sh
239239+$ jsonpp eval rfc6901_example.json "/foo/0"
240240+OK: "bar"
241241+```
242242+243243+`/foo/0` first goes to `foo`, then accesses index 0 of the array.
244244+245245+```sh
246246+$ jsonpp eval rfc6901_example.json "/foo/1"
247247+OK: "baz"
248248+```
249249+250250+Index 1 gives us the second element.
251251+252252+### Empty String as Key
253253+254254+JSON allows empty strings as object keys:
255255+256256+```sh
257257+$ jsonpp eval rfc6901_example.json "/"
258258+OK: 0
259259+```
260260+261261+The pointer `/` has one token: the empty string. This accesses the member
262262+with an empty name.
263263+264264+### Keys with Special Characters
265265+266266+Now for the escape sequences:
267267+268268+```sh
269269+$ jsonpp eval rfc6901_example.json "/a~1b"
270270+OK: 1
271271+```
272272+273273+The token `a~1b` unescapes to `a/b`, which is the key name.
274274+275275+```sh
276276+$ jsonpp eval rfc6901_example.json "/m~0n"
277277+OK: 8
278278+```
279279+280280+The token `m~0n` unescapes to `m~n`.
281281+282282+### Other Special Characters (No Escaping Needed)
283283+284284+Most characters don't need escaping in JSON Pointer strings:
285285+286286+```sh
287287+$ jsonpp eval rfc6901_example.json "/c%d"
288288+OK: 2
289289+```
290290+291291+```sh
292292+$ jsonpp eval rfc6901_example.json "/e^f"
293293+OK: 3
294294+```
295295+296296+```sh
297297+$ jsonpp eval rfc6901_example.json "/g|h"
298298+OK: 4
299299+```
300300+301301+```sh
302302+$ jsonpp eval rfc6901_example.json "/ "
303303+OK: 7
304304+```
305305+306306+Even a space is a valid key character!
307307+308308+### Error Conditions
309309+310310+What happens when we try to access something that doesn't exist?
311311+312312+```sh
313313+$ jsonpp eval rfc6901_example.json "/nonexistent"
314314+ERROR: JSON Pointer: member 'nonexistent' not found
315315+File "-":
316316+```
317317+318318+Or an out-of-bounds array index:
319319+320320+```sh
321321+$ jsonpp eval rfc6901_example.json "/foo/99"
322322+ERROR: JSON Pointer: index 99 out of bounds (array has 2 elements)
323323+File "-":
324324+```
325325+326326+Or try to index into a non-container:
327327+328328+```sh
329329+$ jsonpp eval rfc6901_example.json "/foo/0/invalid"
330330+ERROR: JSON Pointer: cannot index into string with 'invalid'
331331+File "-":
332332+```
333333+334334+### Array Index Rules
335335+336336+RFC 6901 has specific rules for array indices. Section 4 states:
337337+338338+> characters comprised of digits [...] that represent an unsigned base-10
339339+> integer value, making the new referenced value the array element with
340340+> the zero-based index identified by the token
341341+342342+And importantly:
343343+344344+> note that leading zeros are not allowed
345345+346346+```sh
347347+$ jsonpp parse "/foo/0"
348348+OK: [Mem:foo, Nth:0]
349349+```
350350+351351+Zero itself is fine.
352352+353353+```sh
354354+$ jsonpp parse "/foo/01"
355355+OK: [Mem:foo, Mem:01]
356356+```
357357+358358+But `01` has a leading zero, so it's NOT treated as an array index - it
359359+becomes a member name instead. This protects against accidental octal
360360+interpretation.
361361+362362+## The End-of-Array Marker: `-`
363363+364364+RFC 6901, Section 4 introduces a special token:
365365+366366+> exactly the single character "-", making the new referenced value the
367367+> (nonexistent) member after the last array element.
368368+369369+This is primarily useful for JSON Patch operations (RFC 6902). Let's see
370370+how it parses:
371371+372372+```sh
373373+$ jsonpp parse "/foo/-"
374374+OK: [Mem:foo, End]
375375+```
376376+377377+The `-` is recognized as a special `End` index.
378378+379379+However, you cannot evaluate a pointer containing `-` because it refers
380380+to a position that doesn't exist:
381381+382382+```sh
383383+$ jsonpp eval rfc6901_example.json "/foo/-"
384384+ERROR: JSON Pointer: '-' (end marker) refers to nonexistent array element
385385+File "-":
386386+```
387387+388388+The RFC explains this:
389389+390390+> Note that the use of the "-" character to index an array will always
391391+> result in such an error condition because by definition it refers to
392392+> a nonexistent array element.
393393+394394+But we'll see later that `-` is very useful for mutation operations!
395395+396396+## URI Fragment Encoding
397397+398398+JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:
399399+400400+> A JSON Pointer can be represented in a URI fragment identifier by
401401+> encoding it into octets using UTF-8, while percent-encoding those
402402+> characters not allowed by the fragment rule in RFC 3986.
403403+404404+This adds percent-encoding on top of the `~0`/`~1` escaping:
405405+406406+```sh
407407+$ jsonpp uri-fragment "/foo"
408408+OK: /foo -> /foo
409409+```
410410+411411+Simple pointers often don't need percent-encoding.
412412+413413+```sh
414414+$ jsonpp uri-fragment "/a~1b"
415415+OK: /a~1b -> /a~1b
416416+```
417417+418418+The `~1` escape stays as-is (it's valid in URI fragments).
419419+420420+```sh
421421+$ jsonpp uri-fragment "/c%d"
422422+OK: /c%d -> /c%25d
423423+```
424424+425425+The `%` character must be percent-encoded as `%25` in URIs!
426426+427427+```sh
428428+$ jsonpp uri-fragment "/ "
429429+OK: / -> /%20
430430+```
431431+432432+Spaces become `%20`.
433433+434434+Here's the RFC example showing the URI fragment forms:
435435+436436+| JSON Pointer | URI Fragment | Value |
437437+|-------------|-------------|-------|
438438+| `""` | `#` | whole document |
439439+| `"/foo"` | `#/foo` | `["bar", "baz"]` |
440440+| `"/foo/0"` | `#/foo/0` | `"bar"` |
441441+| `"/"` | `#/` | `0` |
442442+| `"/a~1b"` | `#/a~1b` | `1` |
443443+| `"/c%d"` | `#/c%25d` | `2` |
444444+| `"/ "` | `#/%20` | `7` |
445445+| `"/m~0n"` | `#/m~0n` | `8` |
446446+447447+## Mutation Operations
448448+449449+While RFC 6901 defines JSON Pointer for read-only access, RFC 6902
450450+(JSON Patch) uses JSON Pointer for modifications. The `jsont-pointer`
451451+library provides these operations.
452452+453453+### Add
454454+455455+The `add` operation inserts a value at a location:
456456+457457+```sh
458458+$ jsonpp add '{"foo":"bar"}' '/baz' '"qux"'
459459+{"foo":"bar","baz":"qux"}
460460+```
461461+462462+For arrays, `add` inserts BEFORE the specified index:
463463+464464+```sh
465465+$ jsonpp add '{"foo":["a","b"]}' '/foo/1' '"X"'
466466+{"foo":["a","X","b"]}
467467+```
468468+469469+This is where the `-` marker shines - it appends to the end:
470470+471471+```sh
472472+$ jsonpp add '{"foo":["a","b"]}' '/foo/-' '"c"'
473473+{"foo":["a","b","c"]}
474474+```
475475+476476+### Remove
477477+478478+The `remove` operation deletes a value:
479479+480480+```sh
481481+$ jsonpp remove '{"foo":"bar","baz":"qux"}' '/baz'
482482+{"foo":"bar"}
483483+```
484484+485485+For arrays, it removes and shifts:
486486+487487+```sh
488488+$ jsonpp remove '{"foo":["a","b","c"]}' '/foo/1'
489489+{"foo":["a","c"]}
490490+```
491491+492492+### Replace
493493+494494+The `replace` operation updates an existing value:
495495+496496+```sh
497497+$ jsonpp replace '{"foo":"bar"}' '/foo' '"baz"'
498498+{"foo":"baz"}
499499+```
500500+501501+Unlike `add`, `replace` requires the target to already exist:
502502+503503+```sh
504504+$ jsonpp replace '{"foo":"bar"}' '/nonexistent' '"value"'
505505+ERROR: JSON Pointer: member 'nonexistent' not found
506506+File "-":
507507+```
508508+509509+### Move
510510+511511+The `move` operation relocates a value:
512512+513513+```sh
514514+$ jsonpp move '{"foo":{"bar":"baz"},"qux":{}}' '/foo/bar' '/qux/thud'
515515+{"foo":{},"qux":{"thud":"baz"}}
516516+```
517517+518518+### Copy
519519+520520+The `copy` operation duplicates a value:
521521+522522+```sh
523523+$ jsonpp copy '{"foo":{"bar":"baz"}}' '/foo/bar' '/foo/qux'
524524+{"foo":{"bar":"baz","qux":"baz"}}
525525+```
526526+527527+### Test
528528+529529+The `test` operation verifies a value (useful in JSON Patch):
530530+531531+```sh
532532+$ jsonpp test '{"foo":"bar"}' '/foo' '"bar"'
533533+true
534534+```
535535+536536+```sh
537537+$ jsonpp test '{"foo":"bar"}' '/foo' '"baz"'
538538+false
539539+```
540540+541541+## Deeply Nested Structures
542542+543543+JSON Pointer handles arbitrarily deep nesting:
544544+545545+```sh
546546+$ jsonpp eval rfc6901_example.json "/foo/0"
547547+OK: "bar"
548548+```
549549+550550+For deeper structures, just add more path segments. With nested objects:
551551+552552+```sh
553553+$ jsonpp add '{"a":{"b":{"c":"d"}}}' '/a/b/x' '"y"'
554554+{"a":{"b":{"c":"d","x":"y"}}}
555555+```
556556+557557+With nested arrays:
558558+559559+```sh
560560+$ jsonpp add '{"arr":[[1,2],[3,4]]}' '/arr/0/1' '99'
561561+{"arr":[[1,99,2],[3,4]]}
562562+```
563563+564564+## Summary
565565+566566+JSON Pointer (RFC 6901) provides a simple but powerful way to address
567567+values within JSON documents:
568568+569569+1. **Syntax**: Pointers are strings of `/`-separated reference tokens
570570+2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens
571571+3. **Evaluation**: Tokens navigate through objects (by key) and arrays (by index)
572572+4. **URI Encoding**: Pointers can be percent-encoded for use in URIs
573573+5. **Mutations**: Combined with JSON Patch (RFC 6902), pointers enable structured updates
574574+575575+The `jsont-pointer` library implements all of this with type-safe OCaml
576576+interfaces, integration with the `jsont` codec system, and proper error
577577+handling for malformed pointers and missing values.