RFC6901 JSON Pointer implementation in OCaml using jsont

odoc

+882 -882
+5
doc/dune
··· 1 1 (mdx 2 + (files tutorial.mld) 2 3 (libraries jsont jsont.bytesrw jsont_pointer jsont_pointer_top)) 4 + 5 + (documentation 6 + (package jsont-pointer) 7 + (mld_files index tutorial))
+8
doc/index.mld
··· 1 + {0 jsont-pointer} 2 + 3 + {!modules: Jsont_pointer Jsont_pointer_top} 4 + 5 + {1 Tutorial} 6 + 7 + See the {!page-tutorial} for a comprehensive guide to using JSON Pointers 8 + with this library.
-873
doc/tutorial.md
··· 1 - # JSON Pointer Tutorial 2 - 3 - This tutorial introduces JSON Pointer as defined in 4 - [RFC 6901](https://www.rfc-editor.org/rfc/rfc6901), and demonstrates 5 - the `jsont-pointer` OCaml library through interactive examples. 6 - 7 - ## JSON Pointer vs JSON Path 8 - 9 - Before diving in, it's worth understanding the difference between JSON 10 - Pointer and JSON Path, as they serve different purposes: 11 - 12 - **JSON Pointer** (RFC 6901) is an *indicator syntax* that specifies a 13 - *single location* within JSON data. It always identifies at most one 14 - value. 15 - 16 - **JSON Path** is a *query syntax* that can *search* JSON data and return 17 - *multiple* values matching specified criteria. 18 - 19 - Use JSON Pointer when you need to address a single, specific location 20 - (like JSON Schema's `$ref`). Use JSON Path when you might need multiple 21 - results (like Kubernetes queries). 22 - 23 - The `jsont-pointer` library implements JSON Pointer and integrates with 24 - the `Jsont.Path` type for representing navigation indices. 25 - 26 - ## Setup 27 - 28 - First, let's set up our environment with helper functions: 29 - 30 - ```ocaml 31 - # open Jsont_pointer;; 32 - # #install_printer Jsont_pointer_top.nav_printer;; 33 - # #install_printer Jsont_pointer_top.append_printer;; 34 - # #install_printer Jsont_pointer_top.json_printer;; 35 - # #install_printer Jsont_pointer_top.error_printer;; 36 - # let parse_json s = 37 - match Jsont_bytesrw.decode_string Jsont.json s with 38 - | Ok json -> json 39 - | Error e -> failwith e;; 40 - val parse_json : string -> Jsont.json = <fun> 41 - ``` 42 - 43 - ## What is JSON Pointer? 44 - 45 - From RFC 6901, Section 1: 46 - 47 - > JSON Pointer defines a string syntax for identifying a specific value 48 - > within a JavaScript Object Notation (JSON) document. 49 - 50 - In other words, JSON Pointer is an addressing scheme for locating values 51 - inside a JSON structure. Think of it like a filesystem path, but for JSON 52 - documents instead of files. 53 - 54 - For example, given this JSON document: 55 - 56 - ```ocaml 57 - # let users_json = parse_json {|{ 58 - "users": [ 59 - {"name": "Alice", "age": 30}, 60 - {"name": "Bob", "age": 25} 61 - ] 62 - }|};; 63 - val users_json : Jsont.json = 64 - {"users":[{"name":"Alice","age":30},{"name":"Bob","age":25}]} 65 - ``` 66 - 67 - The JSON Pointer `/users/0/name` refers to the string `"Alice"`: 68 - 69 - ```ocaml 70 - # let ptr = of_string_nav "/users/0/name";; 71 - val ptr : nav t = [Mem "users"; Nth 0; Mem "name"] 72 - # get ptr users_json;; 73 - - : Jsont.json = "Alice" 74 - ``` 75 - 76 - In OCaml, this is represented by the `'a Jsont_pointer.t` type - a sequence 77 - of navigation steps from the document root to a target value. The phantom 78 - type parameter `'a` encodes whether this is a navigation pointer or an 79 - append pointer (more on this later). 80 - 81 - ## Syntax: Reference Tokens 82 - 83 - RFC 6901, Section 3 defines the syntax: 84 - 85 - > A JSON Pointer is a Unicode string containing a sequence of zero or more 86 - > reference tokens, each prefixed by a '/' (%x2F) character. 87 - 88 - The grammar is elegantly simple: 89 - 90 - ``` 91 - json-pointer = *( "/" reference-token ) 92 - reference-token = *( unescaped / escaped ) 93 - ``` 94 - 95 - This means: 96 - - The empty string `""` is a valid pointer (it refers to the whole document) 97 - - Every non-empty pointer starts with `/` 98 - - Everything between `/` characters is a "reference token" 99 - 100 - Let's see this in action: 101 - 102 - ```ocaml 103 - # of_string_nav "";; 104 - - : nav t = [] 105 - ``` 106 - 107 - The empty pointer has no reference tokens - it points to the root. 108 - 109 - ```ocaml 110 - # of_string_nav "/foo";; 111 - - : nav t = [Mem "foo"] 112 - ``` 113 - 114 - The pointer `/foo` has one token: `foo`. Since it's not a number, it's 115 - interpreted as an object member name (`Mem`). 116 - 117 - ```ocaml 118 - # of_string_nav "/foo/0";; 119 - - : nav t = [Mem "foo"; Nth 0] 120 - ``` 121 - 122 - Here we have two tokens: `foo` (a member name) and `0` (interpreted as 123 - an array index `Nth`). 124 - 125 - ```ocaml 126 - # of_string_nav "/foo/bar/baz";; 127 - - : nav t = [Mem "foo"; Mem "bar"; Mem "baz"] 128 - ``` 129 - 130 - Multiple tokens navigate deeper into nested structures. 131 - 132 - ### The Index Type 133 - 134 - Each reference token is represented using `Jsont.Path.index`: 135 - 136 - <!-- $MDX skip --> 137 - ```ocaml 138 - type index = Jsont.Path.index 139 - (* = Jsont.Path.Mem of string * Jsont.Meta.t 140 - | Jsont.Path.Nth of int * Jsont.Meta.t *) 141 - ``` 142 - 143 - The `Mem` constructor is for object member access, and `Nth` is for array 144 - index access. The member name is **unescaped** - you work with the actual 145 - key string (like `"a/b"`) and the library handles any escaping needed 146 - for the JSON Pointer string representation. 147 - 148 - ### Invalid Syntax 149 - 150 - What happens if a pointer doesn't start with `/`? 151 - 152 - ```ocaml 153 - # of_string_nav "foo";; 154 - Exception: 155 - Jsont.Error Invalid JSON Pointer: must be empty or start with '/': foo. 156 - ``` 157 - 158 - The RFC is strict: non-empty pointers MUST start with `/`. 159 - 160 - For safer parsing, use `of_string_result`: 161 - 162 - ```ocaml 163 - # of_string_result "foo";; 164 - - : ([ `Append of append t | `Nav of nav t ], string) result = 165 - Error "Invalid JSON Pointer: must be empty or start with '/': foo" 166 - # of_string_result "/valid";; 167 - - : ([ `Append of append t | `Nav of nav t ], string) result = 168 - Ok (`Nav [Mem "valid"]) 169 - ``` 170 - 171 - ## Evaluation: Navigating JSON 172 - 173 - Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 174 - describes how a pointer is resolved against a JSON document: 175 - 176 - > Evaluation of a JSON Pointer begins with a reference to the root value 177 - > of a JSON document and completes with a reference to some value within 178 - > the document. Each reference token in the JSON Pointer is evaluated 179 - > sequentially. 180 - 181 - Let's use the example JSON document from RFC 6901, Section 5: 182 - 183 - ```ocaml 184 - # let rfc_example = parse_json {|{ 185 - "foo": ["bar", "baz"], 186 - "": 0, 187 - "a/b": 1, 188 - "c%d": 2, 189 - "e^f": 3, 190 - "g|h": 4, 191 - "i\\j": 5, 192 - "k\"l": 6, 193 - " ": 7, 194 - "m~n": 8 195 - }|};; 196 - val rfc_example : Jsont.json = 197 - {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} 198 - ``` 199 - 200 - This document is carefully constructed to exercise various edge cases! 201 - 202 - ### The Root Pointer 203 - 204 - ```ocaml 205 - # get root rfc_example ;; 206 - - : Jsont.json = 207 - {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} 208 - ``` 209 - 210 - The empty pointer (`root`) returns the whole document. 211 - 212 - ### Object Member Access 213 - 214 - ```ocaml 215 - # get (of_string_nav "/foo") rfc_example ;; 216 - - : Jsont.json = ["bar","baz"] 217 - ``` 218 - 219 - `/foo` accesses the member named `foo`, which is an array. 220 - 221 - ### Array Index Access 222 - 223 - ```ocaml 224 - # get (of_string_nav "/foo/0") rfc_example ;; 225 - - : Jsont.json = "bar" 226 - # get (of_string_nav "/foo/1") rfc_example ;; 227 - - : Jsont.json = "baz" 228 - ``` 229 - 230 - `/foo/0` first goes to `foo`, then accesses index 0 of the array. 231 - 232 - ### Empty String as Key 233 - 234 - JSON allows empty strings as object keys: 235 - 236 - ```ocaml 237 - # get (of_string_nav "/") rfc_example ;; 238 - - : Jsont.json = 0 239 - ``` 240 - 241 - The pointer `/` has one token: the empty string. This accesses the member 242 - with an empty name. 243 - 244 - ### Keys with Special Characters 245 - 246 - The RFC example includes keys with `/` and `~` characters: 247 - 248 - ```ocaml 249 - # get (of_string_nav "/a~1b") rfc_example ;; 250 - - : Jsont.json = 1 251 - ``` 252 - 253 - The token `a~1b` refers to the key `a/b`. We'll explain this escaping 254 - [below](#escaping-special-characters). 255 - 256 - ```ocaml 257 - # get (of_string_nav "/m~0n") rfc_example ;; 258 - - : Jsont.json = 8 259 - ``` 260 - 261 - The token `m~0n` refers to the key `m~n`. 262 - 263 - **Important**: When using the OCaml library programmatically, you don't need 264 - to worry about escaping. The `Mem` variant holds the literal key name: 265 - 266 - ```ocaml 267 - # let slash_ptr = make [mem "a/b"];; 268 - val slash_ptr : nav t = [Mem "a/b"] 269 - # to_string slash_ptr;; 270 - - : string = "/a~1b" 271 - # get slash_ptr rfc_example ;; 272 - - : Jsont.json = 1 273 - ``` 274 - 275 - The library escapes it when converting to string. 276 - 277 - ### Other Special Characters (No Escaping Needed) 278 - 279 - Most characters don't need escaping in JSON Pointer strings: 280 - 281 - ```ocaml 282 - # get (of_string_nav "/c%d") rfc_example ;; 283 - - : Jsont.json = 2 284 - # get (of_string_nav "/e^f") rfc_example ;; 285 - - : Jsont.json = 3 286 - # get (of_string_nav "/g|h") rfc_example ;; 287 - - : Jsont.json = 4 288 - # get (of_string_nav "/ ") rfc_example ;; 289 - - : Jsont.json = 7 290 - ``` 291 - 292 - Even a space is a valid key character! 293 - 294 - ### Error Conditions 295 - 296 - What happens when we try to access something that doesn't exist? 297 - 298 - ```ocaml 299 - # get_result (of_string_nav "/nonexistent") rfc_example;; 300 - - : (Jsont.json, Jsont.Error.t) result = 301 - Error JSON Pointer: member 'nonexistent' not found 302 - File "-": 303 - # find (of_string_nav "/nonexistent") rfc_example;; 304 - - : Jsont.json option = None 305 - ``` 306 - 307 - Or an out-of-bounds array index: 308 - 309 - ```ocaml 310 - # find (of_string_nav "/foo/99") rfc_example;; 311 - - : Jsont.json option = None 312 - ``` 313 - 314 - Or try to index into a non-container: 315 - 316 - ```ocaml 317 - # find (of_string_nav "/foo/0/invalid") rfc_example;; 318 - - : Jsont.json option = None 319 - ``` 320 - 321 - The library provides both exception-raising and result-returning variants: 322 - 323 - <!-- $MDX skip --> 324 - ```ocaml 325 - val get : nav t -> Jsont.json -> Jsont.json 326 - val get_result : nav t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result 327 - val find : nav t -> Jsont.json -> Jsont.json option 328 - ``` 329 - 330 - ### Array Index Rules 331 - 332 - RFC 6901 has specific rules for array indices. Section 4 states: 333 - 334 - > characters comprised of digits [...] that represent an unsigned base-10 335 - > integer value, making the new referenced value the array element with 336 - > the zero-based index identified by the token 337 - 338 - And importantly: 339 - 340 - > note that leading zeros are not allowed 341 - 342 - ```ocaml 343 - # of_string_nav "/foo/0";; 344 - - : nav t = [Mem "foo"; Nth 0] 345 - ``` 346 - 347 - Zero itself is fine. 348 - 349 - ```ocaml 350 - # of_string_nav "/foo/01";; 351 - - : nav t = [Mem "foo"; Mem "01"] 352 - ``` 353 - 354 - But `01` has a leading zero, so it's NOT treated as an array index - it 355 - becomes a member name instead. This protects against accidental octal 356 - interpretation. 357 - 358 - ## The End-of-Array Marker: `-` and Type Safety 359 - 360 - RFC 6901, Section 4 introduces a special token: 361 - 362 - > exactly the single character "-", making the new referenced value the 363 - > (nonexistent) member after the last array element. 364 - 365 - This `-` marker is unique to JSON Pointer (JSON Path has no equivalent). 366 - It's primarily useful for JSON Patch operations (RFC 6902) to append 367 - elements to arrays. 368 - 369 - ### Navigation vs Append Pointers 370 - 371 - The `jsont-pointer` library uses **phantom types** to encode the difference 372 - between pointers that can be used for navigation and pointers that target 373 - the "append position": 374 - 375 - <!-- $MDX skip --> 376 - ```ocaml 377 - type nav (* A pointer to an existing element *) 378 - type append (* A pointer ending with "-" (append position) *) 379 - type 'a t (* Pointer with phantom type parameter *) 380 - ``` 381 - 382 - When you parse a pointer, you get either a `nav t` or an `append t`: 383 - 384 - ```ocaml 385 - # of_string "/foo/0";; 386 - - : [ `Append of append t | `Nav of nav t ] = `Nav [Mem "foo"; Nth 0] 387 - # of_string "/foo/-";; 388 - - : [ `Append of append t | `Nav of nav t ] = `Append [Mem "foo"] /- 389 - ``` 390 - 391 - The `-` creates an `append` pointer. Note that in the internal 392 - representation, the append position is tracked separately (shown as `/-`). 393 - 394 - ### Why Phantom Types? 395 - 396 - The RFC explains that `-` refers to a *nonexistent* position: 397 - 398 - > Note that the use of the "-" character to index an array will always 399 - > result in such an error condition because by definition it refers to 400 - > a nonexistent array element. 401 - 402 - So you **cannot use `get` or `find`** with an append pointer - it makes 403 - no sense to retrieve a value from a position that doesn't exist! The 404 - library enforces this at compile time: 405 - 406 - ```ocaml 407 - # (* This won't compile: get requires nav t, not append t *) 408 - (* get (match of_string "/foo/-" with `Append p -> p | _ -> assert false) rfc_example;; *) 409 - ``` 410 - 411 - However, append pointers **are** valid for mutation operations like `add`: 412 - 413 - ```ocaml 414 - # let arr_obj = parse_json {|{"foo":["a","b"]}|};; 415 - val arr_obj : Jsont.json = {"foo":["a","b"]} 416 - # match of_string "/foo/-" with 417 - | `Append p -> add p arr_obj ~value:(Jsont.Json.string "c") 418 - | `Nav _ -> assert false ;; 419 - - : Jsont.json = {"foo":["a","b","c"]} 420 - ``` 421 - 422 - For convenience, use `of_string_nav` when you know a pointer shouldn't 423 - contain `-`: 424 - 425 - ```ocaml 426 - # of_string_nav "/foo/0";; 427 - - : nav t = [Mem "foo"; Nth 0] 428 - # of_string_nav "/foo/-";; 429 - Exception: 430 - Jsont.Error Invalid JSON Pointer: '-' not allowed in navigation pointer. 431 - ``` 432 - 433 - ### Creating Append Pointers Programmatically 434 - 435 - You can convert a navigation pointer to an append pointer using `at_end`: 436 - 437 - ```ocaml 438 - # let nav_ptr = of_string_nav "/foo";; 439 - val nav_ptr : nav t = [Mem "foo"] 440 - # let app_ptr = at_end nav_ptr;; 441 - val app_ptr : append t = [Mem "foo"] /- 442 - # to_string app_ptr;; 443 - - : string = "/foo/-" 444 - ``` 445 - 446 - ## Mutation Operations 447 - 448 - While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 449 - (JSON Patch) uses JSON Pointer for modifications. The `jsont-pointer` 450 - library provides these operations. 451 - 452 - ### Which Pointer Type for Which Operation? 453 - 454 - The phantom type system enforces correct usage: 455 - 456 - | Operation | Accepts | Because | 457 - |-----------|---------|---------| 458 - | `get`, `find` | `nav t` only | Can't retrieve from non-existent position | 459 - | `remove` | `nav t` only | Can't remove what doesn't exist | 460 - | `replace` | `nav t` only | Can't replace what doesn't exist | 461 - | `test` | `nav t` only | Can't test non-existent position | 462 - | `add` | `_ t` (both) | Can add at existing position OR append | 463 - | `set` | `_ t` (both) | Can set existing position OR append | 464 - | `move`, `copy` | `from:nav t`, `path:_ t` | Source must exist, dest can be append | 465 - 466 - ### Add 467 - 468 - The `add` operation inserts a value at a location: 469 - 470 - ```ocaml 471 - # let obj = parse_json {|{"foo":"bar"}|};; 472 - val obj : Jsont.json = {"foo":"bar"} 473 - # add (of_string_nav "/baz") obj ~value:(Jsont.Json.string "qux") 474 - ;; 475 - - : Jsont.json = {"foo":"bar","baz":"qux"} 476 - ``` 477 - 478 - For arrays, `add` inserts BEFORE the specified index: 479 - 480 - ```ocaml 481 - # let arr_obj = parse_json {|{"foo":["a","b"]}|};; 482 - val arr_obj : Jsont.json = {"foo":["a","b"]} 483 - # add (of_string_nav "/foo/1") arr_obj ~value:(Jsont.Json.string "X") 484 - ;; 485 - - : Jsont.json = {"foo":["a","X","b"]} 486 - ``` 487 - 488 - This is where the `-` marker and append pointers shine - they append to the end: 489 - 490 - ```ocaml 491 - # match of_string "/foo/-" with 492 - | `Append p -> add p arr_obj ~value:(Jsont.Json.string "c") 493 - | `Nav _ -> assert false ;; 494 - - : Jsont.json = {"foo":["a","b","c"]} 495 - ``` 496 - 497 - Or more conveniently using `at_end`: 498 - 499 - ```ocaml 500 - # add (at_end (of_string_nav "/foo")) arr_obj ~value:(Jsont.Json.string "c") 501 - ;; 502 - - : Jsont.json = {"foo":["a","b","c"]} 503 - ``` 504 - 505 - ### Remove 506 - 507 - The `remove` operation deletes a value. It only accepts `nav t` because 508 - you can only remove something that exists: 509 - 510 - ```ocaml 511 - # let two_fields = parse_json {|{"foo":"bar","baz":"qux"}|};; 512 - val two_fields : Jsont.json = {"foo":"bar","baz":"qux"} 513 - # remove (of_string_nav "/baz") two_fields ;; 514 - - : Jsont.json = {"foo":"bar"} 515 - ``` 516 - 517 - For arrays, it removes and shifts: 518 - 519 - ```ocaml 520 - # let three_elem = parse_json {|{"foo":["a","b","c"]}|};; 521 - val three_elem : Jsont.json = {"foo":["a","b","c"]} 522 - # remove (of_string_nav "/foo/1") three_elem ;; 523 - - : Jsont.json = {"foo":["a","c"]} 524 - ``` 525 - 526 - ### Replace 527 - 528 - The `replace` operation updates an existing value: 529 - 530 - ```ocaml 531 - # replace (of_string_nav "/foo") obj ~value:(Jsont.Json.string "baz") 532 - ;; 533 - - : Jsont.json = {"foo":"baz"} 534 - ``` 535 - 536 - Unlike `add`, `replace` requires the target to already exist (hence `nav t`). 537 - Attempting to replace a nonexistent path raises an error. 538 - 539 - ### Move 540 - 541 - The `move` operation relocates a value. The source (`from`) must be a `nav t` 542 - (you can only move something that exists), but the destination (`path`) can 543 - be either: 544 - 545 - ```ocaml 546 - # let nested = parse_json {|{"foo":{"bar":"baz"},"qux":{}}|};; 547 - val nested : Jsont.json = {"foo":{"bar":"baz"},"qux":{}} 548 - # move ~from:(of_string_nav "/foo/bar") ~path:(of_string_nav "/qux/thud") nested 549 - ;; 550 - - : Jsont.json = {"foo":{},"qux":{"thud":"baz"}} 551 - ``` 552 - 553 - ### Copy 554 - 555 - The `copy` operation duplicates a value (same typing as `move`): 556 - 557 - ```ocaml 558 - # let to_copy = parse_json {|{"foo":{"bar":"baz"}}|};; 559 - val to_copy : Jsont.json = {"foo":{"bar":"baz"}} 560 - # copy ~from:(of_string_nav "/foo/bar") ~path:(of_string_nav "/foo/qux") to_copy 561 - ;; 562 - - : Jsont.json = {"foo":{"bar":"baz","qux":"baz"}} 563 - ``` 564 - 565 - ### Test 566 - 567 - The `test` operation verifies a value (useful in JSON Patch): 568 - 569 - ```ocaml 570 - # test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "bar");; 571 - - : bool = true 572 - # test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "wrong");; 573 - - : bool = false 574 - ``` 575 - 576 - ## Escaping Special Characters 577 - 578 - RFC 6901, Section 3 explains the escaping rules: 579 - 580 - > Because the characters '\~' (%x7E) and '/' (%x2F) have special meanings 581 - > in JSON Pointer, '\~' needs to be encoded as '\~0' and '/' needs to be 582 - > encoded as '\~1' when these characters appear in a reference token. 583 - 584 - Why these specific characters? 585 - - `/` separates tokens, so it must be escaped inside a token 586 - - `~` is the escape character itself, so it must also be escaped 587 - 588 - The escape sequences are: 589 - - `~0` represents `~` (tilde) 590 - - `~1` represents `/` (forward slash) 591 - 592 - ### The Library Handles Escaping Automatically 593 - 594 - **Important**: When using `jsont-pointer` programmatically, you rarely need 595 - to think about escaping. The `Mem` variant stores unescaped strings, 596 - and escaping happens automatically during serialization: 597 - 598 - ```ocaml 599 - # let p = make [mem "a/b"];; 600 - val p : nav t = [Mem "a/b"] 601 - # to_string p;; 602 - - : string = "/a~1b" 603 - # of_string_nav "/a~1b";; 604 - - : nav t = [Mem "a/b"] 605 - ``` 606 - 607 - ### Escaping in Action 608 - 609 - The `Token` module exposes the escaping functions: 610 - 611 - ```ocaml 612 - # Token.escape "hello";; 613 - - : string = "hello" 614 - # Token.escape "a/b";; 615 - - : string = "a~1b" 616 - # Token.escape "a~b";; 617 - - : string = "a~0b" 618 - # Token.escape "~/";; 619 - - : string = "~0~1" 620 - ``` 621 - 622 - ### Unescaping 623 - 624 - And the reverse process: 625 - 626 - ```ocaml 627 - # Token.unescape "a~1b";; 628 - - : string = "a/b" 629 - # Token.unescape "a~0b";; 630 - - : string = "a~b" 631 - ``` 632 - 633 - ### The Order Matters! 634 - 635 - RFC 6901, Section 4 is careful to specify the unescaping order: 636 - 637 - > Evaluation of each reference token begins by decoding any escaped 638 - > character sequence. This is performed by first transforming any 639 - > occurrence of the sequence '~1' to '/', and then transforming any 640 - > occurrence of the sequence '~0' to '~'. By performing the substitutions 641 - > in this order, an implementation avoids the error of turning '~01' first 642 - > into '~1' and then into '/', which would be incorrect (the string '~01' 643 - > correctly becomes '~1' after transformation). 644 - 645 - Let's verify this tricky case: 646 - 647 - ```ocaml 648 - # Token.unescape "~01";; 649 - - : string = "~1" 650 - ``` 651 - 652 - If we unescaped `~0` first, `~01` would become `~1`, which would then become 653 - `/`. But that's wrong! The sequence `~01` should become the literal string 654 - `~1` (a tilde followed by the digit one). 655 - 656 - ## URI Fragment Encoding 657 - 658 - JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains: 659 - 660 - > A JSON Pointer can be represented in a URI fragment identifier by 661 - > encoding it into octets using UTF-8, while percent-encoding those 662 - > characters not allowed by the fragment rule in RFC 3986. 663 - 664 - This adds percent-encoding on top of the `~0`/`~1` escaping: 665 - 666 - ```ocaml 667 - # to_uri_fragment (of_string_nav "/foo");; 668 - - : string = "/foo" 669 - # to_uri_fragment (of_string_nav "/a~1b");; 670 - - : string = "/a~1b" 671 - # to_uri_fragment (of_string_nav "/c%d");; 672 - - : string = "/c%25d" 673 - # to_uri_fragment (of_string_nav "/ ");; 674 - - : string = "/%20" 675 - ``` 676 - 677 - The `%` character must be percent-encoded as `%25` in URIs, and 678 - spaces become `%20`. 679 - 680 - Here's the RFC example showing the URI fragment forms: 681 - 682 - | JSON Pointer | URI Fragment | Value | 683 - |-------------|-------------|-------| 684 - | `""` | `#` | whole document | 685 - | `"/foo"` | `#/foo` | `["bar", "baz"]` | 686 - | `"/foo/0"` | `#/foo/0` | `"bar"` | 687 - | `"/"` | `#/` | `0` | 688 - | `"/a~1b"` | `#/a~1b` | `1` | 689 - | `"/c%d"` | `#/c%25d` | `2` | 690 - | `"/ "` | `#/%20` | `7` | 691 - | `"/m~0n"` | `#/m~0n` | `8` | 692 - 693 - ## Building Pointers Programmatically 694 - 695 - Instead of parsing strings, you can build pointers from indices: 696 - 697 - ```ocaml 698 - # let port_ptr = make [mem "database"; mem "port"];; 699 - val port_ptr : nav t = [Mem "database"; Mem "port"] 700 - # to_string port_ptr;; 701 - - : string = "/database/port" 702 - ``` 703 - 704 - For array access, use the `nth` helper: 705 - 706 - ```ocaml 707 - # let first_feature_ptr = make [mem "features"; nth 0];; 708 - val first_feature_ptr : nav t = [Mem "features"; Nth 0] 709 - # to_string first_feature_ptr;; 710 - - : string = "/features/0" 711 - ``` 712 - 713 - ### Pointer Navigation 714 - 715 - You can build pointers incrementally using the `/` operator (or `append_index`): 716 - 717 - ```ocaml 718 - # let db_ptr = of_string_nav "/database";; 719 - val db_ptr : nav t = [Mem "database"] 720 - # let creds_ptr = db_ptr / mem "credentials";; 721 - val creds_ptr : nav t = [Mem "database"; Mem "credentials"] 722 - # let user_ptr = creds_ptr / mem "username";; 723 - val user_ptr : nav t = [Mem "database"; Mem "credentials"; Mem "username"] 724 - # to_string user_ptr;; 725 - - : string = "/database/credentials/username" 726 - ``` 727 - 728 - Or concatenate two pointers: 729 - 730 - ```ocaml 731 - # let base = of_string_nav "/api/v1";; 732 - val base : nav t = [Mem "api"; Mem "v1"] 733 - # let endpoint = of_string_nav "/users/0";; 734 - val endpoint : nav t = [Mem "users"; Nth 0] 735 - # to_string (concat base endpoint);; 736 - - : string = "/api/v1/users/0" 737 - ``` 738 - 739 - ## Jsont Integration 740 - 741 - The library integrates with the `Jsont` codec system, allowing you to 742 - combine JSON Pointer navigation with typed decoding. This is powerful 743 - because you can point to a location in a JSON document and decode it 744 - directly to an OCaml type. 745 - 746 - ```ocaml 747 - # let config_json = parse_json {|{ 748 - "database": { 749 - "host": "localhost", 750 - "port": 5432, 751 - "credentials": {"username": "admin", "password": "secret"} 752 - }, 753 - "features": ["auth", "logging", "metrics"] 754 - }|};; 755 - val config_json : Jsont.json = 756 - {"database":{"host":"localhost","port":5432,"credentials":{"username":"admin","password":"secret"}},"features":["auth","logging","metrics"]} 757 - ``` 758 - 759 - ### Typed Access with `path` 760 - 761 - The `path` combinator combines pointer navigation with typed decoding: 762 - 763 - ```ocaml 764 - # let db_host = 765 - Jsont.Json.decode 766 - (path (of_string_nav "/database/host") Jsont.string) 767 - config_json 768 - |> Result.get_ok;; 769 - val db_host : string = "localhost" 770 - # let db_port = 771 - Jsont.Json.decode 772 - (path (of_string_nav "/database/port") Jsont.int) 773 - config_json 774 - |> Result.get_ok;; 775 - val db_port : int = 5432 776 - ``` 777 - 778 - Extract a list of strings: 779 - 780 - ```ocaml 781 - # let features = 782 - Jsont.Json.decode 783 - (path (of_string_nav "/features") Jsont.(list string)) 784 - config_json 785 - |> Result.get_ok;; 786 - val features : string list = ["auth"; "logging"; "metrics"] 787 - ``` 788 - 789 - ### Default Values with `~absent` 790 - 791 - Use `~absent` to provide a default when a path doesn't exist: 792 - 793 - ```ocaml 794 - # let timeout = 795 - Jsont.Json.decode 796 - (path ~absent:30 (of_string_nav "/database/timeout") Jsont.int) 797 - config_json 798 - |> Result.get_ok;; 799 - val timeout : int = 30 800 - ``` 801 - 802 - ### Nested Path Extraction 803 - 804 - You can extract values from deeply nested structures: 805 - 806 - ```ocaml 807 - # let org_json = parse_json {|{ 808 - "organization": { 809 - "owner": {"name": "Alice", "email": "alice@example.com", "age": 35}, 810 - "members": [{"name": "Bob", "email": "bob@example.com", "age": 28}] 811 - } 812 - }|};; 813 - val org_json : Jsont.json = 814 - {"organization":{"owner":{"name":"Alice","email":"alice@example.com","age":35},"members":[{"name":"Bob","email":"bob@example.com","age":28}]}} 815 - # Jsont.Json.decode 816 - (path (of_string_nav "/organization/owner/name") Jsont.string) 817 - org_json 818 - |> Result.get_ok;; 819 - - : string = "Alice" 820 - # Jsont.Json.decode 821 - (path (of_string_nav "/organization/members/0/age") Jsont.int) 822 - org_json 823 - |> Result.get_ok;; 824 - - : int = 28 825 - ``` 826 - 827 - ### Comparison: Raw vs Typed Access 828 - 829 - **Raw access** requires pattern matching: 830 - 831 - ```ocaml 832 - # let raw_port = 833 - match get (of_string_nav "/database/port") config_json with 834 - | Jsont.Number (f, _) -> int_of_float f 835 - | _ -> failwith "expected number";; 836 - val raw_port : int = 5432 837 - ``` 838 - 839 - **Typed access** is cleaner and type-safe: 840 - 841 - ```ocaml 842 - # let typed_port = 843 - Jsont.Json.decode 844 - (path (of_string_nav "/database/port") Jsont.int) 845 - config_json 846 - |> Result.get_ok;; 847 - val typed_port : int = 5432 848 - ``` 849 - 850 - The typed approach catches mismatches at decode time with clear errors. 851 - 852 - ## Summary 853 - 854 - JSON Pointer (RFC 6901) provides a simple but powerful way to address 855 - values within JSON documents: 856 - 857 - 1. **Syntax**: Pointers are strings of `/`-separated reference tokens 858 - 2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens (handled automatically by the library) 859 - 3. **Evaluation**: Tokens navigate through objects (by key) and arrays (by index) 860 - 4. **URI Encoding**: Pointers can be percent-encoded for use in URIs 861 - 5. **Mutations**: Combined with JSON Patch (RFC 6902), pointers enable structured updates 862 - 6. **Type Safety**: Phantom types (`nav t` vs `append t`) prevent misuse of append pointers with retrieval operations 863 - 864 - The `jsont-pointer` library implements all of this with type-safe OCaml 865 - interfaces, integration with the `jsont` codec system, and proper error 866 - handling for malformed pointers and missing values. 867 - 868 - ### Key Points on JSON Pointer vs JSON Path 869 - 870 - - **JSON Pointer** addresses a *single* location (like a file path) 871 - - **JSON Path** queries for *multiple* values (like a search) 872 - - The `-` token is unique to JSON Pointer - it means "append position" for arrays 873 - - The library uses phantom types to enforce that `-` (append) pointers cannot be used with `get`/`find`
+846
doc/tutorial.mld
··· 1 + {0 JSON Pointer Tutorial} 2 + 3 + This tutorial introduces JSON Pointer as defined in 4 + {{:https://www.rfc-editor.org/rfc/rfc6901} RFC 6901}, and demonstrates 5 + the [jsont-pointer] OCaml library through interactive examples. 6 + 7 + {1 JSON Pointer vs JSON Path} 8 + 9 + Before diving in, it's worth understanding the difference between JSON 10 + Pointer and JSON Path, as they serve different purposes: 11 + 12 + {b JSON Pointer} (RFC 6901) is an {e indicator syntax} that specifies a 13 + {e single location} within JSON data. It always identifies at most one 14 + value. 15 + 16 + {b JSON Path} is a {e query syntax} that can {e search} JSON data and return 17 + {e multiple} values matching specified criteria. 18 + 19 + Use JSON Pointer when you need to address a single, specific location 20 + (like JSON Schema's [$ref]). Use JSON Path when you might need multiple 21 + results (like Kubernetes queries). 22 + 23 + The [jsont-pointer] library implements JSON Pointer and integrates with 24 + the {!Jsont.Path} type for representing navigation indices. 25 + 26 + {1 Setup} 27 + 28 + First, let's set up our environment. In the toplevel, you can load the 29 + library with [#require "jsont-pointer.top";;] which will automatically 30 + install pretty printers. 31 + 32 + {@ocaml[ 33 + # Jsont_pointer_top.install ();; 34 + - : unit = () 35 + # open Jsont_pointer;; 36 + # let parse_json s = 37 + match Jsont_bytesrw.decode_string Jsont.json s with 38 + | Ok json -> json 39 + | Error e -> failwith e;; 40 + val parse_json : string -> Jsont.json = <fun> 41 + ]} 42 + 43 + {1 What is JSON Pointer?} 44 + 45 + From RFC 6901, Section 1: 46 + 47 + {i JSON Pointer defines a string syntax for identifying a specific value 48 + within a JavaScript Object Notation (JSON) document.} 49 + 50 + In other words, JSON Pointer is an addressing scheme for locating values 51 + inside a JSON structure. Think of it like a filesystem path, but for JSON 52 + documents instead of files. 53 + 54 + For example, given this JSON document: 55 + 56 + {x@ocaml[ 57 + # let users_json = 58 + parse_json "{\"users\":[{\"name\":\"Alice\",\"age\":30},{\"name\":\"Bob\",\"age\":25}]}";; 59 + val users_json : Jsont.json = 60 + {"users":[{"name":"Alice","age":30},{"name":"Bob","age":25}]} 61 + ]x} 62 + 63 + The JSON Pointer [/users/0/name] refers to the string ["Alice"]: 64 + 65 + {@ocaml[ 66 + # let ptr = of_string_nav "/users/0/name";; 67 + val ptr : nav t = [Mem "users"; Nth 0; Mem "name"] 68 + # get ptr users_json;; 69 + - : Jsont.json = "Alice" 70 + ]} 71 + 72 + In OCaml, this is represented by the ['a Jsont_pointer.t] type - a sequence 73 + of navigation steps from the document root to a target value. The phantom 74 + type parameter ['a] encodes whether this is a navigation pointer or an 75 + append pointer (more on this later). 76 + 77 + {1 Syntax: Reference Tokens} 78 + 79 + RFC 6901, Section 3 defines the syntax: 80 + 81 + {i A JSON Pointer is a Unicode string containing a sequence of zero or more 82 + reference tokens, each prefixed by a '/' (%x2F) character.} 83 + 84 + The grammar is elegantly simple: 85 + 86 + {v 87 + json-pointer = *( "/" reference-token ) 88 + reference-token = *( unescaped / escaped ) 89 + v} 90 + 91 + This means: 92 + - The empty string [""] is a valid pointer (it refers to the whole document) 93 + - Every non-empty pointer starts with [/] 94 + - Everything between [/] characters is a "reference token" 95 + 96 + Let's see this in action: 97 + 98 + {@ocaml[ 99 + # of_string_nav "";; 100 + - : nav t = [] 101 + ]} 102 + 103 + The empty pointer has no reference tokens - it points to the root. 104 + 105 + {@ocaml[ 106 + # of_string_nav "/foo";; 107 + - : nav t = [Mem "foo"] 108 + ]} 109 + 110 + The pointer [/foo] has one token: [foo]. Since it's not a number, it's 111 + interpreted as an object member name ([Mem]). 112 + 113 + {@ocaml[ 114 + # of_string_nav "/foo/0";; 115 + - : nav t = [Mem "foo"; Nth 0] 116 + ]} 117 + 118 + Here we have two tokens: [foo] (a member name) and [0] (interpreted as 119 + an array index [Nth]). 120 + 121 + {@ocaml[ 122 + # of_string_nav "/foo/bar/baz";; 123 + - : nav t = [Mem "foo"; Mem "bar"; Mem "baz"] 124 + ]} 125 + 126 + Multiple tokens navigate deeper into nested structures. 127 + 128 + {2 The Index Type} 129 + 130 + Each reference token is represented using {!Jsont.Path.index}: 131 + 132 + {[ 133 + type index = Jsont.Path.index 134 + (* = Jsont.Path.Mem of string * Jsont.Meta.t 135 + | Jsont.Path.Nth of int * Jsont.Meta.t *) 136 + ]} 137 + 138 + The [Mem] constructor is for object member access, and [Nth] is for array 139 + index access. The member name is {b unescaped} - you work with the actual 140 + key string (like ["a/b"]) and the library handles any escaping needed 141 + for the JSON Pointer string representation. 142 + 143 + {2 Invalid Syntax} 144 + 145 + What happens if a pointer doesn't start with [/]? 146 + 147 + {@ocaml[ 148 + # of_string_nav "foo";; 149 + Exception: 150 + Jsont.Error Invalid JSON Pointer: must be empty or start with '/': foo. 151 + ]} 152 + 153 + The RFC is strict: non-empty pointers MUST start with [/]. 154 + 155 + For safer parsing, use [of_string_result]: 156 + 157 + {@ocaml[ 158 + # of_string_result "foo";; 159 + - : ([ `Append of append t | `Nav of nav t ], string) result = 160 + Error "Invalid JSON Pointer: must be empty or start with '/': foo" 161 + # of_string_result "/valid";; 162 + - : ([ `Append of append t | `Nav of nav t ], string) result = 163 + Ok (`Nav [Mem "valid"]) 164 + ]} 165 + 166 + {1 Evaluation: Navigating JSON} 167 + 168 + Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 169 + describes how a pointer is resolved against a JSON document: 170 + 171 + {i Evaluation of a JSON Pointer begins with a reference to the root value 172 + of a JSON document and completes with a reference to some value within 173 + the document. Each reference token in the JSON Pointer is evaluated 174 + sequentially.} 175 + 176 + Let's use the example JSON document from RFC 6901, Section 5: 177 + 178 + {@ocaml[ 179 + # let rfc_example = parse_json "{\"foo\":[\"bar\",\"baz\"],\"\":0,\"a/b\":1,\"c%d\":2,\"e^f\":3,\"g|h\":4,\"i\\\\j\":5,\"k\\\"l\":6,\" \":7,\"m~n\":8}";; 180 + val rfc_example : Jsont.json = 181 + {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} 182 + ]} 183 + 184 + This document is carefully constructed to exercise various edge cases! 185 + 186 + {2 The Root Pointer} 187 + 188 + {@ocaml[ 189 + # get root rfc_example ;; 190 + - : Jsont.json = 191 + {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} 192 + ]} 193 + 194 + The empty pointer ({!root}) returns the whole document. 195 + 196 + {2 Object Member Access} 197 + 198 + {@ocaml[ 199 + # get (of_string_nav "/foo") rfc_example ;; 200 + - : Jsont.json = ["bar","baz"] 201 + ]} 202 + 203 + [/foo] accesses the member named [foo], which is an array. 204 + 205 + {2 Array Index Access} 206 + 207 + {@ocaml[ 208 + # get (of_string_nav "/foo/0") rfc_example ;; 209 + - : Jsont.json = "bar" 210 + # get (of_string_nav "/foo/1") rfc_example ;; 211 + - : Jsont.json = "baz" 212 + ]} 213 + 214 + [/foo/0] first goes to [foo], then accesses index 0 of the array. 215 + 216 + {2 Empty String as Key} 217 + 218 + JSON allows empty strings as object keys: 219 + 220 + {@ocaml[ 221 + # get (of_string_nav "/") rfc_example ;; 222 + - : Jsont.json = 0 223 + ]} 224 + 225 + The pointer [/] has one token: the empty string. This accesses the member 226 + with an empty name. 227 + 228 + {2 Keys with Special Characters} 229 + 230 + The RFC example includes keys with [/] and [~] characters: 231 + 232 + {@ocaml[ 233 + # get (of_string_nav "/a~1b") rfc_example ;; 234 + - : Jsont.json = 1 235 + ]} 236 + 237 + The token [a~1b] refers to the key [a/b]. We'll explain this escaping 238 + {{:#escaping}below}. 239 + 240 + {@ocaml[ 241 + # get (of_string_nav "/m~0n") rfc_example ;; 242 + - : Jsont.json = 8 243 + ]} 244 + 245 + The token [m~0n] refers to the key [m~n]. 246 + 247 + {b Important}: When using the OCaml library programmatically, you don't need 248 + to worry about escaping. The [Mem] variant holds the literal key name: 249 + 250 + {@ocaml[ 251 + # let slash_ptr = make [mem "a/b"];; 252 + val slash_ptr : nav t = [Mem "a/b"] 253 + # to_string slash_ptr;; 254 + - : string = "/a~1b" 255 + # get slash_ptr rfc_example ;; 256 + - : Jsont.json = 1 257 + ]} 258 + 259 + The library escapes it when converting to string. 260 + 261 + {2 Other Special Characters (No Escaping Needed)} 262 + 263 + Most characters don't need escaping in JSON Pointer strings: 264 + 265 + {@ocaml[ 266 + # get (of_string_nav "/c%d") rfc_example ;; 267 + - : Jsont.json = 2 268 + # get (of_string_nav "/e^f") rfc_example ;; 269 + - : Jsont.json = 3 270 + # get (of_string_nav "/g|h") rfc_example ;; 271 + - : Jsont.json = 4 272 + # get (of_string_nav "/ ") rfc_example ;; 273 + - : Jsont.json = 7 274 + ]} 275 + 276 + Even a space is a valid key character! 277 + 278 + {2 Error Conditions} 279 + 280 + What happens when we try to access something that doesn't exist? 281 + 282 + {@ocaml[ 283 + # get_result (of_string_nav "/nonexistent") rfc_example;; 284 + - : (Jsont.json, Jsont.Error.t) result = 285 + Error JSON Pointer: member 'nonexistent' not found 286 + File "-": 287 + # find (of_string_nav "/nonexistent") rfc_example;; 288 + - : Jsont.json option = None 289 + ]} 290 + 291 + Or an out-of-bounds array index: 292 + 293 + {@ocaml[ 294 + # find (of_string_nav "/foo/99") rfc_example;; 295 + - : Jsont.json option = None 296 + ]} 297 + 298 + Or try to index into a non-container: 299 + 300 + {@ocaml[ 301 + # find (of_string_nav "/foo/0/invalid") rfc_example;; 302 + - : Jsont.json option = None 303 + ]} 304 + 305 + The library provides both exception-raising and result-returning variants: 306 + 307 + [ 308 + val get : nav t -> Jsont.json -> Jsont.json 309 + val get_result : nav t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result 310 + val find : nav t -> Jsont.json -> Jsont.json option 311 + ] 312 + 313 + {2 Array Index Rules} 314 + 315 + RFC 6901 has specific rules for array indices. Section 4 states: 316 + 317 + {i characters comprised of digits [...] that represent an unsigned base-10 318 + integer value, making the new referenced value the array element with 319 + the zero-based index identified by the token} 320 + 321 + And importantly: 322 + 323 + {i note that leading zeros are not allowed} 324 + 325 + {@ocaml[ 326 + # of_string_nav "/foo/0";; 327 + - : nav t = [Mem "foo"; Nth 0] 328 + ]} 329 + 330 + Zero itself is fine. 331 + 332 + {@ocaml[ 333 + # of_string_nav "/foo/01";; 334 + - : nav t = [Mem "foo"; Mem "01"] 335 + ]} 336 + 337 + But [01] has a leading zero, so it's NOT treated as an array index - it 338 + becomes a member name instead. This protects against accidental octal 339 + interpretation. 340 + 341 + {1 The End-of-Array Marker: [-] and Type Safety} 342 + 343 + RFC 6901, Section 4 introduces a special token: 344 + 345 + {i exactly the single character "-", making the new referenced value the 346 + (nonexistent) member after the last array element.} 347 + 348 + This [-] marker is unique to JSON Pointer (JSON Path has no equivalent). 349 + It's primarily useful for JSON Patch operations (RFC 6902) to append 350 + elements to arrays. 351 + 352 + {2 Navigation vs Append Pointers} 353 + 354 + The [jsont-pointer] library uses {b phantom types} to encode the difference 355 + between pointers that can be used for navigation and pointers that target 356 + the "append position": 357 + 358 + {[ 359 + type nav (* A pointer to an existing element *) 360 + type append (* A pointer ending with "-" (append position) *) 361 + type 'a t (* Pointer with phantom type parameter *) 362 + ]} 363 + 364 + When you parse a pointer, you get either a [nav t] or an [append t]: 365 + 366 + {@ocaml[ 367 + # of_string "/foo/0";; 368 + - : [ `Append of Jsont_pointer.append Jsont_pointer.t 369 + | `Nav of Jsont_pointer.nav Jsont_pointer.t ] 370 + = `Nav [Mem "foo"; Nth 0] 371 + # of_string "/foo/-";; 372 + - : [ `Append of Jsont_pointer.append Jsont_pointer.t 373 + | `Nav of Jsont_pointer.nav Jsont_pointer.t ] 374 + = `Append [Mem "foo"] /- 375 + ]} 376 + 377 + The [-] creates an [append] pointer. Note that in the internal 378 + representation, the append position is tracked separately (shown as [/-]). 379 + 380 + {2 Why Phantom Types?} 381 + 382 + The RFC explains that [-] refers to a {e nonexistent} position: 383 + 384 + {i Note that the use of the "-" character to index an array will always 385 + result in such an error condition because by definition it refers to 386 + a nonexistent array element.} 387 + 388 + So you {b cannot use [get] or [find]} with an append pointer - it makes 389 + no sense to retrieve a value from a position that doesn't exist! The 390 + library enforces this at compile time. 391 + 392 + However, append pointers {b are} valid for mutation operations like {!add}: 393 + 394 + {x@ocaml[ 395 + # let arr_obj = parse_json "{\"foo\":[\"a\",\"b\"]}";; 396 + val arr_obj : Jsont.json = {"foo":["a","b"]} 397 + # (match of_string "/foo/-" with `Append p -> add p arr_obj ~value:(Jsont.Json.string "c") | `Nav _ -> assert false);; 398 + - : Jsont.json = {"foo":["a","b","c"]} 399 + ]x} 400 + 401 + For convenience, use {!of_string_nav} when you know a pointer shouldn't 402 + contain [-]: 403 + 404 + {@ocaml[ 405 + # of_string_nav "/foo/0";; 406 + - : Jsont_pointer.nav Jsont_pointer.t = [Mem "foo"; Nth 0] 407 + # of_string_nav "/foo/-";; 408 + Exception: 409 + Jsont.Error Invalid JSON Pointer: '-' not allowed in navigation pointer. 410 + ]} 411 + 412 + {2 Creating Append Pointers Programmatically} 413 + 414 + You can convert a navigation pointer to an append pointer using {!at_end}: 415 + 416 + {@ocaml[ 417 + # let nav_ptr = of_string_nav "/foo";; 418 + val nav_ptr : Jsont_pointer.nav Jsont_pointer.t = [Mem "foo"] 419 + # let app_ptr = at_end nav_ptr;; 420 + val app_ptr : Jsont_pointer.append Jsont_pointer.t = [Mem "foo"] /- 421 + # to_string app_ptr;; 422 + - : string = "/foo/-" 423 + ]} 424 + 425 + {1 Mutation Operations} 426 + 427 + While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 428 + (JSON Patch) uses JSON Pointer for modifications. The [jsont-pointer] 429 + library provides these operations. 430 + 431 + {2 Which Pointer Type for Which Operation?} 432 + 433 + The phantom type system enforces correct usage: 434 + 435 + {ul 436 + {- {!get}, {!find} - [nav t] only - Can't retrieve from non-existent position} 437 + {- {!remove} - [nav t] only - Can't remove what doesn't exist} 438 + {- {!replace} - [nav t] only - Can't replace what doesn't exist} 439 + {- {!test} - [nav t] only - Can't test non-existent position} 440 + {- {!add} - [_ t] (both) - Can add at existing position OR append} 441 + {- {!set} - [_ t] (both) - Can set existing position OR append} 442 + {- {!move}, {!copy} - [from:nav t], [path:_ t] - Source must exist, dest can be append} 443 + } 444 + 445 + {2 Add} 446 + 447 + The {!add} operation inserts a value at a location: 448 + 449 + {@ocaml[ 450 + # let obj = parse_json "{\"foo\":\"bar\"}";; 451 + val obj : Jsont.json = {"foo":"bar"} 452 + # add (of_string_nav "/baz") obj ~value:(Jsont.Json.string "qux") 453 + ;; 454 + - : Jsont.json = {"foo":"bar","baz":"qux"} 455 + ]} 456 + 457 + For arrays, {!add} inserts BEFORE the specified index: 458 + 459 + {x@ocaml[ 460 + # let arr_obj = parse_json "{\"foo\":[\"a\",\"b\"]}";; 461 + val arr_obj : Jsont.json = {"foo":["a","b"]} 462 + # add (of_string_nav "/foo/1") arr_obj ~value:(Jsont.Json.string "X") 463 + ;; 464 + - : Jsont.json = {"foo":["a","X","b"]} 465 + ]x} 466 + 467 + This is where the [-] marker and append pointers shine - they append to the end: 468 + 469 + {@ocaml[ 470 + # (match of_string "/foo/-" with `Append p -> add p arr_obj ~value:(Jsont.Json.string "c") | `Nav _ -> assert false);; 471 + - : Jsont.json = {"foo":["a","b","c"]}]}]} 472 + ]} 473 + 474 + Or more conveniently using {!at_end}: 475 + 476 + {@ocaml[ 477 + # add (at_end (of_string_nav "/foo")) arr_obj ~value:(Jsont.Json.string "c") 478 + ;; 479 + - : Jsont.json = {"foo":["a","b","c"]}]}]} 480 + ]} 481 + 482 + {2 Remove} 483 + 484 + The {!remove} operation deletes a value. It only accepts [nav t] because 485 + you can only remove something that exists: 486 + 487 + {@ocaml[ 488 + # let two_fields = parse_json "{\"foo\":\"bar\",\"baz\":\"qux\"}";; 489 + val two_fields : Jsont.json = {"foo":"bar","baz":"qux"} 490 + # remove (of_string_nav "/baz") two_fields ;; 491 + - : Jsont.json = {"foo":"bar"} 492 + ]} 493 + 494 + For arrays, it removes and shifts: 495 + 496 + {x@ocaml[ 497 + # let three_elem = parse_json "{\"foo\":[\"a\",\"b\",\"c\"]}";; 498 + val three_elem : Jsont.json = {"foo":["a","b","c"]} 499 + # remove (of_string_nav "/foo/1") three_elem ;; 500 + - : Jsont.json = {"foo":["a","c"]} 501 + ]x} 502 + 503 + {2 Replace} 504 + 505 + The {!replace} operation updates an existing value: 506 + 507 + {@ocaml[ 508 + # replace (of_string_nav "/foo") obj ~value:(Jsont.Json.string "baz") 509 + ;; 510 + - : Jsont.json = {"foo":"baz"} 511 + ]} 512 + 513 + Unlike {!add}, {!replace} requires the target to already exist (hence [nav t]). 514 + Attempting to replace a nonexistent path raises an error. 515 + 516 + {2 Move} 517 + 518 + The {!move} operation relocates a value. The source ([from]) must be a [nav t] 519 + (you can only move something that exists), but the destination ([path]) can 520 + be either: 521 + 522 + {@ocaml[ 523 + # let nested = parse_json "{\"foo\":{\"bar\":\"baz\"},\"qux\":{}}";; 524 + val nested : Jsont.json = {"foo":{"bar":"baz"},"qux":{}} 525 + # move ~from:(of_string_nav "/foo/bar") ~path:(of_string_nav "/qux/thud") nested 526 + ;; 527 + - : Jsont.json = {"foo":{},"qux":{"thud":"baz"}} 528 + ]} 529 + 530 + {2 Copy} 531 + 532 + The {!copy} operation duplicates a value (same typing as {!move}): 533 + 534 + {@ocaml[ 535 + # let to_copy = parse_json "{\"foo\":{\"bar\":\"baz\"}}";; 536 + val to_copy : Jsont.json = {"foo":{"bar":"baz"}} 537 + # copy ~from:(of_string_nav "/foo/bar") ~path:(of_string_nav "/foo/qux") to_copy 538 + ;; 539 + - : Jsont.json = {"foo":{"bar":"baz","qux":"baz"}} 540 + ]} 541 + 542 + {2 Test} 543 + 544 + The {!test} operation verifies a value (useful in JSON Patch): 545 + 546 + {@ocaml[ 547 + # test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "bar");; 548 + - : bool = true 549 + # test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "wrong");; 550 + - : bool = false 551 + ]} 552 + 553 + {1:escaping Escaping Special Characters} 554 + 555 + RFC 6901, Section 3 explains the escaping rules: 556 + 557 + {i Because the characters '~' (%x7E) and '/' (%x2F) have special meanings 558 + in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be 559 + encoded as '~1' when these characters appear in a reference token.} 560 + 561 + Why these specific characters? 562 + - [/] separates tokens, so it must be escaped inside a token 563 + - [~] is the escape character itself, so it must also be escaped 564 + 565 + The escape sequences are: 566 + - [~0] represents [~] (tilde) 567 + - [~1] represents [/] (forward slash) 568 + 569 + {2 The Library Handles Escaping Automatically} 570 + 571 + {b Important}: When using [jsont-pointer] programmatically, you rarely need 572 + to think about escaping. The [Mem] variant stores unescaped strings, 573 + and escaping happens automatically during serialization: 574 + 575 + {@ocaml[ 576 + # let p = make [mem "a/b"];; 577 + val p : Jsont_pointer.nav Jsont_pointer.t = [Mem "a/b"] 578 + # to_string p;; 579 + - : string = "/a~1b" 580 + # of_string_nav "/a~1b";; 581 + - : Jsont_pointer.nav Jsont_pointer.t = [Mem "a/b"] 582 + ]} 583 + 584 + {2 Escaping in Action} 585 + 586 + The {!Token} module exposes the escaping functions: 587 + 588 + {@ocaml[ 589 + # Token.escape "hello";; 590 + - : string = "hello" 591 + # Token.escape "a/b";; 592 + - : string = "a~1b" 593 + # Token.escape "a~b";; 594 + - : string = "a~0b" 595 + # Token.escape "~/";; 596 + - : string = "~0~1" 597 + ]} 598 + 599 + {2 Unescaping} 600 + 601 + And the reverse process: 602 + 603 + {@ocaml[ 604 + # Token.unescape "a~1b";; 605 + - : string = "a/b" 606 + # Token.unescape "a~0b";; 607 + - : string = "a~b" 608 + ]} 609 + 610 + {2 The Order Matters!} 611 + 612 + RFC 6901, Section 4 is careful to specify the unescaping order: 613 + 614 + {i Evaluation of each reference token begins by decoding any escaped 615 + character sequence. This is performed by first transforming any 616 + occurrence of the sequence '~1' to '/', and then transforming any 617 + occurrence of the sequence '~0' to '~'. By performing the substitutions 618 + in this order, an implementation avoids the error of turning '~01' first 619 + into '~1' and then into '/', which would be incorrect (the string '~01' 620 + correctly becomes '~1' after transformation).} 621 + 622 + Let's verify this tricky case: 623 + 624 + {@ocaml[ 625 + # Token.unescape "~01";; 626 + - : string = "~1" 627 + ]} 628 + 629 + If we unescaped [~0] first, [~01] would become [~1], which would then become 630 + [/]. But that's wrong! The sequence [~01] should become the literal string 631 + [~1] (a tilde followed by the digit one). 632 + 633 + {1 URI Fragment Encoding} 634 + 635 + JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains: 636 + 637 + {i A JSON Pointer can be represented in a URI fragment identifier by 638 + encoding it into octets using UTF-8, while percent-encoding those 639 + characters not allowed by the fragment rule in RFC 3986.} 640 + 641 + This adds percent-encoding on top of the [~0]/[~1] escaping: 642 + 643 + {@ocaml[ 644 + # to_uri_fragment (of_string_nav "/foo");; 645 + - : string = "/foo" 646 + # to_uri_fragment (of_string_nav "/a~1b");; 647 + - : string = "/a~1b" 648 + # to_uri_fragment (of_string_nav "/c%d");; 649 + - : string = "/c%25d" 650 + # to_uri_fragment (of_string_nav "/ ");; 651 + - : string = "/%20" 652 + ]} 653 + 654 + The [%] character must be percent-encoded as [%25] in URIs, and 655 + spaces become [%20]. 656 + 657 + Here's the RFC example showing the URI fragment forms: 658 + 659 + {ul 660 + {- [""] → [#] → whole document} 661 + {- ["/foo"] → [#/foo] → [["bar", "baz"]]} 662 + {- ["/foo/0"] → [#/foo/0] → ["bar"]} 663 + {- ["/"] → [#/] → [0]} 664 + {- ["/a~1b"] → [#/a~1b] → [1]} 665 + {- ["/c%d"] → [#/c%25d] → [2]} 666 + {- ["/ "] → [#/%20] → [7]} 667 + {- ["/m~0n"] → [#/m~0n] → [8]} 668 + } 669 + 670 + {1 Building Pointers Programmatically} 671 + 672 + Instead of parsing strings, you can build pointers from indices: 673 + 674 + {@ocaml[ 675 + # let port_ptr = make [mem "database"; mem "port"];; 676 + val port_ptr : Jsont_pointer.nav Jsont_pointer.t = 677 + [Mem "database"; Mem "port"] 678 + # to_string port_ptr;; 679 + - : string = "/database/port" 680 + ]} 681 + 682 + For array access, use the {!nth} helper: 683 + 684 + {@ocaml[ 685 + # let first_feature_ptr = make [mem "features"; nth 0];; 686 + val first_feature_ptr : Jsont_pointer.nav Jsont_pointer.t = 687 + [Mem "features"; Nth 0] 688 + # to_string first_feature_ptr;; 689 + - : string = "/features/0" 690 + ]} 691 + 692 + {2 Pointer Navigation} 693 + 694 + You can build pointers incrementally using the [/] operator (or {!append_index}): 695 + 696 + {@ocaml[ 697 + # let db_ptr = of_string_nav "/database";; 698 + val db_ptr : Jsont_pointer.nav Jsont_pointer.t = [Mem "database"] 699 + # let creds_ptr = db_ptr / mem "credentials";; 700 + val creds_ptr : Jsont_pointer.nav Jsont_pointer.t = 701 + [Mem "database"; Mem "credentials"] 702 + # let user_ptr = creds_ptr / mem "username";; 703 + val user_ptr : Jsont_pointer.nav Jsont_pointer.t = 704 + [Mem "database"; Mem "credentials"; Mem "username"] 705 + # to_string user_ptr;; 706 + - : string = "/database/credentials/username" 707 + ]} 708 + 709 + Or concatenate two pointers: 710 + 711 + {@ocaml[ 712 + # let base = of_string_nav "/api/v1";; 713 + val base : Jsont_pointer.nav Jsont_pointer.t = [Mem "api"; Mem "v1"] 714 + # let endpoint = of_string_nav "/users/0";; 715 + val endpoint : Jsont_pointer.nav Jsont_pointer.t = [Mem "users"; Nth 0] 716 + # to_string (concat base endpoint);; 717 + - : string = "/api/v1/users/0" 718 + ]} 719 + 720 + {1 Jsont Integration} 721 + 722 + The library integrates with the {!Jsont} codec system, allowing you to 723 + combine JSON Pointer navigation with typed decoding. This is powerful 724 + because you can point to a location in a JSON document and decode it 725 + directly to an OCaml type. 726 + 727 + {x@ocaml[ 728 + # let config_json = parse_json "{\"database\":{\"host\":\"localhost\",\"port\":5432,\"credentials\":{\"username\":\"admin\",\"password\":\"secret\"}},\"features\":[\"auth\",\"logging\",\"metrics\"]}";; 729 + val config_json : Jsont.json = 730 + {"database":{"host":"localhost","port":5432,"credentials":{"username":"admin","password":"secret"}},"features":["auth","logging","metrics"]} 731 + ]x} 732 + 733 + {2 Typed Access with [path]} 734 + 735 + The {!path} combinator combines pointer navigation with typed decoding: 736 + 737 + {@ocaml[ 738 + # let db_host = 739 + Jsont.Json.decode 740 + (path (of_string_nav "/database/host") Jsont.string) 741 + config_json 742 + |> Result.get_ok;; 743 + val db_host : string = "localhost" 744 + # let db_port = 745 + Jsont.Json.decode 746 + (path (of_string_nav "/database/port") Jsont.int) 747 + config_json 748 + |> Result.get_ok;; 749 + val db_port : int = 5432 750 + ]} 751 + 752 + Extract a list of strings: 753 + 754 + {@ocaml[ 755 + # let features = 756 + Jsont.Json.decode 757 + (path (of_string_nav "/features") Jsont.(list string)) 758 + config_json 759 + |> Result.get_ok;; 760 + val features : string list = ["auth"; "logging"; "metrics"] 761 + ]} 762 + 763 + {2 Default Values with [~absent]} 764 + 765 + Use [~absent] to provide a default when a path doesn't exist: 766 + 767 + {@ocaml[ 768 + # let timeout = 769 + Jsont.Json.decode 770 + (path ~absent:30 (of_string_nav "/database/timeout") Jsont.int) 771 + config_json 772 + |> Result.get_ok;; 773 + val timeout : int = 30 774 + ]} 775 + 776 + {2 Nested Path Extraction} 777 + 778 + You can extract values from deeply nested structures: 779 + 780 + {x@ocaml[ 781 + # let org_json = parse_json "{\"organization\":{\"owner\":{\"name\":\"Alice\",\"email\":\"alice@example.com\",\"age\":35},\"members\":[{\"name\":\"Bob\",\"email\":\"bob@example.com\",\"age\":28}]}}";; 782 + val org_json : Jsont.json = 783 + {"organization":{"owner":{"name":"Alice","email":"alice@example.com","age":35},"members":[{"name":"Bob","email":"bob@example.com","age":28}]}} 784 + # Jsont.Json.decode 785 + (path (of_string_nav "/organization/owner/name") Jsont.string) 786 + org_json 787 + |> Result.get_ok;; 788 + - : string = "Alice" 789 + # Jsont.Json.decode 790 + (path (of_string_nav "/organization/members/0/age") Jsont.int) 791 + org_json 792 + |> Result.get_ok;; 793 + - : int = 28 794 + ]x} 795 + 796 + {2 Comparison: Raw vs Typed Access} 797 + 798 + {b Raw access} requires pattern matching: 799 + 800 + {@ocaml[ 801 + # let raw_port = 802 + match get (of_string_nav "/database/port") config_json with 803 + | Jsont.Number (f, _) -> int_of_float f 804 + | _ -> failwith "expected number";; 805 + val raw_port : int = 5432 806 + ]} 807 + 808 + {b Typed access} is cleaner and type-safe: 809 + 810 + {@ocaml[ 811 + # let typed_port = 812 + Jsont.Json.decode 813 + (path (of_string_nav "/database/port") Jsont.int) 814 + config_json 815 + |> Result.get_ok;; 816 + val typed_port : int = 5432 817 + ]} 818 + 819 + The typed approach catches mismatches at decode time with clear errors. 820 + 821 + {1 Summary} 822 + 823 + JSON Pointer (RFC 6901) provides a simple but powerful way to address 824 + values within JSON documents: 825 + 826 + {ol 827 + {- {b Syntax}: Pointers are strings of [/]-separated reference tokens} 828 + {- {b Escaping}: Use [~0] for [~] and [~1] for [/] in tokens (handled automatically by the library)} 829 + {- {b Evaluation}: Tokens navigate through objects (by key) and arrays (by index)} 830 + {- {b URI Encoding}: Pointers can be percent-encoded for use in URIs} 831 + {- {b Mutations}: Combined with JSON Patch (RFC 6902), pointers enable structured updates} 832 + {- {b Type Safety}: Phantom types ([nav t] vs [append t]) prevent misuse of append pointers with retrieval operations} 833 + } 834 + 835 + The [jsont-pointer] library implements all of this with type-safe OCaml 836 + interfaces, integration with the [jsont] codec system, and proper error 837 + handling for malformed pointers and missing values. 838 + 839 + {2 Key Points on JSON Pointer vs JSON Path} 840 + 841 + {ul 842 + {- {b JSON Pointer} addresses a {e single} location (like a file path)} 843 + {- {b JSON Path} queries for {e multiple} values (like a search)} 844 + {- The [-] token is unique to JSON Pointer - it means "append position" for arrays} 845 + {- The library uses phantom types to enforce that [-] (append) pointers cannot be used with [get]/[find]} 846 + }
+18 -9
src/top/jsont_pointer_top.ml
··· 28 28 "Jsont_pointer_top.json_printer"; 29 29 "Jsont_pointer_top.error_printer" ] 30 30 31 - let eval_string 32 - ?(print_outcome = false) ?(err_formatter = Format.err_formatter) str = 33 - let lexbuf = Lexing.from_string str in 34 - let phrase = !Toploop.parse_toplevel_phrase lexbuf in 35 - Toploop.execute_phrase print_outcome err_formatter phrase 31 + (* Suppress stderr during printer installation to avoid noise in MDX tests *) 32 + let null_formatter = Format.make_formatter (fun _ _ _ -> ()) (fun () -> ()) 33 + 34 + let eval_string_quiet str = 35 + try 36 + let lexbuf = Lexing.from_string str in 37 + let phrase = !Toploop.parse_toplevel_phrase lexbuf in 38 + Toploop.execute_phrase false null_formatter phrase 39 + with _ -> false 36 40 37 - let rec install_printers = function 41 + let rec do_install_printers = function 38 42 | [] -> true 39 43 | printer :: rest -> 40 44 let cmd = Printf.sprintf "#install_printer %s;;" printer in 41 - eval_string cmd && install_printers rest 45 + eval_string_quiet cmd && do_install_printers rest 46 + 47 + let install () = 48 + (* Silently ignore failures - this handles non-toplevel contexts like MDX *) 49 + ignore (do_install_printers printers) 42 50 51 + (* Only auto-install when OCAML_TOPLEVEL_NAME is set, indicating a real toplevel *) 43 52 let () = 44 - if not (install_printers printers) then 45 - Format.eprintf "Problem installing jsont-pointer printers@." 53 + if Sys.getenv_opt "OCAML_TOPLEVEL_NAME" <> None then 54 + install ()
+5
src/top/jsont_pointer_top.mli
··· 38 38 val error_printer : Format.formatter -> Jsont.Error.t -> unit 39 39 (** [error_printer] formats a {!Jsont.Error.t} as a human-readable 40 40 error message. Suitable for use with [#install_printer]. *) 41 + 42 + val install : unit -> unit 43 + (** [install ()] installs all printers. This is called automatically when 44 + the library is loaded, but can be called again if needed (e.g., in 45 + test environments where automatic initialization doesn't run). *)