# JSON Pointer Tutorial This tutorial introduces JSON Pointer as defined in [RFC 6901](https://www.rfc-editor.org/rfc/rfc6901), and demonstrates the `jsont-pointer` OCaml library through interactive examples. ## What is JSON Pointer? From RFC 6901, Section 1: > JSON Pointer defines a string syntax for identifying a specific value > within a JavaScript Object Notation (JSON) document. In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files. For example, given this JSON document: ```json { "users": [ {"name": "Alice", "age": 30}, {"name": "Bob", "age": 25} ] } ``` The JSON Pointer `/users/0/name` refers to the string `"Alice"`. In OCaml, this is represented by the `Jsont_pointer.t` type - a sequence of navigation steps from the document root to a target value. ## Syntax: Reference Tokens RFC 6901, Section 3 defines the syntax: > A JSON Pointer is a Unicode string containing a sequence of zero or more > reference tokens, each prefixed by a '/' (%x2F) character. The grammar is elegantly simple: ``` json-pointer = *( "/" reference-token ) reference-token = *( unescaped / escaped ) ``` This means: - The empty string `""` is a valid pointer (it refers to the whole document) - Every non-empty pointer starts with `/` - Everything between `/` characters is a "reference token" Let's see this in action. We can parse pointers and see their structure: ```sh $ jsonpp parse "" OK: [] ``` The empty pointer has no reference tokens - it points to the root. ```sh $ jsonpp parse "/foo" OK: [Mem:foo] ``` The pointer `/foo` has one token: `foo`. Since it's not a number, it's interpreted as an object member name (`Mem`). ```sh $ jsonpp parse "/foo/0" OK: [Mem:foo, Nth:0] ``` Here we have two tokens: `foo` (a member name) and `0` (interpreted as an array index `Nth`). ```sh $ jsonpp parse "/foo/bar/baz" OK: [Mem:foo, Mem:bar, Mem:baz] ``` Multiple tokens navigate deeper into nested structures. ### The Index Type Each reference token becomes an `Index.t` value in the library: ```ocaml type t = | Mem of string (* Object member access *) | Nth of int (* Array index access *) | End (* The special "-" marker for append operations *) ``` The `Mem` variant holds the **unescaped** member name - you work with the actual key string (like `"a/b"`) and the library handles any escaping needed for the JSON Pointer string representation. ### Invalid Syntax What happens if a pointer doesn't start with `/`? ```sh $ jsonpp parse "foo" ERROR: Invalid JSON Pointer: must be empty or start with '/': foo ``` The RFC is strict: non-empty pointers MUST start with `/`. ## Evaluation: Navigating JSON Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document: > Evaluation of a JSON Pointer begins with a reference to the root value > of a JSON document and completes with a reference to some value within > the document. Each reference token in the JSON Pointer is evaluated > sequentially. In the library, this is the `Jsont_pointer.get` function: ```ocaml val get : t -> Jsont.json -> Jsont.json ``` Let's use the example JSON document from RFC 6901, Section 5: ```sh $ cat rfc6901_example.json { "foo": ["bar", "baz"], "": 0, "a/b": 1, "c%d": 2, "e^f": 3, "g|h": 4, "i\\j": 5, "k\"l": 6, " ": 7, "m~n": 8 } ``` This document is carefully constructed to exercise various edge cases! ### The Root Pointer ```sh $ jsonpp eval rfc6901_example.json "" OK: {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} ``` The empty pointer returns the whole document. In OCaml, this is `Jsont_pointer.root`: ```ocaml val root : t (** The empty pointer that references the whole document. *) ``` ### Object Member Access ```sh $ jsonpp eval rfc6901_example.json "/foo" OK: ["bar","baz"] ``` `/foo` accesses the member named `foo`, which is an array. ### Array Index Access ```sh $ jsonpp eval rfc6901_example.json "/foo/0" OK: "bar" ``` `/foo/0` first goes to `foo`, then accesses index 0 of the array. ```sh $ jsonpp eval rfc6901_example.json "/foo/1" OK: "baz" ``` Index 1 gives us the second element. ### Empty String as Key JSON allows empty strings as object keys: ```sh $ jsonpp eval rfc6901_example.json "/" OK: 0 ``` The pointer `/` has one token: the empty string. This accesses the member with an empty name. ### Keys with Special Characters The RFC example includes keys with `/` and `~` characters: ```sh $ jsonpp eval rfc6901_example.json "/a~1b" OK: 1 ``` The token `a~1b` refers to the key `a/b`. We'll explain this escaping [below](#escaping-special-characters). ```sh $ jsonpp eval rfc6901_example.json "/m~0n" OK: 8 ``` The token `m~0n` refers to the key `m~n`. **Important**: When using the OCaml library programmatically, you don't need to worry about escaping. The `Index.Mem` variant holds the literal key name: ```ocaml (* To access the key "a/b", just use the literal string *) let pointer = Jsont_pointer.make [Mem "a/b"] (* The library escapes it when converting to string *) let s = Jsont_pointer.to_string pointer (* "/a~1b" *) ``` ### Other Special Characters (No Escaping Needed) Most characters don't need escaping in JSON Pointer strings: ```sh $ jsonpp eval rfc6901_example.json "/c%d" OK: 2 ``` ```sh $ jsonpp eval rfc6901_example.json "/e^f" OK: 3 ``` ```sh $ jsonpp eval rfc6901_example.json "/g|h" OK: 4 ``` ```sh $ jsonpp eval rfc6901_example.json "/ " OK: 7 ``` Even a space is a valid key character! ### Error Conditions What happens when we try to access something that doesn't exist? ```sh $ jsonpp eval rfc6901_example.json "/nonexistent" ERROR: JSON Pointer: member 'nonexistent' not found File "-": ``` Or an out-of-bounds array index: ```sh $ jsonpp eval rfc6901_example.json "/foo/99" ERROR: JSON Pointer: index 99 out of bounds (array has 2 elements) File "-": ``` Or try to index into a non-container: ```sh $ jsonpp eval rfc6901_example.json "/foo/0/invalid" ERROR: JSON Pointer: cannot index into string with 'invalid' File "-": ``` The library provides both exception-raising and result-returning variants: ```ocaml val get : t -> Jsont.json -> Jsont.json val get_result : t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result val find : t -> Jsont.json -> Jsont.json option ``` ### Array Index Rules RFC 6901 has specific rules for array indices. Section 4 states: > characters comprised of digits [...] that represent an unsigned base-10 > integer value, making the new referenced value the array element with > the zero-based index identified by the token And importantly: > note that leading zeros are not allowed ```sh $ jsonpp parse "/foo/0" OK: [Mem:foo, Nth:0] ``` Zero itself is fine. ```sh $ jsonpp parse "/foo/01" OK: [Mem:foo, Mem:01] ``` But `01` has a leading zero, so it's NOT treated as an array index - it becomes a member name instead. This protects against accidental octal interpretation. ## The End-of-Array Marker: `-` RFC 6901, Section 4 introduces a special token: > exactly the single character "-", making the new referenced value the > (nonexistent) member after the last array element. This is primarily useful for JSON Patch operations (RFC 6902). Let's see how it parses: ```sh $ jsonpp parse "/foo/-" OK: [Mem:foo, End] ``` The `-` is recognized as a special `End` index. However, you cannot evaluate a pointer containing `-` because it refers to a position that doesn't exist: ```sh $ jsonpp eval rfc6901_example.json "/foo/-" ERROR: JSON Pointer: '-' (end marker) refers to nonexistent array element File "-": ``` The RFC explains this: > Note that the use of the "-" character to index an array will always > result in such an error condition because by definition it refers to > a nonexistent array element. But we'll see later that `-` is very useful for mutation operations! ## Mutation Operations While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 (JSON Patch) uses JSON Pointer for modifications. The `jsont-pointer` library provides these operations. ### Add The `add` operation inserts a value at a location: ```sh $ jsonpp add '{"foo":"bar"}' '/baz' '"qux"' {"foo":"bar","baz":"qux"} ``` In OCaml: ```ocaml val add : t -> Jsont.json -> value:Jsont.json -> Jsont.json ``` For arrays, `add` inserts BEFORE the specified index: ```sh $ jsonpp add '{"foo":["a","b"]}' '/foo/1' '"X"' {"foo":["a","X","b"]} ``` This is where the `-` marker shines - it appends to the end: ```sh $ jsonpp add '{"foo":["a","b"]}' '/foo/-' '"c"' {"foo":["a","b","c"]} ``` ### Remove The `remove` operation deletes a value: ```sh $ jsonpp remove '{"foo":"bar","baz":"qux"}' '/baz' {"foo":"bar"} ``` For arrays, it removes and shifts: ```sh $ jsonpp remove '{"foo":["a","b","c"]}' '/foo/1' {"foo":["a","c"]} ``` ### Replace The `replace` operation updates an existing value: ```sh $ jsonpp replace '{"foo":"bar"}' '/foo' '"baz"' {"foo":"baz"} ``` Unlike `add`, `replace` requires the target to already exist: ```sh $ jsonpp replace '{"foo":"bar"}' '/nonexistent' '"value"' ERROR: JSON Pointer: member 'nonexistent' not found File "-": ``` ### Move The `move` operation relocates a value: ```sh $ jsonpp move '{"foo":{"bar":"baz"},"qux":{}}' '/foo/bar' '/qux/thud' {"foo":{},"qux":{"thud":"baz"}} ``` ### Copy The `copy` operation duplicates a value: ```sh $ jsonpp copy '{"foo":{"bar":"baz"}}' '/foo/bar' '/foo/qux' {"foo":{"bar":"baz","qux":"baz"}} ``` ### Test The `test` operation verifies a value (useful in JSON Patch): ```sh $ jsonpp test '{"foo":"bar"}' '/foo' '"bar"' true ``` ```sh $ jsonpp test '{"foo":"bar"}' '/foo' '"baz"' false ``` ## Escaping Special Characters RFC 6901, Section 3 explains the escaping rules: > Because the characters '\~' (%x7E) and '/' (%x2F) have special meanings > in JSON Pointer, '\~' needs to be encoded as '\~0' and '/' needs to be > encoded as '\~1' when these characters appear in a reference token. Why these specific characters? - `/` separates tokens, so it must be escaped inside a token - `~` is the escape character itself, so it must also be escaped The escape sequences are: - `~0` represents `~` (tilde) - `~1` represents `/` (forward slash) ### The Library Handles Escaping Automatically **Important**: When using `jsont-pointer` programmatically, you rarely need to think about escaping. The `Index.Mem` variant stores unescaped strings, and escaping happens automatically during serialization: ```ocaml (* Create a pointer to key "a/b" - no escaping needed *) let p = Jsont_pointer.make [Mem "a/b"] (* Serialize to string - escaping happens automatically *) let s = Jsont_pointer.to_string p (* Returns "/a~1b" *) (* Parse from string - unescaping happens automatically *) let p' = Jsont_pointer.of_string "/a~1b" (* p' contains [Mem "a/b"] - the unescaped key *) ``` The `Token` module exposes the escaping functions if you need them: ```ocaml module Token : sig val escape : string -> string (* "a/b" -> "a~1b" *) val unescape : string -> string (* "a~1b" -> "a/b" *) end ``` ### Escaping in Action Let's see escaping with the CLI tool: ```sh $ jsonpp escape "hello" hello ``` No special characters, no escaping needed. ```sh $ jsonpp escape "a/b" a~1b ``` The `/` becomes `~1`. ```sh $ jsonpp escape "a~b" a~0b ``` The `~` becomes `~0`. ```sh $ jsonpp escape "~/" ~0~1 ``` Both characters are escaped. ### Unescaping And the reverse process: ```sh $ jsonpp unescape "a~1b" OK: a/b ``` ```sh $ jsonpp unescape "a~0b" OK: a~b ``` ### The Order Matters! RFC 6901, Section 4 is careful to specify the unescaping order: > Evaluation of each reference token begins by decoding any escaped > character sequence. This is performed by first transforming any > occurrence of the sequence '~1' to '/', and then transforming any > occurrence of the sequence '~0' to '~'. By performing the substitutions > in this order, an implementation avoids the error of turning '~01' first > into '~1' and then into '/', which would be incorrect (the string '~01' > correctly becomes '~1' after transformation). Let's verify this tricky case: ```sh $ jsonpp unescape "~01" OK: ~1 ``` If we unescaped `~0` first, `~01` would become `~1`, which would then become `/`. But that's wrong! The sequence `~01` should become the literal string `~1` (a tilde followed by the digit one). Invalid escape sequences are rejected: ```sh $ jsonpp unescape "~2" ERROR: Invalid JSON Pointer: invalid escape sequence ~2 ``` ```sh $ jsonpp unescape "hello~" ERROR: Invalid JSON Pointer: incomplete escape sequence at end ``` ## URI Fragment Encoding JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains: > A JSON Pointer can be represented in a URI fragment identifier by > encoding it into octets using UTF-8, while percent-encoding those > characters not allowed by the fragment rule in RFC 3986. This adds percent-encoding on top of the `~0`/`~1` escaping: ```sh $ jsonpp uri-fragment "/foo" OK: /foo -> /foo ``` Simple pointers often don't need percent-encoding. ```sh $ jsonpp uri-fragment "/a~1b" OK: /a~1b -> /a~1b ``` The `~1` escape stays as-is (it's valid in URI fragments). ```sh $ jsonpp uri-fragment "/c%d" OK: /c%d -> /c%25d ``` The `%` character must be percent-encoded as `%25` in URIs! ```sh $ jsonpp uri-fragment "/ " OK: / -> /%20 ``` Spaces become `%20`. The library provides functions for URI fragment encoding: ```ocaml val to_uri_fragment : t -> string val of_uri_fragment : string -> t val jsont_uri_fragment : t Jsont.t ``` Here's the RFC example showing the URI fragment forms: | JSON Pointer | URI Fragment | Value | |-------------|-------------|-------| | `""` | `#` | whole document | | `"/foo"` | `#/foo` | `["bar", "baz"]` | | `"/foo/0"` | `#/foo/0` | `"bar"` | | `"/"` | `#/` | `0` | | `"/a~1b"` | `#/a~1b` | `1` | | `"/c%d"` | `#/c%25d` | `2` | | `"/ "` | `#/%20` | `7` | | `"/m~0n"` | `#/m~0n` | `8` | ## Deeply Nested Structures JSON Pointer handles arbitrarily deep nesting: ```sh $ jsonpp eval rfc6901_example.json "/foo/0" OK: "bar" ``` For deeper structures, just add more path segments. With nested objects: ```sh $ jsonpp add '{"a":{"b":{"c":"d"}}}' '/a/b/x' '"y"' {"a":{"b":{"c":"d","x":"y"}}} ``` With nested arrays: ```sh $ jsonpp add '{"arr":[[1,2],[3,4]]}' '/arr/0/1' '99' {"arr":[[1,99,2],[3,4]]} ``` ## Jsont Integration The library integrates with the `Jsont` codec system, allowing you to combine JSON Pointer navigation with typed decoding. This is powerful because you can point to a location in a JSON document and decode it directly to an OCaml type. Let's set up our OCaml environment and explore these features: ```ocaml # open Jsont_pointer;; # let parse_json s = match Jsont_bytesrw.decode_string Jsont.json s with | Ok json -> json | Error e -> failwith e;; val parse_json : string -> Jsont.json = # let json_to_string json = match Jsont_bytesrw.encode_string ~format:Jsont.Minify Jsont.json json with | Ok s -> s | Error e -> failwith e;; val json_to_string : Jsont.json -> string = ``` ### Working with JSON Values Let's create a sample configuration document: ```ocaml # let config_json = parse_json {|{ "database": { "host": "localhost", "port": 5432, "credentials": {"username": "admin", "password": "secret"} }, "features": ["auth", "logging", "metrics"] }|};; val config_json : Jsont.json = Jsont.Object ([(("database", ), Jsont.Object ([(("host", ), Jsont.String ("localhost", )); (("port", ), Jsont.Number (5432., )); (("credentials", ), Jsont.Object ([(("username", ), Jsont.String ("admin", )); (("password", ), Jsont.String ("secret", ))], ))], )); (("features", ), Jsont.Array ([Jsont.String ("auth", ); Jsont.String ("logging", ); Jsont.String ("metrics", )], ))], ) ``` ### Creating and Using Pointers Create a pointer and use it to extract values: ```ocaml # let host_ptr = of_string "/database/host";; val host_ptr : t = # let host_value = get host_ptr config_json;; val host_value : Jsont.json = Jsont.String ("localhost", ) # match host_value with | Jsont.String (s, _) -> s | _ -> failwith "expected string";; - : string = "localhost" ``` ### Building Pointers Programmatically Instead of parsing strings, you can build pointers from indices: ```ocaml # let port_ptr = make [Mem "database"; Mem "port"];; val port_ptr : t = # to_string port_ptr;; - : string = "/database/port" # match get port_ptr config_json with | Jsont.Number (n, _) -> int_of_float n | _ -> failwith "expected number";; - : int = 5432 ``` For array access, use `Nth`: ```ocaml # let first_feature_ptr = make [Mem "features"; Nth 0];; val first_feature_ptr : t = # match get first_feature_ptr config_json with | Jsont.String (s, _) -> s | _ -> failwith "expected string";; - : string = "auth" ``` ### Pointer Navigation You can build pointers incrementally using `append`: ```ocaml # let db_ptr = of_string "/database";; val db_ptr : t = # let creds_ptr = append db_ptr (Mem "credentials");; val creds_ptr : t = # let user_ptr = append creds_ptr (Mem "username");; val user_ptr : t = # to_string user_ptr;; - : string = "/database/credentials/username" # match get user_ptr config_json with | Jsont.String (s, _) -> s | _ -> failwith "expected string";; - : string = "admin" ``` ### Safe Access with `find` Use `find` when you're not sure if a path exists: ```ocaml # find (of_string "/database/timeout") config_json;; - : Jsont.json option = None # find (of_string "/database/host") config_json |> Option.is_some;; - : bool = true ``` ### Typed Access with `path` The `path` combinator combines pointer navigation with typed decoding: ```ocaml # let db_host = Jsont.Json.decode (path (of_string "/database/host") Jsont.string) config_json |> Result.get_ok;; val db_host : string = "localhost" # let db_port = Jsont.Json.decode (path (of_string "/database/port") Jsont.int) config_json |> Result.get_ok;; val db_port : int = 5432 ``` Extract a list of strings: ```ocaml # let features = Jsont.Json.decode (path (of_string "/features") Jsont.(list string)) config_json |> Result.get_ok;; val features : string list = ["auth"; "logging"; "metrics"] ``` ### Default Values with `~absent` Use `~absent` to provide a default when a path doesn't exist: ```ocaml # let timeout = Jsont.Json.decode (path ~absent:30 (of_string "/database/timeout") Jsont.int) config_json |> Result.get_ok;; val timeout : int = 30 ``` ### Mutation Operations The library provides mutation functions for modifying JSON: ```ocaml # let sample = parse_json {|{"name": "Alice", "scores": [85, 92, 78]}|};; val sample : Jsont.json = Jsont.Object ([(("name", ), Jsont.String ("Alice", )); (("scores", ), Jsont.Array ([Jsont.Number (85., ); Jsont.Number (92., ); Jsont.Number (78., )], ))], ) ``` **Add** a new field: ```ocaml # let with_email = add (of_string "/email") sample ~value:(Jsont.Json.string "alice@example.com");; val with_email : Jsont.json = Jsont.Object ([(("name", ), Jsont.String ("Alice", )); (("scores", ), Jsont.Array ([Jsont.Number (85., ); Jsont.Number (92., ); Jsont.Number (78., )], )); (("email", ), Jsont.String ("alice@example.com", ))], ) # json_to_string with_email;; - : string = "{\"name\":\"Alice\",\"scores\":[85,92,78],\"email\":\"alice@example.com\"}" ``` **Add** to an array using `-` (append): ```ocaml # let with_new_score = add (of_string "/scores/-") sample ~value:(Jsont.Json.number 95.);; val with_new_score : Jsont.json = Jsont.Object ([(("name", ), Jsont.String ("Alice", )); (("scores", ), Jsont.Array ([Jsont.Number (85., ); Jsont.Number (92., ); Jsont.Number (78., ); Jsont.Number (95., )], ))], ) # json_to_string with_new_score;; - : string = "{\"name\":\"Alice\",\"scores\":[85,92,78,95]}" ``` **Replace** an existing value: ```ocaml # let renamed = replace (of_string "/name") sample ~value:(Jsont.Json.string "Bob");; val renamed : Jsont.json = Jsont.Object ([(("name", ), Jsont.String ("Bob", )); (("scores", ), Jsont.Array ([Jsont.Number (85., ); Jsont.Number (92., ); Jsont.Number (78., )], ))], ) # json_to_string renamed;; - : string = "{\"name\":\"Bob\",\"scores\":[85,92,78]}" ``` **Remove** a value: ```ocaml # let without_first = remove (of_string "/scores/0") sample;; val without_first : Jsont.json = Jsont.Object ([(("name", ), Jsont.String ("Alice", )); (("scores", ), Jsont.Array ([Jsont.Number (92., ); Jsont.Number (78., )], ))], ) # json_to_string without_first;; - : string = "{\"name\":\"Alice\",\"scores\":[92,78]}" ``` ### Nested Path Extraction You can extract values from deeply nested structures: ```ocaml # let org_json = parse_json {|{ "organization": { "owner": {"name": "Alice", "email": "alice@example.com", "age": 35}, "members": [{"name": "Bob", "email": "bob@example.com", "age": 28}] } }|};; val org_json : Jsont.json = Jsont.Object ([(("organization", ), Jsont.Object ([(("owner", ), Jsont.Object ([(("name", ), Jsont.String ("Alice", )); (("email", ), Jsont.String ("alice@example.com", )); (("age", ), Jsont.Number (35., ))], )); (("members", ), Jsont.Array ([Jsont.Object ([(("name", ), Jsont.String ("Bob", )); (("email", ), Jsont.String ("bob@example.com", )); (("age", ), Jsont.Number (28., ))], )], ))], ))], ) # Jsont.Json.decode (path (of_string "/organization/owner/name") Jsont.string) org_json |> Result.get_ok;; - : string = "Alice" # Jsont.Json.decode (path (of_string "/organization/members/0/age") Jsont.int) org_json |> Result.get_ok;; - : int = 28 ``` ### Comparison: Raw vs Typed Access **Raw access** requires pattern matching: ```ocaml # let raw_port = match get (of_string "/database/port") config_json with | Jsont.Number (f, _) -> int_of_float f | _ -> failwith "expected number";; val raw_port : int = 5432 ``` **Typed access** is cleaner and type-safe: ```ocaml # let typed_port = Jsont.Json.decode (path (of_string "/database/port") Jsont.int) config_json |> Result.get_ok;; val typed_port : int = 5432 ``` The typed approach catches mismatches at decode time with clear errors. ## Summary JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents: 1. **Syntax**: Pointers are strings of `/`-separated reference tokens 2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens (handled automatically by the library) 3. **Evaluation**: Tokens navigate through objects (by key) and arrays (by index) 4. **URI Encoding**: Pointers can be percent-encoded for use in URIs 5. **Mutations**: Combined with JSON Patch (RFC 6902), pointers enable structured updates The `jsont-pointer` library implements all of this with type-safe OCaml interfaces, integration with the `jsont` codec system, and proper error handling for malformed pointers and missing values.