# JSON Pointer Tutorial This tutorial introduces JSON Pointer as defined in [RFC 6901](https://www.rfc-editor.org/rfc/rfc6901), and demonstrates the `jsont-pointer` OCaml library through interactive examples. ## Setup First, let's set up our environment with helper functions: ```ocaml # open Jsont_pointer;; # #install_printer Jsont_pointer_top.printer;; # #install_printer Jsont_pointer_top.json_printer;; # #install_printer Jsont_pointer_top.error_printer;; # let parse_json s = match Jsont_bytesrw.decode_string Jsont.json s with | Ok json -> json | Error e -> failwith e;; val parse_json : string -> Jsont.json = ``` ## What is JSON Pointer? From RFC 6901, Section 1: > JSON Pointer defines a string syntax for identifying a specific value > within a JavaScript Object Notation (JSON) document. In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files. For example, given this JSON document: ```ocaml # let users_json = parse_json {|{ "users": [ {"name": "Alice", "age": 30}, {"name": "Bob", "age": 25} ] }|};; val users_json : Jsont.json = {"users":[{"name":"Alice","age":30},{"name":"Bob","age":25}]} ``` The JSON Pointer `/users/0/name` refers to the string `"Alice"`: ```ocaml # let ptr = of_string "/users/0/name";; val ptr : t = [`Mem "users"; `Nth 0; `Mem "name"] # get ptr users_json;; - : Jsont.json = "Alice" ``` In OCaml, this is represented by the `Jsont_pointer.t` type - a sequence of navigation steps from the document root to a target value. ## Syntax: Reference Tokens RFC 6901, Section 3 defines the syntax: > A JSON Pointer is a Unicode string containing a sequence of zero or more > reference tokens, each prefixed by a '/' (%x2F) character. The grammar is elegantly simple: ``` json-pointer = *( "/" reference-token ) reference-token = *( unescaped / escaped ) ``` This means: - The empty string `""` is a valid pointer (it refers to the whole document) - Every non-empty pointer starts with `/` - Everything between `/` characters is a "reference token" Let's see this in action: ```ocaml # of_string "";; - : t = [] ``` The empty pointer has no reference tokens - it points to the root. ```ocaml # of_string "/foo";; - : t = [`Mem "foo"] ``` The pointer `/foo` has one token: `foo`. Since it's not a number, it's interpreted as an object member name (`Mem`). ```ocaml # of_string "/foo/0";; - : t = [`Mem "foo"; `Nth 0] ``` Here we have two tokens: `foo` (a member name) and `0` (interpreted as an array index `Nth`). ```ocaml # of_string "/foo/bar/baz";; - : t = [`Mem "foo"; `Mem "bar"; `Mem "baz"] ``` Multiple tokens navigate deeper into nested structures. ### The Index Type Each reference token becomes an `Index.t` value in the library: ```ocaml type t = [ | `Mem of string (* Object member access *) | `Nth of int (* Array index access *) | `End (* The special "-" marker for append operations *) ] ``` The `Mem` variant holds the **unescaped** member name - you work with the actual key string (like `"a/b"`) and the library handles any escaping needed for the JSON Pointer string representation. ### Invalid Syntax What happens if a pointer doesn't start with `/`? ```ocaml # of_string "foo";; Exception: Jsont.Error Invalid JSON Pointer: must be empty or start with '/': foo. ``` The RFC is strict: non-empty pointers MUST start with `/`. For safer parsing, use `of_string_result`: ```ocaml # of_string_result "foo";; - : (t, string) result = Error "Invalid JSON Pointer: must be empty or start with '/': foo" # of_string_result "/valid";; - : (t, string) result = Ok [`Mem "valid"] ``` ## Evaluation: Navigating JSON Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document: > Evaluation of a JSON Pointer begins with a reference to the root value > of a JSON document and completes with a reference to some value within > the document. Each reference token in the JSON Pointer is evaluated > sequentially. Let's use the example JSON document from RFC 6901, Section 5: ```ocaml # let rfc_example = parse_json {|{ "foo": ["bar", "baz"], "": 0, "a/b": 1, "c%d": 2, "e^f": 3, "g|h": 4, "i\\j": 5, "k\"l": 6, " ": 7, "m~n": 8 }|};; val rfc_example : Jsont.json = {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} ``` This document is carefully constructed to exercise various edge cases! ### The Root Pointer ```ocaml # get root rfc_example ;; - : Jsont.json = {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8} ``` The empty pointer (`root`) returns the whole document. ### Object Member Access ```ocaml # get (of_string "/foo") rfc_example ;; - : Jsont.json = ["bar","baz"] ``` `/foo` accesses the member named `foo`, which is an array. ### Array Index Access ```ocaml # get (of_string "/foo/0") rfc_example ;; - : Jsont.json = "bar" # get (of_string "/foo/1") rfc_example ;; - : Jsont.json = "baz" ``` `/foo/0` first goes to `foo`, then accesses index 0 of the array. ### Empty String as Key JSON allows empty strings as object keys: ```ocaml # get (of_string "/") rfc_example ;; - : Jsont.json = 0 ``` The pointer `/` has one token: the empty string. This accesses the member with an empty name. ### Keys with Special Characters The RFC example includes keys with `/` and `~` characters: ```ocaml # get (of_string "/a~1b") rfc_example ;; - : Jsont.json = 1 ``` The token `a~1b` refers to the key `a/b`. We'll explain this escaping [below](#escaping-special-characters). ```ocaml # get (of_string "/m~0n") rfc_example ;; - : Jsont.json = 8 ``` The token `m~0n` refers to the key `m~n`. **Important**: When using the OCaml library programmatically, you don't need to worry about escaping. The `Mem` variant holds the literal key name: ```ocaml # let slash_ptr = make [`Mem "a/b"];; val slash_ptr : t = [`Mem "a/b"] # to_string slash_ptr;; - : string = "/a~1b" # get slash_ptr rfc_example ;; - : Jsont.json = 1 ``` The library escapes it when converting to string. ### Other Special Characters (No Escaping Needed) Most characters don't need escaping in JSON Pointer strings: ```ocaml # get (of_string "/c%d") rfc_example ;; - : Jsont.json = 2 # get (of_string "/e^f") rfc_example ;; - : Jsont.json = 3 # get (of_string "/g|h") rfc_example ;; - : Jsont.json = 4 # get (of_string "/ ") rfc_example ;; - : Jsont.json = 7 ``` Even a space is a valid key character! ### Error Conditions What happens when we try to access something that doesn't exist? ```ocaml # get_result (of_string "/nonexistent") rfc_example;; - : (Jsont.json, Jsont.Error.t) result = Error JSON Pointer: member 'nonexistent' not found File "-": # find (of_string "/nonexistent") rfc_example;; - : Jsont.json option = None ``` Or an out-of-bounds array index: ```ocaml # find (of_string "/foo/99") rfc_example;; - : Jsont.json option = None ``` Or try to index into a non-container: ```ocaml # find (of_string "/foo/0/invalid") rfc_example;; - : Jsont.json option = None ``` The library provides both exception-raising and result-returning variants: ```ocaml val get : t -> Jsont.json -> Jsont.json val get_result : t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result val find : t -> Jsont.json -> Jsont.json option ``` ### Array Index Rules RFC 6901 has specific rules for array indices. Section 4 states: > characters comprised of digits [...] that represent an unsigned base-10 > integer value, making the new referenced value the array element with > the zero-based index identified by the token And importantly: > note that leading zeros are not allowed ```ocaml # of_string "/foo/0";; - : t = [`Mem "foo"; `Nth 0] ``` Zero itself is fine. ```ocaml # of_string "/foo/01";; - : t = [`Mem "foo"; `Mem "01"] ``` But `01` has a leading zero, so it's NOT treated as an array index - it becomes a member name instead. This protects against accidental octal interpretation. ## The End-of-Array Marker: `-` RFC 6901, Section 4 introduces a special token: > exactly the single character "-", making the new referenced value the > (nonexistent) member after the last array element. This is primarily useful for JSON Patch operations (RFC 6902). Let's see how it parses: ```ocaml # of_string "/foo/-";; - : t = [`Mem "foo"; `End] ``` The `-` is recognized as a special `End` index. However, you cannot evaluate a pointer containing `-` because it refers to a position that doesn't exist: ```ocaml # find (of_string "/foo/-") rfc_example;; - : Jsont.json option = None ``` The RFC explains this: > Note that the use of the "-" character to index an array will always > result in such an error condition because by definition it refers to > a nonexistent array element. But we'll see later that `-` is very useful for mutation operations! ## Mutation Operations While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 (JSON Patch) uses JSON Pointer for modifications. The `jsont-pointer` library provides these operations. ### Add The `add` operation inserts a value at a location: ```ocaml # let obj = parse_json {|{"foo":"bar"}|};; val obj : Jsont.json = {"foo":"bar"} # add (of_string "/baz") obj ~value:(Jsont.Json.string "qux") ;; - : Jsont.json = {"foo":"bar","baz":"qux"} ``` For arrays, `add` inserts BEFORE the specified index: ```ocaml # let arr_obj = parse_json {|{"foo":["a","b"]}|};; val arr_obj : Jsont.json = {"foo":["a","b"]} # add (of_string "/foo/1") arr_obj ~value:(Jsont.Json.string "X") ;; - : Jsont.json = {"foo":["a","X","b"]} ``` This is where the `-` marker shines - it appends to the end: ```ocaml # add (of_string "/foo/-") arr_obj ~value:(Jsont.Json.string "c") ;; - : Jsont.json = {"foo":["a","b","c"]} ``` ### Remove The `remove` operation deletes a value: ```ocaml # let two_fields = parse_json {|{"foo":"bar","baz":"qux"}|};; val two_fields : Jsont.json = {"foo":"bar","baz":"qux"} # remove (of_string "/baz") two_fields ;; - : Jsont.json = {"foo":"bar"} ``` For arrays, it removes and shifts: ```ocaml # let three_elem = parse_json {|{"foo":["a","b","c"]}|};; val three_elem : Jsont.json = {"foo":["a","b","c"]} # remove (of_string "/foo/1") three_elem ;; - : Jsont.json = {"foo":["a","c"]} ``` ### Replace The `replace` operation updates an existing value: ```ocaml # replace (of_string "/foo") obj ~value:(Jsont.Json.string "baz") ;; - : Jsont.json = {"foo":"baz"} ``` Unlike `add`, `replace` requires the target to already exist. Attempting to replace a nonexistent path raises an error. ### Move The `move` operation relocates a value: ```ocaml # let nested = parse_json {|{"foo":{"bar":"baz"},"qux":{}}|};; val nested : Jsont.json = {"foo":{"bar":"baz"},"qux":{}} # move ~from:(of_string "/foo/bar") ~path:(of_string "/qux/thud") nested ;; - : Jsont.json = {"foo":{},"qux":{"thud":"baz"}} ``` ### Copy The `copy` operation duplicates a value: ```ocaml # let to_copy = parse_json {|{"foo":{"bar":"baz"}}|};; val to_copy : Jsont.json = {"foo":{"bar":"baz"}} # copy ~from:(of_string "/foo/bar") ~path:(of_string "/foo/qux") to_copy ;; - : Jsont.json = {"foo":{"bar":"baz","qux":"baz"}} ``` ### Test The `test` operation verifies a value (useful in JSON Patch): ```ocaml # test (of_string "/foo") obj ~expected:(Jsont.Json.string "bar");; - : bool = true # test (of_string "/foo") obj ~expected:(Jsont.Json.string "wrong");; - : bool = false ``` ## Escaping Special Characters RFC 6901, Section 3 explains the escaping rules: > Because the characters '\~' (%x7E) and '/' (%x2F) have special meanings > in JSON Pointer, '\~' needs to be encoded as '\~0' and '/' needs to be > encoded as '\~1' when these characters appear in a reference token. Why these specific characters? - `/` separates tokens, so it must be escaped inside a token - `~` is the escape character itself, so it must also be escaped The escape sequences are: - `~0` represents `~` (tilde) - `~1` represents `/` (forward slash) ### The Library Handles Escaping Automatically **Important**: When using `jsont-pointer` programmatically, you rarely need to think about escaping. The `Mem` variant stores unescaped strings, and escaping happens automatically during serialization: ```ocaml # let p = make [`Mem "a/b"];; val p : t = [`Mem "a/b"] # to_string p;; - : string = "/a~1b" # of_string "/a~1b";; - : t = [`Mem "a/b"] ``` ### Escaping in Action The `Token` module exposes the escaping functions: ```ocaml # Token.escape "hello";; - : string = "hello" # Token.escape "a/b";; - : string = "a~1b" # Token.escape "a~b";; - : string = "a~0b" # Token.escape "~/";; - : string = "~0~1" ``` ### Unescaping And the reverse process: ```ocaml # Token.unescape "a~1b";; - : string = "a/b" # Token.unescape "a~0b";; - : string = "a~b" ``` ### The Order Matters! RFC 6901, Section 4 is careful to specify the unescaping order: > Evaluation of each reference token begins by decoding any escaped > character sequence. This is performed by first transforming any > occurrence of the sequence '~1' to '/', and then transforming any > occurrence of the sequence '~0' to '~'. By performing the substitutions > in this order, an implementation avoids the error of turning '~01' first > into '~1' and then into '/', which would be incorrect (the string '~01' > correctly becomes '~1' after transformation). Let's verify this tricky case: ```ocaml # Token.unescape "~01";; - : string = "~1" ``` If we unescaped `~0` first, `~01` would become `~1`, which would then become `/`. But that's wrong! The sequence `~01` should become the literal string `~1` (a tilde followed by the digit one). ## URI Fragment Encoding JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains: > A JSON Pointer can be represented in a URI fragment identifier by > encoding it into octets using UTF-8, while percent-encoding those > characters not allowed by the fragment rule in RFC 3986. This adds percent-encoding on top of the `~0`/`~1` escaping: ```ocaml # to_uri_fragment (of_string "/foo");; - : string = "/foo" # to_uri_fragment (of_string "/a~1b");; - : string = "/a~1b" # to_uri_fragment (of_string "/c%d");; - : string = "/c%25d" # to_uri_fragment (of_string "/ ");; - : string = "/%20" ``` The `%` character must be percent-encoded as `%25` in URIs, and spaces become `%20`. Here's the RFC example showing the URI fragment forms: | JSON Pointer | URI Fragment | Value | |-------------|-------------|-------| | `""` | `#` | whole document | | `"/foo"` | `#/foo` | `["bar", "baz"]` | | `"/foo/0"` | `#/foo/0` | `"bar"` | | `"/"` | `#/` | `0` | | `"/a~1b"` | `#/a~1b` | `1` | | `"/c%d"` | `#/c%25d` | `2` | | `"/ "` | `#/%20` | `7` | | `"/m~0n"` | `#/m~0n` | `8` | ## Building Pointers Programmatically Instead of parsing strings, you can build pointers from indices: ```ocaml # let port_ptr = make [`Mem "database"; `Mem "port"];; val port_ptr : t = [`Mem "database"; `Mem "port"] # to_string port_ptr;; - : string = "/database/port" ``` For array access, use `Nth`: ```ocaml # let first_feature_ptr = make [`Mem "features"; `Nth 0];; val first_feature_ptr : t = [`Mem "features"; `Nth 0] # to_string first_feature_ptr;; - : string = "/features/0" ``` ### Pointer Navigation You can build pointers incrementally using `append`: ```ocaml # let db_ptr = of_string "/database";; val db_ptr : t = [`Mem "database"] # let creds_ptr = append db_ptr (`Mem "credentials");; val creds_ptr : t = [`Mem "database"; `Mem "credentials"] # let user_ptr = append creds_ptr (`Mem "username");; val user_ptr : t = [`Mem "database"; `Mem "credentials"; `Mem "username"] # to_string user_ptr;; - : string = "/database/credentials/username" ``` Or concatenate two pointers: ```ocaml # let base = of_string "/api/v1";; val base : t = [`Mem "api"; `Mem "v1"] # let endpoint = of_string "/users/0";; val endpoint : t = [`Mem "users"; `Nth 0] # to_string (concat base endpoint);; - : string = "/api/v1/users/0" ``` ## Jsont Integration The library integrates with the `Jsont` codec system, allowing you to combine JSON Pointer navigation with typed decoding. This is powerful because you can point to a location in a JSON document and decode it directly to an OCaml type. ```ocaml # let config_json = parse_json {|{ "database": { "host": "localhost", "port": 5432, "credentials": {"username": "admin", "password": "secret"} }, "features": ["auth", "logging", "metrics"] }|};; val config_json : Jsont.json = {"database":{"host":"localhost","port":5432,"credentials":{"username":"admin","password":"secret"}},"features":["auth","logging","metrics"]} ``` ### Typed Access with `path` The `path` combinator combines pointer navigation with typed decoding: ```ocaml # let db_host = Jsont.Json.decode (path (of_string "/database/host") Jsont.string) config_json |> Result.get_ok;; val db_host : string = "localhost" # let db_port = Jsont.Json.decode (path (of_string "/database/port") Jsont.int) config_json |> Result.get_ok;; val db_port : int = 5432 ``` Extract a list of strings: ```ocaml # let features = Jsont.Json.decode (path (of_string "/features") Jsont.(list string)) config_json |> Result.get_ok;; val features : string list = ["auth"; "logging"; "metrics"] ``` ### Default Values with `~absent` Use `~absent` to provide a default when a path doesn't exist: ```ocaml # let timeout = Jsont.Json.decode (path ~absent:30 (of_string "/database/timeout") Jsont.int) config_json |> Result.get_ok;; val timeout : int = 30 ``` ### Nested Path Extraction You can extract values from deeply nested structures: ```ocaml # let org_json = parse_json {|{ "organization": { "owner": {"name": "Alice", "email": "alice@example.com", "age": 35}, "members": [{"name": "Bob", "email": "bob@example.com", "age": 28}] } }|};; val org_json : Jsont.json = {"organization":{"owner":{"name":"Alice","email":"alice@example.com","age":35},"members":[{"name":"Bob","email":"bob@example.com","age":28}]}} # Jsont.Json.decode (path (of_string "/organization/owner/name") Jsont.string) org_json |> Result.get_ok;; - : string = "Alice" # Jsont.Json.decode (path (of_string "/organization/members/0/age") Jsont.int) org_json |> Result.get_ok;; - : int = 28 ``` ### Comparison: Raw vs Typed Access **Raw access** requires pattern matching: ```ocaml # let raw_port = match get (of_string "/database/port") config_json with | Jsont.Number (f, _) -> int_of_float f | _ -> failwith "expected number";; val raw_port : int = 5432 ``` **Typed access** is cleaner and type-safe: ```ocaml # let typed_port = Jsont.Json.decode (path (of_string "/database/port") Jsont.int) config_json |> Result.get_ok;; val typed_port : int = 5432 ``` The typed approach catches mismatches at decode time with clear errors. ## Summary JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents: 1. **Syntax**: Pointers are strings of `/`-separated reference tokens 2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens (handled automatically by the library) 3. **Evaluation**: Tokens navigate through objects (by key) and arrays (by index) 4. **URI Encoding**: Pointers can be percent-encoded for use in URIs 5. **Mutations**: Combined with JSON Patch (RFC 6902), pointers enable structured updates The `jsont-pointer` library implements all of this with type-safe OCaml interfaces, integration with the `jsont` codec system, and proper error handling for malformed pointers and missing values.