JSON Pointer Tutorial#
This tutorial introduces JSON Pointer as defined in
RFC 6901, and demonstrates
the jsont-pointer OCaml library through interactive examples.
Setup#
First, let's set up our environment with helper functions:
# open Jsont_pointer;;
# #install_printer Jsont_pointer_top.printer;;
# #install_printer Jsont_pointer_top.json_printer;;
# #install_printer Jsont_pointer_top.error_printer;;
# let parse_json s =
match Jsont_bytesrw.decode_string Jsont.json s with
| Ok json -> json
| Error e -> failwith e;;
val parse_json : string -> Jsont.json = <fun>
What is JSON Pointer?#
From RFC 6901, Section 1:
JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document.
In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files.
For example, given this JSON document:
# let users_json = parse_json {|{
"users": [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25}
]
}|};;
val users_json : Jsont.json =
{"users":[{"name":"Alice","age":30},{"name":"Bob","age":25}]}
The JSON Pointer /users/0/name refers to the string "Alice":
# let ptr = of_string "/users/0/name";;
val ptr : t = [`Mem "users"; `Nth 0; `Mem "name"]
# get ptr users_json;;
- : Jsont.json = "Alice"
In OCaml, this is represented by the Jsont_pointer.t type - a sequence
of navigation steps from the document root to a target value.
Syntax: Reference Tokens#
RFC 6901, Section 3 defines the syntax:
A JSON Pointer is a Unicode string containing a sequence of zero or more reference tokens, each prefixed by a '/' (%x2F) character.
The grammar is elegantly simple:
json-pointer = *( "/" reference-token )
reference-token = *( unescaped / escaped )
This means:
- The empty string
""is a valid pointer (it refers to the whole document) - Every non-empty pointer starts with
/ - Everything between
/characters is a "reference token"
Let's see this in action:
# of_string "";;
- : t = []
The empty pointer has no reference tokens - it points to the root.
# of_string "/foo";;
- : t = [`Mem "foo"]
The pointer /foo has one token: foo. Since it's not a number, it's
interpreted as an object member name (Mem).
# of_string "/foo/0";;
- : t = [`Mem "foo"; `Nth 0]
Here we have two tokens: foo (a member name) and 0 (interpreted as
an array index Nth).
# of_string "/foo/bar/baz";;
- : t = [`Mem "foo"; `Mem "bar"; `Mem "baz"]
Multiple tokens navigate deeper into nested structures.
The Index Type#
Each reference token becomes an Index.t value in the library:
type t = [
| `Mem of string (* Object member access *)
| `Nth of int (* Array index access *)
| `End (* The special "-" marker for append operations *)
]
The Mem variant holds the unescaped member name - you work with the
actual key string (like "a/b") and the library handles any escaping needed
for the JSON Pointer string representation.
Invalid Syntax#
What happens if a pointer doesn't start with /?
# of_string "foo";;
Exception:
Jsont.Error Invalid JSON Pointer: must be empty or start with '/': foo.
The RFC is strict: non-empty pointers MUST start with /.
For safer parsing, use of_string_result:
# of_string_result "foo";;
- : (t, string) result =
Error "Invalid JSON Pointer: must be empty or start with '/': foo"
# of_string_result "/valid";;
- : (t, string) result = Ok [`Mem "valid"]
Evaluation: Navigating JSON#
Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document:
Evaluation of a JSON Pointer begins with a reference to the root value of a JSON document and completes with a reference to some value within the document. Each reference token in the JSON Pointer is evaluated sequentially.
Let's use the example JSON document from RFC 6901, Section 5:
# let rfc_example = parse_json {|{
"foo": ["bar", "baz"],
"": 0,
"a/b": 1,
"c%d": 2,
"e^f": 3,
"g|h": 4,
"i\\j": 5,
"k\"l": 6,
" ": 7,
"m~n": 8
}|};;
val rfc_example : Jsont.json =
{"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8}
This document is carefully constructed to exercise various edge cases!
The Root Pointer#
# get root rfc_example ;;
- : Jsont.json =
{"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8}
The empty pointer (root) returns the whole document.
Object Member Access#
# get (of_string "/foo") rfc_example ;;
- : Jsont.json = ["bar","baz"]
/foo accesses the member named foo, which is an array.
Array Index Access#
# get (of_string "/foo/0") rfc_example ;;
- : Jsont.json = "bar"
# get (of_string "/foo/1") rfc_example ;;
- : Jsont.json = "baz"
/foo/0 first goes to foo, then accesses index 0 of the array.
Empty String as Key#
JSON allows empty strings as object keys:
# get (of_string "/") rfc_example ;;
- : Jsont.json = 0
The pointer / has one token: the empty string. This accesses the member
with an empty name.
Keys with Special Characters#
The RFC example includes keys with / and ~ characters:
# get (of_string "/a~1b") rfc_example ;;
- : Jsont.json = 1
The token a~1b refers to the key a/b. We'll explain this escaping
below.
# get (of_string "/m~0n") rfc_example ;;
- : Jsont.json = 8
The token m~0n refers to the key m~n.
Important: When using the OCaml library programmatically, you don't need
to worry about escaping. The Mem variant holds the literal key name:
# let slash_ptr = make [`Mem "a/b"];;
val slash_ptr : t = [`Mem "a/b"]
# to_string slash_ptr;;
- : string = "/a~1b"
# get slash_ptr rfc_example ;;
- : Jsont.json = 1
The library escapes it when converting to string.
Other Special Characters (No Escaping Needed)#
Most characters don't need escaping in JSON Pointer strings:
# get (of_string "/c%d") rfc_example ;;
- : Jsont.json = 2
# get (of_string "/e^f") rfc_example ;;
- : Jsont.json = 3
# get (of_string "/g|h") rfc_example ;;
- : Jsont.json = 4
# get (of_string "/ ") rfc_example ;;
- : Jsont.json = 7
Even a space is a valid key character!
Error Conditions#
What happens when we try to access something that doesn't exist?
# get_result (of_string "/nonexistent") rfc_example;;
- : (Jsont.json, Jsont.Error.t) result =
Error JSON Pointer: member 'nonexistent' not found
File "-":
# find (of_string "/nonexistent") rfc_example;;
- : Jsont.json option = None
Or an out-of-bounds array index:
# find (of_string "/foo/99") rfc_example;;
- : Jsont.json option = None
Or try to index into a non-container:
# find (of_string "/foo/0/invalid") rfc_example;;
- : Jsont.json option = None
The library provides both exception-raising and result-returning variants:
val get : t -> Jsont.json -> Jsont.json
val get_result : t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result
val find : t -> Jsont.json -> Jsont.json option
Array Index Rules#
RFC 6901 has specific rules for array indices. Section 4 states:
characters comprised of digits [...] that represent an unsigned base-10 integer value, making the new referenced value the array element with the zero-based index identified by the token
And importantly:
note that leading zeros are not allowed
# of_string "/foo/0";;
- : t = [`Mem "foo"; `Nth 0]
Zero itself is fine.
# of_string "/foo/01";;
- : t = [`Mem "foo"; `Mem "01"]
But 01 has a leading zero, so it's NOT treated as an array index - it
becomes a member name instead. This protects against accidental octal
interpretation.
The End-of-Array Marker: -#
RFC 6901, Section 4 introduces a special token:
exactly the single character "-", making the new referenced value the (nonexistent) member after the last array element.
This is primarily useful for JSON Patch operations (RFC 6902). Let's see how it parses:
# of_string "/foo/-";;
- : t = [`Mem "foo"; `End]
The - is recognized as a special End index.
However, you cannot evaluate a pointer containing - because it refers
to a position that doesn't exist:
# find (of_string "/foo/-") rfc_example;;
- : Jsont.json option = None
The RFC explains this:
Note that the use of the "-" character to index an array will always result in such an error condition because by definition it refers to a nonexistent array element.
But we'll see later that - is very useful for mutation operations!
Mutation Operations#
While RFC 6901 defines JSON Pointer for read-only access, RFC 6902
(JSON Patch) uses JSON Pointer for modifications. The jsont-pointer
library provides these operations.
Add#
The add operation inserts a value at a location:
# let obj = parse_json {|{"foo":"bar"}|};;
val obj : Jsont.json = {"foo":"bar"}
# add (of_string "/baz") obj ~value:(Jsont.Json.string "qux")
;;
- : Jsont.json = {"foo":"bar","baz":"qux"}
For arrays, add inserts BEFORE the specified index:
# let arr_obj = parse_json {|{"foo":["a","b"]}|};;
val arr_obj : Jsont.json = {"foo":["a","b"]}
# add (of_string "/foo/1") arr_obj ~value:(Jsont.Json.string "X")
;;
- : Jsont.json = {"foo":["a","X","b"]}
This is where the - marker shines - it appends to the end:
# add (of_string "/foo/-") arr_obj ~value:(Jsont.Json.string "c")
;;
- : Jsont.json = {"foo":["a","b","c"]}
Remove#
The remove operation deletes a value:
# let two_fields = parse_json {|{"foo":"bar","baz":"qux"}|};;
val two_fields : Jsont.json = {"foo":"bar","baz":"qux"}
# remove (of_string "/baz") two_fields ;;
- : Jsont.json = {"foo":"bar"}
For arrays, it removes and shifts:
# let three_elem = parse_json {|{"foo":["a","b","c"]}|};;
val three_elem : Jsont.json = {"foo":["a","b","c"]}
# remove (of_string "/foo/1") three_elem ;;
- : Jsont.json = {"foo":["a","c"]}
Replace#
The replace operation updates an existing value:
# replace (of_string "/foo") obj ~value:(Jsont.Json.string "baz")
;;
- : Jsont.json = {"foo":"baz"}
Unlike add, replace requires the target to already exist.
Attempting to replace a nonexistent path raises an error.
Move#
The move operation relocates a value:
# let nested = parse_json {|{"foo":{"bar":"baz"},"qux":{}}|};;
val nested : Jsont.json = {"foo":{"bar":"baz"},"qux":{}}
# move ~from:(of_string "/foo/bar") ~path:(of_string "/qux/thud") nested
;;
- : Jsont.json = {"foo":{},"qux":{"thud":"baz"}}
Copy#
The copy operation duplicates a value:
# let to_copy = parse_json {|{"foo":{"bar":"baz"}}|};;
val to_copy : Jsont.json = {"foo":{"bar":"baz"}}
# copy ~from:(of_string "/foo/bar") ~path:(of_string "/foo/qux") to_copy
;;
- : Jsont.json = {"foo":{"bar":"baz","qux":"baz"}}
Test#
The test operation verifies a value (useful in JSON Patch):
# test (of_string "/foo") obj ~expected:(Jsont.Json.string "bar");;
- : bool = true
# test (of_string "/foo") obj ~expected:(Jsont.Json.string "wrong");;
- : bool = false
Escaping Special Characters#
RFC 6901, Section 3 explains the escaping rules:
Because the characters '~' (%x7E) and '/' (%x2F) have special meanings in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be encoded as '~1' when these characters appear in a reference token.
Why these specific characters?
/separates tokens, so it must be escaped inside a token~is the escape character itself, so it must also be escaped
The escape sequences are:
~0represents~(tilde)~1represents/(forward slash)
The Library Handles Escaping Automatically#
Important: When using jsont-pointer programmatically, you rarely need
to think about escaping. The Mem variant stores unescaped strings,
and escaping happens automatically during serialization:
# let p = make [`Mem "a/b"];;
val p : t = [`Mem "a/b"]
# to_string p;;
- : string = "/a~1b"
# of_string "/a~1b";;
- : t = [`Mem "a/b"]
Escaping in Action#
The Token module exposes the escaping functions:
# Token.escape "hello";;
- : string = "hello"
# Token.escape "a/b";;
- : string = "a~1b"
# Token.escape "a~b";;
- : string = "a~0b"
# Token.escape "~/";;
- : string = "~0~1"
Unescaping#
And the reverse process:
# Token.unescape "a~1b";;
- : string = "a/b"
# Token.unescape "a~0b";;
- : string = "a~b"
The Order Matters!#
RFC 6901, Section 4 is careful to specify the unescaping order:
Evaluation of each reference token begins by decoding any escaped character sequence. This is performed by first transforming any occurrence of the sequence '~1' to '/', and then transforming any occurrence of the sequence '
0' to ''. By performing the substitutions in this order, an implementation avoids the error of turning '~01' first into '~1' and then into '/', which would be incorrect (the string '~01' correctly becomes '~1' after transformation).
Let's verify this tricky case:
# Token.unescape "~01";;
- : string = "~1"
If we unescaped ~0 first, ~01 would become ~1, which would then become
/. But that's wrong! The sequence ~01 should become the literal string
~1 (a tilde followed by the digit one).
URI Fragment Encoding#
JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:
A JSON Pointer can be represented in a URI fragment identifier by encoding it into octets using UTF-8, while percent-encoding those characters not allowed by the fragment rule in RFC 3986.
This adds percent-encoding on top of the ~0/~1 escaping:
# to_uri_fragment (of_string "/foo");;
- : string = "/foo"
# to_uri_fragment (of_string "/a~1b");;
- : string = "/a~1b"
# to_uri_fragment (of_string "/c%d");;
- : string = "/c%25d"
# to_uri_fragment (of_string "/ ");;
- : string = "/%20"
The % character must be percent-encoded as %25 in URIs, and
spaces become %20.
Here's the RFC example showing the URI fragment forms:
| JSON Pointer | URI Fragment | Value |
|---|---|---|
"" |
# |
whole document |
"/foo" |
#/foo |
["bar", "baz"] |
"/foo/0" |
#/foo/0 |
"bar" |
"/" |
#/ |
0 |
"/a~1b" |
#/a~1b |
1 |
"/c%d" |
#/c%25d |
2 |
"/ " |
#/%20 |
7 |
"/m~0n" |
#/m~0n |
8 |
Building Pointers Programmatically#
Instead of parsing strings, you can build pointers from indices:
# let port_ptr = make [`Mem "database"; `Mem "port"];;
val port_ptr : t = [`Mem "database"; `Mem "port"]
# to_string port_ptr;;
- : string = "/database/port"
For array access, use Nth:
# let first_feature_ptr = make [`Mem "features"; `Nth 0];;
val first_feature_ptr : t = [`Mem "features"; `Nth 0]
# to_string first_feature_ptr;;
- : string = "/features/0"
Pointer Navigation#
You can build pointers incrementally using append:
# let db_ptr = of_string "/database";;
val db_ptr : t = [`Mem "database"]
# let creds_ptr = append db_ptr (`Mem "credentials");;
val creds_ptr : t = [`Mem "database"; `Mem "credentials"]
# let user_ptr = append creds_ptr (`Mem "username");;
val user_ptr : t = [`Mem "database"; `Mem "credentials"; `Mem "username"]
# to_string user_ptr;;
- : string = "/database/credentials/username"
Or concatenate two pointers:
# let base = of_string "/api/v1";;
val base : t = [`Mem "api"; `Mem "v1"]
# let endpoint = of_string "/users/0";;
val endpoint : t = [`Mem "users"; `Nth 0]
# to_string (concat base endpoint);;
- : string = "/api/v1/users/0"
Jsont Integration#
The library integrates with the Jsont codec system, allowing you to
combine JSON Pointer navigation with typed decoding. This is powerful
because you can point to a location in a JSON document and decode it
directly to an OCaml type.
# let config_json = parse_json {|{
"database": {
"host": "localhost",
"port": 5432,
"credentials": {"username": "admin", "password": "secret"}
},
"features": ["auth", "logging", "metrics"]
}|};;
val config_json : Jsont.json =
{"database":{"host":"localhost","port":5432,"credentials":{"username":"admin","password":"secret"}},"features":["auth","logging","metrics"]}
Typed Access with path#
The path combinator combines pointer navigation with typed decoding:
# let db_host =
Jsont.Json.decode
(path (of_string "/database/host") Jsont.string)
config_json
|> Result.get_ok;;
val db_host : string = "localhost"
# let db_port =
Jsont.Json.decode
(path (of_string "/database/port") Jsont.int)
config_json
|> Result.get_ok;;
val db_port : int = 5432
Extract a list of strings:
# let features =
Jsont.Json.decode
(path (of_string "/features") Jsont.(list string))
config_json
|> Result.get_ok;;
val features : string list = ["auth"; "logging"; "metrics"]
Default Values with ~absent#
Use ~absent to provide a default when a path doesn't exist:
# let timeout =
Jsont.Json.decode
(path ~absent:30 (of_string "/database/timeout") Jsont.int)
config_json
|> Result.get_ok;;
val timeout : int = 30
Nested Path Extraction#
You can extract values from deeply nested structures:
# let org_json = parse_json {|{
"organization": {
"owner": {"name": "Alice", "email": "alice@example.com", "age": 35},
"members": [{"name": "Bob", "email": "bob@example.com", "age": 28}]
}
}|};;
val org_json : Jsont.json =
{"organization":{"owner":{"name":"Alice","email":"alice@example.com","age":35},"members":[{"name":"Bob","email":"bob@example.com","age":28}]}}
# Jsont.Json.decode
(path (of_string "/organization/owner/name") Jsont.string)
org_json
|> Result.get_ok;;
- : string = "Alice"
# Jsont.Json.decode
(path (of_string "/organization/members/0/age") Jsont.int)
org_json
|> Result.get_ok;;
- : int = 28
Comparison: Raw vs Typed Access#
Raw access requires pattern matching:
# let raw_port =
match get (of_string "/database/port") config_json with
| Jsont.Number (f, _) -> int_of_float f
| _ -> failwith "expected number";;
val raw_port : int = 5432
Typed access is cleaner and type-safe:
# let typed_port =
Jsont.Json.decode
(path (of_string "/database/port") Jsont.int)
config_json
|> Result.get_ok;;
val typed_port : int = 5432
The typed approach catches mismatches at decode time with clear errors.
Summary#
JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents:
- Syntax: Pointers are strings of
/-separated reference tokens - Escaping: Use
~0for~and~1for/in tokens (handled automatically by the library) - Evaluation: Tokens navigate through objects (by key) and arrays (by index)
- URI Encoding: Pointers can be percent-encoded for use in URIs
- Mutations: Combined with JSON Patch (RFC 6902), pointers enable structured updates
The jsont-pointer library implements all of this with type-safe OCaml
interfaces, integration with the jsont codec system, and proper error
handling for malformed pointers and missing values.