RFC6901 JSON Pointer implementation in OCaml using jsont

JSON Pointer Tutorial#

This tutorial introduces JSON Pointer as defined in RFC 6901, and demonstrates the jsont-pointer OCaml library through interactive examples.

What is JSON Pointer?#

From RFC 6901, Section 1:

JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document.

In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files.

For example, given this JSON document:

{
  "users": [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25}
  ]
}

The JSON Pointer /users/0/name refers to the string "Alice".

In OCaml, this is represented by the Jsont_pointer.t type - a sequence of navigation steps from the document root to a target value.

Syntax: Reference Tokens#

RFC 6901, Section 3 defines the syntax:

A JSON Pointer is a Unicode string containing a sequence of zero or more reference tokens, each prefixed by a '/' (%x2F) character.

The grammar is elegantly simple:

json-pointer    = *( "/" reference-token )
reference-token = *( unescaped / escaped )

This means:

  • The empty string "" is a valid pointer (it refers to the whole document)
  • Every non-empty pointer starts with /
  • Everything between / characters is a "reference token"

Let's see this in action. We can parse pointers and see their structure:

$ jsonpp parse ""
OK: []

The empty pointer has no reference tokens - it points to the root.

$ jsonpp parse "/foo"
OK: [Mem:foo]

The pointer /foo has one token: foo. Since it's not a number, it's interpreted as an object member name (Mem).

$ jsonpp parse "/foo/0"
OK: [Mem:foo, Nth:0]

Here we have two tokens: foo (a member name) and 0 (interpreted as an array index Nth).

$ jsonpp parse "/foo/bar/baz"
OK: [Mem:foo, Mem:bar, Mem:baz]

Multiple tokens navigate deeper into nested structures.

The Index Type#

Each reference token becomes an Index.t value in the library:

type t =
  | Mem of string   (* Object member access *)
  | Nth of int      (* Array index access *)
  | End             (* The special "-" marker for append operations *)

The Mem variant holds the unescaped member name - you work with the actual key string (like "a/b") and the library handles any escaping needed for the JSON Pointer string representation.

Invalid Syntax#

What happens if a pointer doesn't start with /?

$ jsonpp parse "foo"
ERROR: Invalid JSON Pointer: must be empty or start with '/': foo

The RFC is strict: non-empty pointers MUST start with /.

Evaluation: Navigating JSON#

Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document:

Evaluation of a JSON Pointer begins with a reference to the root value of a JSON document and completes with a reference to some value within the document. Each reference token in the JSON Pointer is evaluated sequentially.

In the library, this is the Jsont_pointer.get function:

val get : t -> Jsont.json -> Jsont.json

Let's use the example JSON document from RFC 6901, Section 5:

$ cat rfc6901_example.json
{
  "foo": ["bar", "baz"],
  "": 0,
  "a/b": 1,
  "c%d": 2,
  "e^f": 3,
  "g|h": 4,
  "i\\j": 5,
  "k\"l": 6,
  " ": 7,
  "m~n": 8
}

This document is carefully constructed to exercise various edge cases!

The Root Pointer#

$ jsonpp eval rfc6901_example.json ""
OK: {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8}

The empty pointer returns the whole document. In OCaml, this is Jsont_pointer.root:

val root : t
(** The empty pointer that references the whole document. *)

Object Member Access#

$ jsonpp eval rfc6901_example.json "/foo"
OK: ["bar","baz"]

/foo accesses the member named foo, which is an array.

Array Index Access#

$ jsonpp eval rfc6901_example.json "/foo/0"
OK: "bar"

/foo/0 first goes to foo, then accesses index 0 of the array.

$ jsonpp eval rfc6901_example.json "/foo/1"
OK: "baz"

Index 1 gives us the second element.

Empty String as Key#

JSON allows empty strings as object keys:

$ jsonpp eval rfc6901_example.json "/"
OK: 0

The pointer / has one token: the empty string. This accesses the member with an empty name.

Keys with Special Characters#

The RFC example includes keys with / and ~ characters:

$ jsonpp eval rfc6901_example.json "/a~1b"
OK: 1

The token a~1b refers to the key a/b. We'll explain this escaping below.

$ jsonpp eval rfc6901_example.json "/m~0n"
OK: 8

The token m~0n refers to the key m~n.

Important: When using the OCaml library programmatically, you don't need to worry about escaping. The Index.Mem variant holds the literal key name:

(* To access the key "a/b", just use the literal string *)
let pointer = Jsont_pointer.make [Mem "a/b"]

(* The library escapes it when converting to string *)
let s = Jsont_pointer.to_string pointer  (* "/a~1b" *)

Other Special Characters (No Escaping Needed)#

Most characters don't need escaping in JSON Pointer strings:

$ jsonpp eval rfc6901_example.json "/c%d"
OK: 2
$ jsonpp eval rfc6901_example.json "/e^f"
OK: 3
$ jsonpp eval rfc6901_example.json "/g|h"
OK: 4
$ jsonpp eval rfc6901_example.json "/ "
OK: 7

Even a space is a valid key character!

Error Conditions#

What happens when we try to access something that doesn't exist?

$ jsonpp eval rfc6901_example.json "/nonexistent"
ERROR: JSON Pointer: member 'nonexistent' not found
File "-":

Or an out-of-bounds array index:

$ jsonpp eval rfc6901_example.json "/foo/99"
ERROR: JSON Pointer: index 99 out of bounds (array has 2 elements)
File "-":

Or try to index into a non-container:

$ jsonpp eval rfc6901_example.json "/foo/0/invalid"
ERROR: JSON Pointer: cannot index into string with 'invalid'
File "-":

The library provides both exception-raising and result-returning variants:

val get : t -> Jsont.json -> Jsont.json
val get_result : t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result
val find : t -> Jsont.json -> Jsont.json option

Array Index Rules#

RFC 6901 has specific rules for array indices. Section 4 states:

characters comprised of digits [...] that represent an unsigned base-10 integer value, making the new referenced value the array element with the zero-based index identified by the token

And importantly:

note that leading zeros are not allowed

$ jsonpp parse "/foo/0"
OK: [Mem:foo, Nth:0]

Zero itself is fine.

$ jsonpp parse "/foo/01"
OK: [Mem:foo, Mem:01]

But 01 has a leading zero, so it's NOT treated as an array index - it becomes a member name instead. This protects against accidental octal interpretation.

The End-of-Array Marker: -#

RFC 6901, Section 4 introduces a special token:

exactly the single character "-", making the new referenced value the (nonexistent) member after the last array element.

This is primarily useful for JSON Patch operations (RFC 6902). Let's see how it parses:

$ jsonpp parse "/foo/-"
OK: [Mem:foo, End]

The - is recognized as a special End index.

However, you cannot evaluate a pointer containing - because it refers to a position that doesn't exist:

$ jsonpp eval rfc6901_example.json "/foo/-"
ERROR: JSON Pointer: '-' (end marker) refers to nonexistent array element
File "-":

The RFC explains this:

Note that the use of the "-" character to index an array will always result in such an error condition because by definition it refers to a nonexistent array element.

But we'll see later that - is very useful for mutation operations!

Mutation Operations#

While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 (JSON Patch) uses JSON Pointer for modifications. The jsont-pointer library provides these operations.

Add#

The add operation inserts a value at a location:

$ jsonpp add '{"foo":"bar"}' '/baz' '"qux"'
{"foo":"bar","baz":"qux"}

In OCaml:

val add : t -> Jsont.json -> value:Jsont.json -> Jsont.json

For arrays, add inserts BEFORE the specified index:

$ jsonpp add '{"foo":["a","b"]}' '/foo/1' '"X"'
{"foo":["a","X","b"]}

This is where the - marker shines - it appends to the end:

$ jsonpp add '{"foo":["a","b"]}' '/foo/-' '"c"'
{"foo":["a","b","c"]}

Remove#

The remove operation deletes a value:

$ jsonpp remove '{"foo":"bar","baz":"qux"}' '/baz'
{"foo":"bar"}

For arrays, it removes and shifts:

$ jsonpp remove '{"foo":["a","b","c"]}' '/foo/1'
{"foo":["a","c"]}

Replace#

The replace operation updates an existing value:

$ jsonpp replace '{"foo":"bar"}' '/foo' '"baz"'
{"foo":"baz"}

Unlike add, replace requires the target to already exist:

$ jsonpp replace '{"foo":"bar"}' '/nonexistent' '"value"'
ERROR: JSON Pointer: member 'nonexistent' not found
File "-":

Move#

The move operation relocates a value:

$ jsonpp move '{"foo":{"bar":"baz"},"qux":{}}' '/foo/bar' '/qux/thud'
{"foo":{},"qux":{"thud":"baz"}}

Copy#

The copy operation duplicates a value:

$ jsonpp copy '{"foo":{"bar":"baz"}}' '/foo/bar' '/foo/qux'
{"foo":{"bar":"baz","qux":"baz"}}

Test#

The test operation verifies a value (useful in JSON Patch):

$ jsonpp test '{"foo":"bar"}' '/foo' '"bar"'
true
$ jsonpp test '{"foo":"bar"}' '/foo' '"baz"'
false

Escaping Special Characters#

RFC 6901, Section 3 explains the escaping rules:

Because the characters '~' (%x7E) and '/' (%x2F) have special meanings in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be encoded as '~1' when these characters appear in a reference token.

Why these specific characters?

  • / separates tokens, so it must be escaped inside a token
  • ~ is the escape character itself, so it must also be escaped

The escape sequences are:

  • ~0 represents ~ (tilde)
  • ~1 represents / (forward slash)

The Library Handles Escaping Automatically#

Important: When using jsont-pointer programmatically, you rarely need to think about escaping. The Index.Mem variant stores unescaped strings, and escaping happens automatically during serialization:

(* Create a pointer to key "a/b" - no escaping needed *)
let p = Jsont_pointer.make [Mem "a/b"]

(* Serialize to string - escaping happens automatically *)
let s = Jsont_pointer.to_string p  (* Returns "/a~1b" *)

(* Parse from string - unescaping happens automatically *)
let p' = Jsont_pointer.of_string "/a~1b"
(* p' contains [Mem "a/b"] - the unescaped key *)

The Token module exposes the escaping functions if you need them:

module Token : sig
  val escape : string -> string    (* "a/b" -> "a~1b" *)
  val unescape : string -> string  (* "a~1b" -> "a/b" *)
end

Escaping in Action#

Let's see escaping with the CLI tool:

$ jsonpp escape "hello"
hello

No special characters, no escaping needed.

$ jsonpp escape "a/b"
a~1b

The / becomes ~1.

$ jsonpp escape "a~b"
a~0b

The ~ becomes ~0.

$ jsonpp escape "~/"
~0~1

Both characters are escaped.

Unescaping#

And the reverse process:

$ jsonpp unescape "a~1b"
OK: a/b
$ jsonpp unescape "a~0b"
OK: a~b

The Order Matters!#

RFC 6901, Section 4 is careful to specify the unescaping order:

Evaluation of each reference token begins by decoding any escaped character sequence. This is performed by first transforming any occurrence of the sequence '~1' to '/', and then transforming any occurrence of the sequence '0' to ''. By performing the substitutions in this order, an implementation avoids the error of turning '~01' first into '~1' and then into '/', which would be incorrect (the string '~01' correctly becomes '~1' after transformation).

Let's verify this tricky case:

$ jsonpp unescape "~01"
OK: ~1

If we unescaped ~0 first, ~01 would become ~1, which would then become /. But that's wrong! The sequence ~01 should become the literal string ~1 (a tilde followed by the digit one).

Invalid escape sequences are rejected:

$ jsonpp unescape "~2"
ERROR: Invalid JSON Pointer: invalid escape sequence ~2
$ jsonpp unescape "hello~"
ERROR: Invalid JSON Pointer: incomplete escape sequence at end

URI Fragment Encoding#

JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:

A JSON Pointer can be represented in a URI fragment identifier by encoding it into octets using UTF-8, while percent-encoding those characters not allowed by the fragment rule in RFC 3986.

This adds percent-encoding on top of the ~0/~1 escaping:

$ jsonpp uri-fragment "/foo"
OK: /foo -> /foo

Simple pointers often don't need percent-encoding.

$ jsonpp uri-fragment "/a~1b"
OK: /a~1b -> /a~1b

The ~1 escape stays as-is (it's valid in URI fragments).

$ jsonpp uri-fragment "/c%d"
OK: /c%d -> /c%25d

The % character must be percent-encoded as %25 in URIs!

$ jsonpp uri-fragment "/ "
OK: /  -> /%20

Spaces become %20.

The library provides functions for URI fragment encoding:

val to_uri_fragment : t -> string
val of_uri_fragment : string -> t
val jsont_uri_fragment : t Jsont.t

Here's the RFC example showing the URI fragment forms:

JSON Pointer URI Fragment Value
"" # whole document
"/foo" #/foo ["bar", "baz"]
"/foo/0" #/foo/0 "bar"
"/" #/ 0
"/a~1b" #/a~1b 1
"/c%d" #/c%25d 2
"/ " #/%20 7
"/m~0n" #/m~0n 8

Deeply Nested Structures#

JSON Pointer handles arbitrarily deep nesting:

$ jsonpp eval rfc6901_example.json "/foo/0"
OK: "bar"

For deeper structures, just add more path segments. With nested objects:

$ jsonpp add '{"a":{"b":{"c":"d"}}}' '/a/b/x' '"y"'
{"a":{"b":{"c":"d","x":"y"}}}

With nested arrays:

$ jsonpp add '{"arr":[[1,2],[3,4]]}' '/arr/0/1' '99'
{"arr":[[1,99,2],[3,4]]}

Jsont Integration#

The library integrates with the Jsont codec system for typed access:

(* Codec for JSON Pointers as JSON strings *)
val jsont : t Jsont.t

(* Query combinators *)
val path : ?absent:'a -> t -> 'a Jsont.t -> 'a Jsont.t
val set_path : ?allow_absent:bool -> 'a Jsont.t -> t -> 'a -> Jsont.json Jsont.t
val update_path : ?absent:'a -> t -> 'a Jsont.t -> Jsont.json Jsont.t
val delete_path : ?allow_absent:bool -> t -> Jsont.json Jsont.t

These allow you to use JSON Pointers with typed codecs rather than raw Jsont.json values.

Summary#

JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents:

  1. Syntax: Pointers are strings of /-separated reference tokens
  2. Escaping: Use ~0 for ~ and ~1 for / in tokens (handled automatically by the library)
  3. Evaluation: Tokens navigate through objects (by key) and arrays (by index)
  4. URI Encoding: Pointers can be percent-encoded for use in URIs
  5. Mutations: Combined with JSON Patch (RFC 6902), pointers enable structured updates

The jsont-pointer library implements all of this with type-safe OCaml interfaces, integration with the jsont codec system, and proper error handling for malformed pointers and missing values.