This tutorial introduces JSON Pointer as defined in RFC 6901, and demonstrates the json-pointer OCaml library through interactive examples.
Before diving in, it's worth understanding the difference between JSON Pointer and JSON Path, as they serve different purposes:
JSON Pointer (RFC 6901) is an indicator syntax that specifies a single location within JSON data. It always identifies at most one value.
JSON Path is a query syntax that can search JSON data and return multiple values matching specified criteria.
Use JSON Pointer when you need to address a single, specific location (like JSON Schema's $ref). Use JSON Path when you might need multiple results (like Kubernetes queries).
The json-pointer library implements JSON Pointer and integrates with the Jsont.Path type for representing navigation indices.
First, let's set up our environment. In the toplevel, you can load the library with #require "json-pointer.top";; which will automatically install pretty printers.
From RFC 6901, Section 1:
JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document.
In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files.
For example, given this JSON document:
The JSON Pointer /users/0/name refers to the string "Alice":
In OCaml, this is represented by the 'a Json_pointer.t type - a sequence of navigation steps from the document root to a target value. The phantom type parameter 'a encodes whether this is a navigation pointer or an append pointer (more on this later).
RFC 6901, Section 3 defines the syntax:
A JSON Pointer is a Unicode string containing a sequence of zero or more reference tokens, each prefixed by a '/' (%x2F) character.
The grammar is elegantly simple:
json-pointer = *( "/" reference-token ) reference-token = *( unescaped / escaped )
This means:
"" is a valid pointer (it refers to the whole document)// characters is a "reference token"Let's see this in action:
The empty pointer has no reference tokens - it points to the root.
The pointer /foo has one token: foo. Since it's not a number, it's interpreted as an object member name (Mem).
Here we have two tokens: foo (a member name) and 0 (interpreted as an array index Nth).
Multiple tokens navigate deeper into nested structures.
Each reference token is represented using Jsont.Path.index:
type index = Jsont.Path.index (* = Jsont.Path.Mem of string * Jsont.Meta.t | Jsont.Path.Nth of int * Jsont.Meta.t *)
The Mem constructor is for object member access, and Nth is for array index access. The member name is unescaped - you work with the actual key string (like "a/b") and the library handles any escaping needed for the JSON Pointer string representation.
What happens if a pointer doesn't start with /?
The RFC is strict: non-empty pointers MUST start with /.
For safer parsing, use of_string_result:
Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document:
Evaluation of a JSON Pointer begins with a reference to the root value of a JSON document and completes with a reference to some value within the document. Each reference token in the JSON Pointer is evaluated sequentially.
Let's use the example JSON document from RFC 6901, Section 5:
This document is carefully constructed to exercise various edge cases!
The empty pointer (root) returns the whole document.
/foo accesses the member named foo, which is an array.
/foo/0 first goes to foo, then accesses index 0 of the array.
JSON allows empty strings as object keys:
The pointer / has one token: the empty string. This accesses the member with an empty name.
The RFC example includes keys with / and ~ characters:
The token a~1b refers to the key a/b. We'll explain this escaping below.
The token m~0n refers to the key m~n.
Important: When using the OCaml library programmatically, you don't need to worry about escaping. The Mem variant holds the literal key name:
The library escapes it when converting to string.
Most characters don't need escaping in JSON Pointer strings:
Even a space is a valid key character!
What happens when we try to access something that doesn't exist?
Or an out-of-bounds array index:
Or try to index into a non-container:
The library provides both exception-raising and result-returning variants:
val get : nav t -> Jsont.json -> Jsont.json val get_result : nav t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result val find : nav t -> Jsont.json -> Jsont.json option
RFC 6901 has specific rules for array indices. Section 4 states:
characters comprised of digits ... that represent an unsigned base-10 integer value, making the new referenced value the array element with the zero-based index identified by the token
And importantly:
note that leading zeros are not allowed
Zero itself is fine.
But 01 has a leading zero, so it's NOT treated as an array index - it becomes a member name instead. This protects against accidental octal interpretation.
- and Type SafetyRFC 6901, Section 4 introduces a special token:
exactly the single character "-", making the new referenced value the (nonexistent) member after the last array element.
This - marker is unique to JSON Pointer (JSON Path has no equivalent). It's primarily useful for JSON Patch operations (RFC 6902) to append elements to arrays.
The json-pointer library uses phantom types to encode the difference between pointers that can be used for navigation and pointers that target the "append position":
type nav (* A pointer to an existing element *) type append (* A pointer ending with "-" (append position) *) type 'a t (* Pointer with phantom type parameter *) type any (* Existential: wraps either nav or append *)
When you parse a pointer with of_string, you get an any pointer that can be used directly with mutation operations:
The - creates an append pointer. The any type wraps either kind, making it ergonomic to use with operations like set and add.
The RFC explains that - refers to a nonexistent position:
Note that the use of the "-" character to index an array will always result in such an error condition because by definition it refers to a nonexistent array element.
So you cannot use get or find with an append pointer - it makes no sense to retrieve a value from a position that doesn't exist! The library enforces this:
of_string_nav when you need to call get or findof_string (returns any) for mutation operationsMutation operations like add accept any directly:
For retrieval operations, use of_string_nav which ensures the pointer doesn't contain -:
You can convert a navigation pointer to an append pointer using at_end:
While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 (JSON Patch) uses JSON Pointer for modifications. The json-pointer library provides these operations.
The add operation inserts a value at a location. It accepts any pointers, so you can use of_string directly:
For arrays, add inserts BEFORE the specified index:
This is where the - marker shines - it appends to the end:
You can also use at_end to create an append pointer programmatically:
anySince add, set, move, and copy accept any pointers, you can use of_string directly without any pattern matching. This makes JSON Patch implementations straightforward:
The same pointer works whether it targets an existing position or the append marker - no conditional logic needed.
The remove operation deletes a value. It only accepts nav t because you can only remove something that exists:
For arrays, it removes and shifts:
The replace operation updates an existing value:
Unlike add, replace requires the target to already exist (hence nav t). Attempting to replace a nonexistent path raises an error.
The move operation relocates a value. The source (from) must be a nav t (you can only move something that exists), but the destination (path) accepts any:
The copy operation duplicates a value (same typing as move):
The test operation verifies a value (useful in JSON Patch):
RFC 6901, Section 3 explains the escaping rules:
Because the characters '~' (%x7E) and '/' (%x2F) have special meanings in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be encoded as '~1' when these characters appear in a reference token.
Why these specific characters?
/ separates tokens, so it must be escaped inside a token~ is the escape character itself, so it must also be escapedThe escape sequences are:
~0 represents ~ (tilde)~1 represents / (forward slash)Important: When using json-pointer programmatically, you rarely need to think about escaping. The Mem variant stores unescaped strings, and escaping happens automatically during serialization:
The Token module exposes the escaping functions:
And the reverse process:
RFC 6901, Section 4 is careful to specify the unescaping order:
Evaluation of each reference token begins by decoding any escaped character sequence. This is performed by first transforming any occurrence of the sequence '~1' to '/', and then transforming any occurrence of the sequence '~0' to '~'. By performing the substitutions in this order, an implementation avoids the error of turning '~01' first into '~1' and then into '/', which would be incorrect (the string '~01' correctly becomes '~1' after transformation).
Let's verify this tricky case:
If we unescaped ~0 first, ~01 would become ~1, which would then become /. But that's wrong! The sequence ~01 should become the literal string ~1 (a tilde followed by the digit one).
JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:
A JSON Pointer can be represented in a URI fragment identifier by encoding it into octets using UTF-8, while percent-encoding those characters not allowed by the fragment rule in RFC 3986.
This adds percent-encoding on top of the ~0/~1 escaping:
The % character must be percent-encoded as %25 in URIs, and spaces become %20.
Here's the RFC example showing the URI fragment forms:
"" -> # -> whole document"/foo" -> #/foo -> ["bar", "baz"]"/foo/0" -> #/foo/0 -> "bar""/" -> #/ -> 0"/a~1b" -> #/a~1b -> 1"/c%d" -> #/c%25d -> 2"/ " -> #/%20 -> 7"/m~0n" -> #/m~0n -> 8Instead of parsing strings, you can build pointers from indices:
For array access, use the nth helper:
You can build pointers incrementally using the / operator (or append_index):
Or concatenate two pointers:
The library integrates with the Jsont codec system, allowing you to combine JSON Pointer navigation with typed decoding. This is powerful because you can point to a location in a JSON document and decode it directly to an OCaml type.
pathThe path combinator combines pointer navigation with typed decoding:
Extract a list of strings:
~absentUse ~absent to provide a default when a path doesn't exist:
You can extract values from deeply nested structures:
Raw access requires pattern matching:
Typed access is cleaner and type-safe:
The typed approach catches mismatches at decode time with clear errors.
The set and add functions accept any pointers, which means you can use the result of of_string directly without pattern matching:
This is useful for implementing JSON Patch (RFC 6902) where operations like "add" can target either existing positions or the append marker. If you need to distinguish between pointer types at runtime, use of_string_kind which returns a polymorphic variant:
JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents:
/-separated reference tokens~0 for ~ and ~1 for / in tokens (handled automatically by the library)nav t vs append t) prevent misuse of append pointers with retrieval operations, while the any existential type allows ergonomic use with mutation operationsThe json-pointer library implements all of this with type-safe OCaml interfaces, integration with the jsont codec system, and proper error handling for malformed pointers and missing values.
- token is unique to JSON Pointer - it means "append position" for arrays- (append) pointers cannot be used with get/find