JSON Pointer Tutorial

This tutorial introduces JSON Pointer as defined in RFC 6901, and demonstrates the json-pointer OCaml library through interactive examples.

JSON Pointer vs JSON Path

Before diving in, it's worth understanding the difference between JSON Pointer and JSON Path, as they serve different purposes:

JSON Pointer (RFC 6901) is an indicator syntax that specifies a single location within JSON data. It always identifies at most one value.

JSON Path is a query syntax that can search JSON data and return multiple values matching specified criteria.

Use JSON Pointer when you need to address a single, specific location (like JSON Schema's $ref). Use JSON Path when you might need multiple results (like Kubernetes queries).

The json-pointer library implements JSON Pointer and integrates with the Jsont.Path type for representing navigation indices.

Setup

First, let's set up our environment. In the toplevel, you can load the library with #require "json-pointer.top";; which will automatically install pretty printers.

Json_pointer_top.install ();; open Json_pointer;; let parse_json s = match Jsont_bytesrw.decode_string Jsont.json s with | Ok json -> json | Error e -> failwith e;;

What is JSON Pointer?

From RFC 6901, Section 1:

JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document.

In other words, JSON Pointer is an addressing scheme for locating values inside a JSON structure. Think of it like a filesystem path, but for JSON documents instead of files.

For example, given this JSON document:

let users_json = parse_json {|{ "users": [ {"name": "Alice", "age": 30}, {"name": "Bob", "age": 25} ] }|};;

The JSON Pointer /users/0/name refers to the string "Alice":

let ptr = of_string_nav "/users/0/name";; get ptr users_json;;

In OCaml, this is represented by the 'a Json_pointer.t type - a sequence of navigation steps from the document root to a target value. The phantom type parameter 'a encodes whether this is a navigation pointer or an append pointer (more on this later).

Syntax: Reference Tokens

RFC 6901, Section 3 defines the syntax:

A JSON Pointer is a Unicode string containing a sequence of zero or more reference tokens, each prefixed by a '/' (%x2F) character.

The grammar is elegantly simple:

json-pointer    = *( "/" reference-token )
reference-token = *( unescaped / escaped )

This means:

Let's see this in action:

of_string_nav "";;

The empty pointer has no reference tokens - it points to the root.

of_string_nav "/foo";;

The pointer /foo has one token: foo. Since it's not a number, it's interpreted as an object member name (Mem).

of_string_nav "/foo/0";;

Here we have two tokens: foo (a member name) and 0 (interpreted as an array index Nth).

of_string_nav "/foo/bar/baz";;

Multiple tokens navigate deeper into nested structures.

The Index Type

Each reference token is represented using Jsont.Path.index:

type index = Jsont.Path.index
(* = Jsont.Path.Mem of string * Jsont.Meta.t
   | Jsont.Path.Nth of int * Jsont.Meta.t *)

The Mem constructor is for object member access, and Nth is for array index access. The member name is unescaped - you work with the actual key string (like "a/b") and the library handles any escaping needed for the JSON Pointer string representation.

Invalid Syntax

What happens if a pointer doesn't start with /?

of_string_nav "foo";;

The RFC is strict: non-empty pointers MUST start with /.

For safer parsing, use of_string_result:

of_string_result "foo";; of_string_result "/valid";;

Evaluation: Navigating JSON

Now we come to the heart of JSON Pointer: evaluation. RFC 6901, Section 4 describes how a pointer is resolved against a JSON document:

Evaluation of a JSON Pointer begins with a reference to the root value of a JSON document and completes with a reference to some value within the document. Each reference token in the JSON Pointer is evaluated sequentially.

Let's use the example JSON document from RFC 6901, Section 5:

let rfc_example = parse_json {|{ "foo": ["bar", "baz"], "": 0, "a/b": 1, "c%d": 2, "e^f": 3, "g|h": 4, "i\\j": 5, "k\"l": 6, " ": 7, "m~n": 8 }|};;

This document is carefully constructed to exercise various edge cases!

The Root Pointer

get root rfc_example ;;

The empty pointer (root) returns the whole document.

Object Member Access

get (of_string_nav "/foo") rfc_example ;;

/foo accesses the member named foo, which is an array.

Array Index Access

get (of_string_nav "/foo/0") rfc_example ;; get (of_string_nav "/foo/1") rfc_example ;;

/foo/0 first goes to foo, then accesses index 0 of the array.

Empty String as Key

JSON allows empty strings as object keys:

get (of_string_nav "/") rfc_example ;;

The pointer / has one token: the empty string. This accesses the member with an empty name.

Keys with Special Characters

The RFC example includes keys with / and ~ characters:

get (of_string_nav "/a~1b") rfc_example ;;

The token a~1b refers to the key a/b. We'll explain this escaping below.

get (of_string_nav "/m~0n") rfc_example ;;

The token m~0n refers to the key m~n.

Important: When using the OCaml library programmatically, you don't need to worry about escaping. The Mem variant holds the literal key name:

let slash_ptr = make [mem "a/b"];; to_string slash_ptr;; get slash_ptr rfc_example ;;

The library escapes it when converting to string.

Other Special Characters (No Escaping Needed)

Most characters don't need escaping in JSON Pointer strings:

get (of_string_nav "/c%d") rfc_example ;; get (of_string_nav "/e^f") rfc_example ;; get (of_string_nav "/g|h") rfc_example ;; get (of_string_nav "/ ") rfc_example ;;

Even a space is a valid key character!

Error Conditions

What happens when we try to access something that doesn't exist?

get_result (of_string_nav "/nonexistent") rfc_example;; find (of_string_nav "/nonexistent") rfc_example;;

Or an out-of-bounds array index:

find (of_string_nav "/foo/99") rfc_example;;

Or try to index into a non-container:

find (of_string_nav "/foo/0/invalid") rfc_example;;

The library provides both exception-raising and result-returning variants:

val get : nav t -> Jsont.json -> Jsont.json
val get_result : nav t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result
val find : nav t -> Jsont.json -> Jsont.json option

Array Index Rules

RFC 6901 has specific rules for array indices. Section 4 states:

characters comprised of digits ... that represent an unsigned base-10 integer value, making the new referenced value the array element with the zero-based index identified by the token

And importantly:

note that leading zeros are not allowed

of_string_nav "/foo/0";;

Zero itself is fine.

of_string_nav "/foo/01";;

But 01 has a leading zero, so it's NOT treated as an array index - it becomes a member name instead. This protects against accidental octal interpretation.

The End-of-Array Marker: - and Type Safety

RFC 6901, Section 4 introduces a special token:

exactly the single character "-", making the new referenced value the (nonexistent) member after the last array element.

This - marker is unique to JSON Pointer (JSON Path has no equivalent). It's primarily useful for JSON Patch operations (RFC 6902) to append elements to arrays.

The json-pointer library uses phantom types to encode the difference between pointers that can be used for navigation and pointers that target the "append position":

type nav     (* A pointer to an existing element *)
type append  (* A pointer ending with "-" (append position) *)
type 'a t    (* Pointer with phantom type parameter *)
type any     (* Existential: wraps either nav or append *)

When you parse a pointer with of_string, you get an any pointer that can be used directly with mutation operations:

of_string "/foo/0";; of_string "/foo/-";;

The - creates an append pointer. The any type wraps either kind, making it ergonomic to use with operations like set and add.

Why Two Pointer Types?

The RFC explains that - refers to a nonexistent position:

Note that the use of the "-" character to index an array will always result in such an error condition because by definition it refers to a nonexistent array element.

So you cannot use get or find with an append pointer - it makes no sense to retrieve a value from a position that doesn't exist! The library enforces this:

Mutation operations like add accept any directly:

let arr_obj = parse_json {|{"foo":["a","b"]}|};; add (of_string "/foo/-") arr_obj ~value:(Jsont.Json.string "c");;

For retrieval operations, use of_string_nav which ensures the pointer doesn't contain -:

of_string_nav "/foo/0";; of_string_nav "/foo/-";;

Creating Append Pointers Programmatically

You can convert a navigation pointer to an append pointer using at_end:

let nav_ptr = of_string_nav "/foo";; let app_ptr = at_end nav_ptr;; to_string app_ptr;;

Mutation Operations

While RFC 6901 defines JSON Pointer for read-only access, RFC 6902 (JSON Patch) uses JSON Pointer for modifications. The json-pointer library provides these operations.

Add

The add operation inserts a value at a location. It accepts any pointers, so you can use of_string directly:

let obj = parse_json {|{"foo":"bar"}|};; add (of_string "/baz") obj ~value:(Jsont.Json.string "qux");;

For arrays, add inserts BEFORE the specified index:

let arr_obj = parse_json {|{"foo":["a","b"]}|};; add (of_string "/foo/1") arr_obj ~value:(Jsont.Json.string "X");;

This is where the - marker shines - it appends to the end:

add (of_string "/foo/-") arr_obj ~value:(Jsont.Json.string "c");;

You can also use at_end to create an append pointer programmatically:

add (any (at_end (of_string_nav "/foo"))) arr_obj ~value:(Jsont.Json.string "c");;

Ergonomic Mutation with any

Since add, set, move, and copy accept any pointers, you can use of_string directly without any pattern matching. This makes JSON Patch implementations straightforward:

let items = parse_json {|{"items":["x"]}|};; add (of_string "/items/0") items ~value:(Jsont.Json.string "y");; add (of_string "/items/-") items ~value:(Jsont.Json.string "z");;

The same pointer works whether it targets an existing position or the append marker - no conditional logic needed.

Remove

The remove operation deletes a value. It only accepts nav t because you can only remove something that exists:

let two_fields = parse_json {|{"foo":"bar","baz":"qux"}|};; remove (of_string_nav "/baz") two_fields ;;

For arrays, it removes and shifts:

let three_elem = parse_json {|{"foo":["a","b","c"]}|};; remove (of_string_nav "/foo/1") three_elem ;;

Replace

The replace operation updates an existing value:

replace (of_string_nav "/foo") obj ~value:(Jsont.Json.string "baz") ;;

Unlike add, replace requires the target to already exist (hence nav t). Attempting to replace a nonexistent path raises an error.

Move

The move operation relocates a value. The source (from) must be a nav t (you can only move something that exists), but the destination (path) accepts any:

let nested = parse_json {|{"foo":{"bar":"baz"},"qux":{}}|};; move ~from:(of_string_nav "/foo/bar") ~path:(of_string "/qux/thud") nested;;

Copy

The copy operation duplicates a value (same typing as move):

let to_copy = parse_json {|{"foo":{"bar":"baz"}}|};; copy ~from:(of_string_nav "/foo/bar") ~path:(of_string "/foo/qux") to_copy;;

Test

The test operation verifies a value (useful in JSON Patch):

test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "bar");; test (of_string_nav "/foo") obj ~expected:(Jsont.Json.string "wrong");;

Escaping Special Characters

RFC 6901, Section 3 explains the escaping rules:

Because the characters '~' (%x7E) and '/' (%x2F) have special meanings in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be encoded as '~1' when these characters appear in a reference token.

Why these specific characters?

The escape sequences are:

The Library Handles Escaping Automatically

Important: When using json-pointer programmatically, you rarely need to think about escaping. The Mem variant stores unescaped strings, and escaping happens automatically during serialization:

let p = make [mem "a/b"];; to_string p;; of_string_nav "/a~1b";;

Escaping in Action

The Token module exposes the escaping functions:

Token.escape "hello";; Token.escape "a/b";; Token.escape "a~b";; Token.escape "~/";;

Unescaping

And the reverse process:

Token.unescape "a~1b";; Token.unescape "a~0b";;

The Order Matters!

RFC 6901, Section 4 is careful to specify the unescaping order:

Evaluation of each reference token begins by decoding any escaped character sequence. This is performed by first transforming any occurrence of the sequence '~1' to '/', and then transforming any occurrence of the sequence '~0' to '~'. By performing the substitutions in this order, an implementation avoids the error of turning '~01' first into '~1' and then into '/', which would be incorrect (the string '~01' correctly becomes '~1' after transformation).

Let's verify this tricky case:

Token.unescape "~01";;

If we unescaped ~0 first, ~01 would become ~1, which would then become /. But that's wrong! The sequence ~01 should become the literal string ~1 (a tilde followed by the digit one).

URI Fragment Encoding

JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:

A JSON Pointer can be represented in a URI fragment identifier by encoding it into octets using UTF-8, while percent-encoding those characters not allowed by the fragment rule in RFC 3986.

This adds percent-encoding on top of the ~0/~1 escaping:

to_uri_fragment (of_string_nav "/foo");; to_uri_fragment (of_string_nav "/a~1b");; to_uri_fragment (of_string_nav "/c%d");; to_uri_fragment (of_string_nav "/ ");;

The % character must be percent-encoded as %25 in URIs, and spaces become %20.

Here's the RFC example showing the URI fragment forms:

Building Pointers Programmatically

Instead of parsing strings, you can build pointers from indices:

let port_ptr = make [mem "database"; mem "port"];; to_string port_ptr;;

For array access, use the nth helper:

let first_feature_ptr = make [mem "features"; nth 0];; to_string first_feature_ptr;;

Pointer Navigation

You can build pointers incrementally using the / operator (or append_index):

let db_ptr = of_string_nav "/database";; let creds_ptr = db_ptr / mem "credentials";; let user_ptr = creds_ptr / mem "username";; to_string user_ptr;;

Or concatenate two pointers:

let base = of_string_nav "/api/v1";; let endpoint = of_string_nav "/users/0";; to_string (concat base endpoint);;

Jsont Integration

The library integrates with the Jsont codec system, allowing you to combine JSON Pointer navigation with typed decoding. This is powerful because you can point to a location in a JSON document and decode it directly to an OCaml type.

let config_json = parse_json {|{ "database": { "host": "localhost", "port": 5432, "credentials": {"username": "admin", "password": "secret"} }, "features": ["auth", "logging", "metrics"] }|};;

Typed Access with path

The path combinator combines pointer navigation with typed decoding:

let nav = of_string_nav "/database/host";; let db_host = Jsont.Json.decode (path nav Jsont.string) config_json |> Result.get_ok;; let db_port = Jsont.Json.decode (path (of_string_nav "/database/port") Jsont.int) config_json |> Result.get_ok;;

Extract a list of strings:

let features = Jsont.Json.decode (path (of_string_nav "/features") Jsont.(list string)) config_json |> Result.get_ok;;

Default Values with ~absent

Use ~absent to provide a default when a path doesn't exist:

let timeout = Jsont.Json.decode (path ~absent:30 (of_string_nav "/database/timeout") Jsont.int) config_json |> Result.get_ok;;

Nested Path Extraction

You can extract values from deeply nested structures:

let org_json = parse_json {|{ "organization": { "owner": {"name": "Alice", "email": "alice@example.com", "age": 35}, "members": [{"name": "Bob", "email": "bob@example.com", "age": 28}] } }|};; Jsont.Json.decode (path (of_string_nav "/organization/owner/name") Jsont.string) org_json |> Result.get_ok;; Jsont.Json.decode (path (of_string_nav "/organization/members/0/age") Jsont.int) org_json |> Result.get_ok;;

Comparison: Raw vs Typed Access

Raw access requires pattern matching:

let raw_port = match get (of_string_nav "/database/port") config_json with | Jsont.Number (f, _) -> int_of_float f | _ -> failwith "expected number";;

Typed access is cleaner and type-safe:

let typed_port = Jsont.Json.decode (path (of_string_nav "/database/port") Jsont.int) config_json |> Result.get_ok;;

The typed approach catches mismatches at decode time with clear errors.

Updates with Polymorphic Pointers

The set and add functions accept any pointers, which means you can use the result of of_string directly without pattern matching:

let tasks = parse_json {|{"tasks":["buy milk"]}|};; set (of_string "/tasks/0") tasks ~value:(Jsont.Json.string "buy eggs");; set (of_string "/tasks/-") tasks ~value:(Jsont.Json.string "call mom");;

This is useful for implementing JSON Patch (RFC 6902) where operations like "add" can target either existing positions or the append marker. If you need to distinguish between pointer types at runtime, use of_string_kind which returns a polymorphic variant:

of_string_kind "/tasks/0";; of_string_kind "/tasks/-";;

Summary

JSON Pointer (RFC 6901) provides a simple but powerful way to address values within JSON documents:

  1. Syntax: Pointers are strings of /-separated reference tokens
  2. Escaping: Use ~0 for ~ and ~1 for / in tokens (handled automatically by the library)
  3. Evaluation: Tokens navigate through objects (by key) and arrays (by index)
  4. URI Encoding: Pointers can be percent-encoded for use in URIs
  5. Mutations: Combined with JSON Patch (RFC 6902), pointers enable structured updates
  6. Type Safety: Phantom types (nav t vs append t) prevent misuse of append pointers with retrieval operations, while the any existential type allows ergonomic use with mutation operations

The json-pointer library implements all of this with type-safe OCaml interfaces, integration with the jsont codec system, and proper error handling for malformed pointers and missing values.

Key Points on JSON Pointer vs JSON Path