What is YAML?
YAML is a human-readable data serialization format. It's commonly used for configuration files, data exchange, and anywhere you need structured data that humans will read and edit.
YAML is designed to be more readable than JSON or XML:
- No curly braces or brackets required for simple structures
- Indentation defines structure (like Python)
- Comments are supported
- Multiple data types are recognized automatically
YAML vs JSON
YAML is a superset of JSON - any valid JSON is also valid YAML. However, YAML offers additional features:
JSON: YAML:
{ name: Alice
"name": "Alice", age: 30
"age": 30, active: true
"active": true
}The YAML version is cleaner for humans to read and write.
Setup
First, let's set up our environment. The library is loaded with:
open Yamlrw;;
Basic Parsing
The simplest way to parse YAML is with Yamlrw.of_string:
let simple = of_string "hello";;
YAML automatically recognizes different data types:
of_string "42";;
of_string "3.14";;
of_string "true";;
of_string "null";;
Note that integers are stored as floats in the JSON-compatible Yamlrw.value type, matching the behavior of JSON parsers.
Boolean Values
YAML recognizes many forms of boolean values:
of_string "yes";;
of_string "no";;
of_string "on";;
of_string "off";;
Strings
Strings can be plain, single-quoted, or double-quoted:
of_string "plain text";;
of_string "'single quoted'";;
of_string {|"double quoted"|};;
Quoting is useful when your string looks like another type:
of_string "'123'";;
of_string "'true'";;
Mappings (Objects)
YAML mappings associate keys with values. In the JSON-compatible representation, these become association lists:
of_string "name: Alice\nage: 30";;
Keys and values are separated by a colon and space. Each key-value pair goes on its own line.
Nested Mappings
Indentation creates nested structures:
let nested = of_string {|
database:
host: localhost
port: 5432
credentials:
user: admin
pass: secret
|};;
Accessing Values
Use the Yamlrw.Util module to navigate and extract values:
let db = Util.get "database" nested;;
Util.get_string (Util.get "host" db);;
Util.get_int (Util.get "port" db);;
For nested access, use Yamlrw.Util.get_path:
Util.get_path ["database"; "credentials"; "user"] nested;;
Util.get_path_exn ["database"; "port"] nested;;
Sequences (Arrays)
YAML sequences are written as bulleted lists:
of_string {|
- apple
- banana
- cherry
|};;
Or using flow style (like JSON arrays):
of_string "[1, 2, 3]";;
Sequences of Mappings
A common pattern is a list of objects:
let users = of_string {|
- name: Alice
role: admin
- name: Bob
role: user
|};;
Accessing Sequence Elements
Util.nth 0 users;;
match Util.nth 0 users with
| Some user -> Util.get_string (Util.get "name" user)
| None -> "not found";;
Serialization
Convert OCaml values back to YAML strings with Yamlrw.to_string:
let data = `O [
("name", `String "Bob");
("active", `Bool true);
("score", `Float 95.5)
];;
print_string (to_string data);;
Constructing Values
Use Yamlrw.Util constructors for cleaner code:
let config = Util.obj [
"server", Util.obj [
"host", Util.string "0.0.0.0";
"port", Util.int 8080
];
"debug", Util.bool true;
"tags", Util.strings ["api"; "v2"]
];;
print_string (to_string config);;
Controlling Output Style
You can control the output format with style options:
print_string (to_string ~layout_style:`Flow config);;
Scalar styles control how strings are written:
print_string (to_string ~scalar_style:`Double_quoted (Util.string "hello"));;
print_string (to_string ~scalar_style:`Single_quoted (Util.string "hello"));;
Full YAML Representation
The Yamlrw.value type is convenient but loses some YAML-specific information. For full fidelity, use the Yamlrw.yaml type:
let full = yaml_of_string ~resolve_aliases:false "hello";;
The Yamlrw.yaml type preserves:
- Scalar styles (plain, quoted, literal, folded)
- Anchors and aliases
- Type tags
- Collection styles (block vs flow)
let s = yaml_of_string ~resolve_aliases:false "'quoted string'";;
match s with
| `Scalar sc -> Scalar.value sc, Scalar.style sc
| _ -> "", `Any;;
Anchors and Aliases
YAML supports node reuse through anchors (&name) and aliases (*name). This is powerful for avoiding repetition:
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults
host: prod.example.com
staging:
<<: *defaults
host: stage.example.com
Parsing with Aliases
By default, Yamlrw.of_string resolves aliases:
let yaml_with_alias = {|
base: &base
x: 1
y: 2
derived:
<<: *base
z: 3
|};;
of_string yaml_with_alias;;
Preserving Aliases
To preserve the alias structure, use Yamlrw.yaml_of_string with ~resolve_aliases:false:
let y = yaml_of_string ~resolve_aliases:false {|
item: &ref
name: shared
copy: *ref
|};;
Multi-line Strings
YAML has special syntax for multi-line strings:
Literal Block Scalar
The | indicator preserves newlines exactly:
of_string {|
description: |
This is a
multi-line
string.
|};;
Folded Block Scalar
The > indicator folds newlines into spaces:
of_string {|
description: >
This is a
single line
when folded.
|};;
Multiple Documents
A YAML stream can contain multiple documents separated by ---:
let docs = documents_of_string {|
---
name: first
---
name: second
...
|};;
List.length docs;;
The --- marker starts a document, and ... optionally ends it.
Working with Documents
Each document has metadata and a root value:
List.map (fun d -> Document.root d) docs;;
Serializing Multiple Documents
let doc1 = Document.make (Some (of_json (Util.obj ["x", Util.int 1])));;
let doc2 = Document.make (Some (of_json (Util.obj ["x", Util.int 2])));;
print_string (documents_to_string [doc1; doc2]);;
Streaming API
For large files or fine-grained control, use the streaming API:
let parser = Stream.parser "key: value";;
Iterate over events:
Stream.iter (fun event _ _ ->
Format.printf "%a@." Event.pp event
) parser;;
Building YAML with Events
You can also emit YAML by sending events:
let emitter = Stream.emitter ();;
Stream.stream_start emitter `Utf8;;
Stream.document_start emitter ();;
Stream.mapping_start emitter ();;
Stream.scalar emitter "greeting";;
Stream.scalar emitter "Hello, World!";;
Stream.mapping_end emitter;;
Stream.document_end emitter ();;
Stream.stream_end emitter;;
print_string (Stream.contents emitter);;
Error Handling
Parse errors raise Yamlrw.Yamlrw_error:
try
ignore (of_string "key: [unclosed");
"ok"
with Yamlrw_error e ->
Error.to_string e;;
Type Errors
The Yamlrw.Util module raises Yamlrw.Util.Type_error for type mismatches:
try
ignore (Util.get_string (`Float 42.));
"ok"
with Util.Type_error (expected, actual) ->
Printf.sprintf "expected %s, got %s" expected (Value.type_name actual);;
Common Patterns
Configuration Files
A typical configuration file pattern:
let config_yaml = {|
app:
name: myapp
version: 1.0.0
server:
host: 0.0.0.0
port: 8080
ssl: true
database:
url: postgres://localhost/mydb
pool_size: 10
|};;
let config = of_string config_yaml;;
let server = Util.get "server" config;;
let host = Util.to_string ~default:"localhost" (Util.get "host" server);;
let port = Util.to_int ~default:80 (Util.get "port" server);;
Working with Lists
Processing lists of items:
let items_yaml = {|
items:
- id: 1
name: Widget
price: 9.99
- id: 2
name: Gadget
price: 19.99
- id: 3
name: Gizmo
price: 29.99
|};;
let items = Util.get_list (Util.get "items" (of_string items_yaml));;
let names = List.map (fun item ->
Util.get_string (Util.get "name" item)
) items;;
let total = List.fold_left (fun acc item ->
acc +. Util.get_float (Util.get "price" item)
) 0. items;;
Modifying YAML structures:
let original = of_string "name: Alice\nstatus: active";;
let updated = Util.update "status" (Util.string "inactive") original;;
let with_timestamp = Util.update "updated_at" (Util.string "2024-01-01") updated;;
print_string (to_string with_timestamp);;
Summary
The yamlrw library provides:
- Simple parsing:
Yamlrw.of_string for JSON-compatible values - Full fidelity:
Yamlrw.yaml_of_string preserves all YAML metadata - Easy serialization:
Yamlrw.to_string with style options - Navigation:
Yamlrw.Util module for accessing and modifying values - Multi-document:
Yamlrw.documents_of_string for YAML streams - Streaming:
Yamlrw.Stream module for event-based processing
Key types:
Yamlrw.value - JSON-compatible representation (`Null, `Bool, `Float, `String, `A, `O)Yamlrw.yaml - Full YAML with scalars, anchors, aliases, and metadataYamlrw.document - A complete document with directives
For more details, see the API reference.