Module Html5rw.Selector

CSS selector engine.

This module provides CSS selector support for querying the DOM tree. CSS selectors are patterns used to select HTML elements based on their tag names, attributes, classes, IDs, and position in the document.

Example selectors:

CSS Selector Engine

This module provides CSS selector parsing and matching for querying the HTML5 DOM. It supports a subset of CSS3 selectors suitable for common web scraping and DOM manipulation tasks.

Supported Selectors

Simple Selectors

Attribute Selectors

Pseudo-classes

Combinators

Usage

  let doc = Html5rw.parse reader in

  (* Find all paragraphs *)
  let paragraphs = Html5rw.query doc "p" in

  (* Find links with specific class *)
  let links = Html5rw.query doc "a.external" in

  (* Find table cells in rows *)
  let cells = Html5rw.query doc "tr > td" in

  (* Check if a node matches *)
  let is_active = Html5rw.matches node ".active"

Exceptions

exception Selector_error of string

Raised when a selector string is malformed.

The exception contains an error message describing the parse error.

Sub-modules

module Ast : sig ... end

Abstract syntax tree for parsed selectors.

module Token : sig ... end

Token types for the selector lexer.

Functions

val parse : string -> Ast.selector

Parse a CSS selector string.

val query : Dom.node -> string -> Dom.node list

Query the DOM tree with a CSS selector.

Returns all nodes matching the selector in document order.

  let divs = query root_node "div.content > p"
val matches : Dom.node -> string -> bool

Check if a node matches a CSS selector.

  if matches node ".active" then
    (* node has class "active" *)