Module Base__Import.Lexing
The run-time library for lexers generated by ocamllex
.
Positions
type position
=
{
pos_fname : string;
pos_lnum : int;
pos_bol : int;
pos_cnum : int;
}
A value of type
position
describes a point in a source file.pos_fname
is the file name;pos_lnum
is the line number;pos_bol
is the offset of the beginning of the line (number of characters between the beginning of the lexbuf and the beginning of the line);pos_cnum
is the offset of the position (number of characters between the beginning of the lexbuf and the position). The difference betweenpos_cnum
andpos_bol
is the character offset within the line (i.e. the column number, assuming each character is one column wide).See the documentation of type
lexbuf
for information about how the lexing engine will manage positions.
val dummy_pos : position
A value of type
position
, guaranteed to be different from any valid position.
Lexer buffers
type lexbuf
=
{
refill_buff : lexbuf -> unit;
mutable lex_buffer : bytes;
mutable lex_buffer_len : int;
mutable lex_abs_pos : int;
mutable lex_start_pos : int;
mutable lex_curr_pos : int;
mutable lex_last_pos : int;
mutable lex_last_action : int;
mutable lex_eof_reached : bool;
mutable lex_mem : int array;
mutable lex_start_p : position;
mutable lex_curr_p : position;
}
The type of lexer buffers. A lexer buffer is the argument passed to the scanning functions defined by the generated scanners. The lexer buffer holds the current state of the scanner, plus a function to refill the buffer from the input.
Lexers can optionally maintain the
lex_curr_p
andlex_start_p
position fields. This "position tracking" mode is the default, and it corresponds to passing~with_position:true
to functions that create lexer buffers. In this mode, the lexing engine and lexer actions are co-responsible for properly updating the position fields, as described in the next paragraph. When the mode is explicitly disabled (with~with_position:false
), the lexing engine will not touch the position fields and the lexer actions should be careful not to do it either; thelex_curr_p
andlex_start_p
field will then always hold thedummy_pos
invalid position. Not tracking positions avoids allocations and memory writes and can significantly improve the performance of the lexer in contexts wherelex_start_p
andlex_curr_p
are not needed.Position tracking mode works as follows. At each token, the lexing engine will copy
lex_curr_p
tolex_start_p
, then change thepos_cnum
field oflex_curr_p
by updating it with the number of characters read since the start of thelexbuf
. The other fields are left unchanged by the lexing engine. In order to keep them accurate, they must be initialised before the first use of the lexbuf, and updated by the relevant lexer actions (i.e. at each end of line -- see alsonew_line
).
val from_channel : ?with_positions:bool -> Stdlib.in_channel -> lexbuf
Create a lexer buffer on the given input channel.
Lexing.from_channel inchan
returns a lexer buffer which reads from the input channelinchan
, at the current reading position.
val from_string : ?with_positions:bool -> string -> lexbuf
Create a lexer buffer which reads from the given string. Reading starts from the first character in the string. An end-of-input condition is generated when the end of the string is reached.
val from_function : ?with_positions:bool -> (bytes -> int -> int) -> lexbuf
Create a lexer buffer with the given function as its reading method. When the scanner needs more characters, it will call the given function, giving it a byte sequence
s
and a byte countn
. The function should putn
bytes or fewer ins
, starting at index 0, and return the number of bytes provided. A return value of 0 means end of input.
val with_positions : lexbuf -> bool
Tell whether the lexer buffer keeps track of position fields
lex_curr_p
/lex_start_p
, as determined by the corresponding optional argument for functions that create lexer buffers (whose default value istrue
).When
with_positions
isfalse
, lexer actions should not modify position fields. Doing it nevertheless could re-enable thewith_position
mode and degrade performances.
Functions for lexer semantic actions
val lexeme : lexbuf -> string
Lexing.lexeme lexbuf
returns the string matched by the regular expression.
val lexeme_char : lexbuf -> int -> char
Lexing.lexeme_char lexbuf i
returns character numberi
in the matched string.
val lexeme_start : lexbuf -> int
Lexing.lexeme_start lexbuf
returns the offset in the input stream of the first character of the matched string. The first character of the stream has offset 0.
val lexeme_end : lexbuf -> int
Lexing.lexeme_end lexbuf
returns the offset in the input stream of the character following the last character of the matched string. The first character of the stream has offset 0.
val lexeme_start_p : lexbuf -> position
Like
lexeme_start
, but return a completeposition
instead of an offset. When position tracking is disabled, the function returnsdummy_pos
.
val lexeme_end_p : lexbuf -> position
Like
lexeme_end
, but return a completeposition
instead of an offset. When position tracking is disabled, the function returnsdummy_pos
.
val new_line : lexbuf -> unit
Update the
lex_curr_p
field of the lexbuf to reflect the start of a new line. You can call this function in the semantic action of the rule that matches the end-of-line character. The function does nothing when position tracking is disabled.- since
- 3.11.0
Miscellaneous functions
val flush_input : lexbuf -> unit
Discard the contents of the buffer and reset the current position to 0. The next use of the lexbuf will trigger a refill.