Compiling regular expressions to RTL is completely straightforward (part of a typical proof that for every RE there is an FSM).
By converting a path expression to a state expression we can generate an RTL checker for use in dynamic validation.
(Details not examinable.)
fun gen_pattern_matcher g (seres_statexp e) = gen_and2(g, gen_boolean e)
| gen_pattern_matcher g (seres_diop(diop_seres_alternation, l, r)) =
let val l' = gen_pattern_matcher g l
val r' = gen_pattern_matcher g r
in gen_or2(l', r') end
| gen_pattern_matcher g (seres_diop(diop_seres_catenation, l, r)) =
let val l' = gen_dff(gen_pattern_matcher g l)
val r' = gen_pattern_matcher l' r
in r' end
| gen_pattern_matcher g (seres_diop(diop_seres_fusion, l, r)) =
let val l' = gen_pattern_matcher g l
val r' = gen_pattern_matcher l' r
in r' end
| gen_pattern_matcher g (seres_monop(mono_arb_repetition, l)) =
let val nn = newnet()
val l' = gen_pattern_matcher nn l
val r = gen_or2(l', g)
val _ = gen_buffer(nn, r)
in r end
| gen_pattern_matcher g (seres_diop(diop_n_times_repetition, l,
seres_statexp(x_num n))) =
let fun f (g, k) = if k=0 then g else
gen_pattern_matcher (f(g, k-1)) l
in f (g, n) end
This generates a simple one-hot automaton and there are far more efficient procedures used in practice and given in the literature.
A harder operator to compile is the length-matching conjunction (introduced shortly), since care is needed when each side contains arbitrary repetition and can declare success or failure at a number of possible times.
| 16: (C) 2008-16, DJ Greaves, University of Cambridge, Computer Laboratory. |