Compiling regular expressions to RTL is completely straightforward (part of a typical proof that for every RE there is an FSM).
By converting a path expression to a state expression we can generate an RTL checker for use in dynamic validation.
(Details not examinable.)
fun gen_pattern_matcher g (seres_statexp e) = gen_and2(g, gen_boolean e) | gen_pattern_matcher g (seres_diop(diop_seres_alternation, l, r)) = let val l' = gen_pattern_matcher g l val r' = gen_pattern_matcher g r in gen_or2(l', r') end | gen_pattern_matcher g (seres_diop(diop_seres_catenation, l, r)) = let val l' = gen_dff(gen_pattern_matcher g l) val r' = gen_pattern_matcher l' r in r' end | gen_pattern_matcher g (seres_diop(diop_seres_fusion, l, r)) = let val l' = gen_pattern_matcher g l val r' = gen_pattern_matcher l' r in r' end | gen_pattern_matcher g (seres_monop(mono_arb_repetition, l)) = let val nn = newnet() val l' = gen_pattern_matcher nn l val r = gen_or2(l', g) val _ = gen_buffer(nn, r) in r end | gen_pattern_matcher g (seres_diop(diop_n_times_repetition, l, seres_statexp(x_num n))) = let fun f (g, k) = if k=0 then g else gen_pattern_matcher (f(g, k-1)) l in f (g, n) end
This generates a simple one-hot automaton and there are far more efficient procedures used in practice and given in the literature.
A harder operator to compile is the length-matching conjunction (introduced shortly), since care is needed when each side contains arbitrary repetition and can declare success or failure at a number of possible times.
16: (C) 2008-16, DJ Greaves, University of Cambridge, Computer Laboratory. |