Documentation

Regex.Backtrack

BoundedBacktracker #

An NFA backed bounded backtracker for executing regex searches with capturing groups (BoundedBacktracker.slots).

This regex engine only implements leftmost-first match semantics and only supports leftmost searches. By design, this backtracking regex engine is bounded. This bound is implemented by not visiting any combination of NFA state ID and position in a haystack more than once.

All searches execute in linear time with respect to the size of the regular expression and search text.

def BoundedBacktracker.Array.Ref.mkRef {α β : Type} [Inhabited β] (arr : Array α) :
ST.Ref β (Array α)

make reference of array

Equations
Instances For

    get array of reference

    Equations
    Instances For
      def BoundedBacktracker.Array.Ref.modifyRefValue {α β : Type} [Inhabited β] [DecidableEq β] (ref : ST.Ref β (Array α)) (index : Nat) (value : α) :
      ST.Ref β (Array α)

      modify array, try to perform the update destructively

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        Char position in a substring

        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Equations
          • One or more equations did not get rendered due to their size.
          Equations
          Instances For
            Equations
            Instances For

              create a CharPos from s and position «at»

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                to prev position of cp

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  @[irreducible]
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    to next position of cp

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      is CharPos at start position

                      Equations
                      Instances For

                        is CharPos at stop position

                        Equations
                        Instances For

                          Represents a stack frame on the heap while doing backtracking.

                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.

                            A representation of "half" of a match reported by a DFA.

                            Instances For
                              Equations
                              • One or more equations did not get rendered due to their size.
                              @[reducible, inline]

                              The stack of frames

                              Equations
                              Instances For
                                @[inline]
                                def BoundedBacktracker.Stack.push {n : Nat} {s : Substring} (stack : Stack n s) (v : Frame n s) :
                                Stack n s

                                Push frame to stack

                                Equations
                                Instances For
                                  @[inline]
                                  def BoundedBacktracker.Stack.pop? {n : Nat} {s : Substring} (stack : Stack n s) :
                                  Option (Frame n s × Stack n s)

                                  pop head frame from stack

                                  Equations
                                  Instances For
                                    theorem BoundedBacktracker.Stack.pop?_some_lt {n : Nat} {s : Substring} {a : Frame n s} {s1 : Stack n s} (s✝ : Stack n s) (h : s✝.pop? = some (a, s1)) :
                                    @[inline]
                                    def BoundedBacktracker.Stack.append {n : Nat} {s : Substring} (stack1 stack2 : Stack n s) :
                                    Stack n s

                                    append stacks

                                    Equations
                                    Instances For
                                      @[inline]
                                      def BoundedBacktracker.Stack.toStack {n : Nat} {s : Substring} (arr : Array (Frame n s)) :
                                      Stack n s
                                      Equations
                                      Instances For
                                        @[reducible, inline]

                                        The encoded set of (Fin n, HaystackOffset) pairs that have been visited by the backtracker within a single search. Optimization: encode as bits

                                        Equations
                                        Instances For
                                          @[reducible, inline]

                                          SlotEntry consists of slot, group and char pos in s

                                          Equations
                                          Instances For
                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              State of the backtracking search

                                              • stack : Stack n input

                                                Stack used on the heap for doing backtracking

                                              • visited : ST.Ref Nat Visited

                                                The encoded set of (Fin n, HaystackOffset) pairs that have been visited by the backtracker within a single search.

                                              • countVisited : Nat

                                                count of pairs that have been visited

                                              • sid : Fin n

                                                current state

                                              • at : CharPos input

                                                position in input string

                                              • slots : Array (SlotEntry input)

                                                slots, positions of captures in input

                                              • slotsValid : Slots.Valid self.slots = true

                                                slots are valid

                                              • recentCaptures : Array (Option (String.Pos × String.Pos))

                                                recent captures of capture groups

                                              • logEnabled : Bool

                                                is logging enabled

                                              • msgs : Array String

                                                log msgs

                                              • halfMatch : Option HalfMatch

                                                HalfMatch

                                              • backtracks : Nat

                                                count of backtracks

                                              Instances For
                                                theorem BoundedBacktracker.SearchState.ext {n : Nat} {input : Substring} {x y : SearchState n input} (stack : x.stack = y.stack) (visited : x.visited = y.visited) (countVisited : x.countVisited = y.countVisited) (sid : x.sid = y.sid) («at» : x.at = y.at) (slots : x.slots = y.slots) (recentCaptures : x.recentCaptures = y.recentCaptures) (logEnabled : x.logEnabled = y.logEnabled) (msgs : x.msgs = y.msgs) (halfMatch : x.halfMatch = y.halfMatch) (backtracks : x.backtracks = y.backtracks) :
                                                x = y
                                                def BoundedBacktracker.SearchState.fromNfa (nfa : NFA.Checked.NFA) (input : Regex.ValidSubstring) («at» : Regex.ValidPos input.val.str) (logEnabled : Bool) (h : 0 < nfa.n input.val.startPos «at».val «at».val input.val.stopPos) :
                                                SearchState nfa.n input.val

                                                create the SearchState from an NFA

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  theorem BoundedBacktracker.SearchState.slots_valid_elem_eq_pair {s : Substring} (slots : Array (SlotEntry s)) (h : Slots.Valid slots = true) (i : Fin slots.toList.length) :
                                                  slots.toList[i].fst = i slots.toList[i].snd.fst = (↑i).div 2
                                                  theorem BoundedBacktracker.SearchState.slots_of_map_valid {s : Substring} (slots : Array (SlotEntry s)) (h : Slots.Valid slots = true) (f : SlotEntry sSlotEntry s) (hf : ∀ (x : SlotEntry s), (f x).fst = x.fst (f x).snd.fst = x.snd.fst) :
                                                  def BoundedBacktracker.Visited.index {n : Nat} {s : Substring} (sid : Fin n) (cp : CharPos s) :

                                                  get encoded index of Fin n and CharPos in visited array

                                                  Equations
                                                  Instances For
                                                    theorem BoundedBacktracker.Visited.eq_of_linear_eq {a1 b1 a2 b2 x : Nat} (hb1 : b1 < x) (hb2 : b2 < x) (h : a1 * x + b1 = a2 * x + b2) :
                                                    a1 = a2 b1 = b2
                                                    theorem BoundedBacktracker.Visited.index_injective {n : Nat} {s : Substring} (sid1 sid2 : Fin n) (cp1 cp2 : CharPos s) (heq : index sid1 cp1 = index sid2 cp2) :
                                                    sid1 = sid2 cp1.pos.byteIdx = cp2.pos.byteIdx

                                                    index is injective.

                                                    Check if current Fin n and CharPos in SearchState is already visited. If not visited mark pair as visited.

                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For

                                                      Check if current Fin n and CharPos in SearchState is already visited. If not visited mark pair as visited.

                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For
                                                        theorem BoundedBacktracker.capture_groups_eq_of_consecutive_slots {s : Substring} {i s0 g0 : Nat} {v0 : CharPos.PosInRange s} {s1 g1 : Nat} {v1 : CharPos.PosInRange s} (slots : Array (SlotEntry s)) (valid : SearchState.Slots.Valid slots = true) (h : ¬i % 2 = 0) (h0 : slots[i - 1]? = some (s0, g0, some v0)) (h1 : slots[i]? = some (s1, g1, some v1)) :
                                                        g0 = g1

                                                        capture groups of consecutive slots are equal

                                                        execute next steps in NFA.

                                                        Equations
                                                        Instances For
                                                          @[irreducible]
                                                          Equations
                                                          • One or more equations did not get rendered due to their size.
                                                          Instances For
                                                            @[irreducible]
                                                            theorem BoundedBacktracker.steps_loop_le {input : Substring} (nfa : NFA.Checked.NFA) (s s1 : SearchState nfa.n input) (h : steps.loop nfa s = s1) :
                                                            theorem BoundedBacktracker.toNextFrameRestoreCapture_true_lt_or_eq_lt {s : Substring} {n : Nat} {s1 : SearchState n s} (slot : Nat) (offset : Option (CharPos.PosInRange s)) (stack : Stack n s) (s✝ : SearchState n s) (h : BoundedBacktracker.toNextFrameRestoreCapture✝ slot offset stack s✝ = (true, s1)) :

                                                            BoundedBacktrack search

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For
                                                              @[irreducible]
                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Search for the first match of this regex in the haystack given and return log msgs and the matches of each capture group.

                                                                Equations
                                                                • One or more equations did not get rendered due to their size.
                                                                Instances For