Documentation

Regex.Nfa

NFA #

A byte oriented Thompson non-deterministic finite automaton (Checked.NFA), see also Tagged NFA.

@[reducible, inline]

The identifier of a regex pattern, represented by a Nat

Equations
Instances For
    Instances For
      inductive NFA.Look :

      The high-level intermediate representation for a look-around assertion.

      • Start : Look

        Match the beginning of text.

      • End : Look

        Match the end of text.

      • EndWithOptionalLF : Look

        Match the end of text (before optional newline).

      • StartLF : Look

        Match the beginning of a line or the beginning of text.

      • EndLF : Look

        Match of a line or the end of text.

      • StartCRLF : Look

        Match the beginning of a line or the beginning of text.

      • EndCRLF : Look

        Match of a line or the end of text.

      • WordUnicode : Look

        Match a Unicode-aware word boundary.

      • WordUnicodeNegate : Look

        Match a Unicode-aware negation of a word boundary.

      • WordStartUnicode : Look

        Match the start of a Unicode word boundary.

      • WordEndUnicode : Look

        Match the end of a Unicode word boundary.

      • WordStartHalfUnicode : Look

        Match the start half of a Unicode word boundary.

      • WordEndHalfUnicode : Look

        Match the end half of a Unicode word boundary.

      • PreviousMatch : Look

        Match the end of the previous match or start of string

      • ClearMatches : Look

        Clear matches

      Instances For
        @[reducible, inline]

        The identifier of a finite automaton state, represented by a Nat. The identifier may refer to a non existing state.

        Equations
        Instances For
          Instances For

            A single transition to another state. This transition may only be followed if the current char in the haystack falls in the inclusive range of chars specified.

            • start : UInt32

              The inclusive start of the char range.

            • end : UInt32

              The inclusive end of the char range.

            • next : StateID

              The identifier of the state to transition to.

            Instances For

              check if c falls in the inclusive range of chars specified in Transition trans

              Equations
              Instances For

                A state in an NFA.

                • Empty (next : StateID) : State

                  An empty state whose only purpose is to forward the automaton to another state via an unconditional epsilon transition.

                • NextChar (offset : Nat) (next : StateID) : State

                  use next char

                • Fail : State

                  Fail, force backtrack

                • Eat (mode : EatMode) (next : StateID) : State

                  remove Frame.Step in stack

                • ChangeFrameStep («from» to : StateID) : State

                  change Frame.Step in stack and force backtrack

                • RemoveFrameStep (sid : StateID) : State

                  remove Frame.Step in stack and force backtrack

                • BackRef (n : Nat) (case_insensitive : Bool) (sid : StateID) : State

                  A state with a single transition may only be followed if the currunt chars match the backrefence to the capturung group.

                • ByteRange (trans : Transition) : State

                  A state with a single transition that can only be taken if the current input symbol is in a particular range of chars.

                • SparseTransitions (transitions : Array Transition) : State

                  A sequence of transitions used to represent a sparse state.

                • Look (look : NFA.Look) (next : StateID) : State

                  A conditional epsilon transition satisfied via some sort of look-around.

                • Union (alternates : Array StateID) : State

                  An alternation such that there exists an epsilon transition to all states in alternates, where matches found via earlier transitions are preferred over later transitions.

                • UnionReverse (alternates : Array StateID) : State

                  An alternation such that there exists an epsilon transition to all states in alternates, where matches found via later transitions are preferred over earlier transitions.

                • BinaryUnion (alt1 alt2 : StateID) : State

                  An alternation such that there exists precisely two unconditional epsilon transitions

                • Capture (role : Capture.Role) (next : StateID) (pattern_id : PatternID) (group_index slot : Nat) : State

                  An empty state that records a capture location. slot in this context refers to the specific capture group slot offset that is being recorded. Each capturing group has two slots corresponding to the start and end of the matching portion of that group.

                • Match (pattern_id : PatternID) : State

                  A match state. There is at least one such occurrence of this state for each regex that can match that is in this NFA.

                Instances For
                  Equations
                  Instances For

                    Array of (slot, group) pairs of captures.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      A char oriented Thompson non-deterministic finite automaton (NFA).

                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For
                          inductive NFA.Checked.EatMode (n : Nat) :
                          • Until {n : Nat} : Fin nEatMode n

                            Eat frames until State found

                          • ToLast {n : Nat} : Fin nEatMode n

                            Eat frames inclusive last occurunce of State.

                          Instances For

                            A single transition to another state. This transition may only be followed if the current char in the haystack falls in the inclusive range of chars specified.

                            • start : Nat

                              The inclusive start of the char range.

                            • end : Nat

                              The inclusive end of the char range.

                            • next : Fin n

                              The identifier of the state to transition to.

                            Instances For

                              check if c falls in the inclusive range of chars specified in Transition trans

                              Equations
                              Instances For
                                inductive NFA.Checked.State (n : Nat) :

                                A state in an NFA, the elements of Fin n refer to existing states in the NFA.

                                • Empty {n : Nat} (next : Fin n) : State n

                                  An empty state whose only purpose is to forward the automaton to another state via an unconditional epsilon transition.

                                • NextChar {n : Nat} (offset : Nat) (next : Fin n) : State n

                                  use netx char

                                • Fail {n : Nat} : State n

                                  Fail, force backtracking

                                • Eat {n : Nat} (mode : EatMode n) (next : Fin n) : State n

                                  remove Frame.Step in stack

                                • ChangeFrameStep {n : Nat} (fr to : Fin n) : State n

                                  change Frame.Step in stack and force backtracking

                                • RemoveFrameStep {n : Nat} (sid : Fin n) : State n

                                  remove Frame.Step in stack and force backtrack

                                • BackRef {n : Nat} (b : Nat) (case_insensitive : Bool) (sid : Fin n) : State n

                                  A state with a single transition may only be followed if the current chars match the backrefence to the capturung group.

                                • ByteRange {n : Nat} (trans : Transition n) : State n

                                  A state with a single transition that can only be taken if the current input symbol is in a particular range of chars.

                                • SparseTransitions {n : Nat} (transitions : Array (Transition n)) : State n

                                  A sequence of transitions used to represent a sparse state.

                                • Look {n : Nat} (look : NFA.Look) (next : Fin n) : State n

                                  A conditional epsilon transition satisfied via some sort of look-around.

                                • Union {n : Nat} (alternates : Array (Fin n)) : State n

                                  An alternation such that there exists an epsilon transition to all states in alternates, where matches found via earlier transitions are preferred over later transitions.

                                • UnionReverse {n : Nat} (alternates : Array (Fin n)) : State n

                                  An alternation such that there exists an epsilon transition to all states in alternates, where matches found via later transitions are preferred over earlier transitions.

                                • BinaryUnion {n : Nat} (alt1 alt2 : Fin n) : State n

                                  An alternation such that there exists precisely two unconditional epsilon transitions

                                • Capture {n : Nat} (role : Capture.Role) (next : Fin n) (pattern_id : PatternID) (group_index slot : Nat) : State n

                                  An empty state that records a capture location. slot in this context refers to the specific capture group slot offset that is being recorded. Each capturing group has two slots corresponding to the start and end of the matching portion of that group.

                                • Match {n : Nat} (pattern_id : PatternID) : State n

                                  A match state. There is at least one such occurrence of this state for each regex that can match that is in this NFA.

                                Instances For
                                  Equations
                                  Instances For

                                    Valid slot arrays have even size and consist of groups with two slots at consecutive positions.

                                    Equations
                                    Instances For

                                      A char oriented Thompson non-deterministic finite automaton (NFA) with n states

                                      • n : Nat

                                        number of states

                                      • states : Array (State self.n)

                                        states of the NFA

                                      • groups : Array Nat

                                        capture groups which may match with a empty string

                                      • slots : Array (Nat × Nat)

                                        Array of (slot, group) pairs of captures

                                      • unanchored_prefix_in_backtrack : Bool

                                        simulate a unanchored prefix by the backtracker

                                      • isEq : self.n = self.states.size

                                        assertion that n equals size of states

                                      • slotsValid : Slots.Valid self.slots = true

                                        slots are valid

                                      Instances For
                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Equations
                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For
                                          def NFA.toCkecked (nfa : Unchecked.NFA) (groups : Array Nat) (hall : ∀ (i : Nat) (h : i < nfa.states.size), nfa.states[i].nextOf < nfa.states.size) :

                                          transform Unchecked.NFA to Checked.NFA

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For
                                            structure Regex :

                                            A compiled regular expression for searching Unicode haystacks.

                                            Instances For
                                              Equations