Documentation

Regex.Nfa

NFA #

A byte oriented Thompson non-deterministic finite automaton (Checked.NFA), see also Tagged NFA.

@[reducible, inline]

The identifier of a regex pattern, represented by a Nat

Equations
Instances For
    Instances For
      inductive NFA.Look :

      The high-level intermediate representation for a look-around assertion.

      • Start : Look

        Match the beginning of text.

      • End : Look

        Match the end of text.

      • EndWithOptionalLF : Look

        Match the end of text (before optional newline).

      • StartLF : Look

        Match the beginning of a line or the beginning of text.

      • EndLF : Look

        Match of a line or the end of text.

      • StartCRLF : Look

        Match the beginning of a line or the beginning of text.

      • EndCRLF : Look

        Match of a line or the end of text.

      • WordUnicode : Look

        Match a Unicode-aware word boundary.

      • WordUnicodeNegate : Look

        Match a Unicode-aware negation of a word boundary.

      • WordStartUnicode : Look

        Match the start of a Unicode word boundary.

      • WordEndUnicode : Look

        Match the end of a Unicode word boundary.

      • WordStartHalfUnicode : Look

        Match the start half of a Unicode word boundary.

      • WordEndHalfUnicode : Look

        Match the end half of a Unicode word boundary.

      • PreviousMatch : Look

        Match the end of the previous match or start of string

      • ClearMatches : Look

        Clear matches

      Instances For
        Equations
        Instances For
          @[reducible, inline]

          The identifier of a finite automaton state, represented by a Nat. The identifier may refer to a non existing state.

          Equations
          Instances For
            Instances For
              Equations
              Instances For

                A single transition to another state. This transition may only be followed if the current char in the haystack falls in the inclusive range of chars specified.

                • start : UInt32

                  The inclusive start of the char range.

                • end : UInt32

                  The inclusive end of the char range.

                • next : StateID

                  The identifier of the state to transition to.

                Instances For

                  check if c falls in the inclusive range of chars specified in Transition trans

                  Equations
                  Instances For

                    A state in an NFA.

                    • Empty (next : StateID) : State

                      An empty state whose only purpose is to forward the automaton to another state via an unconditional epsilon transition.

                    • NextChar (offset : Nat) (next : StateID) : State

                      use next char

                    • Fail : State

                      Fail, force backtrack

                    • Eat (mode : EatMode) (next : StateID) : State

                      remove Frame.Step in stack

                    • ChangeFrameStep («from» to : StateID) : State

                      change Frame.Step in stack and force backtrack

                    • RemoveFrameStep (sid : StateID) : State

                      remove Frame.Step in stack and force backtrack

                    • BackRef (n : Nat) (case_insensitive : Bool) (sid : StateID) : State

                      A state with a single transition may only be followed if the currunt chars match the backrefence to the capturung group.

                    • ByteRange (trans : Transition) : State

                      A state with a single transition that can only be taken if the current input symbol is in a particular range of chars.

                    • SparseTransitions (transitions : Array Transition) : State

                      A sequence of transitions used to represent a sparse state.

                    • Look (look : NFA.Look) (next : StateID) : State

                      A conditional epsilon transition satisfied via some sort of look-around.

                    • Union (alternates : Array StateID) : State

                      An alternation such that there exists an epsilon transition to all states in alternates, where matches found via earlier transitions are preferred over later transitions.

                    • UnionReverse (alternates : Array StateID) : State

                      An alternation such that there exists an epsilon transition to all states in alternates, where matches found via later transitions are preferred over earlier transitions.

                    • BinaryUnion (alt1 alt2 : StateID) : State

                      An alternation such that there exists precisely two unconditional epsilon transitions

                    • Capture (role : Capture.Role) (next : StateID) (pattern_id : PatternID) (group_index slot : Nat) : State

                      An empty state that records a capture location. slot in this context refers to the specific capture group slot offset that is being recorded. Each capturing group has two slots corresponding to the start and end of the matching portion of that group.

                    • Match (pattern_id : PatternID) : State

                      A match state. There is at least one such occurrence of this state for each regex that can match that is in this NFA.

                    Instances For
                      Equations
                      Instances For

                        Array of (slot, group) pairs of captures.

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          A char oriented Thompson non-deterministic finite automaton (NFA).

                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              inductive NFA.Checked.EatMode (n : Nat) :
                              • Until {n : Nat} : Fin nEatMode n

                                Eat frames until State found

                              • ToLast {n : Nat} : Fin nEatMode n

                                Eat frames inclusive last occurunce of State.

                              Instances For
                                Equations
                                Instances For

                                  A single transition to another state. This transition may only be followed if the current char in the haystack falls in the inclusive range of chars specified.

                                  • start : Nat

                                    The inclusive start of the char range.

                                  • end : Nat

                                    The inclusive end of the char range.

                                  • next : Fin n

                                    The identifier of the state to transition to.

                                  Instances For

                                    check if c falls in the inclusive range of chars specified in Transition trans

                                    Equations
                                    Instances For
                                      inductive NFA.Checked.State (n : Nat) :

                                      A state in an NFA, the elements of Fin n refer to existing states in the NFA.

                                      • Empty {n : Nat} (next : Fin n) : State n

                                        An empty state whose only purpose is to forward the automaton to another state via an unconditional epsilon transition.

                                      • NextChar {n : Nat} (offset : Nat) (next : Fin n) : State n

                                        use netx char

                                      • Fail {n : Nat} : State n

                                        Fail, force backtracking

                                      • Eat {n : Nat} (mode : EatMode n) (next : Fin n) : State n

                                        remove Frame.Step in stack

                                      • ChangeFrameStep {n : Nat} (fr to : Fin n) : State n

                                        change Frame.Step in stack and force backtracking

                                      • RemoveFrameStep {n : Nat} (sid : Fin n) : State n

                                        remove Frame.Step in stack and force backtrack

                                      • BackRef {n : Nat} (b : Nat) (case_insensitive : Bool) (sid : Fin n) : State n

                                        A state with a single transition may only be followed if the current chars match the backrefence to the capturung group.

                                      • ByteRange {n : Nat} (trans : Transition n) : State n

                                        A state with a single transition that can only be taken if the current input symbol is in a particular range of chars.

                                      • SparseTransitions {n : Nat} (transitions : Array (Transition n)) : State n

                                        A sequence of transitions used to represent a sparse state.

                                      • Look {n : Nat} (look : NFA.Look) (next : Fin n) : State n

                                        A conditional epsilon transition satisfied via some sort of look-around.

                                      • Union {n : Nat} (alternates : Array (Fin n)) : State n

                                        An alternation such that there exists an epsilon transition to all states in alternates, where matches found via earlier transitions are preferred over later transitions.

                                      • UnionReverse {n : Nat} (alternates : Array (Fin n)) : State n

                                        An alternation such that there exists an epsilon transition to all states in alternates, where matches found via later transitions are preferred over earlier transitions.

                                      • BinaryUnion {n : Nat} (alt1 alt2 : Fin n) : State n

                                        An alternation such that there exists precisely two unconditional epsilon transitions

                                      • Capture {n : Nat} (role : Capture.Role) (next : Fin n) (pattern_id : PatternID) (group_index slot : Nat) : State n

                                        An empty state that records a capture location. slot in this context refers to the specific capture group slot offset that is being recorded. Each capturing group has two slots corresponding to the start and end of the matching portion of that group.

                                      • Match {n : Nat} (pattern_id : PatternID) : State n

                                        A match state. There is at least one such occurrence of this state for each regex that can match that is in this NFA.

                                      Instances For
                                        Equations
                                        Instances For
                                          Equations

                                          Valid slot arrays have even size and consist of groups with two slots at consecutive positions.

                                          Equations
                                          Instances For

                                            A char oriented Thompson non-deterministic finite automaton (NFA) with n states

                                            • n : Nat

                                              number of states

                                            • states : Array (State self.n)

                                              states of the NFA

                                            • groups : Array Nat

                                              capture groups which may match with a empty string

                                            • slots : Array (Nat × Nat)

                                              Array of (slot, group) pairs of captures

                                            • unanchored_prefix_in_backtrack : Bool

                                              simulate a unanchored prefix by the backtracker

                                            • isEq : self.n = self.states.size

                                              assertion that n equals size of states

                                            • slotsValid : Slots.Valid self.slots = true

                                              slots are valid

                                            Instances For
                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              instance NFA.Checked.instCoeFinNSizeStateStates {nfa : NFA} :
                                              Coe (Fin nfa.n) (Fin nfa.states.size)
                                              Equations
                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                transform Unchecked.NFA to Checked.NFA

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  structure Regex :

                                                  A compiled regular expression for searching Unicode haystacks.

                                                  Instances For
                                                    Equations