Info
This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: Determining if a string can be continued while still matching a regex
I have a regular expression, starting out in text form, and an input string which may or may not match the regular expression. What I want to do is efficiently determine if any extension of the string (over a limited alphabet) will match the regular expression, i.e. if the input string I have is a prefix of some string that will match the regex. (For bonus points, I'd like to extract which patterns are still matchable when I have a regex that combines several, but a single one will do.)
As an example, if I have the regex ab*
, I want to be able to distinguish that a
can be extended into a matching string, whereas c
cannot be. In practice, my regexes will be partial matches on the tail of the string, but I don't think that actually changes the problem.
My thinking is that I already know that the regex-automata
crate will let me explicitly "step" the finite state machine, so when I get to the end of the string, I can then check whether or not accepting states are reachable from the state I end up in. Looking at the Automata trait for the state machines, which is implemented for DFAs, I see how to step through the matching process, but the only API I can see for accessing the state transitions is on the Thompson NFA. Is the implication that I have to compile two different objects essentially from scratch, or have I misunderstood? Or is there some more direct way to answer this I've overlooked?
1 post - 1 participant
🏷️ Rust_feed