rmuir commented on PR #15232:
URL: https://github.com/apache/lucene/pull/15232#issuecomment-3335658607

   For some of these regexps, they may now come out deterministic or even 
minimal+deterministic to begin with at the parsing phase (this is a good 
thing!): we may have to get more creative with the regexps to force the 
determinize() to do actual work.
   
   In such a case, the `Operations.determinize()` is a no-op, which is why I 
think you see some of the crazy-fast numbers here.
   
   Best way to check out the regexes is to just write little throwaway 
unit-tests similar to:
   
https://github.com/apache/lucene/blob/002094613418c4bc6a7e335a8edca82fd26ac03d/lucene/core/src/test/org/apache/lucene/util/automaton/TestRegExpParsing.java#L527-L533
   
   Basically, if the result from `toAutomaton()` passes `assertCleanDFA()`, 
then you know it is already a DFA and determinize() wont do anything. See the 
assertions here:
   
   
https://github.com/apache/lucene/blob/de1ed71261d579fdd3cf71b0734f30ea799c4b1f/lucene/test-framework/src/java/org/apache/lucene/tests/util/automaton/AutomatonTestUtil.java#L394-L413
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to