rmuir opened a new pull request, #14381:
URL: https://github.com/apache/lucene/pull/14381

   Add optional flag to support case-insensitive ranges. A minimal DFA is 
always created. This works with Unicode but may have a performance cost.
   
   Each codepoint in the range must be iterated, and any alternatives added to 
a set. This can be large if the range spans much of Unicode.
   
   CPU and memory costs are contained within a single function enabled by the 
optional flag. For example when matching a caseless `/[a-z]/`, 56 codepoints 
will be accumulated into an `int[]`, which is then compressed to 5 ranges 
before adding to the parse tree.
   
   Closes #14378
   
   Here's what resulting `/[a-z]/` automaton looks like in case you are curious:
   ![graphviz 
(5)](https://github.com/user-attachments/assets/dfbc25cd-4a32-4ffc-aee3-ab8dd43a63ec)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to