rmuir opened a new pull request, #14381: URL: https://github.com/apache/lucene/pull/14381
Add optional flag to support case-insensitive ranges. A minimal DFA is always created. This works with Unicode but may have a performance cost. Each codepoint in the range must be iterated, and any alternatives added to a set. This can be large if the range spans much of Unicode. CPU and memory costs are contained within a single function enabled by the optional flag. For example when matching a caseless `/[a-z]/`, 56 codepoints will be accumulated into an `int[]`, which is then compressed to 5 ranges before adding to the parse tree. Closes #14378 Here's what resulting `/[a-z]/` automaton looks like in case you are curious:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org