Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-04-05 Thread via GitHub
rmuir commented on code in PR #14381: URL: https://github.com/apache/lucene/pull/14381#discussion_r2007499003 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) { } } + /** +

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-04-05 Thread via GitHub
rmuir commented on PR #14381: URL: https://github.com/apache/lucene/pull/14381#issuecomment-2743822277 @dweiss thanks for the suggestion there, gazillions of array creations avoided. so now this thing will only spike cpu during parsing at worst. I honestly forget you can pass functions to f

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-03-21 Thread via GitHub
rmuir merged PR #14381: URL: https://github.com/apache/lucene/pull/14381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-03-21 Thread via GitHub
rmuir commented on PR #14381: URL: https://github.com/apache/lucene/pull/14381#issuecomment-2743798573 after fixing the turkish here's the (correct) automaton for `/[a-z]/`: the only special cases are long-s and kelvin sign as you expect: ![graphviz (6)](https://github.com/user-attac

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-03-21 Thread via GitHub
rmuir commented on code in PR #14381: URL: https://github.com/apache/lucene/pull/14381#discussion_r2007006500 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) { } } + /** +

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-03-20 Thread via GitHub
dweiss commented on code in PR #14381: URL: https://github.com/apache/lucene/pull/14381#discussion_r2006930601 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) { } } + /**