[GitHub] [lucene] LuXugang commented on a diff in pull request #12405: Skip docs with Docvalues in NumericLeafComparator

2023-07-18 Thread via GitHub
LuXugang commented on code in PR #12405: URL: https://github.com/apache/lucene/pull/12405#discussion_r1266458966 ## lucene/core/src/java/org/apache/lucene/search/comparators/DoubleComparator.java: ## @@ -61,8 +63,12 @@ public LeafFieldComparator getLeafComparator(LeafReaderCont

[GitHub] [lucene] LuXugang commented on a diff in pull request #12405: Skip docs with Docvalues in NumericLeafComparator

2023-07-18 Thread via GitHub
LuXugang commented on code in PR #12405: URL: https://github.com/apache/lucene/pull/12405#discussion_r1266460078 ## lucene/core/src/java/org/apache/lucene/search/comparators/IntComparator.java: ## @@ -98,19 +99,30 @@ public void copy(int slot, int doc) throws IOException {

[GitHub] [lucene] LuXugang commented on a diff in pull request #12405: Skip docs with Docvalues in NumericLeafComparator

2023-07-18 Thread via GitHub
LuXugang commented on code in PR #12405: URL: https://github.com/apache/lucene/pull/12405#discussion_r1266459560 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -309,34 +323,100 @@ private void updateSkipInterval(boolean success) {

[GitHub] [lucene] donnerpeter opened a new pull request, #12447: hunspell: speed up the dictionary enumeration

2023-07-18 Thread via GitHub
donnerpeter opened a new pull request, #12447: URL: https://github.com/apache/lucene/pull/12447 cache each word's case and the lowercase form group the words by lengths to avoid even visiting entries with unneeded lengths -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] donnerpeter commented on pull request #12447: hunspell: speed up the dictionary enumeration

2023-07-18 Thread via GitHub
donnerpeter commented on PR #12447: URL: https://github.com/apache/lucene/pull/12447#issuecomment-1639903635 This improves suggestion performance for de/en/ru/uk by 15-30% -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [lucene] gashutos opened a new issue, #12448: [Performance] sort query improvement for sequential ordered data [e.g. timestamp field sort in log data]

2023-07-18 Thread via GitHub
gashutos opened a new issue, #12448: URL: https://github.com/apache/lucene/issues/12448 ### Description ## Problem statement Currently in `TopFieldCollector`, we have PriorityQueue (min heap binary implemenation) to find top `K` elements in `asc` or `desc` order. Whenever we find

[GitHub] [lucene] MartinDemberger commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

2023-07-18 Thread via GitHub
MartinDemberger commented on code in PR #12437: URL: https://github.com/apache/lucene/pull/12437#discussion_r1267179992 ## lucene/analysis/common/src/test/org/apache/lucene/analysis/compound/TestHyphenationCompoundWordTokenFilterFactory.java: ## @@ -47,6 +47,33 @@ public void te

[GitHub] [lucene] donnerpeter merged pull request #12447: hunspell: speed up the dictionary enumeration

2023-07-18 Thread via GitHub
donnerpeter merged PR #12447: URL: https://github.com/apache/lucene/pull/12447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

[GitHub] [lucene] MartinDemberger commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

2023-07-18 Thread via GitHub
MartinDemberger commented on code in PR #12437: URL: https://github.com/apache/lucene/pull/12437#discussion_r1267286254 ## lucene/CHANGES.txt: ## @@ -65,6 +65,8 @@ New Features * LUCENE-10626 Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansi

[GitHub] [lucene] MartinDemberger commented on pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

2023-07-18 Thread via GitHub
MartinDemberger commented on PR #12437: URL: https://github.com/apache/lucene/pull/12437#issuecomment-1640968390 This is my first PR for lucene. Thank you for your patience and help. If I can improve something please let me know. -- This is an automated message from the Apache Git Service

[GitHub] [lucene] MartinDemberger commented on issue #9231: HyphenationCompoundWordTokenFilter creates overlapping tokens with onlyLongestMatch enabled [LUCENE-8183]

2023-07-18 Thread via GitHub
MartinDemberger commented on issue #9231: URL: https://github.com/apache/lucene/issues/9231#issuecomment-1640974145 > Does this also fix #4096 ? I'm sorry but no. #4096 notifies DictionaryCompoundWordTokenFilter but this one is about HyphenationCompoundWordTokenFilter Maybe the cha

[GitHub] [lucene] uschindler commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

2023-07-18 Thread via GitHub
uschindler commented on code in PR #12437: URL: https://github.com/apache/lucene/pull/12437#discussion_r1267297167 ## lucene/analysis/common/src/test/org/apache/lucene/analysis/compound/TestHyphenationCompoundWordTokenFilterFactory.java: ## @@ -47,6 +47,33 @@ public void testHyp

[GitHub] [lucene] uschindler commented on a diff in pull request #12437: LUCENE-8183: Added the abbility to get noSubMatches and noOverlapping Matches

2023-07-18 Thread via GitHub
uschindler commented on code in PR #12437: URL: https://github.com/apache/lucene/pull/12437#discussion_r1267299166 ## lucene/CHANGES.txt: ## @@ -65,6 +65,8 @@ New Features * LUCENE-10626 Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion an

[GitHub] [lucene] asubbu90 opened a new issue, #12449: byte to int in TruncateTokenFilterFactory to TruncateTokenFilter

2023-07-18 Thread via GitHub
asubbu90 opened a new issue, #12449: URL: https://github.com/apache/lucene/issues/12449 ### Description TruncateTokenFilterFactory class parses PREFIX_LENGTH_KEY value as Byte which goes upto 127 and then is stored in prefixLength attribute. TruncateTokenFilter class expects the argu