[PR] Fix test failure in TestReqOptSumScorer.testFilterRandomRareOpt [lucene]

2024-02-21 Thread via GitHub
easyice opened a new pull request, #13122: URL: https://github.com/apache/lucene/pull/13122 closes: https://github.com/apache/lucene/issues/13120 This is a similar issue like https://github.com/apache/lucene/pull/13069, This exception was not thrown in the previous PR, because `s1` mi

Re: [PR] FieldInfosFormat translation should be independent of VectorSimilartyFunction enum [lucene]

2024-02-21 Thread via GitHub
ChrisHegarty commented on code in PR #13119: URL: https://github.com/apache/lucene/pull/13119#discussion_r1497177281 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java: ## @@ -171,15 +172,25 @@ private void validateFieldEntry(FieldInfo info,

Re: [PR] FieldInfosFormat translation should be independent of VectorSimilartyFunction enum [lucene]

2024-02-21 Thread via GitHub
benwtrent commented on code in PR #13119: URL: https://github.com/apache/lucene/pull/13119#discussion_r1497494311 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java: ## @@ -171,15 +172,25 @@ private void validateFieldEntry(FieldInfo info, F

[PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-21 Thread via GitHub
benwtrent opened a new pull request, #13124: URL: https://github.com/apache/lucene/pull/13124 Opening this PR for discussion. I took a stab at https://github.com/apache/lucene/issues/12740 The idea is this: - Add new task executor to IWC & num parallel worker actions - Ext

[PR] Make BP work on indexes that have blocks. [lucene]

2024-02-21 Thread via GitHub
jpountz opened a new pull request, #13125: URL: https://github.com/apache/lucene/pull/13125 This is similar to the work we did on supporting index sorting on indexes that have blocks, but for index reordering this time. -- This is an automated message from the Apache Git Service. To re

Re: [PR] Make BP work on indexes that have blocks. [lucene]

2024-02-21 Thread via GitHub
jpountz commented on PR #13125: URL: https://github.com/apache/lucene/pull/13125#issuecomment-1957338146 Draft for now because it builds on #13125 which has not been merged yet. When we have support for BP + blocks, then we can enable BP on githubsearch cc @mikemccand. -- This is a

Re: [PR] Make BP work on indexes that have blocks. [lucene]

2024-02-21 Thread via GitHub
mikemccand commented on PR #13125: URL: https://github.com/apache/lucene/pull/13125#issuecomment-1957352374 Yay, I confirmed that githubsearch [picked up that mention of me](https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=mentioned_users%3Amikemccan

Re: [PR] Make BP work on indexes that have blocks. [lucene]

2024-02-21 Thread via GitHub
mikemccand commented on PR #13125: URL: https://github.com/apache/lucene/pull/13125#issuecomment-1957356548 https://github.com/mikemccand/luceneserver/issues/28 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[PR] factor out (public static) MultiCollector.scoreMode(Collector[]) method [lucene]

2024-02-21 Thread via GitHub
cpoerschke opened a new pull request, #13126: URL: https://github.com/apache/lucene/pull/13126 To allow re-use rather than duplication of the logic e.g. see https://github.com/apache/solr/pull/2248/commits/3cee3e591995520cdfed41a5228a2e01c3b5cc0f#r1485486388 thread. -- This is an automat

[I] TestIDVersionPostingsFormat failure [lucene]

2024-02-21 Thread via GitHub
jpountz opened a new issue, #13127: URL: https://github.com/apache/lucene/issues/13127 ### Description This failure is not reproducible, which is maybe not too surprising given that the test involves concurrency. A few things are interesting: - This is the same failure I had got o

Re: [PR] factor out (public static) MultiCollector.scoreMode(Collector[]) method [lucene]

2024-02-21 Thread via GitHub
jpountz commented on PR #13126: URL: https://github.com/apache/lucene/pull/13126#issuecomment-1957548055 I'm not too familiar with the Solr code but I see that it duplicates logic in `setScorer` as well, could it reuse `MultiCollector` directly instead of exposing the method that combines s

Re: [PR] factor out (public static) MultiCollector.scoreMode(Collector[]) method [lucene]

2024-02-21 Thread via GitHub
cpoerschke commented on PR #13126: URL: https://github.com/apache/lucene/pull/13126#issuecomment-1957580240 > ... could it reuse `MultiCollector` directly instead of exposing the method that combines score modes? I'd wondered the same-ish i.e. `MultiCollectorManager` re-use -- https:

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-21 Thread via GitHub
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1957581351 Thinking out loud: since merge schedulers already have the ability to merge concurrently (across multiple merges rather than within a merge though), it would be nice to fully encapsulate

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-21 Thread via GitHub
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1957696114 > it would be nice to fully encapsulate the merging concurrency there instead of having two sources of merging concurrency that are not aware of one another. To take advantage

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2024-02-21 Thread via GitHub
benwtrent commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1957923215 I have done some more benchmarking and there isn't really a significant improvement. This is over 500k, 1024 vectors. Getting the nearest 500 neighbors. Baseline ``` late

Re: [PR] Avoid allocating redundant Strings [lucene]

2024-02-21 Thread via GitHub
github-actions[bot] commented on PR #13085: URL: https://github.com/apache/lucene/pull/13085#issuecomment-1958434623 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Adding binary Hamming distance as similarity option for byte vectors [lucene]

2024-02-21 Thread via GitHub
github-actions[bot] commented on PR #13076: URL: https://github.com/apache/lucene/pull/13076#issuecomment-1958434666 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-21 Thread via GitHub
zhaih commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1958716375 +1 to move executor away from the Codec API (altho it's me who placed them there LOL) > it would be nice to fully encapsulate the merging concurrency there instead of having two sou

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-21 Thread via GitHub
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1498656246 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +153,25 @@ public Lucene99HnswVectorsFormat() { * @param beamW

Re: [PR] Fix test failure in TestReqOptSumScorer.testFilterRandomRareOpt [lucene]

2024-02-21 Thread via GitHub
easyice merged PR #13122: URL: https://github.com/apache/lucene/pull/13122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] org.apache.lucene.search.TestReqOptSumScorer.testFilterRandomRareOpt fails intermittently [lucene]

2024-02-21 Thread via GitHub
easyice closed issue #13120: org.apache.lucene.search.TestReqOptSumScorer.testFilterRandomRareOpt fails intermittently URL: https://github.com/apache/lucene/issues/13120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Use group-varint encoding for the tail of postings [lucene]

2024-02-21 Thread via GitHub
wjp719 commented on PR #12782: URL: https://github.com/apache/lucene/pull/12782#issuecomment-1958873159 @easyice Hi, I have doubt that the encoding data result using group-varint encoding is different from the old way, so is this way compatible with the old index format data? -- This is