Re: [PR] [WIP] Multi-Vector support for HNSW search [lucene]

2024-07-22 Thread via GitHub
vigyasharma commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2244027209 > Cohere's wikipedia embeddings all indicate their parent page. So, I wonder how this would work on finding the nearest page given the `maxsim(passage)` vs. using the Lucene join log

Re: [PR] Refactor FST.saveMetadata() to FSTMetadata.save() [lucene]

2024-07-22 Thread via GitHub
dungba88 commented on PR #13549: URL: https://github.com/apache/lucene/pull/13549#issuecomment-2243996773 Thank you for merging @mikemccand ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Move synonym map off-heap for SynonymGraphFilter [lucene]

2024-07-22 Thread via GitHub
dungba88 commented on PR #13054: URL: https://github.com/apache/lucene/pull/13054#issuecomment-2243997751 Note: The above PR has been merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] Delegating the matches in PointRangeQuery weight to relate method [lucene]

2024-07-22 Thread via GitHub
jainankitk opened a new pull request, #13599: URL: https://github.com/apache/lucene/pull/13599 ### Description Delegating the matches in PointRangeQuery weight to relate method -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[I] Remove redundant code in PointRangeQuery Weight [lucene]

2024-07-22 Thread via GitHub
jainankitk opened a new issue, #13598: URL: https://github.com/apache/lucene/issues/13598 ### Description The implementation of matches function within PointRangeQuery Weight should delegate to the relate method given the min and max are both inclusive. -- This is an automated mess

Re: [PR] Compute facets while collecting [lucene]

2024-07-22 Thread via GitHub
gsmiller commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2243764545 Wow, thanks for this new faceting proposal! It's really exciting to see. I'm working my way through this PR and will hopefully be able to share some thoughts in the next couple of days.

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1687038516 ## lucene/test-framework/src/java/org/apache/lucene/tests/search/ScorerIndexSearcher.java: ## @@ -76,4 +77,14 @@ protected void search(List leaves, Weight weight, Col

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1687038516 ## lucene/test-framework/src/java/org/apache/lucene/tests/search/ScorerIndexSearcher.java: ## @@ -76,4 +77,14 @@ protected void search(List leaves, Weight weight, Col

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-22 Thread via GitHub
magibney commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1687033792 ## lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Feature/vector io prefetch [lucene]

2024-07-22 Thread via GitHub
benwtrent commented on PR #13586: URL: https://github.com/apache/lucene/pull/13586#issuecomment-2243643190 I ran a bunch more benchmarks, the numbers really jump around for prefetch, but are way more consistent for baseline. Here are two JFRs, but they aren't very enlightening. [

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2243620253 I did some work on this draft PR and made the fixes around hits counting early termination and caching more future proof. They are now contained to `TotalHitCountCollectorManager`, which

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-22 Thread via GitHub
dsmiley commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1686995275 ## lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686998135 ## lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java: ## @@ -211,7 +211,7 @@ private void doTestScores(int totalHitsThreshold) throws IOException {

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686994627 ## lucene/core/src/test/org/apache/lucene/index/TestForTooMuchCloning.java: ## @@ -80,7 +80,7 @@ public void test() throws Exception { // System.out.println("quer

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686989956 ## lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java: ## @@ -198,7 +198,17 @@ private void doTestScores(int totalHitsThreshold) throws IOException {

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686922601 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-22 Thread via GitHub
magibney commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1686805197 ## lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Refactor FST.saveMetadata() to FSTMetadata.save() [lucene]

2024-07-22 Thread via GitHub
mikemccand merged PR #13549: URL: https://github.com/apache/lucene/pull/13549 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2024-07-22 Thread via GitHub
mikemccand commented on PR #12860: URL: https://github.com/apache/lucene/pull/12860#issuecomment-2243259976 And thank you @ChrisHegarty for removing leftover debug code above ^^. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2024-07-22 Thread via GitHub
mikemccand merged PR #12860: URL: https://github.com/apache/lucene/pull/12860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2024-07-22 Thread via GitHub
mikemccand commented on PR #12860: URL: https://github.com/apache/lucene/pull/12860#issuecomment-2243212502 Whoa, I forgot I had not merged this tool! To confirm it still works, I ran it to check whether `branch_9x`'s 9.10.0 entry matches `main` and it found differences! ``` rapt

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
jpountz commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686649591 ## lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java: ## @@ -198,7 +198,17 @@ private void doTestScores(int totalHitsThreshold) throws IOException {

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
jpountz commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686626731 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [PR] Use Max WAND optimizations with ToParentBlockJoinQuery when using ScoreMode.Max [lucene]

2024-07-22 Thread via GitHub
jpountz commented on code in PR #13587: URL: https://github.com/apache/lucene/pull/13587#discussion_r1684532729 ## lucene/join/src/test/org/apache/lucene/search/join/TestBlockJoinScorer.java: ## @@ -16,23 +16,16 @@ */ package org.apache.lucene.search.join; +import static or

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686523892 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput when IndexInputs are closed in other threads (deoptimizations,...) [lucene]

2024-07-22 Thread via GitHub
dsmiley commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2242901661 I filed an issue in Solr: https://issues.apache.org/jira/browse/SOLR-17375 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Align doc value skipper interval boundaries when an interval contains a constant value [lucene]

2024-07-22 Thread via GitHub
iverase merged PR #13597: URL: https://github.com/apache/lucene/pull/13597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Align doc value skipper interval boundaries when an interval contains a constant value [lucene]

2024-07-22 Thread via GitHub
iverase commented on code in PR #13597: URL: https://github.com/apache/lucene/pull/13597#discussion_r1686485234 ## lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestLucene90DocValuesFormatVariableSkipInterval.java: ## @@ -36,4 +49,159 @@ public void testSkipIndexInterva

Re: [PR] Align doc value skipper interval boundaries when an interval contains a constant value [lucene]

2024-07-22 Thread via GitHub
jpountz commented on code in PR #13597: URL: https://github.com/apache/lucene/pull/13597#discussion_r1686443455 ## lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestLucene90DocValuesFormatVariableSkipInterval.java: ## @@ -36,4 +49,159 @@ public void testSkipIndexInterva

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686447283 ## lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java: ## @@ -198,7 +198,17 @@ private void doTestScores(int totalHitsThreshold) throws IOException {

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-22 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1686444817 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple