Re: [I] Move vector search from IndexInput to RandomAccessInput [lucene]

2024-11-13 Thread via GitHub
dungba88 commented on issue #13938: URL: https://github.com/apache/lucene/issues/13938#issuecomment-2475621958 @jpountz it's only a draft (I need to add tests), but can you give some feedbacks on https://github.com/apache/lucene/pull/13981. I'm not sure if I have fully captured the intentio

Re: [PR] Use Arrays.mismatch in FSTCompiler#add. [lucene]

2024-11-13 Thread via GitHub
vsop-479 commented on PR #13924: URL: https://github.com/apache/lucene/pull/13924#issuecomment-2475534092 Thanks for your review, @dungba88 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Add a Multi-Vector Similarity Function [lucene]

2024-11-13 Thread via GitHub
vigyasharma commented on code in PR #13991: URL: https://github.com/apache/lucene/pull/13991#discussion_r1841544154 ## lucene/core/src/java/org/apache/lucene/index/MultiVectorSimilarityFunction.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] [DRAFT] Change vector input from IndexInput to RandomAccessInput [lucene]

2024-11-13 Thread via GitHub
dungba88 commented on code in PR #13981: URL: https://github.com/apache/lucene/pull/13981#discussion_r1841491566 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorer.java: ## @@ -40,8 +41,14 @@ abstract sealed class Lucene99Mem

Re: [PR] [DRAFT] Change vector input from IndexInput to RandomAccessInput [lucene]

2024-11-13 Thread via GitHub
dungba88 commented on code in PR #13981: URL: https://github.com/apache/lucene/pull/13981#discussion_r1841492890 ## lucene/core/src/java/org/apache/lucene/store/RandomAccessInput.java: ## @@ -77,4 +85,6 @@ default void readBytes(long pos, byte[] bytes, int offset, int length) t

Re: [PR] Break the loop when segment is fully deleted by prior delTerms or delQueries [lucene]

2024-11-13 Thread via GitHub
vsop-479 commented on PR #13398: URL: https://github.com/apache/lucene/pull/13398#issuecomment-2475304269 Hi @mikemccand , I think I addressed all your comments, Please take a look when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] [KNN] Add comment and remove duplicate code [lucene]

2024-11-13 Thread via GitHub
dungba88 commented on PR #13594: URL: https://github.com/apache/lucene/pull/13594#issuecomment-2475301572 Thanks Kaival for reviewing and approving. Could someone from Lucene committers help review and merge this PR if it looks good? -- This is an automated message from the Apache

Re: [PR] Add a Multi-Vector Similarity Function [lucene]

2024-11-13 Thread via GitHub
dungba88 commented on code in PR #13991: URL: https://github.com/apache/lucene/pull/13991#discussion_r1841484667 ## lucene/core/src/java/org/apache/lucene/index/MultiVectorSimilarityFunction.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Parse escaped brackets and spaces in range queries [lucene]

2024-11-13 Thread via GitHub
github-actions[bot] commented on PR #13887: URL: https://github.com/apache/lucene/pull/13887#issuecomment-2475092304 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1841125795 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.clo

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-13 Thread via GitHub
benwtrent commented on PR #13990: URL: https://github.com/apache/lucene/pull/13990#issuecomment-2474732022 > Unrelated: I noticed that the DiversifyingChildren* queries don't use the filter to evaluate the query, is this a bug? I am unsure what you mean by this? The filter is utilized

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-13 Thread via GitHub
benwtrent commented on code in PR #13990: URL: https://github.com/apache/lucene/pull/13990#discussion_r1841121411 ## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java: ## @@ -154,7 +154,14 @@ protected TopDocs approximateSearch(

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1841119813 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.clo

[PR] lucene-monitor: make abstract DocumentBatch public [lucene]

2024-11-13 Thread via GitHub
cpoerschke opened a new pull request, #13993: URL: https://github.com/apache/lucene/pull/13993 ### Description The static `DocumentBatch.of` method are already public, if the class itself was public too that would allow applications -- e.g. see @kotman12's https://github.com/apache/s

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
mikemccand commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840911855 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-13 Thread via GitHub
jpountz commented on PR #13990: URL: https://github.com/apache/lucene/pull/13990#issuecomment-2474335025 > Could you update the byte knn query & DiversifyingChildern* knn queries as well? Unrelated: I noticed that the `DiversifyingChildren*` queries don't use the filter to evaluate t

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-13 Thread via GitHub
jpountz commented on code in PR #13990: URL: https://github.com/apache/lucene/pull/13990#discussion_r1840915954 ## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java: ## @@ -154,7 +154,14 @@ protected TopDocs approximateSearch( @O

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
jpountz commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840904596 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.cl

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840841444 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.clo

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-13 Thread via GitHub
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840841444 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.clo

Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]

2024-11-13 Thread via GitHub
msokolov commented on PR #13984: URL: https://github.com/apache/lucene/pull/13984#issuecomment-2474129035 > Doesn't (neighbors are in order) and (neighbors are not repeated) verify uniqueness? duh, yes :) > Also, why wouldn't we assert neighbors are on this leve

Re: [PR] Introduces IndexInput#updateReadAdvice to change the readadvice while [lucene]

2024-11-13 Thread via GitHub
shatejas commented on PR #13985: URL: https://github.com/apache/lucene/pull/13985#issuecomment-2474147590 > I'm curious how much this actually helps, and I know that you said that benchmark results would be posted. @ChrisHegarty Preliminary results showed approximately 40mins (~13%)

Re: [PR] Improve checksum calculations [lucene]

2024-11-13 Thread via GitHub
jfboeuf commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1840804892 ## lucene/core/src/java/org/apache/lucene/store/BufferedChecksum.java: ## @@ -60,6 +64,37 @@ public void update(byte[] b, int off, int len) { } } + void upd

Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]

2024-11-13 Thread via GitHub
benchaplin commented on PR #13984: URL: https://github.com/apache/lucene/pull/13984#issuecomment-2474165424 Gotcha, thanks for the comments @msokolov! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[I] Strange javac "automatic module" warning for `benchmark-jmh` [lucene]

2024-11-13 Thread via GitHub
mikemccand opened a new issue, #13992: URL: https://github.com/apache/lucene/issues/13992 ### Description I suddenly noticed this warning when running top-level `./gradlew jar` in Lucene `main`: ``` /s1/l/trunk/lucene/benchmark-jmh/src/java/module-info.java:20: warning: req

Re: [PR] DocValuesSkipper implementation in IndexSortSorted [lucene]

2024-11-13 Thread via GitHub
gsmiller commented on code in PR #13886: URL: https://github.com/apache/lucene/pull/13886#discussion_r1840714602 ## lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -397,106 +413,80 @@ private boolean matchAll(PointValues points

Re: [PR] Allow easier verification of the Panama Vectorization provider with newer Java versions [lucene]

2024-11-13 Thread via GitHub
ChrisHegarty commented on PR #13986: URL: https://github.com/apache/lucene/pull/13986#issuecomment-2473764740 > > Personally I would prefer a less if/else/default handling using Optional like done in the previous sysprops. > > I'll make that change before merging. Well... I tri

Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]

2024-11-13 Thread via GitHub
benchaplin commented on code in PR #13984: URL: https://github.com/apache/lucene/pull/13984#discussion_r1840492668 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -2746,6 +2781,106 @@ public static Status.VectorValuesStatus testVectors( return status;

Re: [PR] Allow easier verification of the Panama Vectorization provider with newer Java versions [lucene]

2024-11-13 Thread via GitHub
uschindler commented on PR #13986: URL: https://github.com/apache/lucene/pull/13986#issuecomment-2473792705 > > > Personally I would prefer a less if/else/default handling using Optional like done in the previous sysprops. > > > > > > I'll make that change before merging. >

Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]

2024-11-13 Thread via GitHub
msokolov commented on code in PR #13984: URL: https://github.com/apache/lucene/pull/13984#discussion_r1840202966 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -2746,6 +2781,106 @@ public static Status.VectorValuesStatus testVectors( return status;

Re: [PR] Multireader Support in Searcher Manager [lucene]

2024-11-13 Thread via GitHub
jpountz commented on PR #13976: URL: https://github.com/apache/lucene/pull/13976#issuecomment-2472800472 To add to @vigyasharma, I have been wondering if we should remove `SearcherManager` and encourage users to use `IndexReaderManager`. `IndexSearcher` is cheap to create and there are som

Re: [PR] Add new Directory implementation for AWS S3 [lucene]

2024-11-13 Thread via GitHub
jpountz commented on PR #13949: URL: https://github.com/apache/lucene/pull/13949#issuecomment-2472787711 Since we have a S3 Directory implemented already, let's run a comparison with the fuse mount approach? -- This is an automated message from the Apache Git Service. To respond to the me