Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]
msokolov commented on code in PR #13984: URL: https://github.com/apache/lucene/pull/13984#discussion_r1840202966 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -2746,6 +2781,106 @@ public static Status.VectorValuesStatus testVectors( return status; } + /** Test the HNSW graph. */ + public static Status.HnswGraphsStatus testHnswGraphs( + CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException { +if (infoStream != null) { + infoStream.print("test: hnsw graphs."); +} +long startNS = System.nanoTime(); +Status.HnswGraphsStatus status = new Status.HnswGraphsStatus(); +KnnVectorsReader vectorsReader = reader.getVectorReader(); +FieldInfos fieldInfos = reader.getFieldInfos(); + +try { + if (fieldInfos.hasVectorValues()) { +for (FieldInfo fieldInfo : fieldInfos) { + if (fieldInfo.hasVectorValues()) { +if (vectorsReader instanceof PerFieldKnnVectorsFormat.FieldsReader) { + KnnVectorsReader fieldReader = + ((PerFieldKnnVectorsFormat.FieldsReader) vectorsReader) + .getFieldReader(fieldInfo.name); + if (fieldReader instanceof HnswGraphProvider) { +HnswGraph hnswGraph = ((HnswGraphProvider) fieldReader).getGraph(fieldInfo.name); +testHnswGraph(hnswGraph, fieldInfo.name, status); + } +} + } +} + } + StringJoiner hnswGraphResultJoiner = new StringJoiner(", "); + for (Map.Entry hnswGraphStatus : + status.hnswGraphsStatusByField.entrySet()) { +hnswGraphResultJoiner.add( +String.format( +Locale.ROOT, +"(field name: %s, levels: %d, total nodes: %d)", +hnswGraphStatus.getKey(), +hnswGraphStatus.getValue().numLevels, +hnswGraphStatus.getValue().totalNumNodes)); + } + msg( + infoStream, + String.format( + Locale.ROOT, + "OK [%d fields%s] [took %.3f sec]", + status.hnswGraphsStatusByField.size(), + hnswGraphResultJoiner.toString().isEmpty() ? "" : ": " + hnswGraphResultJoiner, + nsToSec(System.nanoTime() - startNS))); +} catch (Throwable e) { + if (failFast) { +throw IOUtils.rethrowAlways(e); + } + msg(infoStream, "ERROR: " + e); + status.error = e; + if (infoStream != null) { +e.printStackTrace(infoStream); + } +} + +return status; + } + + private static void testHnswGraph( + HnswGraph hnswGraph, String fieldName, Status.HnswGraphsStatus status) + throws IOException, CheckIndexException { +if (hnswGraph != null) { + status.hnswGraphsStatusByField.put(fieldName, new Status.HnswGraphStatus()); + final int numLevels = hnswGraph.numLevels(); + // Perform tests on each level of the HNSW graph + for (int level = numLevels - 1; level >= 0; level--) { +HnswGraph.NodesIterator nodesIterator = hnswGraph.getNodesOnLevel(level); +while (nodesIterator.hasNext()) { + int node = nodesIterator.nextInt(); + hnswGraph.seek(level, node); + int nbr, lastNeighbor = -1, firstNeighbor = -1; + while ((nbr = hnswGraph.nextNeighbor()) != NO_MORE_DOCS) { +if (firstNeighbor == -1) { Review Comment: perhaps at some point we could also check for uniqueness of the neighbors, and also check that the neighbors are in range [0, numGraphNodes], and finally on levels > 0, we would want to assert that the neighbors are on this level. But this can all be done separately; maybe we could add a comment here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new Directory implementation for AWS S3 [lucene]
jpountz commented on PR #13949: URL: https://github.com/apache/lucene/pull/13949#issuecomment-2472787711 Since we have a S3 Directory implemented already, let's run a comparison with the fuse mount approach? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow easier verification of the Panama Vectorization provider with newer Java versions [lucene]
uschindler commented on PR #13986: URL: https://github.com/apache/lucene/pull/13986#issuecomment-2473792705 > > > Personally I would prefer a less if/else/default handling using Optional like done in the previous sysprops. > > > > > > I'll make that change before merging. > > Well... I tried, and it was awful! The reason being the need to parseInt the value and catch the exception, which, while possible, just lead to more convoluted code. Easy to refactor later, if at all. I would have left the code to bail out if the Sysprop is empty or no valid integer, so not catch exception at all. Anyways, it's good as is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] DocValuesSkipper implementation in IndexSortSorted [lucene]
gsmiller commented on code in PR #13886: URL: https://github.com/apache/lucene/pull/13886#discussion_r1840714602 ## lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -397,106 +413,80 @@ private boolean matchAll(PointValues points, byte[] queryLowerPoint, byte[] quer } private IteratorAndCount getDocIdSetIteratorOrNullFromBkd( - LeafReaderContext context, DocIdSetIterator delegate) throws IOException { -Sort indexSort = context.reader().getMetaData().sort(); + LeafReader reader, NumericDocValues numericDocValues, DocValuesSkipper skipper) + throws IOException { +if (skipper.docCount() != reader.maxDoc()) { + return null; +} +final Sort indexSort = reader.getMetaData().sort(); if (indexSort == null || indexSort.getSort().length == 0 || indexSort.getSort()[0].getField().equals(field) == false) { return null; } +final int minDocID; +final int maxDocID; final boolean reverse = indexSort.getSort()[0].getReverse(); - -PointValues points = context.reader().getPointValues(field); Review Comment: @iverase hmm, I thought I'd responded a few days back but looks like it didn't save for some reason (probably user error). Anyway, I think your take on this is very reasonable but it would be nice to measure if possible to quantify this in some way. But yeah, I'm fine with either order as a starting point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] Strange javac "automatic module" warning for `benchmark-jmh` [lucene]
mikemccand opened a new issue, #13992: URL: https://github.com/apache/lucene/issues/13992 ### Description I suddenly noticed this warning when running top-level `./gradlew jar` in Lucene `main`: ``` /s1/l/trunk/lucene/benchmark-jmh/src/java/module-info.java:20: warning: requires directive for an automatic module requires jmh.core; ^ 1 warning ``` I think it is related to java modules but I'm not sure how to fix/suppress it :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]
benchaplin commented on PR #13984: URL: https://github.com/apache/lucene/pull/13984#issuecomment-2474165424 Gotcha, thanks for the comments @msokolov! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Improve checksum calculations [lucene]
jfboeuf commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1840804892 ## lucene/core/src/java/org/apache/lucene/store/BufferedChecksum.java: ## @@ -60,6 +64,37 @@ public void update(byte[] b, int off, int len) { } } + void updateShort(short val) { +if (upto + Short.BYTES > buffer.length) flush(); +BitUtil.VH_LE_SHORT.set(buffer, upto, val); +upto += Short.BYTES; + } + + void updateInt(int val) { +if (upto + Integer.BYTES > buffer.length) flush(); +BitUtil.VH_LE_INT.set(buffer, upto, val); +upto += Integer.BYTES; + } + + void updateLong(long val) { +if (upto + Long.BYTES > buffer.length) flush(); +BitUtil.VH_LE_LONG.set(buffer, upto, val); +upto += Long.BYTES; + } + + void updateLongs(long[] vals, int offset, int len) { +flush(); Review Comment: I pushed the modification to limit the flush. Not only is the buffer no longer systematically flushed entering the `updateLongs(long[], int, int)` but also the last chunk is not flushed which results beneficial when in the likely case there is remaining data (the codec footer) after the `long[]` that could fit in the buffer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Introduces IndexInput#updateReadAdvice to change the readadvice while [lucene]
shatejas commented on PR #13985: URL: https://github.com/apache/lucene/pull/13985#issuecomment-2474147590 > I'm curious how much this actually helps, and I know that you said that benchmark results would be posted. @ChrisHegarty Preliminary results showed approximately 40mins (~13%) reduction in total force merge time for 10m dataset. You should be able to find details here https://github.com/opensearch-project/k-NN/issues/2134#issuecomment-2420793541. Though the setup does not use LuceneHNSW it does use Lucene99FlatVectorsReader. > What is unclear is how much the sequential advise here helps over something like a load (similar to MemorySegment::load), that touches every page. I did try preload which does `MemorySegment::load` and there was an improvement in force merge time. The advantage I see with updateAdvice is being able to delay the loading of vectors till needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]
msokolov commented on PR #13984: URL: https://github.com/apache/lucene/pull/13984#issuecomment-2474129035 > Doesn't (neighbors are in order) and (neighbors are not repeated) verify uniqueness? duh, yes :) > Also, why wouldn't we assert neighbors are on this level for level 0? yes, we should, it's just that for level 0 this is the same as node in [0, size) wheras other levels are sparse - they have the same max = size-1, but not every node is included -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840841444 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); +} + +for (int i = 0; i < scores.length; i++) { + assertEquals(scores[i], scc.mscores[i * 2], 0f); + assertEquals(scores[i], scc.mscores[i * 2 + 1], 0f); +} + +assertEquals(scores.length, countingScorable.count); Review Comment: This is similar to the other test method (`testGetScores`), but we assert that each score is collected twice (since we collect each document twice), but the `score()` method is only called once per document. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840841444 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); +} + +for (int i = 0; i < scores.length; i++) { + assertEquals(scores[i], scc.mscores[i * 2], 0f); + assertEquals(scores[i], scc.mscores[i * 2 + 1], 0f); +} + +assertEquals(scores.length, countingScorable.count); Review Comment: This is similar to the other test method (`testGetScores`), but we assert that each score is collected twice, but the `score()` method is only called once per document. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
jpountz commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840904596 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); Review Comment: It's illegal to call collect() multiple times on the same doc, this shouldn't be necessary as `collect()` already calls `Scorable#score()` multiple times? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]
jpountz commented on code in PR #13990: URL: https://github.com/apache/lucene/pull/13990#discussion_r1840915954 ## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java: ## @@ -154,7 +154,14 @@ protected TopDocs approximateSearch( @Override public String toString(String field) { -return getClass().getSimpleName() + ":" + this.field + "[" + query[0] + ",...][" + k + "]"; Review Comment: It's not obvious to me, `parentsFilter` is more a property of the index than of the query. You could have multiple levels of parents, which would make this property relevant to print, but at the same time I don't think they have proper toString() implementations. Let's leave it out for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]
jpountz commented on PR #13990: URL: https://github.com/apache/lucene/pull/13990#issuecomment-2474335025 > Could you update the byte knn query & DiversifyingChildern* knn queries as well? Unrelated: I noticed that the `DiversifyingChildren*` queries don't use the filter to evaluate the query, is this a bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
mikemccand commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1840911855 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); Review Comment: Hmm why are we collecting the same `doc` twice? (I thought this is smelly -- we fixed the other two tests to stop doing that?). Oh I see, is so `score()` happens more than once while on the same doc? Actually, since `ScoreCachingCollector`'s `LeafCollector` is calling `.score()` three times for each `.collect()`, you should be able to do only one `lc.collect(doc)` here? Does this new test case fail if you revert the bug fix? ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); +} + +for (int i = 0; i < scores.length; i++) { + assertEquals(scores[i], scc.mscores[i * 2], 0f); + assertEquals(scores[i], scc.mscores[i * 2 + 1], 0f); +} + +assertEquals(scores.length, countingScorable.count); Review Comment: Ahh I see -- you want to assert that indeed `score()` was called (up on top) multiple times per hit, but then down below in the "true" scorer, it was dedup'd to just once per hit. So that's why you `.collect(doc)` twice, OK ... ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; Review Comment: We don't need the `= 0` since it's java's default already? ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); Review Comment: Maybe add a comment about why we need `* 2` here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Multireader Support in Searcher Manager [lucene]
jpountz commented on PR #13976: URL: https://github.com/apache/lucene/pull/13976#issuecomment-2472800472 To add to @vigyasharma, I have been wondering if we should remove `SearcherManager` and encourage users to use `IndexReaderManager`. `IndexSearcher` is cheap to create and there are som reasons why it may be interesting to create a different `IndexSearcher` per query already, e.g. to tune parallelism based on current load, or to configure a timeout on the `IndexSearcher`. Maybe `MultiReader` falls in the same bucket. Presumably, the `MultiReader` is made from multiple `DirectoryReader`s, so maybe the application code should not create a `SearcherManager` that works with a `MultiReader` but instead manage two (or more, one per Directory) `ReaderManager`s and dynamically create a `MultiReader` and then an `IndexSearcher` on top of this `MultiReader` on every search? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] lucene-monitor: make abstract DocumentBatch public [lucene]
cpoerschke opened a new pull request, #13993: URL: https://github.com/apache/lucene/pull/13993 ### Description The static `DocumentBatch.of` method are already public, if the class itself was public too that would allow applications -- e.g. see @kotman12's https://github.com/apache/solr/pull/2382 -- to refer to the class e.g. in a visitor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow easier verification of the Panama Vectorization provider with newer Java versions [lucene]
ChrisHegarty commented on PR #13986: URL: https://github.com/apache/lucene/pull/13986#issuecomment-2473764740 > > Personally I would prefer a less if/else/default handling using Optional like done in the previous sysprops. > > I'll make that change before merging. Well... I tried, and it was awful! The reason being the need to parseInt the value and catch the exception, which, while possible, just lead to more convoluted code. Easy to refactor later, if at all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add some basic HNSW graph checks to CheckIndex [lucene]
benchaplin commented on code in PR #13984: URL: https://github.com/apache/lucene/pull/13984#discussion_r1840492668 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -2746,6 +2781,106 @@ public static Status.VectorValuesStatus testVectors( return status; } + /** Test the HNSW graph. */ + public static Status.HnswGraphsStatus testHnswGraphs( + CodecReader reader, PrintStream infoStream, boolean failFast) throws IOException { +if (infoStream != null) { + infoStream.print("test: hnsw graphs."); +} +long startNS = System.nanoTime(); +Status.HnswGraphsStatus status = new Status.HnswGraphsStatus(); +KnnVectorsReader vectorsReader = reader.getVectorReader(); +FieldInfos fieldInfos = reader.getFieldInfos(); + +try { + if (fieldInfos.hasVectorValues()) { +for (FieldInfo fieldInfo : fieldInfos) { + if (fieldInfo.hasVectorValues()) { +if (vectorsReader instanceof PerFieldKnnVectorsFormat.FieldsReader) { + KnnVectorsReader fieldReader = + ((PerFieldKnnVectorsFormat.FieldsReader) vectorsReader) + .getFieldReader(fieldInfo.name); + if (fieldReader instanceof HnswGraphProvider) { +HnswGraph hnswGraph = ((HnswGraphProvider) fieldReader).getGraph(fieldInfo.name); +testHnswGraph(hnswGraph, fieldInfo.name, status); + } +} + } +} + } + StringJoiner hnswGraphResultJoiner = new StringJoiner(", "); + for (Map.Entry hnswGraphStatus : + status.hnswGraphsStatusByField.entrySet()) { +hnswGraphResultJoiner.add( +String.format( +Locale.ROOT, +"(field name: %s, levels: %d, total nodes: %d)", +hnswGraphStatus.getKey(), +hnswGraphStatus.getValue().numLevels, +hnswGraphStatus.getValue().totalNumNodes)); + } + msg( + infoStream, + String.format( + Locale.ROOT, + "OK [%d fields%s] [took %.3f sec]", + status.hnswGraphsStatusByField.size(), + hnswGraphResultJoiner.toString().isEmpty() ? "" : ": " + hnswGraphResultJoiner, + nsToSec(System.nanoTime() - startNS))); +} catch (Throwable e) { + if (failFast) { +throw IOUtils.rethrowAlways(e); + } + msg(infoStream, "ERROR: " + e); + status.error = e; + if (infoStream != null) { +e.printStackTrace(infoStream); + } +} + +return status; + } + + private static void testHnswGraph( + HnswGraph hnswGraph, String fieldName, Status.HnswGraphsStatus status) + throws IOException, CheckIndexException { +if (hnswGraph != null) { + status.hnswGraphsStatusByField.put(fieldName, new Status.HnswGraphStatus()); + final int numLevels = hnswGraph.numLevels(); + // Perform tests on each level of the HNSW graph + for (int level = numLevels - 1; level >= 0; level--) { +HnswGraph.NodesIterator nodesIterator = hnswGraph.getNodesOnLevel(level); +while (nodesIterator.hasNext()) { + int node = nodesIterator.nextInt(); + hnswGraph.seek(level, node); + int nbr, lastNeighbor = -1, firstNeighbor = -1; + while ((nbr = hnswGraph.nextNeighbor()) != NO_MORE_DOCS) { +if (firstNeighbor == -1) { Review Comment: Sounds good, I'll add a TODO (and add those checks in a followup PR). A couple questions first: - Doesn't (neighbors are in order) and (neighbors are not repeated) verify uniqueness? - Also, why wouldn't we assert neighbors are on this level for level 0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1841119813 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); Review Comment: The new test does fail if I revert the fix. As I mentioned in https://github.com/apache/lucene/pull/13987#issuecomment-2472044035, the condition for this being an actual bug is probably not really valid (i.e. you need to call `collect()` on the `ScoreCachingWrappingLeafCollector` multiple times for the same document). The caching *does* work if the wrapped `LeafCollector` calls `score()` multiple times, which can happen in `MultiCollector` (for example). The layering is roughly: ``` ScoreCachingWrappingLeafCollector wraps Some other LeafCollector whose score is ScoreCachingWrappingScorer, which wraps The real Scorable ``` The multiple calls to `score()` from `Some other LeafCollector` work just fine. Again, it's the call to `collect()` on `ScoreCachingWrappingLeafCollector` that was invalidating the cache. Those other two tests broke because my "fix" causes the caching to work (but they didn't expect it). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]
benwtrent commented on code in PR #13990: URL: https://github.com/apache/lucene/pull/13990#discussion_r1841121411 ## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java: ## @@ -154,7 +154,14 @@ protected TopDocs approximateSearch( @Override public String toString(String field) { -return getClass().getSimpleName() + ":" + this.field + "[" + query[0] + ",...][" + k + "]"; Review Comment: i agree with @jpountz, I am less convinced that as it is right now, that printing the parentsFilter is useful -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]
benwtrent commented on PR #13990: URL: https://github.com/apache/lucene/pull/13990#issuecomment-2474732022 > Unrelated: I noticed that the DiversifyingChildren* queries don't use the filter to evaluate the query, is this a bug? I am unsure what you mean by this? The filter is utilized when building the accepted docs set in the super() classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]
msfroh commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1841125795 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.close(); directory.close(); } + + private static class CountingScorable extends FilterScorable { +int count = 0; + +public CountingScorable(Scorable in) { + super(in); +} + +@Override +public float score() throws IOException { + count++; + return in.score(); +} + } + + public void testRepeatedCollectReusesScore() throws Exception { +Scorer s = new SimpleScorer(); +CountingScorable countingScorable = new CountingScorable(s); +ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2); +LeafCollector lc = scc.getLeafCollector(null); +lc.setScorer(countingScorable); + +// We need to iterate on the scorer so that its doc() advances. +int doc; +while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { + lc.collect(doc); + lc.collect(doc); Review Comment: Yeah -- that's why I'd hesitate to really call this a "bug fix". I'm pretty sure you meant to write `lastDoc = currDoc` in https://github.com/apache/lucene/pull/12407, but functionally `currDoc = lastDoc` works fine as long as you don't call `collect()` multiple times on the same doc. I only noticed this because I was [copy/pasting the logic](https://github.com/opensearch-project/OpenSearch/pull/16366/files#diff-45fac6161bed1775163b916fba9ea296fbe9a8adf931a48593672c5c558a114b) to adapt it for `LeafBucketCollector` in OpenSearch and IntelliJ warned me that `lastDoc` could be `final`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Parse escaped brackets and spaces in range queries [lucene]
github-actions[bot] commented on PR #13887: URL: https://github.com/apache/lucene/pull/13887#issuecomment-2475092304 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add a Multi-Vector Similarity Function [lucene]
vigyasharma commented on code in PR #13991: URL: https://github.com/apache/lucene/pull/13991#discussion_r1841544154 ## lucene/core/src/java/org/apache/lucene/index/MultiVectorSimilarityFunction.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import org.apache.lucene.util.ArrayUtil; + +/** + * Computes similarity between two multi-vectors. + * + * A multi-vector is a collection of multiple vectors that represent a single document or query. + * MultiVectorSimilarityFunction is used to determine nearest neighbors during indexing and search + * on multi-vectors. + */ +public class MultiVectorSimilarityFunction { + + /** Aggregation function to combine similarity across multiple vector values */ + public enum Aggregation { + +/** + * Sum_Max Similarity between two multi-vectors. Computes the sum of maximum similarity found + * for each vector in the first multi-vector against all vectors in the second multi-vector. + */ +SUM_MAX { + @Override + public float aggregate( + float[] outer, + float[] inner, + VectorSimilarityFunction vectorSimilarityFunction, + int dimension) { +if (outer.length % dimension != 0 || inner.length % dimension != 0) { + throw new IllegalArgumentException("Multi vectors do not match provided dimension value"); +} + +// TODO: can we avoid making vector copies? +List outerList = new ArrayList<>(); +List innerList = new ArrayList<>(); +for (int i = 0; i < outer.length; i += dimension) { + outerList.add(ArrayUtil.copyOfSubArray(outer, i, i + dimension)); +} +for (int i = 0; i < inner.length; i += dimension) { + innerList.add(ArrayUtil.copyOfSubArray(inner, i, i + dimension)); +} + +float result = 0f; +for (float[] o : outerList) { + float maxSim = Float.MIN_VALUE; + for (float[] i : innerList) { +maxSim = Float.max(maxSim, vectorSimilarityFunction.compare(o, i)); Review Comment: Right, but it needs to go all the way down to `VectorUtilSupport`, which I think should be a PR of its own. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add a Multi-Vector Similarity Function [lucene]
dungba88 commented on code in PR #13991: URL: https://github.com/apache/lucene/pull/13991#discussion_r1841484667 ## lucene/core/src/java/org/apache/lucene/index/MultiVectorSimilarityFunction.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import org.apache.lucene.util.ArrayUtil; + +/** + * Computes similarity between two multi-vectors. + * + * A multi-vector is a collection of multiple vectors that represent a single document or query. + * MultiVectorSimilarityFunction is used to determine nearest neighbors during indexing and search + * on multi-vectors. + */ +public class MultiVectorSimilarityFunction { + + /** Aggregation function to combine similarity across multiple vector values */ + public enum Aggregation { + +/** + * Sum_Max Similarity between two multi-vectors. Computes the sum of maximum similarity found + * for each vector in the first multi-vector against all vectors in the second multi-vector. + */ +SUM_MAX { + @Override + public float aggregate( + float[] outer, + float[] inner, + VectorSimilarityFunction vectorSimilarityFunction, + int dimension) { +if (outer.length % dimension != 0 || inner.length % dimension != 0) { + throw new IllegalArgumentException("Multi vectors do not match provided dimension value"); +} + +// TODO: can we avoid making vector copies? +List outerList = new ArrayList<>(); +List innerList = new ArrayList<>(); +for (int i = 0; i < outer.length; i += dimension) { + outerList.add(ArrayUtil.copyOfSubArray(outer, i, i + dimension)); +} +for (int i = 0; i < inner.length; i += dimension) { + innerList.add(ArrayUtil.copyOfSubArray(inner, i, i + dimension)); +} + +float result = 0f; +for (float[] o : outerList) { + float maxSim = Float.MIN_VALUE; + for (float[] i : innerList) { +maxSim = Float.max(maxSim, vectorSimilarityFunction.compare(o, i)); Review Comment: If we add another `compare` method with start and end indexes for both inner and outer, I guess we won't need to copy the array? ## lucene/core/src/java/org/apache/lucene/index/MultiVectorSimilarityFunction.java: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import org.apache.lucene.util.ArrayUtil; + +/** + * Computes similarity between two multi-vectors. + * + * A multi-vector is a collection of multiple vectors that represent a single document or query. + * MultiVectorSimilarityFunction is used to determine nearest neighbors during indexing and search + * on multi-vectors. + */ +public class MultiVectorSimilarityFunction { + + /** Aggregation function to combine similarity across multiple vector values */ + public enum Aggregation { + +/** + * Sum_Max Similarity between two multi-vectors. Computes the sum of maximum similarity found + * for each vector in the first multi-vector against all vectors in the second multi-vector. + */ +SUM_MAX { + @Override + public float aggregate( + float[] outer, + float[] inner, + VectorSimilarityFunction vectorSimilarityFunction,
Re: [PR] [KNN] Add comment and remove duplicate code [lucene]
dungba88 commented on PR #13594: URL: https://github.com/apache/lucene/pull/13594#issuecomment-2475301572 Thanks Kaival for reviewing and approving. Could someone from Lucene committers help review and merge this PR if it looks good? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Break the loop when segment is fully deleted by prior delTerms or delQueries [lucene]
vsop-479 commented on PR #13398: URL: https://github.com/apache/lucene/pull/13398#issuecomment-2475304269 Hi @mikemccand , I think I addressed all your comments, Please take a look when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [DRAFT] Change vector input from IndexInput to RandomAccessInput [lucene]
dungba88 commented on code in PR #13981: URL: https://github.com/apache/lucene/pull/13981#discussion_r1841492890 ## lucene/core/src/java/org/apache/lucene/store/RandomAccessInput.java: ## @@ -77,4 +85,6 @@ default void readBytes(long pos, byte[] bytes, int offset, int length) throws IO * @see IndexInput#prefetch */ default void prefetch(long offset, long length) throws IOException {} + + Object clone(); Review Comment: This baffles me a lot! I can't let the return type to be `RandomAccessInput` because it will conflict with the `DataInput.clone()` which returns `DataInput`. Adding the clone method seems a bit weird to me, but I'm not sure if there is a better way to go around it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [DRAFT] Change vector input from IndexInput to RandomAccessInput [lucene]
dungba88 commented on code in PR #13981: URL: https://github.com/apache/lucene/pull/13981#discussion_r1841491566 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorer.java: ## @@ -40,8 +41,14 @@ abstract sealed class Lucene99MemorySegmentByteVectorScorer * returned. */ public static Optional create( - VectorSimilarityFunction type, IndexInput input, KnnVectorValues values, byte[] queryVector) { + VectorSimilarityFunction type, + RandomAccessInput slice, + KnnVectorValues values, + byte[] queryVector) { assert values instanceof ByteVectorValues; +if (!(slice instanceof IndexInput input)) { Review Comment: It's used in the next block though ``` input = FilterIndexInput.unwrapOnlyTest(input); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use Arrays.mismatch in FSTCompiler#add. [lucene]
vsop-479 commented on PR #13924: URL: https://github.com/apache/lucene/pull/13924#issuecomment-2475534092 Thanks for your review, @dungba88 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Move vector search from IndexInput to RandomAccessInput [lucene]
dungba88 commented on issue #13938: URL: https://github.com/apache/lucene/issues/13938#issuecomment-2475621958 @jpountz it's only a draft (I need to add tests), but can you give some feedbacks on https://github.com/apache/lucene/pull/13981. I'm not sure if I have fully captured the intention of this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org