vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2439876776

   Thanks @benwtrent. I've been working on getting a multi-vector benchmark 
running to wire this end to end. Found some pesky bugs and oversights. I'm 
planning to split this feature into multiple smaller PRs. This PR was mainly to 
get inputs on the approach. It's too big to test and review. I'll share a plan 
of the split PRs soon.
   
   re: the multi-vector benchmark for passage search use-case, I've been stuck 
on a bug where after I run into an `EOFException` on reading the last 
multi-vector document through `DenseOffHeapMultiVectorValues`. I could 
definitely use some help here. If you plan to take a look, you can use the code 
in this PR (i'll push my fixes) and multi-vector benchmark code from 
[here](https://github.com/vigyasharma/luceneutil/tree/multivec).
   
   ```java
   Exception in thread "main" java.lang.RuntimeException: java.io.EOFException: 
read past EOF: 
MemorySegmentIndexInput(path="/Users/vigyas/forks/bench/util/knnIndices/cohere-wikipedia-docs-768d.vec-32-50-multiVector.index/_0_Lucene99HnswMultiVectorsFormat_0.vecmv")
 [slice=multi-vector-data]
           at 
knn.KnnGraphTester$ComputeBaselineNNFloatTask.call(KnnGraphTester.java:1115)
           at knn.KnnGraphTester.computeNN(KnnGraphTester.java:967)
           at knn.KnnGraphTester.getNN(KnnGraphTester.java:812)
           at knn.KnnGraphTester.run(KnnGraphTester.java:438)
           at knn.KnnGraphTester.runWithCleanUp(KnnGraphTester.java:177)
           at knn.KnnGraphTester.main(KnnGraphTester.java:172)
   Caused by: java.io.EOFException: read past EOF: 
MemorySegmentIndexInput(path="/Users/vigyas/forks/bench/util/knnIndices/cohere-wikipedia-docs-768d.vec-32-50-multiVector.index/_0_Lucene99HnswMultiVectorsFormat_0.vecmv")
 [slice=multi-vector-data]
           at 
org.apache.lucene.store.MemorySegmentIndexInput.readByte(MemorySegmentIndexInput.java:146)
           at org.apache.lucene.store.DataInput.readInt(DataInput.java:95)
           at 
org.apache.lucene.store.MemorySegmentIndexInput.readInt(MemorySegmentIndexInput.java:261)
           at org.apache.lucene.store.DataInput.readFloats(DataInput.java:202)
           at 
org.apache.lucene.store.MemorySegmentIndexInput.readFloats(MemorySegmentIndexInput.java:231)
           at 
org.apache.lucene.codecs.lucene99.OffHeapFloatMultiVectorValues.vectorValue(OffHeapFloatMultiVectorValues.java:111)
           at 
org.apache.lucene.codecs.lucene99.OffHeapFloatMultiVectorValues.vectorValue(OffHeapFloatMultiVectorValues.java:130)
           at 
org.apache.lucene.codecs.hnsw.DefaultFlatMultiVectorScorer$FloatMultiVectorScorer.score(DefaultFlatMultiVectorScorer.java:185)
           at 
org.apache.lucene.codecs.lucene99.OffHeapFloatMultiVectorValues$DenseOffHeapMultiVectorValues$1.score(OffHeapFloatMultiVectorValues.java:248)
           at 
org.apache.lucene.search.AbstractKnnVectorQuery.exactSearch(AbstractKnnVectorQuery.java:220)
           at 
knn.KnnFloatVectorBenchmarkQuery.exactSearch(KnnFloatVectorBenchmarkQuery.java:33)
           at 
knn.KnnFloatVectorBenchmarkQuery.runExactSearch(KnnFloatVectorBenchmarkQuery.java:50)
           at 
knn.KnnGraphTester$ComputeBaselineNNFloatTask.call(KnnGraphTester.java:1111)
           ... 5 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to