[ 
https://issues.apache.org/jira/browse/LUCENE-9715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275996#comment-17275996
 ] 

Anton Hägerstrand edited comment on LUCENE-9715 at 1/31/21, 9:58 PM:
---------------------------------------------------------------------

I am/was unfortunately not on the latest \{{luceneutil }}tip - this was done on 
b3b50065940152b2b666046e1791cc4e4d5646c9, from December 23rd. I could try to 
run with trunk to see if it makes a difference, but will take a little while. 
The script I ran was one where I tried to replicate the nightly benchmarks for 
the [https://blunders.io/lucene-bench] setup. I'll attach the python script 
with the upload parts redacted. Did this a bit hastily so might not run 100%, 
but should give an idea of how it runs at least.

Edit: I should say that the benchrun.py is just my attempt to boil down the 
nightlyBench.py into something more usable in my environment, nothing more 
original than that. 

[^benchrun.py]


was (Author: antonha):
I am/was unfortunately not on the latest {{luceneutil }}tip - this was done on 
b3b50065940152b2b666046e1791cc4e4d5646c9, from December 23rd. I could try to 
run with trunk to see if it makes a difference, but will take a little while. 
The script I ran was one where I tried to replicate the nightly benchmarks for 
the [https://blunders.io/lucene-bench] setup. I'll attach the python script 
with the upload parts redacted. Did this a bit hastily so might not run 100%, 
but should give an idea of how it runs at least. 

[^benchrun.py]

> EOF error in VectorValues in Lucene nightly benchmarks
> ------------------------------------------------------
>
>                 Key: LUCENE-9715
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9715
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: trunk
>         Environment: OS: Arch Linux
> Java versions: 
> {code:java}
> openjdk version "11.0.10" 2021-01-19
> OpenJDK Runtime Environment (build 11.0.10+8)
> OpenJDK 64-Bit Server VM (build 11.0.10+8, mixed mode){code}
>  
>            Reporter: Anton Hägerstrand
>            Priority: Minor
>         Attachments: benchrun.py
>
>
> Hi! When running the nightly benchmarks, I can consistently reproduce an EOF 
> exception in the VectorValues code:
> {code:java}
> TASK LEN=150000
> Task repeat count 1000
> Tasks file /home/anton/dev/lucene-bench-home/util/tasks/wikinightly.tasks
> Num task per cat 5
> EXC: <vector:knn:<golf>[-0.07267512,...]>
> Exception in thread "Thread-2" java.lang.RuntimeException: 
> java.lang.RuntimeException: java.io.EOFException: seek past EOF: 
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-25_f670131_medium_1thread/index/_32.vec")
>  [slice=vector-data]
>       at perf.TaskThreads$TaskThread.run(TaskThreads.java:105)
> Caused by: java.lang.RuntimeException: java.io.EOFException: seek past EOF: 
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-25_f670131_medium_1thread/index/_32.vec")
>  [slice=vector-data]
>       at perf.SearchTask.go(SearchTask.java:322)
>       at perf.TaskThreads$TaskThread.run(TaskThreads.java:91)
> Caused by: java.io.EOFException: seek past EOF: 
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-25_f670131_medium_1thread/index/_32.vec")
>  [slice=vector-data]
>       at 
> org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:255)
>       at 
> org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl.seek(ByteBufferIndexInput.java:575)
>       at 
> org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.vectorValue(Lucene90VectorReader.java:432)
>       at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:118)
>       at 
> org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.search(Lucene90VectorReader.java:409)
>       at perf.KnnQuery$KnnWeight.scorer(KnnQuery.java:88)
>       at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:664)
>       at 
> org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:510)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:520)
>       at perf.SearchTask.go(SearchTask.java:263)
>       ... 1 more
> EXC: <vector:knn:<many geografia>[0.02625591,...]>{code}
> I have tried this both on eb24e95731b9f865b95b821c1745264fdc58119 which was 
> head of master/trunk about a week ago, as well as on 
> f670131cbccf42fdde378ee47f9b01977ebbd147 from 
> [https://github.com/apache/lucene-solr/pull/2239.] 
> Command was, in the lucene benchmark setup:
> {code:java}
>       run: java -server -Xms8g -Xmx8g -XX:+FlightRecorder 
> -XX:StartFlightRecording=name=Default,filename=/home/anton/dev/lucene-bench-home/jfr/lucene_bench_2021-01-31_f670131_medium_1thread_search.jfr,settings=profile
>  -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -classpath 
> /home/anton/dev/lucene-bench-home/trunk/lucene/core/build/libs/lucene-core-9.0.0-SNAPSHOT.jar:/home/anton/dev/lucene-bench-home/trunk/lucene/core/build/classes/java/test:/home/anton/dev/lucene-bench-home/trunk/lucene/sandbox/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/misc/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/facet/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/analysis/common/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/analysis/icu/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/queryparser/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/grouping/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/suggest/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/highlighter/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/codecs/build/classes/java/main:/home/anton/dev/lucene-bench-home/trunk/lucene/queries/build/classes/java/main:/home/anton/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.8.2/ccb3ef933ead6b5d766fa571582ddb9b447e48c4/hppc-0.8.2.jar:/home/anton/dev/lucene-bench-home/util/lib/HdrHistogram.jar:/home/anton/dev/lucene-bench-home/util/build
>  perf.SearchPerfTest -dirImpl MMapDirectory -indexPath 
> /home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-31_f670131_medium_1thread
>  -facets taxonomy:Date;Date -facets taxonomy:Month;Month -facets 
> taxonomy:DayOfYear;DayOfYear -facets sortedset:Month;Month -facets 
> sortedset:DayOfYear;DayOfYear -analyzer StandardAnalyzerNoStopWords 
> -taskSource /home/anton/dev/lucene-bench-home/util/tasks/wikinightly.tasks 
> -searchThreadCount 2 -taskRepeatCount 1000 -field body -tasksPerCat 5 
> -staticSeed -8035476 -seed -8826252 -similarity BM25Similarity -commit multi 
> -hiliteImpl FastVectorHighlighter -log 
> /home/anton/dev/lucene-bench-home/logs/jfrtest.jfrtest.0 -topN 10 -printHeap 
> -pk -vectorDict /home/anton/dev/lucene-bench-home/data/glove.6B.100d.txt
> {code}
> Which should be the same that the nighly benchmarks run.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to