msokolov commented on PR #14874: URL: https://github.com/apache/lucene/pull/14874#issuecomment-3053634936
My weird results (w/768d cohere vectors). In sum, it looks to me as if we have some noise immediately after reindexing. Either this is a measurement artifact in `luceneutil` or it is a startup transient related to page caching / preloading / madvise / something. But after things settle doen this is clearly a bug improvement and we should merge, while also trying to understand this measurement problem -- which is pre-existsing and should not block this change. # run 1 ## baseline first line does reindex -- and then latency and netCPU *increase*? This has nothing to do with this PR. ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized visited 0.987 17.861 17.850 0.999 500000 100 50 32 250 no 25663 0.987 25.272 25.256 0.999 500000 100 50 32 250 no 25663 0.987 24.355 24.340 0.999 500000 100 50 32 250 no 25663 ``` ## candidate ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized visited 0.987 13.261 13.251 0.999 500000 100 50 32 250 no 25663 0.987 13.366 13.356 0.999 500000 100 50 32 250 no 25663 ``` # run 2 ## candidate (reindex each time) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized visited 0.983 57.972 57.959 1.000 500000 100 50 32 250 no 25648 0.980 55.219 55.205 1.000 500000 100 50 32 250 no 25802 0.986 93.766 93.752 1.000 500000 100 50 32 250 no 25630 ``` ### no reindex ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized visited 0.986 13.170 13.159 0.999 500000 100 50 32 250 no 25630 0.986 13.411 13.400 0.999 500000 100 50 32 250 no 25630 0.986 13.535 13.523 0.999 500000 100 50 32 250 no 25630 ``` ## baseline (no reindex) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized visited 0.986 26.048 26.033 0.999 500000 100 50 32 250 no 25630 0.986 25.954 25.938 0.999 500000 100 50 32 250 no 25630 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org