kaivalnp commented on PR #14874: URL: https://github.com/apache/lucene/pull/14874#issuecomment-3064104650
> is the weird "search after indexing" regression specific only to this PR? It's slightly different here -- I tried the following on `main` When a fresh index is created (run 1): ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.962 2.387 2.386 1.000 100000 100 50 64 250 7 bits 11.68 8563.11 52.88 1 77.44 146.866 73.624 HNSW 0.961 2.392 2.391 1.000 100000 100 50 64 250 4 bits 11.62 8604.37 51.28 1 77.46 110.245 37.003 HNSW ``` Then a run without `-reindex` (run 2, same index as run 1): ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.962 2.550 2.549 0.999 100000 100 50 64 250 7 bits 0.00 Infinity 0.12 1 77.44 146.866 73.624 HNSW 0.961 2.522 2.521 0.999 100000 100 50 64 250 4 bits 0.00 Infinity 0.13 1 77.46 110.245 37.003 HNSW ``` Then a run with `-reindex` (run 3, keeping index from run 1 around): ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.961 1.821 1.820 1.000 100000 100 50 64 250 7 bits 11.41 8765.01 53.00 1 77.45 146.866 73.624 HNSW 0.962 1.851 1.850 1.000 100000 100 50 64 250 4 bits 11.57 8643.04 51.66 1 77.49 110.245 37.003 HNSW ``` And finally a run without `-reindex` (run 4, same index as run 3): ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.961 2.525 2.524 0.999 100000 100 50 64 250 7 bits 0.00 Infinity 0.12 1 77.45 146.866 73.624 HNSW 0.962 2.517 2.516 0.999 100000 100 50 64 250 4 bits 0.00 Infinity 0.13 1 77.49 110.245 37.003 HNSW ``` This behavior of seeing a performance improvement after re-indexing (but not in a fresh index) has been consistent for me (see [this previous result](https://github.com/apache/lucene/pull/14874#issuecomment-3033793168)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org