mccullocht commented on PR #14980: URL: https://github.com/apache/lucene/pull/14980#issuecomment-3155740647
Hey Chris, I think when I tried my hand at a similar approach with panama I ran into similar (neutral) results on graviton2 and really the only thing that helped there was prefetching into cpu cache. Totally believe the results are better elsewhere, it's just been a bit of a struggle to stand up the tests on other machines given the resources I have at hand. On Tue, Aug 5, 2025 at 7:38 AM Chris Hegarty ***@***.***> wrote: > *ChrisHegarty* left a comment (apache/lucene#14980) > <https://github.com/apache/lucene/pull/14980#issuecomment-3155502316> > > Thanks @mccullocht <https://github.com/mccullocht> - not sure why you're > not seeing improvement on Graviton2, but I'll post some more results that I > see when testing across different platforms. > > Search latencies have improved by ~33%, and merge time 40-50%. I some > ideas about how to further improve indexing, but they can be done > separately. > > 1M cohere 768d > > linux-x64 (m6i.2xlarge, x64, Intel(R) Xeon(R) Platinum 8375C CPU @ > 2.90GHz, AVX 512) > > recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType > baseline > 0.943 3.445 3.440 0.999 1000000 100 50 64 250 no 1303.77 767.00 582.92 1 3022.19 2929.688 2929.688 HNSW > candidate > 0.942 2.492 2.477 0.994 1000000 100 50 64 250 no 982.26 1018.06 325.12 1 3020.78 2929.688 2929.688 HNSW > > linux-amd64 (m6a.4xlarge, AMD EPYC 7R13 Processor, AVX2 ) > > recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType > baseline > 0.943 3.276 3.256 0.994 1000000 100 50 64 250 no 1246.36 802.34 670.54 1 3022.99 2929.688 2929.688 HNSW > candidate > 0.944 2.003 1.989 0.993 1000000 100 50 64 250 no 1489.27 671.47 352.79 1 3023.21 2929.688 2929.688 HNSW > > linux-arm (c6g.8xlarge, aarch64, Neoverse-N1) > > recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType > baseline > 0.927 4.503 4.493 0.998 1000000 100 50 32 250 no 881.79 1134.06 766.93 1 3014.53 2929.688 2929.688 HNSW > candidate > 0.927 3.015 2.999 0.995 1000000 100 50 32 250 no 873.39 1144.96 322.33 1 3013.33 2929.688 2929.688 HNSW > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/lucene/pull/14980#issuecomment-3155502316>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AU5GXMD33KPN4ZSNMS2O6QL3MC6V3AVCNFSM6AAAAACCDNNF66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCNJVGUYDEMZRGY> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org