jpountz commented on PR #13958: URL: https://github.com/apache/lucene/pull/13958#issuecomment-2440178064
I did more digging: vectorization actually worked on my Mac! So my best guess is that I got a ~20% slowdown because I only have 2 lanes on it, so the `trueCount != LONG_SPECIES.length()` is much less likely on my Mac than on my Linux desktop which has 4 lanes, and this hurts more than it helps compared to the naive linear scan. (See https://github.com/apache/lucene/pull/13692#issuecomment-2324658146 for some stats about how far `advance()` needs to go within a block.) For now I disabled the optimization on machines which have less than 4 lanes, I'll try to run benchmarks on more CPUs to confirm it's not only helpful on my desktop CPU (AMD Ryzen 9 3900X). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org