gf2121 commented on PR #14203: URL: https://github.com/apache/lucene/pull/14203#issuecomment-2723963174
@jpountz Hi, do you have any idea how should we move forward on this optimization? several thoughts: * We can add another step32 for the hybrid-step decoding, which makes the code even more complex but resolves the concern that we might decrease the BKD leaf size in the future. * If the code of hybrid-step inner loop is too complex and single-step has performance issue, should we re-consider the original Vector API way? BTW, i got previous AVX512 results on a `Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz` chip. I see similar regression running `-XX:UseAVX=2` or `-XX:UseAVX=3`. I also trie some other machine with Intel chips and see the same result so it does not seem like a corner case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org