HUSTERGS commented on PR #14896: URL: https://github.com/apache/lucene/pull/14896#issuecomment-3040499654
> I get the following results on an Apple M3 (ARM). The vectorized implementation is 10x slower than the scalar impls. I got similar result on my Mac (also ARM). Based on the result showed above, I'm wondering if we can add a flag just like `ENABLE_FIND_NEXT_GEQ_VECTOR_OPTO` so we can fallback to the default 'slower' implementation. It seems the boundary is also 256 bits (so the double vectory have 4 lanes), the same as `findNextGEQ`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org