jpountz commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2652267868

   > #current bpv=24 gets vectorized on the shift loop, but not for the 
remainder loop.
   
   This is an interesting observation. I wonder if a small refactoring could 
help it get auto-vectorized? E.g. what if we applied the `0xFF` mask to 
`scratch` in the shift loop rather than the remainder loop? Or if we split the 
remainder loop into 3 loops, one for each 8 bits that get contributed to the 
value?
   
   Sorry for pushing, but if we could get auto-vectorization to do the right 
thing, then this would automatically benefit all users, not only those who 
enable the vector module.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to