rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747044386

   As far as the ARM goes, the fact it has only 128-bit SIMD is the limiting 
factor.
   
   For e.g. AVX-256, we use 64-bit vector of 8 byte values -> 128 bit vector of 
8 short values -> 256 bit vector of 8 int values.
   
   For ARM/NEON with only 128-bit, we can't do this as we don't have 256-bit 
vectors. So instead we use use 64-bit vector of 8 byte values -> 128 bit vector 
of 8 short values -> 2 128-bit vectors of 4 short values each. It requires 
splitting the vector in half, it is just all we can do. 
   
   If you want it to be faster get an ARM with SVE SIMD which has bigger 
vectors than NEON.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to