rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2230061569

   I haven't benchmarked, just seems `SDOT` is the one to optimize for, and GCC 
can both recognize the code shape and autovectorize to it without hassle. 
   
   my cheap 2021 phone has `asimddp` feature in /proc/cpuinfo, dot product 
support seems widespread.
   
   You can use it directly via intrinsic, too, no need to use add/multiply 
intrinsic: 
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#dot-product
   
   But unless it is really faster than what GCC does with simple C, no need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to