Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

via GitHub Thu, 18 Jul 2024 06:30:35 -0700


ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236530280


   > > With the updated compile flags, the performance of auto-vectorized code 
is slightly better than explicitly vectorized code (see results). Interesting 
thing to note is that both C-based implementations have `10X` better throughout 
compared to the Panama API based java implementation (unless I am not doing 
apples-to-apples comparison).
   > 
   > This seems correct to me.
   
   Same. The performance benefit is large here. We do similar in Elasticsearch, 
and have impls for both AArch64 and x64. I remember @rmuir making a similar 
comment before about the auto-vectorization - which is correct. I avoided it at 
the time given the toolchain that we were using, but it's a good option which 
I'll reevaluate.
   
   
https://github.com/elastic/elasticsearch/blob/main/libs/simdvec/native/src/vec/c/aarch64/vec.c
   
   > The java Vector API is not performant for the integer case. Hotspot 
doesn't much understand ARM at all and definitely doesn't even have 
instructions such as `SDOT` in its vocabulary at all.
   
   ++ Yes. This is not a good state of affairs. I'll make sure to get an issue 
filed with OpenJDK for it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

Reply via email to