Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

via GitHub Fri, 19 Jul 2024 19:23:22 -0700


rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1685214866



##########
lucene/core/build.gradle:
##########
@@ -14,12 +14,59 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+plugins {
+  id "c"
+}
 
 apply plugin: 'java-library'
+apply plugin: 'c'
 
 description = 'Lucene core library'
+model {
+  toolChains {
+    gcc(Gcc) {
+      target("linux_aarch64"){
+        path '/usr/bin/'
+        cCompiler.executable 'gcc10-cc'
+        cCompiler.withArguments { args ->
+          args << "--shared"
+               << "-O3"
+               << "-march=armv8.2-a+dotprod"

Review Comment:
   oh, the other likely explanation on the performance is that the integer dot 
product in java is not AS HORRIBLE on the 256-bit SVE as it is on the 128-bit 
neon. it more closely resembles the logic of how it behaves on AVX-256: two 8x8 
bit integers ("64-bit vectors") are multiplied into intermediate 8x16-bit 
result (128-bit vector) and added to 8x32-bit (256-bit vector). Of course, it 
does not use SDOT instruction which is sad as it is CPU instruction intended 
precisely for this purpose.
   
   On the 128-bit neon there is not a possibility with java's vector api to 
process 4x8 bit integers ("32-bit vectors") like the SDOT instruction does: 
https://developer.arm.com/documentation/102651/a/What-are-dot-product-intructions-
   Nor is it even performant to take 64-bit vector and process "part 0" then 
"part 1". The situation is really sad, and the performance reflects that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

Reply via email to