[ 
https://issues.apache.org/jira/browse/LUCENE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302157#comment-17302157
 ] 

Ankur commented on LUCENE-9838:
-------------------------------

This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading [OpenJDK 
build 
16+36-2231|https://download.java.net/java/GA/jdk16/7863447f0ab643c585b9bdebf67c69db/36/GPL/openjdk-16_osx-x64_bin.tar.gz]
 and setting up a standalone [JMH benchmark|https://github.com/openjdk/jmh] 
project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark                  (size)   Mode  Cnt    Score   Error   Units
MyBenchmark.dotProductOld      16  thrpt    5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew      16  thrpt    5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld      32  thrpt    5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew      32  thrpt    5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld      64  thrpt    5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew      64  thrpt    5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld     128  thrpt    5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew     128  thrpt    5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld     256  thrpt    5    8.360 ± 0.125  ops/us
MyBenchmark.dotProductNew     256  thrpt    5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld     512  thrpt    5    4.231 ± 0.158  ops/us
MyBenchmark.dotProductNew     512  thrpt    5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld    1024  thrpt    5    2.104 ± 0.093  ops/us
MyBenchmark.dotProductNew    1024  thrpt    5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*

 

 

 

 

 

 

> simd version of VectorUtil.dotProduct
> -------------------------------------
>
>                 Key: LUCENE-9838
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9838
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9838.patch
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup to LUCENE-9837
> Let's explore using JDK 16 vector API to speed this up more. It might be a 
> hassle to try to MR-JAR/package up for users (adding commandline flags and 
> stuff), but it gives good performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to