uschindler commented on PR #12667:
URL: https://github.com/apache/lucene/pull/12667#issuecomment-1761940947

   I ran all benchmarks in module mode (second line of assemble output) on my 
AVX-256 laptop:
   
   Prozessor:   Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz, 1992 MHz, 4 Kern(e), 
8 logische(r) Prozessor(en)
   
   ```
   C:\Users\Uwe Schindler\Projects\lucene\lucene>"C:\Program 
Files\Java\jdk-21\bin\java"  --module-path 
lucene\benchmark-jmh\build\benchmarks --module org.apache.lucene.benchmark.jmh
   [...]
   # Benchmark: 
org.apache.lucene.benchmark.jmh.VectorUtilBenchmark.floatSquareScalar
   # Parameters: (size = 1024)
   
   # Run progress: 90,63% complete, ETA 00:03:42
   # Fork: 1 of 1
   # Warmup Iteration   1: Okt. 13, 2023 6:31:40 PM 
org.apache.lucene.internal.vectorization.VectorizationProvider lookup
   WARNUNG: Java vector incubator module is not readable. For optimal vector 
performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.
   [...]
   # Benchmark: 
org.apache.lucene.benchmark.jmh.VectorUtilBenchmark.floatSquareVector
   # Parameters: (size = 1024)
   
   # Run progress: 98,96% complete, ETA 00:00:24
   # Fork: 1 of 1
   WARNING: Using incubator modules: jdk.incubator.vector
   # Warmup Iteration   1: Okt. 13, 2023 6:34:58 PM 
org.apache.lucene.internal.vectorization.PanamaVectorizationProvider <init>
   INFORMATION: Java vector incubator API enabled; uses preferredBitSize=256
   [...]
   
   Benchmark                                   (size)   Mode  Cnt    Score    
Error   Units
   VectorUtilBenchmark.binaryCosineScalar           1  thrpt    5  111,963 ± 
55,067  ops/us
   VectorUtilBenchmark.binaryCosineScalar         128  thrpt    5    6,607 ±  
0,530  ops/us
   VectorUtilBenchmark.binaryCosineScalar         207  thrpt    5    4,297 ±  
0,268  ops/us
   VectorUtilBenchmark.binaryCosineScalar         256  thrpt    5    3,591 ±  
0,098  ops/us
   VectorUtilBenchmark.binaryCosineScalar         300  thrpt    5    2,831 ±  
0,761  ops/us
   VectorUtilBenchmark.binaryCosineScalar         512  thrpt    5    1,749 ±  
0,261  ops/us
   VectorUtilBenchmark.binaryCosineScalar         702  thrpt    5    1,272 ±  
0,439  ops/us
   VectorUtilBenchmark.binaryCosineScalar        1024  thrpt    5    0,846 ±  
0,212  ops/us
   VectorUtilBenchmark.binaryCosineVector           1  thrpt    5  116,594 ± 
19,379  ops/us
   VectorUtilBenchmark.binaryCosineVector         128  thrpt    5   23,696 ±  
0,971  ops/us
   VectorUtilBenchmark.binaryCosineVector         207  thrpt    5   15,562 ±  
1,261  ops/us
   VectorUtilBenchmark.binaryCosineVector         256  thrpt    5   15,580 ±  
0,818  ops/us
   VectorUtilBenchmark.binaryCosineVector         300  thrpt    5   10,589 ±  
9,402  ops/us
   VectorUtilBenchmark.binaryCosineVector         512  thrpt    5    8,864 ±  
1,360  ops/us
   VectorUtilBenchmark.binaryCosineVector         702  thrpt    5    5,632 ±  
0,152  ops/us
   VectorUtilBenchmark.binaryCosineVector        1024  thrpt    5    4,033 ±  
0,966  ops/us
   VectorUtilBenchmark.binaryDotProductScalar       1  thrpt    5  276,530 ± 
38,515  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     128  thrpt    5   13,190 ±  
0,303  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     207  thrpt    5    8,590 ±  
0,332  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     256  thrpt    5    6,982 ±  
0,256  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     300  thrpt    5    6,007 ±  
0,243  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     512  thrpt    5    3,463 ±  
0,433  ops/us
   VectorUtilBenchmark.binaryDotProductScalar     702  thrpt    5    2,602 ±  
0,063  ops/us
   VectorUtilBenchmark.binaryDotProductScalar    1024  thrpt    5    1,755 ±  
0,073  ops/us
   VectorUtilBenchmark.binaryDotProductVector       1  thrpt    5  154,801 ± 
45,755  ops/us
   VectorUtilBenchmark.binaryDotProductVector     128  thrpt    5   50,450 ± 
10,559  ops/us
   VectorUtilBenchmark.binaryDotProductVector     207  thrpt    5   30,656 ±  
1,151  ops/us
   VectorUtilBenchmark.binaryDotProductVector     256  thrpt    5   30,256 ±  
1,618  ops/us
   VectorUtilBenchmark.binaryDotProductVector     300  thrpt    5   23,890 ±  
6,478  ops/us
   VectorUtilBenchmark.binaryDotProductVector     512  thrpt    5   16,696 ±  
0,571  ops/us
   VectorUtilBenchmark.binaryDotProductVector     702  thrpt    5   11,718 ±  
0,265  ops/us
   VectorUtilBenchmark.binaryDotProductVector    1024  thrpt    5    8,760 ±  
0,194  ops/us
   VectorUtilBenchmark.binarySquareScalar           1  thrpt    5  251,177 ± 
83,185  ops/us
   VectorUtilBenchmark.binarySquareScalar         128  thrpt    5   11,902 ±  
1,279  ops/us
   VectorUtilBenchmark.binarySquareScalar         207  thrpt    5    7,244 ±  
2,344  ops/us
   VectorUtilBenchmark.binarySquareScalar         256  thrpt    5    5,975 ±  
1,489  ops/us
   VectorUtilBenchmark.binarySquareScalar         300  thrpt    5    5,089 ±  
0,309  ops/us
   VectorUtilBenchmark.binarySquareScalar         512  thrpt    5    3,139 ±  
0,205  ops/us
   VectorUtilBenchmark.binarySquareScalar         702  thrpt    5    2,325 ±  
0,200  ops/us
   VectorUtilBenchmark.binarySquareScalar        1024  thrpt    5    1,586 ±  
0,032  ops/us
   VectorUtilBenchmark.binarySquareVector           1  thrpt    5  179,243 ± 
12,767  ops/us
   VectorUtilBenchmark.binarySquareVector         128  thrpt    5   41,748 ±  
1,302  ops/us
   VectorUtilBenchmark.binarySquareVector         207  thrpt    5   25,865 ±  
0,939  ops/us
   VectorUtilBenchmark.binarySquareVector         256  thrpt    5   25,354 ±  
1,070  ops/us
   VectorUtilBenchmark.binarySquareVector         300  thrpt    5   20,371 ±  
0,653  ops/us
   VectorUtilBenchmark.binarySquareVector         512  thrpt    5   14,283 ±  
0,631  ops/us
   VectorUtilBenchmark.binarySquareVector         702  thrpt    5    9,980 ±  
0,344  ops/us
   VectorUtilBenchmark.binarySquareVector        1024  thrpt    5    6,684 ±  
3,338  ops/us
   VectorUtilBenchmark.floatCosineScalar            1  thrpt    5  190,660 ±  
5,937  ops/us
   VectorUtilBenchmark.floatCosineScalar          128  thrpt    5    7,029 ±  
0,202  ops/us
   VectorUtilBenchmark.floatCosineScalar          207  thrpt    5    4,424 ±  
0,116  ops/us
   VectorUtilBenchmark.floatCosineScalar          256  thrpt    5    3,473 ±  
1,401  ops/us
   VectorUtilBenchmark.floatCosineScalar          300  thrpt    5    3,144 ±  
0,048  ops/us
   VectorUtilBenchmark.floatCosineScalar          512  thrpt    5    1,653 ±  
0,030  ops/us
   VectorUtilBenchmark.floatCosineScalar          702  thrpt    5    1,210 ±  
0,037  ops/us
   VectorUtilBenchmark.floatCosineScalar         1024  thrpt    5    0,795 ±  
0,049  ops/us
   VectorUtilBenchmark.floatCosineVector            1  thrpt    5  132,462 ±  
7,447  ops/us
   VectorUtilBenchmark.floatCosineVector          128  thrpt    5   26,174 ±  
0,621  ops/us
   VectorUtilBenchmark.floatCosineVector          207  thrpt    5   15,948 ±  
2,942  ops/us
   VectorUtilBenchmark.floatCosineVector          256  thrpt    5   17,445 ±  
2,915  ops/us
   VectorUtilBenchmark.floatCosineVector          300  thrpt    5   14,293 ±  
1,994  ops/us
   VectorUtilBenchmark.floatCosineVector          512  thrpt    5   11,711 ±  
0,523  ops/us
   VectorUtilBenchmark.floatCosineVector          702  thrpt    5    8,415 ±  
0,228  ops/us
   VectorUtilBenchmark.floatCosineVector         1024  thrpt    5    6,859 ±  
0,244  ops/us
   VectorUtilBenchmark.floatDotProductScalar        1  thrpt    5  211,648 ±  
9,064  ops/us
   VectorUtilBenchmark.floatDotProductScalar      128  thrpt    5   16,845 ±  
4,619  ops/us
   VectorUtilBenchmark.floatDotProductScalar      207  thrpt    5   11,864 ±  
0,254  ops/us
   VectorUtilBenchmark.floatDotProductScalar      256  thrpt    5    9,471 ±  
0,368  ops/us
   VectorUtilBenchmark.floatDotProductScalar      300  thrpt    5    8,323 ±  
0,262  ops/us
   VectorUtilBenchmark.floatDotProductScalar      512  thrpt    5    4,850 ±  
0,147  ops/us
   VectorUtilBenchmark.floatDotProductScalar      702  thrpt    5    3,622 ±  
0,182  ops/us
   VectorUtilBenchmark.floatDotProductScalar     1024  thrpt    5    2,412 ±  
0,188  ops/us
   VectorUtilBenchmark.floatDotProductVector        1  thrpt    5  186,701 ±  
9,767  ops/us
   VectorUtilBenchmark.floatDotProductVector      128  thrpt    5   53,626 ± 
12,555  ops/us
   VectorUtilBenchmark.floatDotProductVector      207  thrpt    5   31,024 ±  
0,692  ops/us
   VectorUtilBenchmark.floatDotProductVector      256  thrpt    5   38,246 ±  
0,693  ops/us
   VectorUtilBenchmark.floatDotProductVector      300  thrpt    5   26,674 ±  
0,843  ops/us
   VectorUtilBenchmark.floatDotProductVector      512  thrpt    5   22,742 ±  
0,608  ops/us
   VectorUtilBenchmark.floatDotProductVector      702  thrpt    5   15,759 ±  
0,346  ops/us
   VectorUtilBenchmark.floatDotProductVector     1024  thrpt    5   13,512 ±  
0,549  ops/us
   VectorUtilBenchmark.floatSquareScalar            1  thrpt    5  306,048 ±  
3,798  ops/us
   VectorUtilBenchmark.floatSquareScalar          128  thrpt    5   13,377 ±  
0,446  ops/us
   VectorUtilBenchmark.floatSquareScalar          207  thrpt    5    7,939 ±  
0,447  ops/us
   VectorUtilBenchmark.floatSquareScalar          256  thrpt    5    6,760 ±  
0,396  ops/us
   VectorUtilBenchmark.floatSquareScalar          300  thrpt    5    5,415 ±  
0,220  ops/us
   VectorUtilBenchmark.floatSquareScalar          512  thrpt    5    3,261 ±  
0,956  ops/us
   VectorUtilBenchmark.floatSquareScalar          702  thrpt    5    2,394 ±  
0,072  ops/us
   VectorUtilBenchmark.floatSquareScalar         1024  thrpt    5    1,686 ±  
0,043  ops/us
   VectorUtilBenchmark.floatSquareVector            1  thrpt    5  182,811 ± 
27,177  ops/us
   VectorUtilBenchmark.floatSquareVector          128  thrpt    5   53,693 ±  
2,607  ops/us
   VectorUtilBenchmark.floatSquareVector          207  thrpt    5   26,490 ±  
0,620  ops/us
   VectorUtilBenchmark.floatSquareVector          256  thrpt    5   29,780 ±  
2,791  ops/us
   VectorUtilBenchmark.floatSquareVector          300  thrpt    5   22,809 ±  
0,417  ops/us
   VectorUtilBenchmark.floatSquareVector          512  thrpt    5   18,750 ±  
1,414  ops/us
   VectorUtilBenchmark.floatSquareVector          702  thrpt    5   13,374 ±  
0,513  ops/us
   VectorUtilBenchmark.floatSquareVector         1024  thrpt    5   11,284 ±  
0,261  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to