Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]

via GitHub Mon, 09 Oct 2023 03:11:09 -0700


ChrisHegarty commented on PR #12632:
URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752723993


   @rmuir 
   
   Building on your idea, and focusing again on the x64 case, I get a bit of a 
boost by just converting directly to int (rather than the short dance).
   
   On my Rocket Lake, AVX 512, I get the following results:
   
   ```
   Benchmark                                   (size)   Mode  Cnt   Score   
Error   Units
   BinaryDotProductBenchmark.dotProductNew       1024  thrpt    5  20.675 ± 
0.051  ops/us
   BinaryDotProductBenchmark.dotProductNewNew    1024  thrpt    5  22.705 ± 
0.015  ops/us
   BinaryDotProductBenchmark.dotProductOld       1024  thrpt    5   3.174 ± 
0.113  ops/us
   ```
   
    From ...
   ```
   @Benchmark
    public int dotProductNewNew() {
     ..
     if (vectorSize >= 256) {
       // optimized 256/512 bit implementation, processes 8/16 bytes at a time 
by converting from 8/16 bytes to 8/16 ints
       int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length);
       IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED);
       for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) {
         ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i);
         ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i);
         Vector<Integer> va32 = va8.convertShape(VectorOperators.B2I, 
IntVector.SPECIES_PREFERRED, 0);
         Vector<Integer> vb32 = vb8.convertShape(VectorOperators.B2I, 
IntVector.SPECIES_PREFERRED, 0);
         Vector<Integer> prod32 = va32.mul(vb32);
         acc = acc.add(prod32);
       }
       // reduce
       res += acc.reduceLanes(VectorOperators.ADD);
     } else { ..
   ```
   
   Trying a hand unrolled version, unrolling 4x, I see no perf benefits - the 
numbers remain the same. So I just left it out, for the sake of simplicity.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]

Reply via email to