ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752723993
@rmuir Building on your idea, and focusing again on the x64 case, I get a bit of a boost by just converting directly to int (rather than the short dance). On my Rocket Lake, AVX 512, I get the following results: ``` Benchmark (size) Mode Cnt Score Error Units BinaryDotProductBenchmark.dotProductNew 1024 thrpt 5 20.675 ± 0.051 ops/us BinaryDotProductBenchmark.dotProductNewNew 1024 thrpt 5 22.705 ± 0.015 ops/us BinaryDotProductBenchmark.dotProductOld 1024 thrpt 5 3.174 ± 0.113 ops/us ``` From ... ``` @Benchmark public int dotProductNewNew() { .. if (vectorSize >= 256) { // optimized 256/512 bit implementation, processes 8/16 bytes at a time by converting from 8/16 bytes to 8/16 ints int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length); IntVector acc = IntVector.zero(IntVector.SPECIES_PREFERRED); for (; i < upperBound; i += PREFERRED_BYTE_SPECIES.length()) { ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i); ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i); Vector<Integer> va32 = va8.convertShape(VectorOperators.B2I, IntVector.SPECIES_PREFERRED, 0); Vector<Integer> vb32 = vb8.convertShape(VectorOperators.B2I, IntVector.SPECIES_PREFERRED, 0); Vector<Integer> prod32 = va32.mul(vb32); acc = acc.add(prod32); } // reduce res += acc.reduceLanes(VectorOperators.ADD); } else { .. ``` Trying a hand unrolled version, unrolling 4x, I see no perf benefits - the numbers remain the same. So I just left it out, for the sake of simplicity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org