Inspired by Ard Biesheuvel's RFC patches [1] for accelerating
carry-less multiply under emulation.
This is less polished than the AES patch set:
(1) Should I split HAVE_CLMUL_ACCEL into per-width HAVE_CLMUL{N}_ACCEL?
The "_generic" and "_accel" split is different from aes-round.h
because of the difference in support for different widths, and it
means that each host accel has more boilerplate.
(2) Should I bother trying to accelerate anything other than 64x64->128?
That seems to be the one that GSM really wants anyway. I'd keep all
of the sizes implemented generically, since that centralizes the 3
target implementations.
(3) The use of Int128 isn't fantastic -- better would be a vector type,
though that has its own special problems for ppc64le (see the
endianness hoops within aes-round.h). Perhaps leave things in
env memory, like I was mostly able to do with AES?
(4) No guest test case(s).
r~
[1] https://patchew.org/QEMU/[email protected]/
Richard Henderson (18):
crypto: Add generic 8-bit carry-less multiply routines
target/arm: Use clmul_8* routines
target/s390x: Use clmul_8* routines
target/ppc: Use clmul_8* routines
crypto: Add generic 16-bit carry-less multiply routines
target/arm: Use clmul_16* routines
target/s390x: Use clmul_16* routines
target/ppc: Use clmul_16* routines
crypto: Add generic 32-bit carry-less multiply routines
target/arm: Use clmul_32* routines
target/s390x: Use clmul_32* routines
target/ppc: Use clmul_32* routines
crypto: Add generic 64-bit carry-less multiply routine
target/arm: Use clmul_64
target/s390x: Use clmul_64
target/ppc: Use clmul_64
host/include/i386: Implement clmul.h
host/include/aarch64: Implement clmul.h
host/include/aarch64/host/cpuinfo.h | 1 +
host/include/aarch64/host/crypto/clmul.h | 230 +++++++++++++++++++++++
host/include/generic/host/crypto/clmul.h | 28 +++
host/include/i386/host/cpuinfo.h | 1 +
host/include/i386/host/crypto/clmul.h | 187 ++++++++++++++++++
host/include/x86_64/host/crypto/clmul.h | 1 +
include/crypto/clmul.h | 123 ++++++++++++
target/arm/tcg/vec_internal.h | 11 --
crypto/clmul.c | 163 ++++++++++++++++
target/arm/tcg/mve_helper.c | 16 +-
target/arm/tcg/vec_helper.c | 112 ++---------
target/ppc/int_helper.c | 63 +++----
target/s390x/tcg/vec_int_helper.c | 175 +++++++----------
util/cpuinfo-aarch64.c | 4 +-
util/cpuinfo-i386.c | 1 +
crypto/meson.build | 9 +-
16 files changed, 865 insertions(+), 260 deletions(-)
create mode 100644 host/include/aarch64/host/crypto/clmul.h
create mode 100644 host/include/generic/host/crypto/clmul.h
create mode 100644 host/include/i386/host/crypto/clmul.h
create mode 100644 host/include/x86_64/host/crypto/clmul.h
create mode 100644 include/crypto/clmul.h
create mode 100644 crypto/clmul.c
--
2.34.1