On Mon, Jan 11, 2021 at 05:52:30PM +0100, Ard Biesheuvel wrote: > CRC-T10DIF is a very poor match for the crypto API: > - every user in the kernel calls it via a library wrapper around the > shash API, so all callers share a single instance of the transform > - each architecture provides at most a single optimized implementation, > based on SIMD instructions for carryless multiplication > > In other words, none of the flexibility it provides is put to good use, > and so it is better to get rid of this complexity, and expose the optimized > implementations via static calls instead. This removes a substantial chunk > of code, and also gets rid of any indirect calls on architectures that > obsess about those (x86) > > If this approach is deemed suitable, there are other places where we might > consider adopting it: CRC32 and CRC32(C). > > Patch #1 does some preparatory refactoring and removes the library wrapper > around the shash transform. > > Patch #2 introduces the actual static calls, along with the registration > routines to update the crc-t10dif implementation at runtime. > > Patch #3 updates the generic CRC-T10DIF shash driver so it distinguishes > between the optimized library code and the generic library code. > > Patches #4 to #7 update the various arch implementations to switch over to > the new method. > > Special request to Peter to take a look at patch #2, and in particular, > whether synchronize_rcu_tasks() is sufficient to ensure that a module > providing the target of a static call can be unloaded safely. > > Cc: "Martin K. Petersen" <martin.peter...@oracle.com> > Cc: Eric Biggers <ebigg...@google.com> > Cc: Peter Zijlstra <pet...@infradead.org> > > Ard Biesheuvel (7): > crypto: crc-t10dif - turn library wrapper for shash into generic > library > crypto: lib/crc-t10dif - add static call support for optimized > versions > crypto: generic/crc-t10dif - expose both arch and generic shashes > crypto: x86/crc-t10dif - convert to static call library API > crypto: arm/crc-t10dif - convert to static call library API > crypto: arm64/crc-t10dif - convert to static call API > crypto: powerpc/crc-t10dif - convert to static call API > > arch/arm/crypto/Kconfig | 2 +- > arch/arm/crypto/crct10dif-ce-glue.c | 58 ++------ > arch/arm64/crypto/Kconfig | 3 +- > arch/arm64/crypto/crct10dif-ce-glue.c | 85 ++--------- > arch/powerpc/crypto/crct10dif-vpmsum_glue.c | 51 +------ > arch/x86/crypto/crct10dif-pclmul_glue.c | 90 ++---------- > crypto/Kconfig | 7 +- > crypto/Makefile | 2 +- > crypto/crct10dif_common.c | 82 ----------- > crypto/crct10dif_generic.c | 100 +++++++++---- > include/linux/crc-t10dif.h | 21 ++- > lib/Kconfig | 2 - > lib/crc-t10dif.c | 152 +++++++++----------- > 13 files changed, 204 insertions(+), 451 deletions(-) > delete mode 100644 crypto/crct10dif_common.c
There is already a library API for two other hash functions, BLAKE2s and Poly1305, which takes advantage of architecture-specific implementations without using static calls. Also, those algorithms are likewise also exposed through the shash API, but in a different way from what this patchset proposes. Is there a reason not to do things in the same way? What are the advantages of the new approach that you're proposing? - Eric