Re: [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-05-01 Thread Tim Chen
On Tue, 2013-04-30 at 11:27 +0800, Herbert Xu wrote: > On Mon, Apr 29, 2013 at 01:40:30PM -0700, Tim Chen wrote: > > > > If I allocate the transform under the mod init instead, how can I make > > sure that the fast version is already registered if I have it compiled > > in? It is not clear to me h

[PATCH v3 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction

2013-05-01 Thread Tim Chen
This is the x86_64 CRC T10 DIF transform accelerated with the PCLMULQDQ instructions. Details discussing the implementation can be found in the paper: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction" http://www.intel.com/content/dam/www/public/us/en/documents/white-paper

[PATCH v3 4/4] Simple correctness and speed test for CRCT10DIF hash

2013-05-01 Thread Tim Chen
These are simple tests to do sanity check of CRC T10 DIF hash. The correctness of the transform can be checked with the command modprobe tcrypt mode=47 The speed of the transform can be evaluated with the command modprobe tcrypt mode=320 Set the cpu frequency to constant and turn

[PATCH v3 3/4] Glue code to cast accelerated CRCT10DIF assembly as a crypto transform

2013-05-01 Thread Tim Chen
Glue code that plugs the PCLMULQDQ accelerated CRC T10 DIF hash into the crypto framework. The config CRYPTO_CRCT10DIF_PCLMUL should be turned on to enable the feature. The crc_t10dif crypto library function will use this faster algorithm when crct10dif_pclmul module is loaded. Signed-off-by: Ti

[PATCH v3 1/4] Wrap crc_t10dif function all to use crypto transform framework

2013-05-01 Thread Tim Chen
When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by: Tim Chen --- crypto/Kconfig |

[PATCH v3 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation

2013-05-01 Thread Tim Chen
Currently the CRC-T10DIF checksum is computed using a generic table lookup algorithm. By switching the checksum to PCLMULQDQ based computation, we can speedup the computation by 8x for checksumming 512 bytes and even more for larger buffer size. This will improve performance of SCSI drivers turni

Crypto Update for 3.10

2013-05-01 Thread Herbert Xu
Hi Linus: Here is the crypto update for 3.10: * XTS mode optimisation for twofish/cast6/camellia/aes on x86. * AVX2/x86_64 implementation for blowfish/twofish/serpent/camellia. * SSSE3/AVX/AVX2 optimisations for sha256/sha512. * Added driver for SAHARA2 crypto accelerator. * Fix for GMAC when use