Re: [PATCH 01/11] Added macro to check for AVX2 feature.

2013-03-22 Thread H. Peter Anvin
Just syntactic overhead. We should probably discuss it among ourselves first though. Tim Chen wrote: >On Fri, 2013-03-22 at 17:21 -0700, H. Peter Anvin wrote: >> I really, really hate these macros... Not sure they are worth the >extra noise. >> > >I can do without the macro and I'll remove it

Re: [PATCH 01/11] Added macro to check for AVX2 feature.

2013-03-22 Thread Tim Chen
On Fri, 2013-03-22 at 17:21 -0700, H. Peter Anvin wrote: > I really, really hate these macros... Not sure they are worth the extra noise. > I can do without the macro and I'll remove it. Wonder the reason that such macro should be avoided? Tim -- To unsubscribe from this list: send the line "u

Re: [PATCH 01/11] Added macro to check for AVX2 feature.

2013-03-22 Thread H. Peter Anvin
I really, really hate these macros... Not sure they are worth the extra noise. Tim Chen wrote: >Macro to facilitate checking the availability of the AVX2 feature. > >Signed-off-by: Tim Chen >--- > arch/x86/include/asm/cpufeature.h | 1 + > 1 file changed, 1 insertion(+) > >diff --git a/arch/x86/

[PATCH 04/11] Optimized sha256 x86_64 assembly routine with AVX instructions.

2013-03-22 Thread Tim Chen
Provides SHA256 x86_64 assembly routine optimized with SSE and AVX instructions. Speedup of 60% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha256-avx-asm.S | 493 +++ 1 file changed, 493 insertions(+)

[PATCH 06/11] Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.

2013-03-22 Thread Tim Chen
We added glue code and config options to create crypto module that uses SSE/AVX/AVX2 optimized SHA256 x86_64 assembly routines. Signed-off-by: Tim Chen --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/sha256_ssse3_glue.c | 269 crypto/Kconfig

[PATCH 05/11] Optimized sha256 x86_64 routine using AVX2's RORX instructions

2013-03-22 Thread Tim Chen
Provides SHA256 x86_64 assembly routine optimized with SSE, AVX and AVX2's RORX instructions. Speedup of 70% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha256-avx2-asm.S | 769 ++ 1 file changed, 769

[PATCH 03/11] Optimized sha256 x86_64 assembly routine using Supplemental SSE3 instructions.

2013-03-22 Thread Tim Chen
Provides SHA256 x86_64 assembly routine optimized with SSSE3 instructions. Speedup of 40% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha256-ssse3-asm.S | 504 + 1 file changed, 504 insertions(+) crea

[PATCH 02/11] Expose SHA256 generic routine to be callable externally.

2013-03-22 Thread Tim Chen
Other SHA256 routine may need to use the generic routine when FPU is not available. Signed-off-by: Tim Chen --- crypto/sha256_generic.c | 11 ++- include/crypto/sha.h| 2 ++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/crypto/sha256_generic.c b/crypto/sha256_generi

[PATCH 07/11] Expose generic sha512 routine to be callable from other modules

2013-03-22 Thread Tim Chen
Other SHA512 routines may need to use the generic routine when FPU is not available. Signed-off-by: Tim Chen --- crypto/sha512_generic.c | 13 +++-- include/crypto/sha.h| 3 +++ 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/crypto/sha512_generic.c b/crypto/sha512_g

[PATCH 10/11] Optimized SHA512 x86_64 assembly routine using AVX2 RORX instruction.

2013-03-22 Thread Tim Chen
Provides SHA512 x86_64 assembly routine optimized with SSE, AVX and AVX2's RORX instructions. Speedup of 70% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha512-avx2-asm.S | 741 ++ 1 file changed, 741

[PATCH 11/11] Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.

2013-03-22 Thread Tim Chen
We added glue code and config options to create crypto module that uses SSE/AVX/AVX2 optimized SHA512 x86_64 assembly routines. Signed-off-by: Tim Chen --- arch/x86/crypto/Makefile| 2 + arch/x86/crypto/sha512_ssse3_glue.c | 276 crypto/Kconfig

[PATCH 08/11] Optimized SHA512 x86_64 assembly routine using Supplemental SSE3 instructions.

2013-03-22 Thread Tim Chen
Provides SHA512 x86_64 assembly routine optimized with SSSE3 instructions. Speedup of 40% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha512-ssse3-asm.S | 419 + 1 file changed, 419 insertions(+) crea

[PATCH 09/11] Optimized SHA512 x86_64 assembly routine using AVX instructions.

2013-03-22 Thread Tim Chen
Provides SHA512 x86_64 assembly routine optimized with SSE and AVX instructions. Speedup of 60% or more has been measured over the generic implementation. Signed-off-by: Tim Chen --- arch/x86/crypto/sha512-avx-asm.S | 420 +++ 1 file changed, 420 insertions(+)

[PATCH 01/11] Added macro to check for AVX2 feature.

2013-03-22 Thread Tim Chen
Macro to facilitate checking the availability of the AVX2 feature. Signed-off-by: Tim Chen --- arch/x86/include/asm/cpufeature.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 2d9075e..db98ec7 100644 --- a/arch/x86

[PATCH 00/11] Optimize SHA256 and SHA512 for Intel x86_64 with SSSE3, AVX or AVX2 instructions

2013-03-22 Thread Tim Chen
Herbert, The following patch series provides optimized SHA256 and SHA512 routines using the SSSE3, AVX or AVX2 instructions on x86_64 for Intel cpus. Depending on cpu capabilities, speedup between 40% to 70% or more can be achieved over the generic SHA256 and SHA512 routines. Tim Chen (11):

[PATCH -next] crypto: ux500 - fix error return code in hash_dma_final()

2013-03-22 Thread Wei Yongjun
From: Wei Yongjun Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in this function. Signed-off-by: Wei Yongjun --- drivers/crypto/ux500/hash/hash_core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/crypto/ux500/hash/hash_cor