On Thursday, March 20, 2014 at 06:45:17 PM, chandramouli narayanan wrote:
> This git patch adds x86_64 AVX2 optimization of SHA1
> transform to crypto support. The patch has been tested with 3.14.0-rc1
> kernel.
> 
> On a Haswell desktop, with turbo disabled and all cpus running
> at maximum frequency, tcrypt shows AVX2 performance improvement
> from 3% for 256 bytes update to 16% for 1024 bytes update over
> AVX implementation.
> 
> This patch adds sha1_avx2_transform(), the glue, build and
> configuration changes needed for AVX2 optimization of
> SHA1 transform to crypto support.
> 
> sha1-ssse3 is one module which adds the necessary optimization
> support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With
> better optimization support, transform function is overridden as the case
> may be. In the case of AVX2, due to performance reasons across datablock
> sizes, the AVX or AVX2 transform function is used at run-time as it suits
> best. The Makefile change therefore appends the necessary objects to the
> linkage. Due to this, the patch merely appends AVX2 transform to the
> existing build mix and Kconfig support and leaves the configuration build
> support as is.
> 
> Signed-off-by: Chandramouli Narayanan <mo...@linux.intel.com>
> ---
>  arch/x86/crypto/Makefile               |   3 +
>  arch/x86/crypto/sha1_avx2_x86_64_asm.S | 702
> +++++++++++++++++++++++++++++++++ arch/x86/crypto/sha1_ssse3_glue.c      |
>  50 ++-
>  crypto/Kconfig                         |   4 +-
>  4 files changed, 750 insertions(+), 9 deletions(-)
>  create mode 100644 arch/x86/crypto/sha1_avx2_x86_64_asm.S

The changelog is missing completely now ;-)
[...]

> +#include <linux/linkage.h>
> +
> +#define CTX  %rdi    /* arg1 */
> +#define BUF  %rsi    /* arg2 */
> +#define CNT  %rdx    /* arg3 */
> +
> +#define REG_A        %ecx
> +#define REG_B        %esi
> +#define REG_C        %edi
> +#define REG_D        %eax
> +#define REG_E        %edx
> +#define REG_TB  %ebx
> +#define REG_TA  %r12d
> +#define REG_RA  %rcx
> +#define REG_RB  %rsi
> +#define REG_RC  %rdi
> +#define REG_RD  %rax
> +#define REG_RE  %rdx
> +#define REG_RTA %r12
> +#define REG_RTB %rbx
> +#define REG_T1  %ebp

You're still mixing spaces and tabs here ...

[...]

> +     /* Align stack */
> +        mov     %rsp, %rbx
> +        and     $(0x1000-1), %rbx
> +        sub     $(8+32), %rbx
> +        sub     %rbx, %rsp
> +        push    %rbx
> +        sub     $RESERVE_STACK, %rsp
> +
> +        avx2_zeroupper
> +
> +     lea     K_XMM_AR(%rip), K_BASE

The indent here is really flying all around ;-)

Why don't you just check for "^ \+" and replace them with tabs ? That'd solve 
your indent problem rather quickly. Moreover, you can just use:

[TAB]<insn>[TAB]arg1, arg2...

This would solve the problem where your instruction arguments are not well 
indented.

Uh guys, Peter or Herbert, please stop me if I'm pushing too much.

[...]
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to