Re: [PATCH v2] crypto: arm/chacha-neon - optimize for non-block size multiples

2020-12-12 Thread Eric Biggers
On Sat, Dec 12, 2020 at 08:24:24AM +0100, Ard Biesheuvel wrote: > On Sat, 12 Dec 2020 at 07:43, Eric Biggers wrote: > > > > Hi Ard, > > > > On Tue, Nov 03, 2020 at 05:28:09PM +0100, Ard Biesheuvel wrote: > > > @@ -42,24 +42,24 @@ static void chacha_doneon(u32 *state, u8 *dst, const > > > u8 *src,

Re: [PATCH v2] crypto: arm/chacha-neon - optimize for non-block size multiples

2020-12-12 Thread Ard Biesheuvel
On Sat, 12 Dec 2020 at 07:43, Eric Biggers wrote: > > Hi Ard, > > On Tue, Nov 03, 2020 at 05:28:09PM +0100, Ard Biesheuvel wrote: > > @@ -42,24 +42,24 @@ static void chacha_doneon(u32 *state, u8 *dst, const u8 > > *src, > > { > > u8 buf[CHACHA_BLOCK_SIZE]; > > > > - while (bytes >= CHA

Re: [PATCH v2] crypto: arm/chacha-neon - optimize for non-block size multiples

2020-12-11 Thread Eric Biggers
Hi Ard, On Tue, Nov 03, 2020 at 05:28:09PM +0100, Ard Biesheuvel wrote: > @@ -42,24 +42,24 @@ static void chacha_doneon(u32 *state, u8 *dst, const u8 > *src, > { > u8 buf[CHACHA_BLOCK_SIZE]; > > - while (bytes >= CHACHA_BLOCK_SIZE * 4) { > - chacha_4block_xor_neon(state,

Re: [PATCH v2] crypto: arm/chacha-neon - optimize for non-block size multiples

2020-11-12 Thread Herbert Xu
On Tue, Nov 03, 2020 at 05:28:09PM +0100, Ard Biesheuvel wrote: > The current NEON based ChaCha implementation for ARM is optimized for > multiples of 4x the ChaCha block size (64 bytes). This makes sense for > block encryption, but given that ChaCha is also often used in the > context of networkin

[PATCH v2] crypto: arm/chacha-neon - optimize for non-block size multiples

2020-11-03 Thread Ard Biesheuvel
The current NEON based ChaCha implementation for ARM is optimized for multiples of 4x the ChaCha block size (64 bytes). This makes sense for block encryption, but given that ChaCha is also often used in the context of networking, it makes sense to consider arbitrary length inputs as well. For exam