On Sat, Dec 12, 2020 at 09:32:43AM +0100, Ard Biesheuvel wrote:
> Commit 86cd97ec4b943af3 ("crypto: arm/chacha-neon - optimize for non-block
> size multiples") refactored the chacha block handling in the glue code in
> a way that may result in the counter increment to be omitted when calling
> chacha_block_xor_neon() to process a full block. This violates the API,
> which requires that the output IV is suitable for handling more input as
> long as the preceding input has been presented in round multiples of the
> block size.

It appears that the library API actually requires that the counter be
incremented on partial blocks too.  See __chacha20poly1305_encrypt().

I guess the missing increment in chacha_doneon() just wasn't noticed before
because chacha20poly1305 only needs this behavior on 32-byte inputs, and
chacha_doneon() is only executed when the length is over 64 bytes.

> 
> So increment the counter after calling chacha_block_xor_neon().
> 
> Fixes: 86cd97ec4b943af3 ("crypto: arm/chacha-neon - optimize for non-block 
> size multiples")
> Reported-by: Eric Biggers <ebigg...@kernel.org>
> Signed-off-by: Ard Biesheuvel <a...@kernel.org>
> ---
>  arch/arm/crypto/chacha-glue.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/crypto/chacha-glue.c b/arch/arm/crypto/chacha-glue.c
> index 7b5cf8430c6d..f19e6da8cdd0 100644
> --- a/arch/arm/crypto/chacha-glue.c
> +++ b/arch/arm/crypto/chacha-glue.c
> @@ -60,6 +60,7 @@ static void chacha_doneon(u32 *state, u8 *dst, const u8 
> *src,
>               chacha_block_xor_neon(state, d, s, nrounds);
>               if (d != dst)
>                       memcpy(dst, buf, bytes);
> +             state[12] += 1;
>       }

Maybe write this as:

        state[12]++;

Reply via email to