On Sat, Dec 12, 2020 at 09:32:43AM +0100, Ard Biesheuvel wrote:
> Commit 86cd97ec4b943af3 ("crypto: arm/chacha-neon - optimize for non-block
> size multiples") refactored the chacha block handling in the glue code in
> a way that may result in the counter increment to be omitted when calling
> chacha_block_xor_neon() to process a full block. This violates the API,
> which requires that the output IV is suitable for handling more input as
> long as the preceding input has been presented in round multiples of the
> block size.
It appears that the library API actually requires that the counter be
incremented on partial blocks too. See __chacha20poly1305_encrypt().
I guess the missing increment in chacha_doneon() just wasn't noticed before
because chacha20poly1305 only needs this behavior on 32-byte inputs, and
chacha_doneon() is only executed when the length is over 64 bytes.
>
> So increment the counter after calling chacha_block_xor_neon().
>
> Fixes: 86cd97ec4b943af3 ("crypto: arm/chacha-neon - optimize for non-block
> size multiples")
> Reported-by: Eric Biggers <[email protected]>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> arch/arm/crypto/chacha-glue.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/crypto/chacha-glue.c b/arch/arm/crypto/chacha-glue.c
> index 7b5cf8430c6d..f19e6da8cdd0 100644
> --- a/arch/arm/crypto/chacha-glue.c
> +++ b/arch/arm/crypto/chacha-glue.c
> @@ -60,6 +60,7 @@ static void chacha_doneon(u32 *state, u8 *dst, const u8
> *src,
> chacha_block_xor_neon(state, d, s, nrounds);
> if (d != dst)
> memcpy(dst, buf, bytes);
> + state[12] += 1;
> }
Maybe write this as:
state[12]++;