On Wed, Nov 25, 2020 at 08:22:16AM +0100, Ard Biesheuvel wrote:
> ARM Cortex-A72 cores running in 32-bit mode are affected by a silicon
> erratum (1655431: ELR recorded incorrectly on interrupt taken between
> cryptographic instructions in a sequence [0]) where the second instruction
> of a AES instruction pair may execute twice if an interrupt is taken right
> after the first instruction consumes an input register of which a single
> 32-bit lane has been updated the last time it was modified.
> 
> This is not such a rare occurrence as it may seem: in counter mode, only
> the least significant 32-bit word is incremented in the absence of a
> carry, which makes our counter mode implementation susceptible to the
> erratum.
> 
> So let's shuffle the counter assignments around a bit so that the most
> recent updates when the AES instruction pair executes are 128-bit wide.
> 
> [0] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice
> 
> Cc: <sta...@vger.kernel.org> # v5.4+
> Signed-off-by: Ard Biesheuvel <a...@kernel.org>
> ---
>  arch/arm/crypto/aes-ce-core.S | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S
> index 4d1707388d94..c0ef9680d90b 100644
> --- a/arch/arm/crypto/aes-ce-core.S
> +++ b/arch/arm/crypto/aes-ce-core.S
> @@ -386,20 +386,20 @@ ENTRY(ce_aes_ctr_encrypt)
>  .Lctrloop4x:
>       subs            r4, r4, #4
>       bmi             .Lctr1x
> -     add             r6, r6, #1
> +     add             ip, r6, #1
>       vmov            q0, q7
> +     rev             ip, ip
> +     add             lr, r6, #2
> +     vmov            s31, ip
> +     add             ip, r6, #3
> +     rev             lr, lr
>       vmov            q1, q7
> -     rev             ip, r6
> -     add             r6, r6, #1
> +     vmov            s31, lr
> +     rev             ip, ip
>       vmov            q2, q7
> -     vmov            s7, ip
> -     rev             ip, r6
> -     add             r6, r6, #1
> +     vmov            s31, ip
> +     add             r6, r6, #4
>       vmov            q3, q7
> -     vmov            s11, ip
> -     rev             ip, r6
> -     add             r6, r6, #1
> -     vmov            s15, ip
>       vld1.8          {q4-q5}, [r1]!
>       vld1.8          {q6}, [r1]!
>       vld1.8          {q15}, [r1]!

Seems like this could use a comment that explains that things need to be done in
a certain way to avoid an erratum.

- Eric

Reply via email to