[Bug target/90582] AArch64 stack-protector wastes an instruction on address-generation

pinskia at gcc dot gnu.org via Gcc-bugs Thu, 25 Jan 2024 00:48:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90582


Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> > I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on
> > all/most AArch64 microarchitectures, but someone should check.
> 
> It is similar as x86 with that respect on some cores (Marvell's cores
> mostly).
> That is ThunderX, ThunderX 2 and OcteonTX and OcteonTX2 all have the ability
> to do macro-combining of the two instructions into one micro-op.

Even on non-most Marvell cores now, subs/bne is better than eor/cbnz.


Anyways starting GCC 10.3/9.4  we get:
        ldr     x2, [x0]
        subs    x1, x1, x2
        mov     x2, 0
        bne     .L5

Which we can't fuse anyways.  I wonder if we should clobber x1 too.


Note for -fomit-frame-pointer issue, it is not really an issue as only
-momit-leaf-frame-pointer is turned on by default and now the function is NOT a
leaf function due to the call to __stack_chk_fail .

>        mov     x1,0                            # and destroy the reg
>        mov     w1, 3                           # right before it's already 
> destroyed

This is by design, GCC does not go back and figure out if we could remove the
zeroing as if it deletes it on accident, it might introduce a "security hole".
So emitting it always allows that NOT to happen.


As far as the other issue dealing with the address formation, it is a small
missed optmization and might not help in general or at all.

[Bug target/90582] AArch64 stack-protector wastes an instruction on address-generation

Reply via email to