https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119178
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=3507 Keywords| |code-size Component|tree-optimization |rtl-optimization --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- well the subtract case can be done even better for aarch64 than what you recommend. That is: sub w0, w0, #1365 cbz w0, .L29 is better if written as: subs w0, w0, #1365 c.eq .L29 Note that is recorded as PR 3507 already. NOTE I suspect this optimization is worse in general when it comes to speed but better for size. For an example on RISCV we have: li a5,1365 beq a0,a5,.L9 xori a0,a0,1365 ret vs xori a0,a0,1365 beq a0,zero,.L15 ret While in the first case the li might be "free" and the beq/xori could be issue together. While in the second case you have an extra depedency due to the xori before the beq. This might be decent for -Os though.