https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119178
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=3507
Keywords| |code-size
Component|tree-optimization |rtl-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
well the subtract case can be done even better for aarch64 than what you
recommend.
That is:
sub w0, w0, #1365
cbz w0, .L29
is better if written as:
subs w0, w0, #1365
c.eq .L29
Note that is recorded as PR 3507 already.
NOTE I suspect this optimization is worse in general when it comes to speed but
better for size.
For an example on RISCV we have:
li a5,1365
beq a0,a5,.L9
xori a0,a0,1365
ret
vs
xori a0,a0,1365
beq a0,zero,.L15
ret
While in the first case the li might be "free" and the beq/xori could be issue
together. While in the second case you have an extra depedency due to the xori
before the beq.
This might be decent for -Os though.