https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119178

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=3507
           Keywords|                            |code-size
          Component|tree-optimization           |rtl-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
well the subtract case can be done even better for aarch64 than what you
recommend.

That is:
        sub     w0, w0, #1365
        cbz     w0, .L29

is better if written as:
        subs     w0, w0, #1365
        c.eq     .L29

Note that is recorded as PR 3507 already.

NOTE I suspect this optimization is worse in general when it comes to speed but
better for size.

For an example on RISCV we have:
        li      a5,1365
        beq     a0,a5,.L9
        xori    a0,a0,1365
        ret

vs
        xori    a0,a0,1365
        beq     a0,zero,.L15
        ret

While in the first case the li might be "free" and the beq/xori could be issue
together. While in the second case you have an extra depedency due to the xori
before the beq.

This might be decent for -Os though.

Reply via email to