https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121778

--- Comment #2 from Dusan Stojkovic <[email protected]> ---
Created attachment 62402
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62402&action=edit
A patch which contains a pattern to Improve rotation detection for rv32gc_zbb

This patch helps with detection for rv32 because the combine pass tries this
special case:

...

Trying 8, 10, 9 -> 11:
    8: r141:SI=r138:SI 0>>0x1f
   10: r143:SI=r138:SI<<0x1
      REG_DEAD r138:SI
    9: r142:SI=r141:SI^0x1
      REG_DEAD r141:SI
   11: r140:SI=r142:SI|r143:SI
      REG_DEAD r143:SI
      REG_DEAD r142:SI
Failed to match this instruction:
(set (reg:SI 140 [ _5 ])
    (ior:SI (ashift:SI (reg/v:SI 138 [ aD.2872 ])
            (const_int 1 [0x1]))
        (ge:SI (reg/v:SI 138 [ aD.2872 ])
            (const_int 0 [0]))))

...

With this patch, this combine attempt works:

...

Trying 6, 8, 7 -> 9:
    6: r140:SI=r138:SI 0>>0x1f
    8: r142:SI=r138:SI<<0x1
      REG_DEAD r138:SI
    7: r141:SI=r140:SI^0x1
      REG_DEAD r140:SI
    9: r139:SI=r141:SI|r142:SI
      REG_DEAD r142:SI
      REG_DEAD r141:SI
Successfully matched this instruction:
(set (reg:SI 139 [ _5 ])
    (ior:SI (ashift:SI (reg/v:SI 138 [ a ])
            (const_int 1 [0x1]))
        (ge:SI (reg/v:SI 138 [ a ])
            (const_int 0 [0]))))
allowing combination of insns 6, 7, 8 and 9
original costs 4 + 4 + 4 + 4 = 16
replacement cost 16
deferring deletion of insn with uid = 8.
deferring deletion of insn with uid = 7.
deferring deletion of insn with uid = 6.
modifying insn i3     9: r139:SI=r138:SI<<0x1|r138:SI>=0
      REG_DEAD r138:SI
deferring rescan insn with uid = 9.

...

And so the output from GCC is:

test_011:
        rori    a0,a0,31
        xori    a0,a0,1
        ret

Compiled with rv32gc_zbb -O2.

Some notes regarding the pattern chosen:
* The combine attempt which the pattern matches only appears because logically:
(xor (lshiftrt A 31) 1)  ==  (A >= 0 ? 1 : 0), this canonic representation is
equvalent.
* The dump for rv64 doesn't try this combine attempt, so one direction might be
to expand simplify-rtx to try this combination for rv64 by transforming:

(xor:DI
  (zero_extend:DI
    (lshiftrt:SI (subreg/s/u:SI (reg/v:DI 138 [ aD.2452 ]) 0)
                 (const_int 31)))
  (const_int 1))

into:

(zero_extend:DI
  (ge:SI (subreg:SI (reg/v:DI 138 [ aD.2452 ]) 0)
         (const_int 0)))

or something similar.

Reply via email to