https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100694
HaoChen Gui <guihaoc at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |guihaoc at gcc dot gnu.org --- Comment #5 from HaoChen Gui <guihaoc at gcc dot gnu.org> --- (In reply to Segher Boessenkool from comment #4) > On aarch64 we have (in expand): > > ;; i_4 = i_3 << 64; > > (insn 10 9 11 (set (subreg:DI (reg/v:TI 94 [ i ]) 8) > (subreg:DI (reg/v:TI 93 [ i ]) 0)) "100694.c":4:6 -1 > (nil)) > > (insn 11 10 0 (set (subreg:DI (reg/v:TI 94 [ i ]) 0) > (const_int 0 [0])) "100694.c":4:6 -1 > (nil)) > > But on rs6000 we get: > > ;; i_4 = i_3 << 64; > > (insn 10 9 11 (set (subreg:DI (reg/v:TI 119 [ i ]) 0) > (ashift:DI (subreg:DI (reg/v:TI 118 [ i ]) 8) > (const_int 0 [0]))) "100694.c":4:6 -1 > (nil)) > > (insn 11 10 0 (set (subreg:DI (reg/v:TI 119 [ i ]) 8) > (const_int 0 [0])) "100694.c":4:6 -1 > (nil)) > > What the what. On rs6000, the insn 10 is optimized at forward propagation pass. test.c.261r.fwprop1: (insn 10 5 11 2 (set (subreg:DI (reg/v:TI 119 [ i ]) 8) (reg/v:DI 122 [ hi ])) "test.c":4:6 670 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 126 [ i ]) Seems aarch64 optimizes it at expand pass. Now the problem is "ior" operation is done with TImode on rs6000 while it is done with two subreg:DI on aarch64. The subreg pass can decomposes the register which is always used by subreg. If the ior is done with two subreg:DI on rs6000, it can be optimized by subreg pass. on rs6000: (insn 14 13 15 2 (set (reg:TI 125 [ i ]) (ior:TI (reg:TI 124 [ lo ]) (reg/v:TI 119 [ i ]))) "test.c":5:6 494 {*boolti3_internal} on aarch64 (insn 21 20 22 2 (set (reg:DI 100) (ior:DI (subreg:DI (reg:TI 99) 0) (subreg:DI (reg/v:TI 94 [ i ]) 0))) "/app/example.c":5:6 521 {iordi3} (insn 23 22 24 2 (set (reg:DI 101) (ior:DI (subreg:DI (reg:TI 99) 8) (subreg:DI (reg/v:TI 94 [ i ]) 8))) "/app/example.c":5:6 521 {iordi3}