On Jun 25, 2020, at 6:37 PM, Jeff Law <l...@redhat.com<mailto:l...@redhat.com>> wrote:
On Thu, 2020-06-25 at 15:46 -0400, Alan Lehotsky wrote: I’m working on a GCC 8.3 port to a load/store architecture with a 32-bit data-path between registers and memory; looking at the gcc.dg/loop-9.c test, I fail to pass because I have split the move of a double constant to memory into multiple moves (4 in fact, because I only have a 16-bit immediate mode.) The (define_insn_and_split “movdf” …) is conditioned on “reload_completed”. Is there some other trick I need get the constant hoisted. I have already set the rtx cost of the CONST_DOUBLE ridiculously high (like 10 insns) Hi Alan, it's been a long time... We'd probably need to set the RTL. A variety of things can get in the way of LICM. For example, I'd expect subregs to be problematical because they can look like RMW operations. jeff Hello to you too, Jeff…. I’ve been lurking for the last decade or so, last port I actually did was was GCC 4 based, so lots of new stuff to try and wrap my head around. I certainly am grateful for anybody with suggestions as to how to track down this problem (I’m not terribly eager to do a parallel stepping thru a x86 gcc in parallel with my port to see where they diverge in the loop-invariant recognition.) Although in crafting this expanded email, I see that the x86 has already decided to store the constant 18.4242 in the .rodata section by the start of loop-invariance so there’s a (set (reg:DF…. ) (mem:DF (symbol_ref ….))) and I bet that’s far easier to move out of the loop than it would be to split the original (set (mem:DF…) (const_double:DF ….)) — Al ========== Source code is void f (double *a) { int i; for (i = 0; i < 100; i++_ a[i] = 18.4242; } ========== Here’s the dump from loop-9.c.252r.loop2-invariant (compiled -O1) ;; Function f (f, funcdef_no=0, decl_uid=1458, cgraph_uid=0, symbol_order=0) *****starting processing of loop 1 ****** starting the processing of deferred insns ending the processing of deferred insns setting blocks to analyze 3, 5 starting the processing of deferred insns ending the processing of deferred insns df_analyze called df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 ( 0.33) df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 ( 0.33) df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 3 ( 0.5) starting region dump f Dataflow summary: def_info->table_size = 3, use_info->table_size = 23 ;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24 [acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc] 31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35 [scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3] ;; hardware regs used 23 [sp] 29 [arg] 39 [sfp] ;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; exit block uses 22 [a6] 23 [sp] 39 [sfp] ;; regs ever live 0 [d0] 30 [cc] ;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u} r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u} ;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns. ;; Reaching defs: ;; sparse invalidated ;; dense invalidated 0, 1 ;; reg->defs[] map: 30[0,1] 46[2,2] ;; bb 3 artificial_defs: { } ;; bb 3 artificial_uses: { u7(22){ }u8(23){ }u9(29){ }u10(39){ }} ;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 ;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 ;; lr def 30 [cc] 46 ;; live in 46 ;; live gen 30 [cc] 46 ;; live kill 30 [cc] ;; rd in (1) 46[2] ;; rd gen (2) 30[1],46[2] ;; rd kill (3) 30[0,1],46[2] ;; UD chains for artificial uses at top (code_label 11 7 8 3 2 (nil) [0 uses]) (note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK) ;; UD chains for insn luid 0 uid 9 ;; reg 46 { d2(bb 3 insn 10) } (insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15, offset: 0B]+0 S8 A32]) (const_double:DF 1.84241999999999990222931955941021442413330078125e+1 [0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf} (nil)) ;; UD chains for insn luid 1 uid 10 ;; reg 46 { d2(bb 3 insn 10) } (insn 10 9 12 3 (parallel [ (set (reg:SI 46 [ ivtmp___6 ]) (plus:SI (reg:SI 46 [ ivtmp___6 ]) (const_int 8 [0x8]))) (clobber (reg:CC 30 cc)) ]) 81 {addsi3_1v5} (expr_list:REG_UNUSED (reg:CC 30 cc) (nil))) ;; UD chains for insn luid 2 uid 12 ;; reg 46 { d2(bb 3 insn 10) } ;; reg 48 { } (insn 12 10 13 3 (set (reg:CCWZ 30 cc) (compare:CCWZ (reg:SI 46 [ ivtmp___6 ]) (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4} (nil)) ;; UD chains for insn luid 3 uid 13 ;; reg 30 { d1(bb 3 insn 12) } (jump_insn 13 12 18 3 (set (pc) (if_then_else (ne:CCWZ (reg:CCWZ 30 cc) (const_int 0 [0])) (label_ref:SI 18) (pc))) "loop-9.c":8 177 {jcc} (expr_list:REG_DEAD (reg:CCWZ 30 cc) (int_list:REG_BR_PROB 1063004412 (nil))) -> 18) ;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 ;; live out 46 ;; rd out (1) 46[2] ;; UD chains for artificial uses at bottom ;; reg 22 { } ;; reg 23 { } ;; reg 29 { } ;; reg 39 { } ;; bb 5 artificial_defs: { } ;; bb 5 artificial_uses: { u-1(22){ }u-1(23){ }u-1(29){ }u-1(39){ }} ;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 ;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; lr def ;; live in 46 ;; live gen ;; live kill ;; rd in (2) 30[1],46[2] ;; rd gen (0) ;; rd kill (0) ;; UD chains for artificial uses at top (code_label 18 13 17 5 3 (nil) [1 uses]) (note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK) ;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 ;; live out 46 ;; rd out (1) 46[2] ;; UD chains for artificial uses at bottom ;; reg 22 { } ;; reg 23 { } ;; reg 29 { } ;; reg 39 { } *****ending processing of loop 1 ****** starting the processing of deferred insns ending the processing of deferred insns f Dataflow summary: ;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24 [acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc] 31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35 [scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3] ;; hardware regs used 23 [sp] 29 [arg] 39 [sfp] ;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp] ;; exit block uses 22 [a6] 23 [sp] 39 [sfp] ;; regs ever live 0 [d0] 30 [cc] ;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u} r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u} ;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns. (note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 4 3 2 (set (reg:SI 46 [ ivtmp___6 ]) (reg:SI 0 d0 [ a ])) "loop-9.c":6 7 {movsi_internal} (expr_list:REG_DEAD (reg:SI 0 d0 [ a ]) (nil))) (note 3 2 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 3 11 2 (parallel [ (set (reg:SI 48 [ _17 ]) (plus:SI (reg:SI 46 [ ivtmp___6 ]) (const_int 800 [0x320]))) (clobber (reg:CC 30 cc)) ]) 81 {addsi3_1v5} (expr_list:REG_UNUSED (reg:CC 30 cc) (nil))) (code_label 11 7 8 3 2 (nil) [0 uses]) (note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15, offset: 0B]+0 S8 A32]) (const_double:DF 1.84241999999999990222931955941021442413330078125e+1 [0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf} (nil)) (insn 10 9 12 3 (parallel [ (set (reg:SI 46 [ ivtmp___6 ]) (plus:SI (reg:SI 46 [ ivtmp___6 ]) (const_int 8 [0x8]))) (clobber (reg:CC 30 cc)) ]) 81 {addsi3_1v5} (expr_list:REG_UNUSED (reg:CC 30 cc) (nil))) (insn 12 10 13 3 (set (reg:CCWZ 30 cc) (compare:CCWZ (reg:SI 46 [ ivtmp___6 ]) (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4} (nil)) (jump_insn 13 12 18 3 (set (pc) (if_then_else (ne:CCWZ (reg:CCWZ 30 cc) (const_int 0 [0])) (label_ref:SI 18) (pc))) "loop-9.c":8 177 {jcc} (expr_list:REG_DEAD (reg:CCWZ 30 cc) (int_list:REG_BR_PROB 1063004412 (nil))) -> 18) (code_label 18 13 17 5 3 (nil) [1 uses]) (note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (note 14 17 0 4 [bb 4] NOTE_INSN_BASIC_BLOCK)