All,

When implementing ARM/Thumb support for -freorder-blocks-and-partition I
encountered the following silent code generation fault.

Given the following CFG:

     |              |
    93             97
     |              |
 (FALLTHRU)    (CROSSING)
     \             /
      \----\  /---/
            \/
            94
             |

Basic Block 94 has the following insn in it which we want to lift into
blocks 93 and 97:

(insn/v 62 767 63 94 (set (reg:SI 3 r3 [orig:1468 ivtmp.85 ] [1468])
        (mem/c:SI (plus:SI (reg/f:SI 13 sp)
                (const_int 20 [0x14])) [7 %sfp+-52 S4 A32]))

For block 93 this becomes a move from r0 to r3 - and everything is OK.

For block 97 there is no appropriate move so the compiler tries
to copy the load, and insert it on the edge from 97 to 94.  This edge is
a crossing edge, and so is implemented by an indirect jump:

(insn 2795 2590 3940 97 (set (reg:SI 3 r3 [2464])
        (mem/u/c:SI (symbol_ref/u:SI ("*.LC19") [flags 0x2]) [2 S4
A32])) 634 {*arm_movsi_vfp}
     (insn_list:REG_LABEL_OPERAND 887 (expr_list:REG_EQUIV (label_ref:SI 887)
            (nil))))
(jump_insn 2796 3940 2593 97 (set (pc)
        (reg:SI 3 r3 [2464])) 264 {*arm_indirect_jump}
     (expr_list:REG_CROSSING_JUMP (nil)
        (nil)))

The compiler tries to insert the copy of insn 62 (in this case it
becomes insn 3940) immediately before the jump_insn - which because this
is a crossing edge is implemented as an indirect jump using a register
in the ARM backend:

(insn 2795 2590 3940 97 (set (reg:SI 3 r3 [2464])
        (mem/u/c:SI (symbol_ref/u:SI ("*.LC19") [flags 0x2]) [2 S4
A32])) 634 {*arm_movsi_vfp}
     (insn_list:REG_LABEL_OPERAND 887 (expr_list:REG_EQUIV (label_ref:SI 887)
            (nil))))
(insn 3940 2795 2796 97 (set (reg:SI 3 r3 [orig:1468 ivtmp.85 ] [1468])
        (mem/c:SI (plus:SI (reg/f:SI 13 sp)
                (const_int 20 [0x14])) [7 %sfp+-52 S4 A32])) -1
     (nil))
(jump_insn 2796 3940 2593 97 (set (pc)
        (reg:SI 3 r3 [2464])) 264 {*arm_indirect_jump}
     (expr_list:REG_CROSSING_JUMP (nil)
        (nil)))

However, this is incorrect as insn 3940 sets r3, and the jump_insn
2796 wants to use r3 (as set by 2795).

The patch fixes this by checking that the register set by the load is
not used by the jump before allowing the load to be lifted.

Whilst this fix works for this particular case I am not sure it is the
best fix for the general issue, and so if others have a better idea how
to fix this I would be very happy.

In particular I wonder whether we should be defining
TARGET_CANNOT_MODIFY_JUMPS_P for the ARM backend as indirect jumps use
registers in a similar way to the SH backend.  Not that this would have helped 
in this particular instance.

Tested cross arm-linux-none-gnueabi with in progress ARM -freorder-blocks-and-
partition enabling patch.

OK?

Thanks,

Matt

gcc/ChangeLog:

2012-09-05  Matthew Gretton-Dann  <matthew.gretton-d...@linaro.org>

        * postreload-gcse.c (eliminate_partially_redundant_load): Ensure
        that loads are not lifted over branches which use the register
        loaded.


diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c
index b464d1f..85fb9b3 100644
--- a/gcc/postreload-gcse.c
+++ b/gcc/postreload-gcse.c
@@ -1048,6 +1048,13 @@ eliminate_partially_redundant_load (basic_block bb, rtx
          /* Adding a load on a critical edge will cause a split.  */
          if (EDGE_CRITICAL_P (pred))
            critical_edge_split = true;
+
+         /* If the destination register is used at the BB end we can not
+            insert the load.  */
+         if (reg_used_between_p (dest, PREV_INSN (BB_END (pred_bb)),
+                                 next_pred_bb_end))
+           goto cleanup;
+
          not_ok_count += pred->count;
          unoccr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
                                                    sizeof (struct unoccr));

-- 
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-d...@linaro.org

Reply via email to