Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack - V2

Jeff Law Thu, 08 Jul 2021 19:39:27 -0700


On 7/2/2021 10:13 AM, Jeff Law wrote:

This is a minor missed optimization we found with our internal port.

Given this code:

typedef struct {short a; short b;} T;

extern void g1();

void f(T s)
{
        if (s.a < 0)
                g1();
}
"s" is passed in a register, but it's still a BLKmode object becausethe alignment of T is smaller than the alignment that an integer ofthe same size would have (16 bits vs 32 bits).
Because "s" is BLKmode function.c is going to store it into a stackslot and we'll load it from the that slot for each reference. So onthe v850 (just to pick a port that likely has the same behavior we'reseeing) we have this RTL from CSE2:
(insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)
                (const_int -4 [0xfffffffffffffffc])) [2 S4 A32])
        (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 6 r6)
        (nil)))
(note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)
(insn 8 3 9 2 (set (reg:HI 44 [ s.a ])
        (mem/c:HI (plus:SI (reg/f:SI 34 .fp)
(const_int -4 [0xfffffffffffffffc])) [1 s.a+0 S2A32])) "j.c":7:5 3 {*movhi_internal}
     (nil))
(insn 9 8 10 2 (parallel [
            (set (reg:SI 45)
                (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)
                    (const_int 16 [0x10])))
            (clobber (reg:CC 32 psw))
        ]) "j.c":7:5 94 {ashlsi3_clobber_flags}
     (expr_list:REG_DEAD (reg:HI 44 [ s.a ])
        (expr_list:REG_UNUSED (reg:CC 32 psw)
            (nil))))
(insn 10 9 11 2 (parallel [
            (set (reg:SI 43)
                (ashiftrt:SI (reg:SI 45)
                    (const_int 16 [0x10])))
            (clobber (reg:CC 32 psw))
        ]) "j.c":7:5 104 {ashrsi3_clobber_flags}
     (expr_list:REG_DEAD (reg:SI 45)
        (expr_list:REG_UNUSED (reg:CC 32 psw)
            (nil))))
Insn 2 is the store into the stack. insn 8 is the load for s.a in theconditional. DSE1 replaces the MEM in insn 8 with (reg 6) since (reg6) has the value we want. After that the store at insn 2 is dead. Sadly DSE never removes the store.
The problem is RTL DSE considers a store with no MEM_EXPR as escaping,which keeps the MEM live. The lack of a MEM_EXPR is due to call tochange_address to twiddle the mode on the MEM for the store at insn2. It should be safe to copy the MEM_EXPR (which should always be aPARM_DECL) from the original memory to the memory returned bychange_address. Doing so results in DSE1 removing the store at insn 2.
It would be nice to remove the stack setup/teardown. I'm not offhandaware of mechanisms to remove the setup/teardown after we've alreadyallocated a slot, even if the slot is no longer used.
Bootstrapped and regression tested on x86, though I don't think that'sa particularly useful test. So I also ran it through my tester acrossthose pesky embedded targets without regressions as well.
I didn't include a test simply because I didn't want to have an insanetarget selector. I guess if we really wanted a test we could lookafter DSE1 is done and verify there aren't any MEMs left at all. Willing to try that if the consensus is we want this tested.
OK for the trunk?

So Richi questioned if using adjust_address rather than change_addresswas sufficient to solve the problem. It is. I've bootstrapped &regression tested on x86_64 and regression tested this against the usualset of crosses in the tester.


OK for the trunk now?

JEff

        * function.c (assign_parm_setup_block): Use adjust_address instead
        of change_address to preserve MEM_EXPR and friends.

diff --git a/gcc/function.c b/gcc/function.c
index 00b2fe70c7d..af3d57b32a3 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3036,7 +3036,15 @@ assign_parm_setup_block (struct assign_parm_data_all 
*all,
                  reg = gen_rtx_REG (word_mode, REGNO (entry_parm));
                  reg = convert_to_mode (mode, copy_to_reg (reg), 1);
                }
-             emit_move_insn (change_address (mem, mode, 0), reg);
+
+             /* We use adjust_address to get a new MEM with the mode
+                changed.  adjust_address is better than change_address
+                for this purpose because adjust_address does not lose
+                the MEM_EXPR associated with the MEM.
+
+                If the MEM_EXPR is lost, then optimizations like DSE
+                assume the MEM escapes and thus is not subject to DSE.  */
+             emit_move_insn (adjust_address (mem, mode, 0), reg);
            }
 
 #ifdef BLOCK_REG_PADDING

Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack - V2

Reply via email to