Re: [PATCH PR69052]Check if loop inv can be propagated into mem ref with additional addr expr canonicalization

Jeff Law Wed, 10 Feb 2016 23:14:47 -0800

On 02/09/2016 04:08 AM, Bin Cheng wrote:

Hi,
When counting cost for loop inv, GCC checks if a loop inv can be propagated 
into its use site (a memory reference).  If it cannot be propagated, we 
increase its cost so that it's expensive enough to be hoisted out of loop.  
Currently we simply replace loop inv register in the use site with its 
definition expression, then call validate_changes to check if the result insn 
is valid.  This is weak because validate_changes doesn't take canonicalization 
into consideration.  Given below example:


   Loop inv def:
    69: r149:SI=r87:SI+const(unspec[`xxxx'] 1)
       REG_DEAD r87:SI
   Loop inv use:
    70: r150:SI=[r90:SI*0x4+r149:SI]
       REG_DEAD r149:SI

The address expression after propagation is "r90 * 0x4 + (r87 + const(unspec[`xxxx']))".  
Function validate_changes simply returns false to it.  As a matter of fact, the propagation is 
feasible if we canonicalize address expression into the form like "(r90 * 0x4 + r87) + 
const(unspec[`xxxx'])".

This patch fixes the problem by canonicalizing address expression and verifying 
if the new addr is valid.  The canonicalization follows GCC insn 
canonicalization rules.  The test case from bugzilla PR is also included.
As for the canonicalize_address interface, there is another 
canonicalize_address in fwprop.c which only changes shift into mult.  I think 
it would be good to factor out a common RTL interface in GCC, but that's stage1 
work.

Also note there's bits in combine that will canonicalize appropriateshifts into mults. Clearly there's a need for some generalized routinesto take a fairly generic address and perform canonicalizations andsimplifications on it.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-02-09  Bin Cheng<bin.ch...@arm.com>

        PR tree-optimization/69052
        * loop-invariant.c (canonicalize_address): New function.
        (inv_can_prop_to_addr_use): Check validity of address expression
        which is canonicalized by above function.

gcc/testsuite/ChangeLog
2016-02-09  Bin Cheng<bin.ch...@arm.com>

        PR tree-optimization/69052
        * gcc.target/i386/pr69052.c: New test.


pr69052-20160204.txt


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 707f044..157e273 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -754,6 +754,74 @@ create_new_invariant (struct def *def, rtx_insn *insn, 
bitmap depends_on,
    return inv;
  }

+/* Returns a canonical version address for X.  It identifies
+   addr expr in the form of A + B + C.  Following instruction
+   canonicalization rules, MULT operand is moved to the front,
+   CONST operand is moved to the end; also PLUS operators are
+   chained to the left.  */
+
+static rtx
+canonicalize_address (rtx x)
+{
+  rtx op0, op1, op2;
+  machine_mode mode = GET_MODE (x);
+  enum rtx_code code = GET_CODE (x);
+
+  if (code != PLUS)
+    return x;
+
+  /* Extract operands from A + B (+ C).  */
+  if (GET_CODE (XEXP (x, 0)) == PLUS)
+    {
+      op0 = XEXP (XEXP (x, 0), 0);
+      op1 = XEXP (XEXP (x, 0), 1);
+      op2 = XEXP (x, 1);
+    }
+  else if (GET_CODE (XEXP (x, 1)) == PLUS)
+    {
+      op0 = XEXP (x, 0);
+      op1 = XEXP (XEXP (x, 1), 0);
+      op2 = XEXP (XEXP (x, 1), 1);
+    }
+  else
+    {
+      op0 = XEXP (x, 0);
+      op1 = XEXP (x, 1);
+      op2 = NULL_RTX;
+    }
+
+  /* Move MULT operand to the front.  */
+  if (!REG_P (op1) && !CONST_INT_P (op1))
+    std::swap (op0, op1);

This feels a bit hack-ish in the sense that you already know the form ofthe RTL you're expecting and just assume that you'll be given somethingof that form, but no more complex.

ISTM you're better off walking the whole rtx, recording the tidbits asyou go into a vec. If you see something unexpected during that walk,you punt canonicalization of the whole expression.

You then sort the vec. You want to move things like MULT to the startand all the constants to the end I think.

You then do simplifications, particularly on the constants, but theremay be something useful to do with MULT terms as well. You could alsoarrange to rewrite ASHIFTs into MULTs at this stage.

Then you generate a new equivalent expression from the simplifiedoperands in the vec.


You might look at tree-ssa-reassoc for ideas on implementation details.

Initially just use it in the LICM code, but I think given that kind ofstructure it'd be generally useful to replace bits of combine and fwprop

If your contention is that only a few forms really matter, then I'd liketo see those forms spelled out better in the comment and some kind ofchecking that we have reasonable incoming RTL.

+
+  /* Move CONST operand to the end.  */
+  if (CONST_INT_P (op0))
+    std::swap (op0, op1);

You might want to check CONSTANT_P here. Maybe it doesn't matter inpractice, but things like (plus (plus (symbol-ref) (const_int) const_int))

That also gives you a fighting chance at extending this to handleHIGH/LO_SUM which are going to appear on the RISCy targets.

So I think the concept of making sure we're passing canonical RTL to theverification step is good, I'm a bit concerned about the implementationof the canonicalization step.


Jeff

Re: [PATCH PR69052]Check if loop inv can be propagated into mem ref with additional addr expr canonicalization

Reply via email to