7 Regression] Non-optimal signed comparison on x86_64 since r146817

matz at gcc dot gnu.org Thu, 06 Oct 2016 08:34:36 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77881


--- Comment #4 from Michael Matz <matz at gcc dot gnu.org> ---
Actually, it's merely a deficiency in current combine not simplifying
intermediate expressions enough.  One of the things that need to happen is 
the following transformation:

(compare:CCZ (subreg:QI (lshiftrt:DI (reg:DI 95)
            (const_int 63 [0x3f])) 0)
    (const_int 0 [0]))

--> (remove subreg)

(compare:CCZ (lshiftrt:DI (reg:DI 95)
            (const_int 63 [0x3f]))
    (const_int 0 [0]))

--> (recognize that it's a sign bit extract)

(ge (reg:DI 95) (const_int 0))

(or 'lt', doesn't matter, depends on the outer code by which the compare:CCZ
result itself is compared).

In current combine.c this requires two steps, the removal of the irrelevant
subreg (irrelevant in this specific context), and then the recognition of
the sign bit extraction.  With the first function combine has the chance
for two attempts of simplification because we have a sequence of three 
isnstructions to start with:

  t1 = ~a
  t2 = 255 & (t1 >> 63)
  flags = t2 != 0

The intermediate expression is combine_simplify_rtx'ed twice, so both steps
above happen, and we get good code.  In the second function we only have
two instructions to start with (the not is missing):

  t2 = 255 & (a >> 63)
  flags = t2 == 0

Only the first step above happens, we're left with the (lshiftrt(subreg)) and
nothing simplifies this further before it tries to recognize the insn
(which ultimately fails).

The same can also be seen when artificially forcing the expression to
be a tad more complicated:

int bar2 (long long int a, long long int a2, int b) {
  if ((a+a2) < 0 || b)
    baz();
}

Here, we also get good code again, simply because combine can have two
attempts at the intermediate expression.

After some amount of tracing combine I've come up with the below patch.
The simplify_comparison function already contains loops that effectively
retry simplification after a change occured.  But the code that actually
removes
a useless subreg is after that loop.  Putting it into the loop as well
fixes the problem.  It can't be removed from the old place because between the
loop and subreg removal it might actually change the expression further
due to make_compound_operation (though I don't know if that really creates
subregs often).

combines normal facilities of not doing combines when intermediate results
are really used outside will take care of not creating useless code.

Index: combine.c
===================================================================
--- combine.c   (revision 235171)
+++ combine.c   (working copy)
@@ -11891,6 +11891,27 @@ simplify_comparison (enum rtx_code code,
          if (subreg_lowpart_p (op0)
              && GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) < mode_width)
            /* Fall through */ ;
+         else if (subreg_lowpart_p (op0)
+                  && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT
+                  && GET_MODE_CLASS (GET_MODE (SUBREG_REG (op0))) == MODE_INT
+                  && (code == NE || code == EQ)
+                  && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0)))
+                      <= HOST_BITS_PER_WIDE_INT)
+                  && !paradoxical_subreg_p (op0)
+                  && (nonzero_bits (SUBREG_REG (op0),
+                                    GET_MODE (SUBREG_REG (op0)))
+                      & ~GET_MODE_MASK (GET_MODE (op0))) == 0)
+           {
+             tem = gen_lowpart (GET_MODE (SUBREG_REG (op0)), op1);
+
+             if ((nonzero_bits (tem, GET_MODE (SUBREG_REG (op0)))
+                  & ~GET_MODE_MASK (GET_MODE (op0))) == 0)
+               {
+                 op0 = SUBREG_REG (op0), op1 = tem;
+                 continue;
+               }
+             break;
+           }
          else
            break;

[Bug target/77881] [5/6/7 Regression] Non-optimal signed comparison on x86_64 since r146817

Reply via email to