11 Regression] X86 unoptimal code for float equallity comparison followed by jump

jakub at gcc dot gnu.org via Gcc-bugs Wed, 09 Dec 2020 11:10:54 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98212


Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, for the corge case, do_compare_rtx_and_jump is called with EQ,
if_false_label non-NULL, if_true_label NULL (i.e. fall through) and prob of
10%.
The target doesn't support it and so x == y is being split into
x ord y && x u== y.
first_prob is computed as 10.9% and prob 91.74%, it is expanded as
if (x unord y) goto if_false_label; // prob of goto 89.1%
if (x u== y) goto dummy; // prob of goto 91.74%
goto if_false_label;
dummy:;

1133                  profile_probability cprob
1134                    = profile_probability::guessed_always ();
1135                  if (first_code == UNORDERED)
1136                    cprob = cprob.apply_scale (1, 100);
1137                  else if (first_code == ORDERED)
1138                    cprob = cprob.apply_scale (99, 100);
1139                  else
1140                    cprob = profile_probability::even ();
1141                  /* We want to split:
1142                     if (x) goto t; // prob;
1143                     into
1144                     if (a) goto t; // first_prob;
1145                     if (b) goto t; // prob;
1146                     such that the overall probability of jumping to t
1147                     remains the same and first_prob is prob * cprob.  */
1148                  if (and_them)
...
1151                      prob = prob.invert ();
1152                      profile_probability first_prob = prob.split
(cprob).invert ();
1153                      prob = prob.invert ();

The comment only describes how !and_them looks like, for and_them it is
actually:
if (x) goto t; // prob;
goto f;
into:
if (a) goto f; // 1 - first_prob;
if (b) goto t; // prob;
goto f;
(at least for the case where we have both f and t).
prob.split is documented as:
     Split *THIS (ORIG) probability into 2 probabilities, such that
     the returned one (FIRST) is *THIS * CPROB and *THIS is
     adjusted (SECOND) so that FIRST + FIRST.invert () * SECOND
     == ORIG.  This is useful e.g. when splitting a conditional
     branch like:
     if (cond)
       goto lab; // ORIG probability
     into
     if (cond1)
       goto lab; // FIRST = ORIG * CPROB probability
     if (cond2)
       goto lab; // SECOND probability
So, if we talk about baz above with prob of jumping to t 90%, the computed
first_prob is 1-((1-90%)*99%), so 90.1%, which means if (op0 unord op1) goto f;
with 8.9% probability.
For corge prob of jumping to t is 10%, the computed first_prob is
1-((1-10%)*99%), so 10.9% which means
if (op0 unord op1) goto f; // with 89.1% probability.  NaNs are never that
common!
if (op0 uneq op1) goto t; // with 91.74% probability.
I think the right thing would be to for these < 50% initial prop and ORDERED
first_code would be to use something like:
profile_probability first_prob = prob.split (cprob).invert ();
without the two prob = prob.invert (); around it.

With:
--- gcc/dojump.c.jj     2020-12-09 15:11:17.042888002 +0100
+++ gcc/dojump.c        2020-12-09 20:05:59.535234206 +0100
@@ -1148,9 +1148,15 @@ do_compare_rtx_and_jump (rtx op0, rtx op
              if (and_them)
                {
                  rtx_code_label *dest_label;
-                 prob = prob.invert ();
-                 profile_probability first_prob = prob.split (cprob).invert
();
-                 prob = prob.invert ();
+                 profile_probability first_prob;
+                 if (prob < profile_probability::even ())
+                   first_prob = prob.split (cprob).invert ();
+                 else
+                   {
+                     prob = prob.invert ();
+                     first_prob = prob.split (cprob).invert ();
+                     prob = prob.invert ();
+                   }
                  /* If we only jump if true, just bypass the second jump.  */
                  if (! if_false_label)
                    {
I get the output I'm looking for, so bar/baz/qux stay the same and corge
becomes:
corge:  ucomiss %xmm1, %xmm0
        jp      .L14
        je      .L18
.L14:   ret
.L18:   jmp     foo

Honza, what do you think?

[Bug rtl-optimization/98212] [10/11 Regression] X86 unoptimal code for float equallity comparison followed by jump

Reply via email to