On Mon, 20 Mar 2023, Richard Biener wrote:

> On Fri, 17 Mar 2023, Jakub Jelinek wrote:
> 
> > On Fri, Mar 17, 2023 at 02:18:52PM +0000, Richard Biener wrote:
> > > > And as you show on the testcases, it probably isn't a good idea for
> > > > BUILT_IN_EXPECT* either.
> > > > 
> > > > So, perhaps use op_cfn_pass_through_arg1 for the ERF_RETURNS_ARG 
> > > > functions
> > > > and BUILT_IN_EXPECT* ?
> > > 
> > > But that already causes the problems (I didn't yet finish testing
> > > adding RET1 to BUILT_IN_EXPECT*).  Note FRE currently does not use
> > > returns_arg but I have old patches that do - but those replace
> > > uses after a RET1 function with the return value to also reduce
> > > spilling around a call (they of course CSE equal calls).
> > 
> > I meant in your patch drop the builtins.cc hunk and add from your
> > other patch
> > > > +    case CFN_BUILT_IN_EXPECT:
> > > > +    case CFN_BUILT_IN_EXPECT_WITH_PROBABILITY:
> > > > +      m_valid = true;
> > > > +      m_op1 = gimple_call_arg (call, 0);
> > > > +      m_int = &op_cfn_pass_through_arg1;
> > > > +      break;
> > 
> > hunk to gimple_range_op_handler::maybe_builtin_call.
> > Does that already cause the problems?
> 
> Yes, that's basically what the first variant of the patch did and
> the FAILs quoted are from testing that variant.
> 
> There are no extra fails from the second more generic patch also
> touching builtins.cc
> 
> > I mean, if we e.g. see that a range of the argument is singleton,
> > then it is fine to optimize the __builtin_expect away.
> 
> Yes, it's somewhat odd that we handle the - case but not the +:
> 
> @@ -204,6 +206,7 @@
>    _2 = b.0_1 < 0;
>    # RANGE [irange] long int [0, 1] NONZERO 0x1
>    _3 = (long int) _2;
> +  # RANGE [irange] long int [0, 1] NONZERO 0x1
>    _4 = __builtin_expect (_3, 0);
>    if (_4 != 0)
>      goto <bb 5>; [10.00%]
> ...
> @@ -224,13 +228,14 @@
>    goto <bb 3>; [100.00%]
>  
>    <bb 5> [local count: 977105059]:
> -  # _9 = PHI <_4(3), 0(4)>
> +  # RANGE [irange] long int [0, 1] NONZERO 0x1
> +  # _9 = PHI <1(3), 0(4)>
>    if (_9 != 0)
> -    goto <bb 6>; [10.00%]
> +    goto <bb 6>; [50.00%]
>    else
> -    goto <bb 7>; [90.00%]
> +    goto <bb 7>; [50.00%]
> 
> 
>  Predictions for bb 5
> -  first match heuristics: 10.00%
> -  combined heuristics: 10.00%
> -  __builtin_expect heuristics of edge 5->6: 10.00%
> +  no prediction heuristics: 50.00%
> +  combined heuristics: 50.00%
> 
> the dumps don't get any hints on where the first matchor
> __builtin_expect heuristic came from, but it looks like we
> run expr_expected_value_1 on _9 != 0 which recurses for _9
> and runs into the PHI handling which then looks for
> a common value into the PHI.  In this case _4 is said
> to be zero by PRED_BUILTIN_EXPECT and probability 9000.
> But that seems wrong - it looks simply at the __builtin_expect
> def here, not taking into account the edge this flows through
> (the unlikely edge).
> 
> If we look at the recorded edge predictions we instead see
> 
> $28 = {ep_next = 0x43e4c80, ep_edge = <edge 0x7ffff67e0f00 (3 -> 5)>, 
>   ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 1000}
> 
> so that's the same edge but unlikely now.  Of course there's
> no value recorded for this.  For the other edge into the PHI
> we get
> 
> $31 = {ep_next = 0x43c1e90, ep_edge = <edge 0x7ffff67e0f60 (4 -> 5)>, 
>   ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 9000}
> 
> so to me a more reasonable approach would be to record '0' from
> the 2nd edge with a probability of 9000 (or for the '+' case IL
> '1' with a probability of 1000).  There's possibly a way to
> combine predictor + value (here we'd simply take the more
> probable value, or the constant for a PHI).  I also see that
> we handle any PHI this way, not just PHIs defined in the same
> BB as the condition which op we're ultimatively interested in.
> 
> So my conclusion is that currently it's pure luck the testcase
> "works", and thus adjusting it and opening a bug with the
> above findings would be appropriate?
> 
> Honza?

I filed PR109210 and updated the patch with adjustment of the
two failing testcases.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

>From aac0ed731c49f74d7f363b80745150943b75bd4c Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Fri, 17 Mar 2023 13:14:49 +0100
Subject: [PATCH] tree-optimization/109170 - bogus use-after-free with
 __builtin_expect
To: gcc-patches@gcc.gnu.org

The following adds a missing range-op for __builtin_expect which
helps -Wuse-after-free to detect the case a realloc original
pointer is used when the result was NULL.  The implementation
should handle all argument one pass-through builtins we handle
in the fnspec machinery.

The gcc.dg/tree-ssa/ssa-lim-21.c testcase needs adjustment because

   for (int j = 0; j < m; j++)
     if (__builtin_expect (m, 0))
       for (int i = 0; i < m; i++)

is now correctly optimized to a unconditional jump by EVRP - m
cannot be zero when the outer loop is entered.  I've adjusted
the outer loop to iterate 'n' times which makes us apply store-motion
to 'count' and 'q->data1' but only out of the inner loop and
as expected not apply store motion to 'q->data' at all.

The gcc.dg/predict-20.c testcase relies on broken behavior of
profile estimation when trying to handle __builtin_expect values
flowing into PHI nodes.  I have opened PR109210 and removed
the expected matching from the testcase.

        PR tree-optimization/109170
        * gimple-range-op.cc (cfn_pass_through_arg1): New.
        (gimple_range_op_handler::maybe_builtin_call): Handle
        __builtin_expect and similar via cfn_pass_through_arg1
        and inspecting the calls fnspec.
        * builtins.cc (builtin_fnspec): Handle BUILT_IN_EXPECT
        and BUILT_IN_EXPECT_WITH_PROBABILITY.

        * gcc.dg/Wuse-after-free-pr109170.c: New testcase.
        * gcc.dg/tree-ssa/ssa-lim-21.c: Adjust.
        * gcc.dg/predict-20.c: Likewise.
---
 gcc/builtins.cc                               |  2 ++
 gcc/gimple-range-op.cc                        | 32 ++++++++++++++++++-
 .../gcc.dg/Wuse-after-free-pr109170.c         | 15 +++++++++
 gcc/testsuite/gcc.dg/predict-20.c             |  3 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c    |  7 ++--
 5 files changed, 54 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 90246e214d6..56545027297 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -11715,6 +11715,8 @@ builtin_fnspec (tree callee)
       case BUILT_IN_RETURN_ADDRESS:
        return ".c";
       case BUILT_IN_ASSUME_ALIGNED:
+      case BUILT_IN_EXPECT:
+      case BUILT_IN_EXPECT_WITH_PROBABILITY:
        return "1cX ";
       /* But posix_memalign stores a pointer into the memory pointed to
         by its first argument.  */
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index a5d625387e7..1a00f1690e5 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "range.h"
 #include "value-query.h"
 #include "gimple-range.h"
+#include "attr-fnspec.h"
 
 // Given stmt S, fill VEC, up to VEC_SIZE elements, with relevant ssa-names
 // on the statement.  For efficiency, it is an error to not pass in enough
@@ -309,6 +310,26 @@ public:
   }
 } op_cfn_constant_p;
 
+// Implement range operator for integral/pointer functions returning
+// the first argument.
+class cfn_pass_through_arg1 : public range_operator
+{
+public:
+  using range_operator::fold_range;
+  virtual bool fold_range (irange &r, tree, const irange &lh,
+                          const irange &, relation_trio) const
+  {
+    r = lh;
+    return true;
+  }
+  virtual bool op1_range (irange &r, tree, const irange &lhs,
+                         const irange &, relation_trio) const
+  {
+    r = lhs;
+    return true;
+  }
+} op_cfn_pass_through_arg1;
+
 // Implement range operator for CFN_BUILT_IN_SIGNBIT.
 class cfn_signbit : public range_operator_float
 {
@@ -967,6 +988,15 @@ gimple_range_op_handler::maybe_builtin_call ()
       break;
 
     default:
-      break;
+      {
+       unsigned arg;
+       if (gimple_call_fnspec (call).returns_arg (&arg) && arg == 0)
+         {
+           m_valid = true;
+           m_op1 = gimple_call_arg (call, 0);
+           m_int = &op_cfn_pass_through_arg1;
+         }
+       break;
+      }
     }
 }
diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c 
b/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c
new file mode 100644
index 00000000000..fa7dc66d66c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuse-after-free" } */
+
+unsigned long bufmax = 0;
+unsigned long __open_catalog_bufmax;
+void *realloc(void *, __SIZE_TYPE__);
+void free(void *);
+
+void __open_catalog(char *buf)
+{
+  char *old_buf = buf;
+  buf = realloc (buf, bufmax);
+  if (__builtin_expect ((buf == ((void *)0)), 0))
+    free (old_buf); /* { dg-bogus "used after" } */
+}
diff --git a/gcc/testsuite/gcc.dg/predict-20.c 
b/gcc/testsuite/gcc.dg/predict-20.c
index 31d01835b80..7bb0d411f88 100644
--- a/gcc/testsuite/gcc.dg/predict-20.c
+++ b/gcc/testsuite/gcc.dg/predict-20.c
@@ -16,8 +16,9 @@ c ()
        break;
     }
   int d = b < 0;
+  /* We fail to apply __builtin_expect heuristics here.  Se PR109210.  */
   if (__builtin_expect (d, 0))
     asm("");
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_expect heuristics of edge" 3 
"profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "__builtin_expect heuristics of edge" 2 
"profile_estimate" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c
index ffe6f8f699d..fe29e841f28 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c
@@ -17,7 +17,7 @@ void
 func (int m, int n, int k, struct obj *a)
 {
   struct obj *q = a;
-  for (int j = 0; j < m; j++)
+  for (int j = 0; j < n; j++)
     if (__builtin_expect (m, 0))
       for (int i = 0; i < m; i++)
        {
@@ -31,5 +31,6 @@ func (int m, int n, int k, struct obj *a)
        }
 }
 
-/* { dg-final { scan-tree-dump-not "Executing store motion of" "lim2"  }  } */
-
+/* { dg-final { scan-tree-dump "Executing store motion of count from loop 2" 
"lim2"  }  } */
+/* { dg-final { scan-tree-dump "Executing store motion of \[^ \]*data1 from 
loop 2" "lim2"  }  } */
+/* { dg-final { scan-tree-dump-times "Executing store motion of" 2 "lim2"  }  
} */
-- 
2.35.3

Reply via email to