> 
> On Sat, 2 Nov 2024, Sam James wrote:
> 
> > Some references to feed discussion which I had in my logs from
> > discussing it with someone the other week after that interaction we
> > had with alanc:
> > * https://github.com/llvm/llvm-project/issues/28790 (not very
> > insightful, other than to show it has a history of confusing people, but
> > one might argue that for other (definitely-valid) optimisations)
> > * https://github.com/llvm/llvm-project/issues/49826
> > * https://github.com/llvm/llvm-project/issues/51476
> 
> Thank you.
Indeed
> 
> I believe in this situation the onus is on us to prove that such transforms
> are correct.
> 
> Exhibit A:
> 
> POSIX semantics for malloc involve errno.

So if I can check errno to see if malloc failed, I guess even our
current behaviour of optimizing away paired malloc+free calls provided
that the return value is unused is problematic under POSIX same way as
the proposed patch.
> 
> Exhibit B:
> 
> malloc is supposed to give unique pointers (without any intervening free's).
> However, the proposed patch should be powerful enough to transform
> 
> void f()
> {
>     void *p = __builtin_malloc(1);
>     if (p) {
>         asm("");
>         f();
>     }
>     __builtin_free(p);
> }
> 
> to an infinite loop, but infinite amout of unique pointers is impossible.
> 
> In a similar vein, five mallocs, each allocating a quarter of address space,
> also seems impossible.
> 
> I am reminded that we take a pretty strict stance against enabling
> -fno-math-errno -fno-trapping-math, even though they gate a lot of useful
> vectorization. So it's a bit weird we're going so gung-ho with this malloc
> elimination.

The attached patch adds code to track size of allocated block and
disable the transformation when the block is not known to be smaller
then half of the address space by ranger.  We can do the runtime check
discussed on the top of that.

I have bootstrap&regresst running (it will still need testusite
compensation).  However I am not convinced this is a good direction.
this patch will still break the example where one allocates 5x 1/4th of
address space, leaves it unused, and verify that at least one of
allocation fails.

Reading the discussions and thinking of practical uses cases, it still
seems to me that portable C code can not anything useful by such probing
for malloc to fail.  One can not rely on the fact that once the
allocation succeeded it will succeed later, since OS may give the memory
to something else.

For things like glibc testuiste there is -fno-builtin-malloc and I think
it should use it (also for other functions we recognize as biultins),
since the tests are very likely not expecting compiler to change function calls
to different one.  Such as printf ("Hello %i",1) to puts ("Hello 1")

I can add -fno-malloc-dce, but I still think it would make sense to do
such transformation by default even for C code.

If we decide that malloc dce is not safe by default, perhaps we want
something like __builtin_optimizable_malloc (to pair
__builtin_operator_new) to be used in cases where we do not care for
malloc to fail after allocating and not using all of address space. Also
non-C languages (such as fortran) may use it by default.

In longer term I think we may want to do other optimizations of heap
allocations, such as merging multiple malloc blocks, moving small
allocations to stack or increasing alignment (turning malloc to aligned
allocs when it help vectorization just as we do with static variables).

Honza

gcc/ChangeLog:

        * tree-ssa-dce.cc (is_removable_allocation_p): Break out from ...;
        also determine size of block.
        (mark_stmt_if_obviously_necessary): ... here
        (checks_return_value_of_removable_allocation_p): New function.
        (mark_all_reaching_defs_necessary_1): Do not propagate through
        null pointer checks of allocations.
        (propagate_necessity): Likewise.
        (eliminate_unnecessary_stmts): Fol away null pointer checks of
        removed allocations.

diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index c44eff35b84..82a5b665ce9 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -240,6 +240,114 @@ mark_operand_necessary (tree op)
   worklist.safe_push (stmt);
 }
 
+/* Return true if STMT is a call to allocation function that can be
+   optimized out if the memory block is never used and only freed.
+ 
+   If true is returned and size1/size2 are non-NULL then
+   they are initialized to expression computing allocated block size
+   which is later used to compare with half of address space.  */
+
+static bool
+is_removable_allocation_p (gcall *stmt, tree *size1 = NULL, tree *size2 = NULL)
+{
+  tree callee = gimple_call_fndecl (stmt);
+  if (callee != NULL_TREE
+      && fndecl_built_in_p (callee, BUILT_IN_NORMAL))
+    switch (DECL_FUNCTION_CODE (callee))
+      {
+      case BUILT_IN_MALLOC:
+      case BUILT_IN_ALIGNED_ALLOC:
+      CASE_BUILT_IN_ALLOCA:
+      case BUILT_IN_GOMP_ALLOC:
+       if (size1)
+         {
+           *size1 = gimple_call_arg (stmt, 0);
+           *size2 = NULL;
+         }
+       return true;
+      case BUILT_IN_CALLOC:
+       if (size1)
+         {
+           *size1 = gimple_call_arg (stmt, 0);
+           *size2 = gimple_call_arg (stmt, 1);
+         }
+       return true;
+      case BUILT_IN_STRDUP:
+      case BUILT_IN_STRNDUP:
+       if (size1)
+         *size1 = *size2 = NULL_TREE;
+       return true;
+
+      default:;
+      }
+
+  if (callee != NULL_TREE
+      && flag_allocation_dce
+      && gimple_call_from_new_or_delete (stmt)
+      && DECL_IS_REPLACEABLE_OPERATOR_NEW_P (callee))
+    {
+      /* C++ standard allows removal of paired new/delete operators.  */
+      *size1 = *size2 = NULL_TREE;
+      return true;
+    }
+  return false;
+}
+
+/* Return true if STMT is a conditional
+     if (ptr != NULL)
+   where ptr was returned by a removable allocation function
+   and this check can be safely assumed to be true.  */
+
+static bool
+checks_return_value_of_removable_allocation_p (gimple *stmt)
+{
+  gcall *def_stmt;
+  tree size1, size2;
+  if (gimple_code (stmt) == GIMPLE_COND
+      && (gimple_cond_code (stmt) == EQ_EXPR
+         || gimple_cond_code (stmt) == NE_EXPR)
+      && integer_zerop (gimple_cond_rhs (stmt))
+      && TREE_CODE (gimple_cond_lhs (stmt)) == SSA_NAME
+      && (def_stmt = dyn_cast <gcall *>
+                     (SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt))))
+      && is_removable_allocation_p (def_stmt, &size1, &size2))
+   {
+     if (size1 && integer_onep (size1))
+       {
+        size1 = size2;
+        size2 = NULL_TREE;
+       }
+     if (size2 && integer_onep (size2))
+       size2 = NULL_TREE;
+     if (!size1 && !size2)
+       return true;
+     int_range_max vr;
+     if (!get_range_query (cfun)->range_of_expr (vr, size1, stmt)
+        || vr.undefined_p ()
+        || vr.varying_p ())
+       return false;
+     if (!size2)
+       return wi::leu_p (vr.upper_bound (),
+                         wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node)));
+     else
+       {
+        int_range_max vr2;
+        if (!get_range_query (cfun)->range_of_expr (vr2, size2, stmt)
+            || vr2.undefined_p ()
+            || vr2.varying_p ())
+          return false;
+       wi::overflow_type overflow;
+       wide_int size = wi::umul (vr.upper_bound (),  vr2.upper_bound (), 
&overflow);
+       if (overflow)
+         return false;
+       return wi::leu_p (size,
+                         wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node)));
+       }
+     return true;
+   }
+  return false;
+}
+
 
 /* Mark STMT as necessary if it obviously is.  Add it to the worklist if
    it can make other statements necessary.
@@ -271,38 +379,23 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool 
aggressive)
 
     case GIMPLE_CALL:
       {
+       gcall *call = as_a <gcall *> (stmt);
+
        /* Never elide a noreturn call we pruned control-flow for.  */
-       if ((gimple_call_flags (stmt) & ECF_NORETURN)
-           && gimple_call_ctrl_altering_p (stmt))
+       if ((gimple_call_flags (call) & ECF_NORETURN)
+           && gimple_call_ctrl_altering_p (call))
          {
-           mark_stmt_necessary (stmt, true);
+           mark_stmt_necessary (call, true);
            return;
          }
 
-       tree callee = gimple_call_fndecl (stmt);
-       if (callee != NULL_TREE
-           && fndecl_built_in_p (callee, BUILT_IN_NORMAL))
-         switch (DECL_FUNCTION_CODE (callee))
-           {
-           case BUILT_IN_MALLOC:
-           case BUILT_IN_ALIGNED_ALLOC:
-           case BUILT_IN_CALLOC:
-           CASE_BUILT_IN_ALLOCA:
-           case BUILT_IN_STRDUP:
-           case BUILT_IN_STRNDUP:
-           case BUILT_IN_GOMP_ALLOC:
-             return;
-
-           default:;
-           }
 
-       if (callee != NULL_TREE
-           && flag_allocation_dce
-           && DECL_IS_REPLACEABLE_OPERATOR_NEW_P (callee))
+       if (is_removable_allocation_p (call))
          return;
 
+
        /* For __cxa_atexit calls, don't mark as necessary right away. */
-       if (is_removable_cxa_atexit_call (stmt))
+       if (is_removable_cxa_atexit_call (call))
          return;
 
        /* IFN_GOACC_LOOP calls are necessary in that they are used to
@@ -311,9 +404,9 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool 
aggressive)
           survive from aggressive loop removal for it has loop exit and
           is assumed to be finite.  Therefore, we need to explicitly mark
           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
-       if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+       if (gimple_call_internal_p (call, IFN_GOACC_LOOP))
          {
-           mark_stmt_necessary (stmt, true);
+           mark_stmt_necessary (call, true);
            return;
          }
        break;
@@ -667,6 +760,8 @@ mark_all_reaching_defs_necessary_1 (ao_ref *ref 
ATTRIBUTE_UNUSED,
          case BUILT_IN_ALIGNED_ALLOC:
          case BUILT_IN_CALLOC:
          CASE_BUILT_IN_ALLOCA:
+         case BUILT_IN_STRDUP:
+         case BUILT_IN_STRNDUP:
          case BUILT_IN_FREE:
          case BUILT_IN_GOMP_ALLOC:
          case BUILT_IN_GOMP_FREE:
@@ -891,19 +986,11 @@ propagate_necessity (bool aggressive)
            {
              tree ptr = gimple_call_arg (stmt, 0);
              gcall *def_stmt;
-             tree def_callee;
              /* If the pointer we free is defined by an allocation
                 function do not add the call to the worklist.  */
              if (TREE_CODE (ptr) == SSA_NAME
                  && (def_stmt = dyn_cast <gcall *> (SSA_NAME_DEF_STMT (ptr)))
-                 && (def_callee = gimple_call_fndecl (def_stmt))
-                 && ((DECL_BUILT_IN_CLASS (def_callee) == BUILT_IN_NORMAL
-                      && (DECL_FUNCTION_CODE (def_callee) == 
BUILT_IN_ALIGNED_ALLOC
-                          || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_MALLOC
-                          || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC
-                          || DECL_FUNCTION_CODE (def_callee) == 
BUILT_IN_GOMP_ALLOC))
-                     || (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)
-                         && gimple_call_from_new_or_delete (def_stmt))))
+                 && is_removable_allocation_p (def_stmt))
                {
                  if (is_delete_operator
                      && !valid_new_delete_pair_p (def_stmt, stmt))
@@ -925,6 +1012,9 @@ propagate_necessity (bool aggressive)
                }
            }
 
+         if (checks_return_value_of_removable_allocation_p (stmt))
+           continue;
+
          FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
            mark_operand_necessary (use);
 
@@ -1379,7 +1469,6 @@ eliminate_unnecessary_stmts (bool aggressive)
   basic_block bb;
   gimple_stmt_iterator gsi, psi;
   gimple *stmt;
-  tree call;
   auto_vec<edge> to_remove_edges;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -1448,6 +1537,23 @@ eliminate_unnecessary_stmts (bool aggressive)
                    gimple_set_plf (stmt, STMT_NECESSARY, false);
                }
            }
+         /* Conditional checking that return value of allocation is non-NULL
+            can be turned to constant if the allocation itself
+            is unnecesary.  */
+         if (gimple_plf (stmt, STMT_NECESSARY)
+             && gimple_code (stmt) == GIMPLE_COND
+             && TREE_CODE (gimple_cond_lhs (stmt)) == SSA_NAME)
+           {
+             gimple *def_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt));
+             if (!gimple_nop_p (def_stmt)
+                 && !gimple_plf (def_stmt, STMT_NECESSARY))
+               {
+                 gcc_checking_assert
+                       (checks_return_value_of_removable_allocation_p (stmt));
+                 gimple_cond_set_lhs (as_a <gcond *>(stmt), integer_one_node);
+                 update_stmt (stmt);
+               }
+           }
 
          /* If GSI is not necessary then remove it.  */
          if (!gimple_plf (stmt, STMT_NECESSARY))
@@ -1482,11 +1588,11 @@ eliminate_unnecessary_stmts (bool aggressive)
              remove_dead_stmt (&gsi, bb, to_remove_edges);
              continue;
            }
-         else if (is_gimple_call (stmt))
+         else if (gcall *call_stmt = dyn_cast <gcall *> (stmt))
            {
-             tree name = gimple_call_lhs (stmt);
+             tree name = gimple_call_lhs (call_stmt);
 
-             notice_special_calls (as_a <gcall *> (stmt));
+             notice_special_calls (call_stmt);
 
              /* When LHS of var = call (); is dead, simplify it into
                 call (); saving one operand.  */
@@ -1496,36 +1602,30 @@ eliminate_unnecessary_stmts (bool aggressive)
                  /* Avoid doing so for allocation calls which we
                     did not mark as necessary, it will confuse the
                     special logic we apply to malloc/free pair removal.  */
-                 && (!(call = gimple_call_fndecl (stmt))
-                     || ((DECL_BUILT_IN_CLASS (call) != BUILT_IN_NORMAL
-                          || (DECL_FUNCTION_CODE (call) != 
BUILT_IN_ALIGNED_ALLOC
-                              && DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC
-                              && DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC
-                              && !ALLOCA_FUNCTION_CODE_P
-                              (DECL_FUNCTION_CODE (call))))
-                         && !DECL_IS_REPLACEABLE_OPERATOR_NEW_P (call))))
+                 && !is_removable_allocation_p (call_stmt))
                {
                  something_changed = true;
                  if (dump_file && (dump_flags & TDF_DETAILS))
                    {
                      fprintf (dump_file, "Deleting LHS of call: ");
-                     print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+                     print_gimple_stmt (dump_file, call_stmt, 0, TDF_SLIM);
                      fprintf (dump_file, "\n");
                    }
 
-                 gimple_call_set_lhs (stmt, NULL_TREE);
-                 maybe_clean_or_replace_eh_stmt (stmt, stmt);
-                 update_stmt (stmt);
+                 gimple_call_set_lhs (call_stmt, NULL_TREE);
+                 maybe_clean_or_replace_eh_stmt (call_stmt, call_stmt);
+                 update_stmt (call_stmt);
                  release_ssa_name (name);
 
                  /* GOMP_SIMD_LANE (unless three argument) or ASAN_POISON
                     without lhs is not needed.  */
-                 if (gimple_call_internal_p (stmt))
-                   switch (gimple_call_internal_fn (stmt))
+                 if (gimple_call_internal_p (call_stmt))
+                   switch (gimple_call_internal_fn (call_stmt))
                      {
                      case IFN_GOMP_SIMD_LANE:
-                       if (gimple_call_num_args (stmt) >= 3
-                           && !integer_nonzerop (gimple_call_arg (stmt, 2)))
+                       if (gimple_call_num_args (call_stmt) >= 3
+                           && !integer_nonzerop
+                                       (gimple_call_arg (call_stmt, 2)))
                          break;
                        /* FALLTHRU */
                      case IFN_ASAN_POISON:
@@ -1535,8 +1635,8 @@ eliminate_unnecessary_stmts (bool aggressive)
                        break;
                      }
                }
-             else if (gimple_call_internal_p (stmt))
-               switch (gimple_call_internal_fn (stmt))
+             else if (gimple_call_internal_p (call_stmt))
+               switch (gimple_call_internal_fn (call_stmt))
                  {
                  case IFN_ADD_OVERFLOW:
                    maybe_optimize_arith_overflow (&gsi, PLUS_EXPR);
@@ -1548,11 +1648,11 @@ eliminate_unnecessary_stmts (bool aggressive)
                    maybe_optimize_arith_overflow (&gsi, MULT_EXPR);
                    break;
                  case IFN_UADDC:
-                   if (integer_zerop (gimple_call_arg (stmt, 2)))
+                   if (integer_zerop (gimple_call_arg (call_stmt, 2)))
                      maybe_optimize_arith_overflow (&gsi, PLUS_EXPR);
                    break;
                  case IFN_USUBC:
-                   if (integer_zerop (gimple_call_arg (stmt, 2)))
+                   if (integer_zerop (gimple_call_arg (call_stmt, 2)))
                      maybe_optimize_arith_overflow (&gsi, MINUS_EXPR);
                    break;
                  default:

Reply via email to