On Mon, Oct 21, 2024 at 6:04 PM Andrew Pinski <pins...@gmail.com> wrote:
>
> On Mon, Oct 21, 2024 at 3:41 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski <quic_apin...@quicinc.com> 
> > wrote:
> > >
> > > After fixing loop-im to do the correct overflow rewriting
> > > for pointer types too. We end up with code like:
> > > ```
> > >   _9 = (unsigned long) &g;
> > >   _84 = _9 + 18446744073709551615;
> > >   _11 = _42 + _84;
> > >   _44 = (signed char *) _11;
> > > ...
> > >   *_44 = 10;
> > >   g ={v} {CLOBBER(eos)};
> > > ...
> > >   n[0] = &f;
> > >   *_44 = 8;
> > >   g ={v} {CLOBBER(eos)};
> > > ```
> > > Which was not being recongized by the scope conflicts code.
> > > This was because it only handled one level walk backs rather than 
> > > multiple ones.
> > > This fixes it by using a work_list to avoid huge recursion and a visited 
> > > bitmap to avoid
> > > going into an infinite loops when dealing with loops.
> > > Adds a cache for the addr_expr's that are associated with each ssa name. 
> > > I found 2 element
> > > cache was the decent trade off for size and speed.  Most ssa names will 
> > > have only
> > > one address associated with it but there are times (phis) where 2 or more 
> > > will
> > > be there. But 2 is the common case for most if statements.
> >
> > Hmm.  I was hoping that, since we walk BBs in RPO order in 
> > add_scope_conflicts,
> > we can simply populate the cache in the first forward walk of a BBs
> > stmts without
> > needing to care for cycles or using of a worklist.  We even get the
> > set of ADDR_EXPRs
> > in the stmt we visit for free, we just have to associate that set with
> > the SSA def(s).
> > I'd have used a SSA name -> bitmap mapping for simplicity.  For
> > backedges the iteration
> > should get things corrected, so we should be able to use the "cache"
> > as conservative
> > answer even for those.
>
> The problem is if we had a phi of say t_3 = <&a, &b, &c> .
> At this point the cache is a fixed size cache, and only holds 2
> elements (maybe changing it to be variable would be better) so
> sometimes the cache is invalidated due to needing to hold more than 2
> elements.

As said I'd use a bitmap of the ADDR_EXPR uids in the cache.  The
bitmaps can be shared (just wipe their obstack at the end).

> And you would need the worklist and the bitmap visited for those
> cases. This should not happen that often but could with switches.
> If we change to use a variable sized cache, then it shouldn't happen
> due to the walk as you described. Let me change it to a variable sized
> cache.
>
> Thanks,
> Andrew
>
> >
> > Richard.
> >
> > > gcc/ChangeLog:
> > >
> > >         PR middle-end/111422
> > >         * cfgexpand.cc: Define INCLUDE_STRING if ADDR_WALKER_STATS
> > >         is defined.
> > >         (class addr_ssa_walker): New class.
> > >         (add_scope_conflicts_2): Rename to ...
> > >         (addr_ssa_walker::operator()): This and rewrite to be a full walk
> > >         of all operands and their uses and use a cache.
> > >         (add_scope_conflicts_1): Add walker new argument for the addr 
> > > cache.
> > >         Just walk the phi result since that will include all addr_exprs.
> > >         Change call to add_scope_conflicts_2 to walker.
> > >         (add_scope_conflicts): Add walker variable and update call to
> > >         add_scope_conflicts_1.
> > >
> > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com>
> > > ---
> > >  gcc/cfgexpand.cc | 207 ++++++++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 176 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> > > index 6c1096363af..74f4cfc0f22 100644
> > > --- a/gcc/cfgexpand.cc
> > > +++ b/gcc/cfgexpand.cc
> > > @@ -17,6 +17,9 @@ You should have received a copy of the GNU General 
> > > Public License
> > >  along with GCC; see the file COPYING3.  If not see
> > >  <http://www.gnu.org/licenses/>.  */
> > >
> > > +#ifdef ADDR_WALKER_STATS
> > > +#define INCLUDE_STRING
> > > +#endif
> > >  #include "config.h"
> > >  #include "system.h"
> > >  #include "coretypes.h"
> > > @@ -571,35 +574,175 @@ visit_conflict (gimple *, tree op, tree, void 
> > > *data)
> > >    return false;
> > >  }
> > >
> > > -/* Helper function for add_scope_conflicts_1.  For USE on
> > > -   a stmt, if it is a SSA_NAME and in its SSA_NAME_DEF_STMT is known to 
> > > be
> > > -   based on some ADDR_EXPR, invoke VISIT on that ADDR_EXPR.  */
> > > +namespace {
> > >
> > > -static inline void
> > > -add_scope_conflicts_2 (tree use, bitmap work,
> > > -                      walk_stmt_load_store_addr_fn visit)
> > > +class addr_ssa_walker
> > > +{
> > > +private:
> > > +  struct addr_cache
> > > +  {
> > > +  private:
> > > +    unsigned elems = 0;
> > > +    static constexpr unsigned maxelements = 2;
> > > +    bool visited = false;
> > > +    tree cached[maxelements] = {};
> > > +  public:
> > > +    /* Returns true if the cache is valid. */
> > > +    operator bool ()
> > > +    {
> > > +      return visited && elems <= maxelements;
> > > +    }
> > > +    /* Mark as visited. The cache might be invalidated
> > > +       by adding too many elements though. */
> > > +    void visit () { visited = true; }
> > > +    /* Iterator over the cached values. */
> > > +    tree *begin () { return &cached[0]; }
> > > +    tree *end ()
> > > +    {
> > > +      /* If there was too many elements, then there are
> > > +        nothing to vist in the cache. */
> > > +      if (elems > maxelements)
> > > +       return &cached[0];
> > > +      return &cached[elems];
> > > +    }
> > > +    /* Add ADDR_EXPR to the cache if it is not there already. */
> > > +    void add (tree addr)
> > > +    {
> > > +      if (elems > maxelements)
> > > +       {
> > > +         statistics_counter_event (cfun, "addr_walker already overflow", 
> > > 1);
> > > +         return;
> > > +       }
> > > +      /* Skip if the cache already contains the addr_expr. */
> > > +      for(tree other : *this)
> > > +       if (operand_equal_p (other, addr))
> > > +         return;
> > > +      elems++;
> > > +      /* Record that the cache overflowed. */
> > > +      if (elems > maxelements)
> > > +       {
> > > +         statistics_counter_event (cfun, "addr_walker overflow", 1);
> > > +         return;
> > > +       }
> > > +      cached[elems - 1] = addr;
> > > +    }
> > > +  };
> > > +public:
> > > +  addr_ssa_walker () : cache (new addr_cache[num_ssa_names]{}) { }
> > > +  ~addr_ssa_walker (){ delete[] cache; }
> > > +
> > > +  /* Walk the name and its defining statement,
> > > +     call func with for addr_expr's. */
> > > +  template<typename T>
> > > +  void operator ()(tree name, T func);
> > > +
> > > +private:
> > > +
> > > +  /* Cannot create a copy. */
> > > +  addr_ssa_walker (const addr_ssa_walker &) = delete;
> > > +  addr_ssa_walker (addr_ssa_walker &&) = delete;
> > > +  /* Return the cache entries for a SSA NAME. */
> > > +  addr_cache &operator[] (tree name)
> > > +  {
> > > +    return cache[SSA_NAME_VERSION (name)];
> > > +  }
> > > +
> > > +  addr_cache *cache;
> > > +};
> > > +
> > > +/* Walk backwards on the defining statements of NAME
> > > +   and call FUNC on the addr_expr. Use the cache for
> > > +   the SSA name if possible.  */
> > > +
> > > +template<typename T>
> > > +void
> > > +addr_ssa_walker::operator() (tree name, T func)
> > >  {
> > > -  if (TREE_CODE (use) == SSA_NAME
> > > -      && (POINTER_TYPE_P (TREE_TYPE (use))
> > > -         || INTEGRAL_TYPE_P (TREE_TYPE (use))))
> > > +  gcc_assert (TREE_CODE (name) == SSA_NAME);
> > > +  auto_vec<std::pair<tree,tree>, 4> work_list;
> > > +  auto_bitmap visited_ssa_names;
> > > +  work_list.safe_push (std::make_pair (name, name));
> > > +
> > > +#ifdef ADDR_WALKER_STATS
> > > +  unsigned process_list = 0;
> > > +#endif
> > > +
> > > +  while (!work_list.is_empty())
> > >      {
> > > +      auto work = work_list.pop();
> > > +      tree use = work.first;
> > > +      tree old_name = work.second;
> > > +#ifdef ADDR_WALKER_STATS
> > > +      process_list++;
> > > +#endif
> > > +
> > > +      if (!use)
> > > +       continue;
> > > +      /* For addr_expr's call the function and update the cache. */
> > > +      if (TREE_CODE (use) == ADDR_EXPR)
> > > +       {
> > > +         func (use);
> > > +         (*this)[old_name].add (use);
> > > +         continue;
> > > +       }
> > > +      /* Ignore all non SSA names. */
> > > +      if (TREE_CODE (use) != SSA_NAME)
> > > +       continue;
> > > +
> > > +      /* Only Pointers and integral types are used to track addresses.  
> > > */
> > > +      if (!POINTER_TYPE_P (TREE_TYPE (use))
> > > +         && !INTEGRAL_TYPE_P (TREE_TYPE (use)))
> > > +       continue;
> > > +
> > > +      /* Check the cache, if there is a hit use it.  */
> > > +      if ((*this)[use])
> > > +       {
> > > +         statistics_counter_event (cfun, "addr_walker cache hit", 1);
> > > +         /* Call the function and update the cache.  */
> > > +         for (tree naddr : (*this)[use])
> > > +           {
> > > +             (*this)[old_name].add (naddr);
> > > +             func (naddr);
> > > +           }
> > > +         continue;
> > > +       }
> > >        gimple *g = SSA_NAME_DEF_STMT (use);
> > > +      /* Mark the use as being visited so even if the cache is empty,
> > > +        there is no reason to walk the names backwards. */
> > > +      (*this)[use].visit();
> > > +      /* Skip the name if already visted this time.  */
> > > +      if (!bitmap_set_bit (visited_ssa_names, SSA_NAME_VERSION (use)))
> > > +       continue;
> > > +      /* Need to update the old_name afterwards add it to the work list,
> > > +        this will either hit the cache or the check bitmap will skip it
> > > +        if there was too many names associated with the cache. */
> > > +      work_list.safe_push (work);
> > > +
> > > +      /* For assign statements, add each operand to the work list.
> > > +        Note operand 0 is the same as the use here so there is nothing
> > > +        to be done.  */
> > >        if (gassign *a = dyn_cast <gassign *> (g))
> > >         {
> > > -         if (tree op = gimple_assign_rhs1 (a))
> > > -           if (TREE_CODE (op) == ADDR_EXPR)
> > > -             visit (a, TREE_OPERAND (op, 0), op, work);
> > > +         for (unsigned i = 1; i < gimple_num_ops (g); i++)
> > > +           work_list.safe_push (std::make_pair (gimple_op (a, i), use));
> > >         }
> > > +      /* PHI nodes, add each argument to the work list.  */
> > >        else if (gphi *p = dyn_cast <gphi *> (g))
> > >         for (unsigned i = 0; i < gimple_phi_num_args (p); ++i)
> > > -         if (TREE_CODE (use = gimple_phi_arg_def (p, i)) == SSA_NAME)
> > > -           if (gassign *a = dyn_cast <gassign *> (SSA_NAME_DEF_STMT 
> > > (use)))
> > > -             {
> > > -               if (tree op = gimple_assign_rhs1 (a))
> > > -                 if (TREE_CODE (op) == ADDR_EXPR)
> > > -                   visit (a, TREE_OPERAND (op, 0), op, work);
> > > -             }
> > > -    }
> > > +         work_list.safe_push (std::make_pair (gimple_phi_arg_def (p, i), 
> > > use));
> > > +      /* All other kind of statements assume they can't contribute to an 
> > > address
> > > +        being alive.  */
> > > +    }
> > > +  /* This stat is here to see how long is the longest walk,
> > > +     it is not useful stat except when tuning
> > > +     addr_ssa_walker::addr_cache::maxelements.  */
> > > +#ifdef ADDR_WALKER_STATS
> > > +  statistics_counter_event (cfun,
> > > +                           ("addr_walker process " + std::to_string 
> > > (process_list)).c_str (),
> > > +                           1);
> > > +#endif
> > > +}
> > > +
> > >  }
> > >
> > >  /* Helper routine for add_scope_conflicts, calculating the active 
> > > partitions
> > > @@ -608,7 +751,8 @@ add_scope_conflicts_2 (tree use, bitmap work,
> > >     liveness.  */
> > >
> > >  static void
> > > -add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
> > > +add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict,
> > > +                      addr_ssa_walker &walker)
> > >  {
> > >    edge e;
> > >    edge_iterator ei;
> > > @@ -623,14 +767,14 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, 
> > > bool for_conflict)
> > >
> > >    visit = visit_op;
> > >
> > > +  auto addrvisitor = [&visit,&work](tree addr) {
> > > +                  gcc_assert (TREE_CODE (addr) == ADDR_EXPR);
> > > +                  visit (nullptr, TREE_OPERAND (addr, 0), addr, work);
> > > +                };
> > > +
> > > +  /* Walk over the phi for the incoming addresses to be alive.  */
> > >    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > > -    {
> > > -      gimple *stmt = gsi_stmt (gsi);
> > > -      gphi *phi = as_a <gphi *> (stmt);
> > > -      walk_stmt_load_store_addr_ops (stmt, work, NULL, NULL, visit);
> > > -      FOR_EACH_PHI_ARG (use_p, phi, iter, SSA_OP_USE)
> > > -       add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit);
> > > -    }
> > > +    walker (gimple_phi_result (gsi_stmt (gsi)), addrvisitor);
> > >    for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > >      {
> > >        gimple *stmt = gsi_stmt (gsi);
> > > @@ -676,7 +820,7 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, 
> > > bool for_conflict)
> > >             }
> > >           walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
> > >           FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
> > > -           add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit);
> > > +           walker (USE_FROM_PTR (use_p), addrvisitor);
> > >         }
> > >      }
> > >
> > > @@ -707,6 +851,7 @@ add_scope_conflicts (void)
> > >    bitmap work = BITMAP_ALLOC (NULL);
> > >    int *rpo;
> > >    int n_bbs;
> > > +  addr_ssa_walker walker;
> > >
> > >    /* We approximate the live range of a stack variable by taking the 
> > > first
> > >       mention of its name as starting point(s), and by the end-of-scope
> > > @@ -734,14 +879,14 @@ add_scope_conflicts (void)
> > >           bitmap active;
> > >           bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]);
> > >           active = (bitmap)bb->aux;
> > > -         add_scope_conflicts_1 (bb, work, false);
> > > +         add_scope_conflicts_1 (bb, work, false, walker);
> > >           if (bitmap_ior_into (active, work))
> > >             changed = true;
> > >         }
> > >      }
> > >
> > >    FOR_EACH_BB_FN (bb, cfun)
> > > -    add_scope_conflicts_1 (bb, work, true);
> > > +    add_scope_conflicts_1 (bb, work, true, walker);
> > >
> > >    free (rpo);
> > >    BITMAP_FREE (work);
> > > --
> > > 2.34.1
> > >

Reply via email to