104288 - Update non-null interface to provide better precision.

Richard Biener via Gcc-patches Tue, 08 Feb 2022 00:25:50 -0800

On Mon, Feb 7, 2022 at 3:33 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This changes the non-null interface to provide more granularity and
> facilitate more precise queries.
>
> Previously, we could only ask if NAME was nonnull anywhere in the block,
> and it applied to the entire block.  now we can ask:
>    bool always_nonnull_p (tree name, basic_block bb);    // all uses are
> nonnull in the block
>    bool contains_nonnull_p (tree name, basic_block bb); // there is a
> nonnull use somewhere in the block.


How is the latter useful?  To me it seems that

  bool nonnull_at_p (tree name, gimple *stmt);
// 'name' is nonnull at the execution of STMT

would be (or nonnull_from_p with starting with STMT), nonnull_from_p could
be overloaded with basic_block even.

I see the patch uses contains_nonnull_p for range_on_exit.  In the previous
patch I commented that for

  <bb5>:
     *p = foo (*q);  // fallthru + EH

on the fallthru edge we know p and q are not NULL but on the EH edge
we only know q is not NULL.  [with -fnon-call-exceptions we have a
representational issue since the *q load might trap but we cannot currenty
separate that EH on GIMPLE]

So given that wouldn't we instead need range_on_edge () instead of
range_on_exit?  At least contains_nonnull_p cannot be used for
range_on_exit in case we derive nonnull from *p?

Btw, for -fnon-call-exceptions all ranges derived from the possibly throwing
stmt are not realized on the exceptional path.  Like for

 <bb6>:
    _1 = _3 / _5;  // fallthru + EH

on the fallthru edge we have _5 ~[0,0] but on the EH edge we do not
(in fact there we have _5 equals to [0,0] in this specific case).

Short-term it might be neccessary to not derive ranges from a
stmt that throws internally but it would be nice to sort those issues
out?

> Ranger was using this query for on-entry and range_of_expr within
> block.  Range-on-entry was simply wrong.. it should have been asking if
> non-null was true in the dominator of this block.   And this PR
> demonstrate the issue with range_of_expr.
>
> This patch changes all the places which checked the state of non-null.
> An Audit of every use now:
>
> ranger_cache::exit_range - Applies non-null if contains_nonnull (bb) is true
> ranger_cache::range_of_expr -  Applies non-null only if always_nonnull (bb).
> ranger_cache::fill_block_cache - marks a block to have on-entry
> calculated if predecessor contains_nonnull.
> ranger_cache::range_from_dom - Applies non-null if contains_nonnull() is
> true in a visited dominator block.
>
> gimple_ranger::range_of_expr - Applies non-null only if always_nonnull
> in the block.
> gimple_ranger::range_on_entry - Checks if any dominator block
> contains_nonnull.
> gimple_range::range_on_exit - Calls one of the above 2 routines, then
> checks if contains_nonull in this block
>
> gimple_range_path has 2 uses in range_defined_in_block() and  and
> adjust_for_non_null_uses.  Aldy audited both, and as the values are used
> at the end of the path only, so both are correct in using
> contains_nonull.  So there should be no impact to threading from this.
>
> This patch allows us to properly optimize:
>
> void h(int result) __attribute__((noipa));
> void h(int result)
> {
>      if (result)
>          __builtin_exit(0);
> }
>
> void foo (void *p) __attribute__((nonnull(1)));
>
> void bar (void *p)
> {
>    h (p == 0);
>    foo (p);
>    if (!p)
>      __builtin_abort ();
> }
>
> to
>
>    _1 = p_3(D) == 0B;
>    _2 = (int) _1;
>    h (_2);
>    foo (p_3(D));
>    return;
>
>
>  From a risk perspective, I don't think this patch can make us more
> incorrect than we were before.
>
> Oh, and compilation speed. All the patches combined with checking a
> little less than 1% regression overall in EVRP/VRP.  The audit exposed
> that the threader was invoking an unnecessary DOM lookup version of the
> query, and as a result, we see a 1.3% reduction in the time spent in the
> threader.  the overall compile time impact turned out to be a slight
> improvement of 0.02% overall in 380 GCC files (according to valgrind)
> so its basically all good.
>
> Bootstraps with no regressions.  OK for trunk?
>
> Andrew

Re: [PATCH 3/4] tree-optimization/104288 - Update non-null interface to provide better precision.

Reply via email to