On Wed, Feb 15, 2023 at 6:07 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This patch implements the suggestion that we have an alternative
> ssa-cache which does not zero memory, and instead uses a bitmap to track
> whether a value is currently set or not.  It roughly mimics what
> path_range_query was doing internally.
>
> For sparsely used cases, expecially in large programs, this is more
> efficient.  I changed path_range_query to use this, and removed it old
> bitmap (and a hack or two around PHI calculations), and also utilized
> this is the assume_query class.
>
> Performance wise, the patch doesn't affect VRP (since that still uses
> the original version).  Switching to the lazy version caused a slowdown
> of 2.5% across VRP.
>
> There was a noticeable improvement elsewhere.,  across 230 GCC source
> files, threading ran over 12% faster!.  Overall compilation improved by
> 0.3%  Not sure it makes much difference in compiler.i, but it shouldn't
> hurt.
>
> bootstraps on x86_64-pc-linux-gnu with no regressions.   OK for trunk?
> or do you want to wait for the next release...

I see

@@ -365,16 +335,8 @@ path_range_query::compute_ranges_in_phis (basic_block bb)

       Value_Range r (TREE_TYPE (name));
       if (range_defined_in_block (r, name, bb))
-       {
-         unsigned v = SSA_NAME_VERSION (name);
-         set_cache (r, name);
-         bitmap_set_bit (phi_set, v);
-         // Pretend we don't have a cache entry for this name until
-         // we're done with all PHIs.
-         bitmap_clear_bit (m_has_cache_entry, v);
-       }
+       m_cache.set_global_range (name, r);
     }
-  bitmap_ior_into (m_has_cache_entry, phi_set);
 }

 // Return TRUE if relations may be invalidated after crossing edge E.

which I think is not correct - if we have

 # _1 = PHI <..., _2>
 # _2 = PHI <..., _1>

then their effects are supposed to be executed in parallel, that is,
both PHI argument _2 and _1 are supposed to see the "old" version.
The previous code tried to make sure the range of the new _1 doesn't
get seen when processing the argument _1 in the definition of _2.

The new version drops this, possibly resulting in wrong-code.

While I think it's appropriate to sort out compile-time issues like this
during stage4 at least the above makes me think it should be defered
to next stage1.

Richard.

>
> Andrew

Reply via email to