On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
> The following avoids repeatedly turning VALUE RTXen into
> sth useful and re-applying a constant offset through get_addr
> via DSE check_mem_read_rtx.  Instead perform this once for
> all stores to be visited in check_mem_read_rtx.  This avoids
> allocating 1.6GB of garbage PLUS RTXen on the PR80960
> testcase, fixing the memory usage regression from old GCC.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
> 2021-01-27  Richard Biener  <rguent...@suse.de>
> 
>       PR rtl-optimization/80960
>       * dse.c (check_mem_read_rtx): Call get_addr on the
>       offsetted address.
> ---
>  gcc/dse.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/dse.c b/gcc/dse.c
> index c88587e7d94..da0df54a2dd 100644
> --- a/gcc/dse.c
> +++ b/gcc/dse.c
> @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info)
>      }
>    if (maybe_ne (offset, 0))
>      mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
> +  /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence
> +     which will over and over re-create proper RTL and re-apply the
> +     offset above.  See PR80960 where we almost allocate 1.6GB of PLUS
> +     RTXen that way.  */
> +  mem_addr = get_addr (mem_addr);
>  
>    if (group_id >= 0)
>      {

Does that result in any changes on how much does DSE optimize?
I mean, if you do 2 bootstraps/regtests, one with this patch and another one
without it, and at the end of rest_of_handle_dse dump
locally_deleted, globally_deleted
for each CU/function, do you get the same counts except perhaps for dse.c?

        Jakub

Reply via email to