On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote: > The following avoids repeatedly turning VALUE RTXen into > sth useful and re-applying a constant offset through get_addr > via DSE check_mem_read_rtx. Instead perform this once for > all stores to be visited in check_mem_read_rtx. This avoids > allocating 1.6GB of garbage PLUS RTXen on the PR80960 > testcase, fixing the memory usage regression from old GCC. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > 2021-01-27 Richard Biener <rguent...@suse.de> > > PR rtl-optimization/80960 > * dse.c (check_mem_read_rtx): Call get_addr on the > offsetted address. > --- > gcc/dse.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/gcc/dse.c b/gcc/dse.c > index c88587e7d94..da0df54a2dd 100644 > --- a/gcc/dse.c > +++ b/gcc/dse.c > @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info) > } > if (maybe_ne (offset, 0)) > mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset); > + /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence > + which will over and over re-create proper RTL and re-apply the > + offset above. See PR80960 where we almost allocate 1.6GB of PLUS > + RTXen that way. */ > + mem_addr = get_addr (mem_addr); > > if (group_id >= 0) > {
Does that result in any changes on how much does DSE optimize? I mean, if you do 2 bootstraps/regtests, one with this patch and another one without it, and at the end of rest_of_handle_dse dump locally_deleted, globally_deleted for each CU/function, do you get the same counts except perhaps for dse.c? Jakub