On Wed, 27 Jan 2021, Jakub Jelinek wrote: > On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote: > > The following avoids repeatedly turning VALUE RTXen into > > sth useful and re-applying a constant offset through get_addr > > via DSE check_mem_read_rtx. Instead perform this once for > > all stores to be visited in check_mem_read_rtx. This avoids > > allocating 1.6GB of garbage PLUS RTXen on the PR80960 > > testcase, fixing the memory usage regression from old GCC. > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? > > > > Thanks, > > Richard. > > > > 2021-01-27 Richard Biener <rguent...@suse.de> > > > > PR rtl-optimization/80960 > > * dse.c (check_mem_read_rtx): Call get_addr on the > > offsetted address. > > --- > > gcc/dse.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/gcc/dse.c b/gcc/dse.c > > index c88587e7d94..da0df54a2dd 100644 > > --- a/gcc/dse.c > > +++ b/gcc/dse.c > > @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info) > > } > > if (maybe_ne (offset, 0)) > > mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset); > > + /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence > > + which will over and over re-create proper RTL and re-apply the > > + offset above. See PR80960 where we almost allocate 1.6GB of PLUS > > + RTXen that way. */ > > + mem_addr = get_addr (mem_addr); > > > > if (group_id >= 0) > > { > > Does that result in any changes on how much does DSE optimize? > I mean, if you do 2 bootstraps/regtests, one with this patch and another one > without it, and at the end of rest_of_handle_dse dump > locally_deleted, globally_deleted > for each CU/function, do you get the same counts except perhaps for dse.c?
So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the patch but counts are _extremely_ small. Statistics: 70148 dse: local deletions = 0, global deletions = 0 32 dse: local deletions = 0, global deletions = 1 9 dse: local deletions = 0, global deletions = 2 7 dse: local deletions = 0, global deletions = 3 2 dse: local deletions = 0, global deletions = 4 2 dse: local deletions = 0, global deletions = 5 3 dse: local deletions = 0, global deletions = 7 67 dse: local deletions = 1, global deletions = 0 1 dse: local deletions = 1, global deletions = 2 12 dse: local deletions = 2, global deletions = 0 1 dse: local deletions = 24, global deletions = 1 2 dse: local deletions = 3, global deletions = 0 4 dse: local deletions = 4, global deletions = 0 4 dse: local deletions = 6, global deletions = 0 1 dse: local deletions = 7, global deletions = 0 1 dse: local deletions = 8, global deletions = 0 so not sure how much confidence this brings over the analytical reasoning that it shouldn't make a difference ... stats on just dse2 are even more depressing (given it's cost) 35123 dse: local deletions = 0, global deletions = 0 2 dse: local deletions = 0, global deletions = 1 20 dse: local deletions = 1, global deletions = 0 1 dse: local deletions = 2, global deletions = 0 1 dse: local deletions = 3, global deletions = 0 1 dse: local deletions = 4, global deletions = 0 Richard.