https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

--- Comment #94 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 50165
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50165&action=edit
patch experiment

(In reply to Jakub Jelinek from comment #93)
> I think I'd go for more chains by default, at least 64 or even 256, with a
> param and tracking on how many we have in a counter.  The class has a
> ctor/dtor, so the increment/decrement of the counter can be done there.
> And I think it is doubly-linked, so tail should be prev on the head.

So the list is not circular but adding a counter is easy and avoids the
linked list walk in the usual cases when we do not hit the limit.

Note that the number of alias queries we do is
(param_max_store_chains_to_track * param_max_stores_to_merge) squared
with a default of 64 that's already 64 ^ 4.  So a less restrictive option
might be to limit the product of both numbers.

Now, I was thinking that we eventually can get rid of most alias queries
by instead of ambiguating against all stores in each chain only
ambiguate against the whole base of each chain, effectively making it
param_max_store_chains_to_track squared then.  That might lose some odd
cases.  The profile also shows that the caching of ao_refs (and thus
get_ref_base_and_extent calls) is imperfect - for the store chains
we could record ao_refs for example (it effectively already records
all necessary info anyway).

A simple patch caching the ao_ref in store_immediate_info improves compile
time to

 store merging                      : 265.07 ( 83%)   0.00 (  0%) 265.14 ( 82%)
 3858k (  1%)
 TOTAL                              : 321.25          0.53        321.88       
  557M

shaving off some 50s and thus possibly worth the extra memory cost here.

Btw, I've done some statistics and for this testcase we indeed mostly
have chains with a single store (the clobber) and thus the idea of
reducing the number of alias queries by globbing all stores of a chain
into a single effective ao_ref wouldn't help too much
(but it would save us from caching the ao_ref in each store to a single
ao_ref per chain):

  36422 Terminating chain with 1 stores
  10889 Terminating chain with 3 stores

and the largest number of active chains is 32661.

The idea of limiting the number of overall tracked stores instead of the
number of chains would be similarly simple and put a limit on the overall
alias queries (but then we have the limit on the number of stores in a
single chain, possibly because of non-linearities in chain processing).

Still limiting the number of tracked chain is more obvious in behavior
and thus IMHO superior for users (even if it means we have to put a
more conservative default on that).

Statistics on the param defaults:

 64    1.31 (  2%) 
128    2.64 (  4%)
256    4.86 (  8%)

but given the above statistics the testcase isn't a good example to tune
these with just either one or three stores on each chain.  Enabling
the ao_ref caching only marginally improves the testcase with the
limiting in place:

256    4.28 (  7%)

so any comments?

Reply via email to