https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93199
--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
At -O2 I see, with just E(1),
expand vars : 61.55 ( 23%) 0.01 ( 3%) 61.56 ( 23%)
1267 kB ( 1%)
store merging : 185.44 ( 69%) 0.00 ( 0%) 185.44 ( 69%)
625 kB ( 1%)
the time is spent in terminate_all_aliasing_chains where it seems the
m_stores_head chain is quite long. With D(1) D(2) only we have 4000 calls
to this function but then the inner look iterates 8 million times. Guess
we miss some limiting there, a testcase might be just a large BB with
many (non-aliasing) stores.
Micro-optimizing the function is also possible (testing patch for that).