https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88767
--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 9 Jan 2019, wschmidt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88767 > > --- Comment #7 from Bill Schmidt <wschmidt at gcc dot gnu.org> --- > (In reply to Michael Matz from comment #3) > > I don't see anything to improve either (as far as unroll-and-jam is > > concerned). > > It's quite possible that cunrolli is harming more than helping in this case, > > but with it disabled it seems the code is as it should be. > > > > So, please state what you want to see changed: unroll-and-jam or cunrolli? > > The question in my mind is what to do about the phase interaction between the > two. Classical optimizations of loop nests for HPC code optimize memory > access > patterns, and cunrolli takes some of the options off the table before > unroll-and-jam (in this case) can analyze the loop. A improvement of the heuristics could be to turn down --param max-completely-peel-times and friends for cunrolli. cunrolli is important to remove abstraction in C++ since none of the scalar optimization passes knows to unroll loops "virtually" (it's on my list to experiment with such an idea for value-numbering)