https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81558
--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 27 Jul 2017, kugan at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81558 > > --- Comment #2 from kugan at gcc dot gnu.org --- > > > Does LLVM do a runtime alias check here? For foo1 GCC adds a runtime alias > > check > > (BB vectorization cannot version for aliasing). > > Yes. LLVM does not seem to be unrolling the inner loop. As you said, when > disabling cunrolli it works. cunroll pass will unroll after loop > vectorisation. > Can anything done with the heuristics for this case? Thanks. cunrolli sees Loop 2 iterates 16 times. ... size: 1 imgY_org.6_2 = imgY_org; size: 0 _3 = (long unsigned int) y_15; size: 1 _4 = _3 * 8; size: 1 _5 = imgY_org.6_2 + _4; size: 1 _6 = *_5; size: 0 _7 = (long unsigned int) x_14; size: 1 _8 = _7 * 2; size: 1 _9 = _6 + _8; size: 1 orgptr_24 = orgptr_16 + 2; size: 1 _10 = *_9; size: 1 *orgptr_16 = _10; size: 1 x_26 = x_14 + 1; A quick shot at a heuristic would see we'd vectorize this with V8HI/V16HImode and with statically determined 16 iterations that should be profitable. So yes, a heuristic is possible but it would be only a heuristic which means there's likely a testcase that will regress in one way or another (like missing simplifications exposed by unrolling). Another thing is that IMHO cunrolli has a too big limit on the maximum number of iterations it'll unroll. Adding another param might help here, or making it less aggressive. Of course calcluix relies heavily on curnolli aggressively unrolling ...