On Thu, Nov 17, 2016 at 11:26 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Thu, Nov 17, 2016 at 8:32 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Wed, Nov 16, 2016 at 6:20 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>> Hi, >>> Currently test gfortran.dg/vect/fast-math-mgrid-resid.f checks all >>> predictive commoning opportunities for all possible loops. This makes it >>> fragile because vectorizer may peel the loop differently, as well as may >>> choose different vector factors. For example, on x86-solaris, vectorizer >>> doesn't peel for prologue loop; for -march=haswell, the case is long time >>> failed because vector factor is 4, while iteration distance of predictive >>> commoning opportunity is smaller than 4. This patch refines it by only >>> checking if predictive commoning variable is created when vector factor is >>> 2; or vectorization variable is created when factor is 4. This works since >>> we have only one main loop, and only one vector factor can be used. >>> Test result checked for various x64 targets. Is it OK? >> >> I think that as you write the test is somewhat fragile. But rather >> than adjusting the scanning like you do >> I'd add --param vect-max-peeling-for-alignment=0 and -mprefer-avx128 > In this way, is it better to add "--param > vect-max-peeling-for-alignment=0" for all targets? Otherwise we still > need to differentiate test string to handle different targets. But I > have another question here: what if a target can't handle unaligned > access and vectorizer have to peel for alignment for it?
You'd get versioning for alignment instead. > Also do you think it's ok to check predictive commoning PHI node as below? > # vectp_u.122__lsm0.158_94 = PHI <vectp_u.122__lsm0.158_95(8), _96(6)> > In this way, we don't need to take possible prologue/epilogue loops > into consideration. I hoped w/o peeling we can simply scan for "Executing predictive commoning". But with versioning for alignment you'd still get two loops. So maybe checking for both "Executing predictive commoning" and looking for a vect_lsm PHI node is ok... >> as additional option on x86_64-*-* i?86-*-*. >> >> Your new pattern would fail with avx512 if vector (8) real would be used. >> >> What's the actual change that made the testcase fail btw? > There are two cases. > A) After vect_do_peeling change, vectorizer may only peel one > iteration for prologue loop (if vf == 2), below test string was added > for this reason: > ! { dg-final { scan-tree-dump-times "Loop iterates only 1 time, > nothing to do" 1 "pcom" } } > This fails on x86_64 solaris because prologue loop is not peeled at all. > B) Depending on ilp, I think below test strings fail for long time with > haswell: > ! { dg-final { scan-tree-dump-times "Executing predictive commoning > without unrolling" 1 "pcom" { target lp64 } } } > ! { dg-final { scan-tree-dump-times "Executing predictive commoning > without unrolling" 2 "pcom" { target ia32 } } } > Because vectorizer choose vf==4 in this case, and there is no > predictive commoning opportunities at all. Yes. I suggest -mprefer-avx128 for that. > Also the newly added test string fails in this case too because the > prolog peeled iterates more than 1 times. > > Thanks, > bin >> >> Richard. >> >>> Thanks, >>> bin >>> >>> gcc/testsuite/ChangeLog >>> 2016-11-16 Bin Cheng <bin.ch...@arm.com> >>> >>> PR testsuite/78114 >>> * gfortran.dg/vect/fast-math-mgrid-resid.f: Refine test by >>> checking predictive commining variables in vectorized loop >>> wrto vector factor.