On Thu, Nov 17, 2016 at 8:32 AM, Richard Biener
<[email protected]> wrote:
> On Wed, Nov 16, 2016 at 6:20 PM, Bin Cheng <[email protected]> wrote:
>> Hi,
>> Currently test gfortran.dg/vect/fast-math-mgrid-resid.f checks all
>> predictive commoning opportunities for all possible loops. This makes it
>> fragile because vectorizer may peel the loop differently, as well as may
>> choose different vector factors. For example, on x86-solaris, vectorizer
>> doesn't peel for prologue loop; for -march=haswell, the case is long time
>> failed because vector factor is 4, while iteration distance of predictive
>> commoning opportunity is smaller than 4. This patch refines it by only
>> checking if predictive commoning variable is created when vector factor is
>> 2; or vectorization variable is created when factor is 4. This works since
>> we have only one main loop, and only one vector factor can be used.
>> Test result checked for various x64 targets. Is it OK?
>
> I think that as you write the test is somewhat fragile. But rather
> than adjusting the scanning like you do
> I'd add --param vect-max-peeling-for-alignment=0 and -mprefer-avx128
In this way, is it better to add "--param
vect-max-peeling-for-alignment=0" for all targets? Otherwise we still
need to differentiate test string to handle different targets. But I
have another question here: what if a target can't handle unaligned
access and vectorizer have to peel for alignment for it?
Also do you think it's ok to check predictive commoning PHI node as below?
# vectp_u.122__lsm0.158_94 = PHI <vectp_u.122__lsm0.158_95(8), _96(6)>
In this way, we don't need to take possible prologue/epilogue loops
into consideration.
> as additional option on x86_64-*-* i?86-*-*.
>
> Your new pattern would fail with avx512 if vector (8) real would be used.
>
> What's the actual change that made the testcase fail btw?
There are two cases.
A) After vect_do_peeling change, vectorizer may only peel one
iteration for prologue loop (if vf == 2), below test string was added
for this reason:
! { dg-final { scan-tree-dump-times "Loop iterates only 1 time,
nothing to do" 1 "pcom" } }
This fails on x86_64 solaris because prologue loop is not peeled at all.
B) Depending on ilp, I think below test strings fail for long time with haswell:
! { dg-final { scan-tree-dump-times "Executing predictive commoning
without unrolling" 1 "pcom" { target lp64 } } }
! { dg-final { scan-tree-dump-times "Executing predictive commoning
without unrolling" 2 "pcom" { target ia32 } } }
Because vectorizer choose vf==4 in this case, and there is no
predictive commoning opportunities at all.
Also the newly added test string fails in this case too because the
prolog peeled iterates more than 1 times.
Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin
>>
>> gcc/testsuite/ChangeLog
>> 2016-11-16 Bin Cheng <[email protected]>
>>
>> PR testsuite/78114
>> * gfortran.dg/vect/fast-math-mgrid-resid.f: Refine test by
>> checking predictive commining variables in vectorized loop
>> wrto vector factor.