On Thu, Nov 17, 2016 at 11:26 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Thu, Nov 17, 2016 at 8:32 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> On Wed, Nov 16, 2016 at 6:20 PM, Bin Cheng <bin.ch...@arm.com> wrote:
>>> Hi,
>>> Currently test gfortran.dg/vect/fast-math-mgrid-resid.f checks all 
>>> predictive commoning opportunities for all possible loops.  This makes it 
>>> fragile because vectorizer may peel the loop differently, as well as may 
>>> choose different vector factors.  For example, on x86-solaris, vectorizer 
>>> doesn't peel for prologue loop; for -march=haswell, the case is long time 
>>> failed because vector factor is 4, while iteration distance of predictive 
>>> commoning opportunity is smaller than 4.  This patch refines it by only 
>>> checking if predictive commoning variable is created when vector factor is 
>>> 2; or vectorization variable is created when factor is 4.  This works since 
>>> we have only one main loop, and only one vector factor can be used.
>>> Test result checked for various x64 targets.  Is it OK?
>>
>> I think that as you write the test is somewhat fragile.  But rather
>> than adjusting the scanning like you do
>> I'd add --param vect-max-peeling-for-alignment=0 and -mprefer-avx128
> In this way, is it better to add "--param
> vect-max-peeling-for-alignment=0" for all targets?  Otherwise we still
> need to differentiate test string to handle different targets.  But I
> have another question here: what if a target can't handle unaligned
> access and vectorizer have to peel for alignment for it?

You'd get versioning for alignment instead.

> Also do you think it's ok to check predictive commoning PHI node as below?
> # vectp_u.122__lsm0.158_94 = PHI <vectp_u.122__lsm0.158_95(8), _96(6)>
> In this way, we don't need to take possible prologue/epilogue loops
> into consideration.

I hoped w/o peeling we can simply scan for "Executing predictive commoning".
But with versioning for alignment you'd still get two loops.

So maybe checking for both "Executing predictive commoning" and looking
for a vect_lsm PHI node is ok...

>> as additional option on x86_64-*-* i?86-*-*.
>>
>> Your new pattern would fail with avx512 if vector (8) real would be used.
>>
>> What's the actual change that made the testcase fail btw?
> There are two cases.
> A) After vect_do_peeling change, vectorizer may only peel one
> iteration for prologue loop (if vf == 2), below test string was added
> for this reason:
> ! { dg-final { scan-tree-dump-times "Loop iterates only 1 time,
> nothing to do" 1 "pcom" } }
> This fails on x86_64 solaris because prologue loop is not peeled at all.
> B) Depending on ilp, I think below test strings fail for long time with 
> haswell:
> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
> without unrolling" 1 "pcom" { target lp64 } } }
> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
> without unrolling" 2 "pcom" { target ia32 } } }
> Because vectorizer choose vf==4 in this case, and there is no
> predictive commoning opportunities at all.

Yes.  I suggest -mprefer-avx128 for that.

> Also the newly added test string fails in this case too because the
> prolog peeled iterates more than 1 times.
>
> Thanks,
> bin
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>
>>> gcc/testsuite/ChangeLog
>>> 2016-11-16  Bin Cheng  <bin.ch...@arm.com>
>>>
>>>         PR testsuite/78114
>>>         * gfortran.dg/vect/fast-math-mgrid-resid.f: Refine test by
>>>         checking predictive commining variables in vectorized loop
>>>         wrto vector factor.

Reply via email to