On Fri, Nov 18, 2016 at 4:52 PM, Michael Matz <m...@suse.de> wrote:
> Hi,
>
> On Thu, 17 Nov 2016, Bin.Cheng wrote:
>
>> B) Depending on ilp, I think below test strings fail for long time with 
>> haswell:
>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>> without unrolling" 1 "pcom" { target lp64 } } }
>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>> without unrolling" 2 "pcom" { target ia32 } } }
>> Because vectorizer choose vf==4 in this case, and there is no
>> predictive commoning opportunities at all.
>> Also the newly added test string fails in this case too because the
>> prolog peeled iterates more than 1 times.
>
> Btw, this probably means that on haswell (or other archs with vf==4) mgrid
> is slower than necessary.  On mgrid you really really want predictive
> commoning to happen.  Vectorization isn't that interesting there.
Interesting, I will check if there is difference between 2/4 vf.  we
do have cases that smaller vf is better and should be chosen, though
for different reasons.
Thanks,
bin
>
>
> Ciao,
> Michael.

Reply via email to