Re: Masked vector deficiencies

Richard Sandiford Tue, 03 Mar 2020 07:57:44 -0800

Andrew Stubbs <a...@codesourcery.com> writes:
> Hi all,
>
> Up until now the AMD GCN port has been using exclusively 64-lane vectors 
> with masking for smaller sizes.
>
> This works quite well, where it works, but there remain many test cases 
> (and no doubt some real code) that refuse to vectorize because the 
> number of iterations (or SLP equivalent) are smaller than the 
> vectorization factor.
>
> My question is: are there any plans to fill in these missing cases? Or, 
> is relying on masking alone just not feasible?


This is supported for loop vectorisation.  E.g.:

  void f (short *x) { for (int i = 0; i < 7; ++i) x[i] += 1; }

generates:

        ptrue   p0.h, vl7
        ld1h    z0.h, p0/z, [x0]
        add     z0.h, z0.h, #1
        st1h    z0.h, p0, [x0]
        ret

for SVE.  BB SLP is on the wish-list for GCC 11, but no promises. :-)

Early peeling/complete unrolling can cause loops to be straight-line
code by the time the vectoriser sees them.  E.g. the loop above doesn't
use masked SVE for "i < 3".

Which kind of cases fail for GCN?

Thanks,
Richard

>
> I've dabbled in the vectorizer code, of course, but I can't claim to 
> have much of a feel for it as a whole. I may be able to help with the 
> effort in future, but for now I'm struggling to judge what's even needed.
>
> For GCN the vectorization is quite important as scalar code is slow, and 
> adding vectorization is usually cheap. The architecture can do any 
> vector size between 1 and 64 lanes (not just powers of two), so being 
> smaller than the vectorization factor really ought not be a problem.
>
> To fix this, I've been considering adding extra vector sizes (probably 
> 2, 4, 8, 16, 32) where the backend would take care of the masking. 
> Asside from reductions and permutations the changes would be somewhat 
> trivial, but the explosion in the number of generated patterns would be 
> enormous, and it still won't allow arbitrary size vectors.
>
> Thank you for your time; I'm trying to decide where my efforts should lie.
>
> Andrew

Re: Masked vector deficiencies

Reply via email to