https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121925

--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> The other lane appears after lowering load permutations, but I did not want
> to mess with SLP pattern recog, so did not place it before.  Possibly load
> permute lowering will also break some cases.
> 

We've had similar issues when trying to see if we could implement Top/Bottom
support inside SLP pattern recog. 

e.g.

#include <stdint.h>

void store_perm(uint32_t *__restrict res0, uint32_t *__restrict res1,
                uint16_t *__restrict a, uint16_t *__restrict b, int n)
{
    for (int i = 0; i < n; i += 8)
    {
       // use uaddlb
       res0[i]   = a[i] + b[i];
       res0[i+1] = a[i+2] + b[i+2];
       res0[i+2] = a[i+4] + b[i+4];
       res0[i+3] = a[i+6] + b[i+6];

       // use uaddlt
       res1[i] = a[i+1] + b[i+1];
       res1[i+1] = a[i+3] + b[i+3];
       res1[i+2] = a[i+5] + b[i+5];
       res1[i+3] = a[i+7] + b[i+7];
    }
}

and for similar reasons we might have to do during vectorizable_operation,
but I'm not quite happy with that in that it isn't as clean as if we'd able
to do it during SLP patterns.

> This is unfortunately still a quite ugly area.
> 
> One possibility is to simply look at the DR group instead of just the
> SLP load, but you won't get to see the SLP load for the other part this way
> (or without SLP "backedges" even with lowered permutes).
> 

I think that's OK if I can insert a VEC_PERM_EXPR node here which will only be
hopefully merged during permute lowering?

If this works that would solve the above issue too with top/bottom design?

> What are the existing optab names for the cases we handle?

Currently we have

DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary)
DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)

But I'm not sure the new ones should have an explicit relationship with
"complex numbers"

Reply via email to