On Thu, Sep 24, 2020 at 10:21 AM xionghu luo <[email protected]> wrote:
>
> Hi Segher,
>
> The attached two patches are updated and split from
> "[PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple
> [PR79251]"
> as your comments.
>
>
> [PATCH v3 2/3] rs6000: Fix lvsl&lvsr mode and change rs6000_expand_vector_set
> param
>
> This one is preparation work of fix lvsl&lvsr arg mode and
> rs6000_expand_vector_set
> parameter support for both constant and variable index input.
>
>
> [PATCH v3 2/3] rs6000: Support variable insert and Expand vec_insert in
> expander [PR79251]
>
> This one is Building VIEW_CONVERT_EXPR and expand the IFN VEC_SET to fast.
I'll just comment that
xxperm 34,34,33
xxinsertw 34,0,12
xxperm 34,34,32
doesn't look like a variable-position insert instruction but
this is a variable whole-vector rotate plus an insert at index zero
followed by a variable whole-vector rotate. I'm not fluend in
ppc assembly but
rlwinm 6,6,2,28,29
mtvsrwz 0,5
lvsr 1,0,6
lvsl 0,0,6
possibly computes the shift masks for r33/r32? though
I do not see those registers mentioned...
This might be a generic viable expansion strathegy btw,
which is why I asked before whether the CPU supports
inserts at a variable position ... the building blocks are
already there with vec_set at constant zero position
plus vec_perm_const for the rotates.
But well, I did ask this question. Multiple times.
ppc does _not_ have a VSX instruction
like xxinsertw r34, r8, r12 where r8 denotes
the vector element (or byte position or whatever).
So I don't think vec_set with a variable index is the
best approach.
Xionghu - you said even without the patch the stack
storage is eventually elided but
addi 9,1,-16
rldic 6,6,2,60
stxv 34,-16(1)
stwx 5,9,6
lxv 34,-16(1)
still shows stack(?) store/load with a bad STLF penalty.
Richard.
>
> Thanks,
> Xionghu