> Note for blendv, it checks the significant bit of the mask, not simple
> if_then_else
> mask
> if_true
> if_false
>
> It should be
> if_then_else
> ashiftrt mask 31
> if_true
> if_false
I think canonical form (produced by combine) would be
if_then_else
ge mask 0
if_false
if_true
>
> Maybe not very useful in practice, just like why there's UNSPEC_FMADDSUB
>
> 6334
> 6335;; It would be possible to represent these without the UNSPEC as
> 6336;;
> 6337;; (vec_merge
> 6338;; (fma op1 op2 op3)
> 6339;; (fma op1 op2 (neg op3))
> 6340;; (merge-const))
> 6341;;
> 6342;; But this doesn't seem useful in practice.
I am not so sure about this when it come to relatively common
instructions. Hiding things in unspec prevents combine and other RTL
passes from doing their job. I would say that it only makes sense for
siutations where RTL equivalent is very inconvenient.
I noticed that we miss other optimizations of conditional moves. For
example:
int a[1000];
int b[1000];
int test()
{
for (int i = 0; i < 1000; i++)
a[i] = b[i] > 10 ? 2 : 3;
}
is compiled by clang to:
pcmpgtd %xmm0, %xmm2
paddd %xmm1, %xmm2
while we do
pcmpgtd %xmm4, %xmm0
pand %xmm0, %xmm1
pandn %xmm2, %xmm0
por %xmm1, %xmm0
Here I guess combine is out of luck. I fails to simplify the three
logcal operations:
Trying 17, 16 -> 18:
17: r112:V4SI=~r110:V4SI&r104:V4SI
REG_DEAD r110:V4SI
16: r111:V4SI=r107:V4SI&r110:V4SI
18: r100:V4SI=r112:V4SI|r111:V4SI
REG_DEAD r112:V4SI
REG_DEAD r111:V4SI
Failed to match this instruction:
(set (reg:V4SI 100 [ vect_iftmp.10 ])
(ior:V4SI (and:V4SI (not:V4SI (reg:V4SI 110))
(reg:V4SI 104))
(and:V4SI (reg:V4SI 107)
(reg:V4SI 110))))
Here reg 110 is set by compare:
(insn 15 14 16 3 (set (reg:V4SI 110)
(gt:V4SI (reg:V4SI 109 [ MEM <vector(4) int> [(int *)&b +
ivtmp.17_13 * 1] ])
(reg:V4SI 104))) 6997 {*sse2_gtv4si3}
(expr_list:REG_DEAD (reg:V4SI 109 [ MEM <vector(4) int> [(int *)&b
+ ivtmp.17_13 * 1] ])
(nil)))
and I think it misses the fact that the mask is either all 0 or all 1
for each lane (which is a value rango info it does not track).
Similarly one can simplify i.e.
a[i] = b[i] > 10 ? 2 : 5;
into and and or...
Honza