https://bugs.kde.org/show_bug.cgi?id=385334

--- Comment #2 from Julian Seward <jsew...@acm.org> ---
I looked at the IR generation for vperm
(case 0x2B: { // vperm (Permute, AV p218), etc)
and it looks correct to me.  Which test case is failing and
for which target? (32-be, 64-be, 64-le) ?

The code already does use 5 bits to select.  It uses 4 bits
here:

      assign( vC_andF,
              binop(Iop_AndV128, mkexpr(vC),
                                 unop(Iop_Dup8x16, mkU8(0xF))) );

and one from here:

      // mask[i8] = (vC[i8]_4 == 1) ? 0xFF : 0x0
      assign( mask, binop(Iop_SarN8x16,
                          binop(Iop_ShlN8x16, mkexpr(vC), mkU8(3)),
                          mkU8(7)) );

by computing, for each 8-bit lane, (lane << 3) >>signed 7.  This
copies bit[3] in the IBM encoding into all 8 bits of the lane, and
hence makes it into a usable mask.

I am wondering now if the problem is one of endianness.

The IR definition of Iop_Perm8x16 only allows values 0 .. 15 in the
selector fields, which the patch ends up violating.  Given the above
I suspect there's some other way to fix this.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to