On Fri, Jun 16, 2023 at 4:12 AM liuhongt wrote:
>
> The packing in vpacksswb/vpackssdw is not a simple concat, it's an
> interweave from src1 and src2 for every 128 bit(or 64-bit for the
> ss_truncate result).
>
> .i.e.
>
> dst[192-255] = ss_truncate (src2[128-255])
> dst[128-191] = ss_truncate (s
The packing in vpacksswb/vpackssdw is not a simple concat, it's an
interweave from src1 and src2 for every 128 bit(or 64-bit for the
ss_truncate result).
.i.e.
dst[192-255] = ss_truncate (src2[128-255])
dst[128-191] = ss_truncate (src1[128-255])
dst[64-127] = ss_truncate (src2[0-127])
dst[0-63] =