On 11/03/2017 10:19 AM, Richard Sandiford wrote:
> SLP load permutation fails if any individual permutation requires more
> than two vector inputs. For 128-bit vectors, it's possible to permute
> 3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
> or 64-bit elements. The re
SLP load permutation fails if any individual permutation requires more
than two vector inputs. For 128-bit vectors, it's possible to permute
3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
or 64-bit elements. The results are reversed for 256-bit vectors,
and so on for wid