On Fri, Dec 09, 2011 at 06:02:21PM +0000, Ramana Radhakrishnan wrote: > On 8 December 2011 21:06, Richard Henderson <r...@redhat.com> wrote: > > --- > > gcc/config/arm/arm-protos.h | 3 + > > gcc/config/arm/arm.c | 527 > > ++++++++++++++++++++++++++++++++- > > gcc/config/arm/neon.md | 59 ++++ > > gcc/config/arm/vec-common.md | 26 ++ > > gcc/testsuite/lib/target-supports.exp | 9 +- > > 5 files changed, 620 insertions(+), 4 deletions(-) > > I haven't been following the vector permute work in great detail and > I must say I haven't read this patch series in great detail yet. > > For Neon a further optimization to consider might be to use the vext > instruction which could achieve permute masks that are monotonically > increasing constants ? While I expect the latency for a vext or vtbl > instruction to be about the same (your mileage might vary depending on > the core), using vext gives us the freedom of not needing a register > for the permute mask - > > a = vec_shuffle (b, c, mask) where mask is { n + 7, n + 6, n + 5, n + > 4, n + 3, n + 2, n + 1, n } could just be vext.8 A, B, C, #n > > If the mask being provided is a reverse of the mask above, it's > probably not worth it. > > > Additionally , can we also detect rotate rights ? unless ofcourse > there's a different interface - > > a = vec_shuffle (vec, {0, 7, 6, 5, 4, 3, 2, 1}) => vext.8 a, vec, vec, #1 > > > Masks doing rotate lefts are probably not worth it in this
Richard and I were discussing this last night on IRC, and it is certainly possible. Somebody would just have to write a predicate to recognize the case. We do wonder how frequently it will occur, and whether people doing this would just use the whole vector shift instead of shuffle. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899