On 23/04/14 16:20, Alan Lawrence wrote: > This patch is a small tidy of a more-complicated expression that just flips a > single bit and can thus be a simple XOR. > > No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified > code > is indeed exercised by dg-torture.exp vshuf-v*.c). > > Also ok after applying TBL and testsuite patches in > http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and > http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. > > gcc/ChangeLog: > 2014-04-23 Alan Lawrence <alan.lawre...@arm.com> > > * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip > expression. >
s/tidy/Tidy/ It's not obvious from your description (or from the code, for that matter) that for this to be valid nelt must be a power of 2. I suggest that, above the loop, you put gcc_assert (nelt == (nelt & -nelt)); OK with those changes. R. > > xor_tidy.patch > > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index a3147ee..b879754 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct > expand_vec_perm_d *d) > rtx x; > > for (i = 0; i < nelt; ++i) > - d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1); > + d->perm[i] ^= nelt; /* Keep the same index, but in the other vector. */ > > x = d->op0; > d->op0 = d->op1; >