On Wed, Jul 30, 2014 at 12:10 PM, James Greenhalgh <james.greenha...@arm.com> wrote: > > Hi, > > A vec_select mask exists in GCC's world-view of lane ordering. The > "low-half" of the vector { a, b, c, d } is { a, b }, which on big-endian > will be in the high bits of the architectural register. On little-endian, > these lanes will be in the low bits of the architectural register. > We therefore need different masks depending on our target endian-ness. > The diagram below may help. > > We must draw the distinction when building masks which select one half of the > vector. An instruction selecting architectural low-lanes for a big-endian > target, must be described using a mask selecting GCC high-lanes. > > Big-Endian Little-Endian > > GCC 0 1 2 3 3 2 1 0 > | x | x | x | x | | x | x | x | x | > Architecture 3 2 1 0 3 2 1 0 > > Low Mask: { 2, 3 } { 0, 1 } > High Mask: { 0, 1 } { 2, 3 } > > The way we implement this requires some "there is no spoon" thinking to avoid > pattern duplication. We define a vec_par_cnst_lo_half mask to always > refer to the low architectural lanes. I gave some thought to renaming this > vec_par_cnst_arch_lo_half, but it didn't add much meaning. I'm happy to > take bike-shedding towards a more self-documenting naming scheme. > > No regressions spotted on aarch64_be-none-elf or aarch64-none-elf. > > OK for trunk?
Please make sure the above is still correct if you rip out all if (BYTES_BIG_ENDIAN) cases from tree-vect*.c. Richard. > Thanks, > James > > --- > gcc/ > > 2014-07-30 James Greenhalgh <james.greenha...@arm.com> > > * config/aarch64/aarch64.c (aarch64_simd_vect_par_cnst_half): Vary > the generated mask based on BYTES_BIG_ENDIAN. > (aarch64_simd_check_vect_par_cnst_half): New. > * config/aarch64/aarch64-protos.h > (aarch64_simd_check_vect_par_cnst_half): New. > * config/aarch64/predicates.md (vect_par_cnst_hi_half): Refactor > the check out to aarch64_simd_check_vect_par_cnst_half. > (vect_par_cnst_lo_half): Likewise. > * config/aarch64/aarch64-simd.md > (aarch64_simd_move_hi_quad_<mode>): Always use vec_par_cnst_lo_half. > (move_hi_quad_<mode>): Always generate a low mask.