Re: [4.8, PATCH, rs6000] (Re: [PATCH, rs6000] More efficient vector permute for little endian)

2014-04-07 Thread Bill Schmidt
Hi, I'm withdrawing this request as I just discovered it will sometimes be advantageous to use vnand rather than vnor; will rework this and get back to you. Thanks, Bill On Fri, 2014-04-04 at 15:45 -0500, Bill Schmidt wrote: > On Thu, 2014-03-20 at 20:38 -0500, Bill Schmidt wrote: > > The origin

Re: [4.8, PATCH, rs6000] (Re: [PATCH, rs6000] More efficient vector permute for little endian)

2014-04-04 Thread Richard Henderson
On 04/04/2014 01:45 PM, Bill Schmidt wrote: > Per Richard Henderson's previous comment, I have changed the > patch slightly to avoid the use of emit_move_insn. Thanks. r~

[4.8, PATCH, rs6000] (Re: [PATCH, rs6000] More efficient vector permute for little endian)

2014-04-04 Thread Bill Schmidt
On Thu, 2014-03-20 at 20:38 -0500, Bill Schmidt wrote: > The original workaround for vector permute on a little endian platform > includes subtracting each element of the permute control vector from 31. > Because the upper 3 bits of each element are unimportant, this was > implemented as subtractin

Re: [PATCH, rs6000] More efficient vector permute for little endian

2014-03-21 Thread Richard Henderson
On 03/20/2014 06:38 PM, Bill Schmidt wrote: > - rtx splat = gen_rtx_VEC_DUPLICATE (V16QImode, > - gen_rtx_CONST_INT (QImode, -1)); > + rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); > + rtx andx = gen_rtx_AND (V16QImode, notx, notx)

Re: [PATCH, rs6000] More efficient vector permute for little endian

2014-03-21 Thread David Edelsohn
On Thu, Mar 20, 2014 at 9:38 PM, Bill Schmidt wrote: > Hi, > > The original workaround for vector permute on a little endian platform > includes subtracting each element of the permute control vector from 31. > Because the upper 3 bits of each element are unimportant, this was > implemented as sub

[PATCH, rs6000] More efficient vector permute for little endian

2014-03-20 Thread Bill Schmidt
Hi, The original workaround for vector permute on a little endian platform includes subtracting each element of the permute control vector from 31. Because the upper 3 bits of each element are unimportant, this was implemented as subtracting the whole vector from a splat of -1. On reflection this