http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse <marc.glisse at normalesup dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26938|0                           |1
        is obsolete|                            |

--- Comment #18 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-25 
13:52:09 UTC ---
Created attachment 26979
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26979
default case

An updated version of this simple, generic-case shuffle (do note that I didn't
run the generated code, just checked that it compiled and the instructions
generated looked roughly ok). With the patch, we have (concerning v4df and
v8sf):

- no single-vector shuffle takes more than 4 insn,
- no 2-vector shuffle takes more than 9 insn (or 3 (+ 2 movs for constants...)
with AVX2).

I think the current code already guarantees than anything that can be done in a
single instruction is.

Some possible goals (making everything optimal may be a bit hard) would be:

- everything that can be done in 2 insn is,
- no single-vector v4df takes more than 3 insn,
- one or two extra optimizations, if they are generic enough.

I do wonder occasionally about allowing wild indexes (jokers, places where you
can put anything) in shuffles, whether it is exposed to users or just an
internal tool.

Reply via email to