https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
luoxhu at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |luoxhu at gcc dot gnu.org
--- Comment #2 from luoxhu at gcc dot gnu.org ---
But it only works for V8HImode, no better code generation for other modes like
V4SI/V2DI/V1TI to do byte swap with only two instructions vspltish+vrlh?
unsigned int swap1[16] = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
unsigned int swap2[16] = {7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8};
unsigned int swap4[16] = {3,2,1,0,7,6,5,4,11,10,9,8,15,14,13,12};
unsigned int swap8[16] = {1,0,3,2,5,4,7,6,9,8,11,10,13,12,15,14};
For example V4SI, need swap short first, then swap word, it seems not so
straight forward than vperm?