https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86723

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It boils down to even simpler:
int
bar (unsigned long long value)
{
  return ((value & 0x000000ff00000000ull) >> 8)
         | ((value & 0x0000ff0000000000ull) >> 24)
         | ((value & 0x00ff000000000000ull) >> 40)
         | ((value & 0xff00000000000000ull) >> 56);
}
which is what you get from #c2 after optimizations.
The bswap pass tries to ATM recognize just nops - 0x0807060504030201
permutation
markers - and full bswaps - 0x0102030405060708 permutations, where in those
permutation bytes
   0       - target byte has the value 0
   FF      - target byte has an unknown value (eg. due to sign extension)
   1..size - marker value is the byte index in the source (0 for lsb).
But we could very well handle also masked bswaps, either just those one can get
from zero extensions, so 0x0000000005060708 or 0x0000000000000708 etc., or
generally with clearing of arbitrary bytes in it, say
0x0100030400060700 by doing __builtin_bswap64 (arg) & 0xff00ffff00ffff00ULL
etc.
Then this optimization would fall out from that, because we'd do
(int) (__builtin_bswap64 (arg) & 0xffffffffULL) and further opts would optimize
away the masking.

Reply via email to