[Bug target/82498] Missed optimization for x86 rotate instruction

jakub at gcc dot gnu.org Wed, 11 Oct 2017 10:18:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82498


--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Two further cases:
unsigned
f10 (unsigned x, unsigned char y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

unsigned
f11 (unsigned x, unsigned short y)
{
  y %= __CHAR_BIT__ * __SIZEOF_INT__;
  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
}

On f11 GCC generates also efficient code, on f10 useless &.
Guess the f10 case would be improved by addition of a
*<rotate_insn><mode>3_mask_1 define_insn_and_split (and similarly the
inefficient/nonportable  f1 code would be slightly improved).

Looking at LLVM, f1/f3/f5 are worse in LLVM than in GCC, and in all cases
instead of cmov it uses branching; f7/f8/f9/f10/f11 all generate efficient code
though, so the same like GCC in case of f8 and f11.

[Bug target/82498] Missed optimization for x86 rotate instruction

Reply via email to