On Thu, 2012-09-20 at 09:12 +0200, Eric Botcazou wrote:
> > The attached patch catches C constructs:
> > (A << 8) | (A >> 8)
> > where A is unsigned 16 bits
> > and maps them to builtin_bswap16(A) which can provide more efficient
> > implementations on some targets.
> 
> This belongs in tree-ssa-math-opts.c:execute_optimize_bswap instead.
> 
> When I implemented __builtin_bswap16, I didn't add this because I thought 
> this 
> would be overkill since the RTL combiner should be able to catch the pattern.
> Have you investigated on this front?  But I don't have a strong opinion.
> 

A while ago I've tried doing that for SH (implementing bswap16 with RTL
combine).  It was like an explosion of patterns, because combine would
try out a lot of things depending on the surrounding code around the
actual bswap16.  In the end I decided to drop that stuff for the most
part.

BTW, the built-in documentation says:

Built-in Function: int16_t __builtin_bswap16 (int16_t x)
Built-in Function: int32_t __builtin_bswap32 (int32_t x)

However, it seems the result is always unsigned for those.
At least on SH I get the following:

int test (short x)
{
  return __builtin_bswap16 (x);
}

swap.b  r4,r4   ! 8     *rotlhi3_8
rts             ! 24    *return_i
extu.w  r4,r0   ! 9     *zero_extendhisi2_compact

... and similarly for int32.
Can anyone else confirm this?

Cheers,
Oleg

Reply via email to