https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98908

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
   Target Milestone|---                         |9.0
         Resolution|---                         |FIXED

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #6)
> Also the second example wasn't misoptimized, on the contrary it was the most
> reasonable portable function I could write that would work equivalently to
> the first *and* that GCC would optimize ideally.

GCC 7.1.0 produces:
f(reg):
.LFB0:
        .cfi_startproc
        mov     edx, edi
        xor     eax, eax
        mov     ecx, edi
        and     edx, -2
        mov     al, dl
        movzx   edx, ch
        and     edx, -128
        mov     ah, dl
        ret
f1(reg):
.LFB1:
        .cfi_startproc
        and     di, -32514
        xor     eax, eax
        movzx   edx, di
        mov     al, dil
        sar     edx, 8
        mov     ah, dl
        ret

f is your first example and f1 is the second.
As you can see GCC before GCC 8 did neither.
In GCC 8, the second function produces:
  _1 = x.l;
  _2 = (signed short) _1;
  _3 = x.h;
  _4 = (int) _3;
  _5 = _4 << 8;
  _6 = (signed short) _5;
  _7 = _2 | _6;
  _8 = (short unsigned int) _7;
  tmp_14 = _8 & 33022;
  MEM[(unsigned char *)&D.2500] = tmp_14;

And is only opimitized in GCC 9 with bswap producing what I mentioned before

So fixed for GCC 9.

Reply via email to