https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ubizjak at gmail dot com

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
Some more examples:

--cut here--
struct S1
{
  char val1;
  char val2;
  short pad2;
};

extern struct S1 s;

void test (struct S1 a)
{
  s.val1 = a.val1;
  s.val2 = a.val2;
}
--cut here--

results in (x86_64, -O2):

test:
        movl    %edi, %eax
        movb    %dil, s(%rip)
        movb    %ah, s+1(%rip)
        ret

the code above is equivalent to:

        movw    %di, s(%rip)
        ret

--cut here--
struct S1
{
  short val1;
  short val2;
};

extern struct S1 s;

void test (struct S1 a)
{
  s.val1 = a.val1;
  s.val2 = a.val2;
}
--cut here--

results in (x86_64, -O2):

test:
        movw    %di, s(%rip)
        sarl    $16, %edi
        movw    %di, s+2(%rip)
        ret

the code above is equivalent to:

        movl    %edi, s(%rip)

The first example happens many times in libstdc++. Looking at
src/c++11/cxx11-shim_facets.o for x86_64, there are several examples of:

 19d:   88 43 5c                mov    %al,0x5c(%rbx)
 1a0:   88 63 5d                mov    %ah,0x5d(%rbx)

and in the cc1 itself even gems like:

 13ee3c6:       48 8b 53 10             mov    0x10(%rbx),%rdx
 13ee3ca:       48 8b 43 28             mov    0x28(%rbx),%rax
 13ee3ce:       44 89 f9                mov    %r15d,%ecx
 13ee3d1:       44 88 7c 02 fc          mov   
%r15b,0xfffffffffffffffc(%rdx,%rax,1)
 13ee3d6:       48 8b 53 10             mov    0x10(%rbx),%rdx
 13ee3da:       48 8b 43 28             mov    0x28(%rbx),%rax
 13ee3de:       88 6c 02 fd             mov   
%ch,0xfffffffffffffffd(%rdx,%rax,1)

The additional problem represents the fact that move from highpart (%ch) can't
use registers other than %ah, %bh, %ch and %dh, and the address can't use
REX-prefixed registers in this case, while the move from 16bit register avoids
both limitations.

So, merging would benefit x86 targets considerably.

Reply via email to