On Tue, Oct 4, 2011 at 8:37 PM, H.J. Lu <hjl.to...@gmail.com> wrote:

>>>> OTOH, x86_64 and i686 targets can also benefit from this change. If
>>>> combine can't create more complex address (covered by lea), then it
>>>> will simply propagate memory operand back into the add insn. It looks
>>>> to me that we can't loose here, so:
>>>>
>>>>  /* Improve address combine.  */
>>>>  if (code == PLUS && MEM_P (src2))
>>>>    src2 = force_reg (mode, src2);
>>>>
>>>> Any opinions?
>>>>
>>>
>>> It doesn't work with 64bit libstdc++:
>>
>> Yeah, yeah. ix86_output_mi_thunk has some ...  issues.
>>
>> Please try attached patch that introduces ix86_emit_binop and uses it
>> in a bunch of places.

> I tried it on GCC.  There are no regressions.  The bugs are fixed for x32.
> Here are size comparison with GCC runtime libraries on ia32, x32 and
> x86-64:

>  884093   18600   27064  929757   e2fdd old libstdc++.so
>  884189   18600   27064  929853   e303d new libs/libstdc++.so
>
> The new code is
>
> mov    0xc(%edi),%eax
> mov    %eax,0x8(%esi)
> mov    -0xc(%eax),%eax
> mov    0x10(%edi),%edx
> lea    0x8(%esi,%eax,1),%eax
>
> The old one is
>
> mov    0xc(%edi),%edx
> lea    0x8(%esi),%eax
> mov    %edx,0x8(%esi)
> add    -0xc(%edx),%eax
> mov    0x10(%edi),%edx

The new code merged lea+add into one lea, so it looks quite OK to me.

Do you have some performance numbers?

Thanks,
Uros.

Reply via email to