On Fri, Jan 18, 2013 at 9:32 AM, Konstantin Vladimirov
<konstantin.vladimi...@gmail.com> wrote:
> Hi,
>
> Faced this problem in private backend, but it can be easily reproduced
> on x86 GCC:
>
> Sample code (test.c):
>
> int a;
>
> int foo(int *x, int y)
> {
>   a = x[(y << 1)];
>   x[(y << 1)] = y;
>   return 0;
> }
>
> Compile with gcc-4.7.2:
>
> $ gcc --version
> gcc (GCC) 4.7.2
> Copyright (C) 2012 Free Software Foundation, Inc.
>
> Command line is:
>
> $ gcc -O2 -S -m32 -dp test.c
>
> Yields code:
>
> foo:
> .LFB0:
>   .cfi_startproc
>   movl  8(%esp), %edx # 3 *movsi_internal/1 [length = 4]
>   leal  0(,%edx,8), %eax  # 25  *leasi  [length = 7]
>   addl  4(%esp), %eax # 9 *addsi_1/1  [length = 4]

I think you'd need to intermediate combine the lea and the add to

     movl 4(%esp) %ecx
     leal 0%(%ecx,%edx,8), %eax

only then fwprop may consider combining this with the load/stores.
Note that combine does not apply because %eax is used multiple
times.  This also means that for code-size the combining is not a good
idea.

Richard.

>   movl  (%eax), %ecx  # 10  *movsi_internal/1 [length = 2]
>   movl  %ecx, a # 11  *movsi_internal/2 [length = 6]
>   movl  %edx, (%eax)  # 12  *movsi_internal/2 [length = 2]
>   xorl  %eax, %eax  # 29  *movsi_xor  [length = 2]
>   ret # 28  simple_return_internal  [length = 1]
>
> It is obvious, that it can be rewritten without prior leal much better:
>
> foo:
> .LFB0:
>   .cfi_startproc
>   movl  8(%esp), %eax # 3 *movsi_internal/1 [length = 4]
>   movl  4(%esp), %edx # 19  *movsi_internal/1 [length = 4]
>   movl  (%edx,%eax,8), %ecx  # 8 *movsi_internal/1 [length = 3]
>   movl  %ecx, a # 11  *movsi_internal/2 [length = 6]
>   movl  %eax, (%edx,%eax,8)  # 8 *movsi_internal/2 [length = 3]
>   xorl  %eax, %eax  # 29  *movsi_xor  [length = 2]
>   ret # 28  simple_return_internal  [length = 1]
>
> (this assembler is handwritten, summary instruction count -1, summery length 
> -5)
>
> Key Idea is that common address here, that forms leal is profitable to
> be not calculated standalone, but moved into address operands in store
> and in load.
>
> When we have only store or only load, this job is done by combining.
>
> But it seems, that combine pass even don't try store+load.
>
> Am I missing something?
>
> P.S. Posted similar one to gcc-help, got no response, trying to repost
> here (sorry for possible offtopic).
>
> ---
> With best regards, Konstantin

Reply via email to