On Fri, Jan 18, 2013 at 9:32 AM, Konstantin Vladimirov
<[email protected]> wrote:
> Hi,
>
> Faced this problem in private backend, but it can be easily reproduced
> on x86 GCC:
>
> Sample code (test.c):
>
> int a;
>
> int foo(int *x, int y)
> {
> a = x[(y << 1)];
> x[(y << 1)] = y;
> return 0;
> }
>
> Compile with gcc-4.7.2:
>
> $ gcc --version
> gcc (GCC) 4.7.2
> Copyright (C) 2012 Free Software Foundation, Inc.
>
> Command line is:
>
> $ gcc -O2 -S -m32 -dp test.c
>
> Yields code:
>
> foo:
> .LFB0:
> .cfi_startproc
> movl 8(%esp), %edx # 3 *movsi_internal/1 [length = 4]
> leal 0(,%edx,8), %eax # 25 *leasi [length = 7]
> addl 4(%esp), %eax # 9 *addsi_1/1 [length = 4]
I think you'd need to intermediate combine the lea and the add to
movl 4(%esp) %ecx
leal 0%(%ecx,%edx,8), %eax
only then fwprop may consider combining this with the load/stores.
Note that combine does not apply because %eax is used multiple
times. This also means that for code-size the combining is not a good
idea.
Richard.
> movl (%eax), %ecx # 10 *movsi_internal/1 [length = 2]
> movl %ecx, a # 11 *movsi_internal/2 [length = 6]
> movl %edx, (%eax) # 12 *movsi_internal/2 [length = 2]
> xorl %eax, %eax # 29 *movsi_xor [length = 2]
> ret # 28 simple_return_internal [length = 1]
>
> It is obvious, that it can be rewritten without prior leal much better:
>
> foo:
> .LFB0:
> .cfi_startproc
> movl 8(%esp), %eax # 3 *movsi_internal/1 [length = 4]
> movl 4(%esp), %edx # 19 *movsi_internal/1 [length = 4]
> movl (%edx,%eax,8), %ecx # 8 *movsi_internal/1 [length = 3]
> movl %ecx, a # 11 *movsi_internal/2 [length = 6]
> movl %eax, (%edx,%eax,8) # 8 *movsi_internal/2 [length = 3]
> xorl %eax, %eax # 29 *movsi_xor [length = 2]
> ret # 28 simple_return_internal [length = 1]
>
> (this assembler is handwritten, summary instruction count -1, summery length
> -5)
>
> Key Idea is that common address here, that forms leal is profitable to
> be not calculated standalone, but moved into address operands in store
> and in load.
>
> When we have only store or only load, this job is done by combining.
>
> But it seems, that combine pass even don't try store+load.
>
> Am I missing something?
>
> P.S. Posted similar one to gcc-help, got no response, trying to repost
> here (sorry for possible offtopic).
>
> ---
> With best regards, Konstantin