On Fri, Jan 18, 2013 at 9:32 AM, Konstantin Vladimirov <konstantin.vladimi...@gmail.com> wrote: > Hi, > > Faced this problem in private backend, but it can be easily reproduced > on x86 GCC: > > Sample code (test.c): > > int a; > > int foo(int *x, int y) > { > a = x[(y << 1)]; > x[(y << 1)] = y; > return 0; > } > > Compile with gcc-4.7.2: > > $ gcc --version > gcc (GCC) 4.7.2 > Copyright (C) 2012 Free Software Foundation, Inc. > > Command line is: > > $ gcc -O2 -S -m32 -dp test.c > > Yields code: > > foo: > .LFB0: > .cfi_startproc > movl 8(%esp), %edx # 3 *movsi_internal/1 [length = 4] > leal 0(,%edx,8), %eax # 25 *leasi [length = 7] > addl 4(%esp), %eax # 9 *addsi_1/1 [length = 4]
I think you'd need to intermediate combine the lea and the add to movl 4(%esp) %ecx leal 0%(%ecx,%edx,8), %eax only then fwprop may consider combining this with the load/stores. Note that combine does not apply because %eax is used multiple times. This also means that for code-size the combining is not a good idea. Richard. > movl (%eax), %ecx # 10 *movsi_internal/1 [length = 2] > movl %ecx, a # 11 *movsi_internal/2 [length = 6] > movl %edx, (%eax) # 12 *movsi_internal/2 [length = 2] > xorl %eax, %eax # 29 *movsi_xor [length = 2] > ret # 28 simple_return_internal [length = 1] > > It is obvious, that it can be rewritten without prior leal much better: > > foo: > .LFB0: > .cfi_startproc > movl 8(%esp), %eax # 3 *movsi_internal/1 [length = 4] > movl 4(%esp), %edx # 19 *movsi_internal/1 [length = 4] > movl (%edx,%eax,8), %ecx # 8 *movsi_internal/1 [length = 3] > movl %ecx, a # 11 *movsi_internal/2 [length = 6] > movl %eax, (%edx,%eax,8) # 8 *movsi_internal/2 [length = 3] > xorl %eax, %eax # 29 *movsi_xor [length = 2] > ret # 28 simple_return_internal [length = 1] > > (this assembler is handwritten, summary instruction count -1, summery length > -5) > > Key Idea is that common address here, that forms leal is profitable to > be not calculated standalone, but moved into address operands in store > and in load. > > When we have only store or only load, this job is done by combining. > > But it seems, that combine pass even don't try store+load. > > Am I missing something? > > P.S. Posted similar one to gcc-help, got no response, trying to repost > here (sorry for possible offtopic). > > --- > With best regards, Konstantin