------- Additional Comments From steven at gcc dot gnu dot org  2005-01-23 
19:23 -------
For x86 I get this: 
g: 
        movl    r+8, %edx 
        movl    r, %eax 
        addl    %edx, %eax 
        movl    %eax, r 
        addl    r+4, %eax 
        movl    %eax, r+4 
        addl    %edx, %eax 
        movl    %eax, r+8 
        ret 
 
That is pretty much the best you can get, as far as I can tell. 
 
For AMD64 it's similar: 
 
g: 
.LFB2: 
        movl    r+8(%rip), %edx 
        movl    r(%rip), %eax 
        addl    %edx, %eax 
        movl    %eax, r(%rip) 
        addl    r+4(%rip), %eax 
        movl    %eax, r+4(%rip) 
        addl    %edx, %eax 
        movl    %eax, r+8(%rip) 
        ret 
.LFE2: 
 
I'm not sure what you think the missed optimization is here.  You will have 
to show what you want at the assembly level, and explain why you think this 
is a coalescing problem.  So far, I don't see a missed optimization. 
 
What is worse is that we fail to do store motion when you put such blocks 
inside a loop, e.g. 
 
int r[6]; 
void g (int n) 
{ 
  while (--n) 
    { 
      r [0] += r [1]; 
      r [1] += r [2]; 
      r [2] += r [0]; 
    } 
} 
 
which is the issue discussed in PR19581. 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19580

Reply via email to