http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
Richard Guenther changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #11 from Richard Guenther 2011-01-20
10:36:32 UTC ---
Author: rguenth
Date: Thu Jan 20 10:36:29 2011
New Revision: 169051
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169051
Log:
2011-01-20 Richard Guenther
PR tree-o
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #10 from Richard Guenther 2011-01-20
10:33:18 UTC ---
Author: rguenth
Date: Thu Jan 20 10:33:15 2011
New Revision: 169050
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169050
Log:
2011-01-20 Richard Guenther
PR tree-o
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #9 from Martin Reinecke 2011-01-19
17:26:31 UTC ---
(In reply to comment #8)
> Can you check if the following patch solves your problem?
Yes, this patch gets performance back to normal on the 4.5 branch and on trunk.
Great!
> The di
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
Richard Guenther changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Comment
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #7 from Martin Reinecke 2011-01-19
14:16:18 UTC ---
OK, I located the problematic commit, at least on the 4.5 branch: it's revision
number 167492 (fix for PR tree-optimization/46806).
Between revisions 167491 and 167492 the CPU time
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #6 from Uros Bizjak 2011-01-06 07:38:11
UTC ---
(In reply to comment #5)
> Some loop performance is very sensitive to code sizes. This change
>
> -mulpd%xmm10, %xmm2
> +mulpd%xmm0, %xmm2
>
> will impact loop size d
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #5 from H.J. Lu 2011-01-05 20:09:11
UTC ---
(In reply to comment #3)
> > this could be the reason for slowdown.
>
>
> $ gcc -lm testcase2.s
> $ time ./a.out
>
> real0m4.239s
> user0m4.234s
> sys0m0.001s
>
> The im
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #4 from Uros Bizjak 2011-01-05 19:48:58
UTC ---
Applying the same medicine to original test gets us from:
wall time for map2alm: 6.908527s
to
wall time for map2alm: 6.703142s
where 4.5.1 wins with:
wall time for map2alm: 6.651740
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #3 from Uros Bizjak 2011-01-05 19:30:49
UTC ---
> this could be the reason for slowdown.
Hm, not really.
But, by changing the generated assembly around loop entry:
$ diff -u testcase2.s testcase2_.s
--- testcase2.s2011-01-05 20
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #2 from Uros Bizjak 2011-01-05 17:31:20
UTC ---
The only difference in the hot loop is the usage of two regs in the address:
4.5.1:
.L142:
movapd%xmm0, (%rcx)
mulpd%xmm6, %xmm2
addq$32, %rbx
movapd%xm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47167
--- Comment #1 from Martin Reinecke 2011-01-05
14:42:20 UTC ---
Created attachment 22904
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22904
shorter test case
More compact test case; the hot spot is marked with "CRITICAL LOOP".
Compile wit
12 matches
Mail list logo