--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2009-06-29
02:25 ---
Fixed
--
luisgpm at linux dot vnet dot ibm dot com changed:
What|Removed |Added
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2009-06-29
02:24 ---
Already commited on 4.5. Closing...
--
luisgpm at linux dot vnet dot ibm dot com changed:
What|Removed |Added
--- Comment #19 from luisgpm at linux dot vnet dot ibm dot com 2009-06-03
03:01 ---
A little bit of information about the problem.
On 32-bit code, the loads are being pushed up, from a different BB to the BB
where we have the stfd instruction, during global scheduling. I suspect the
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2009-05-29
19:52 ---
>From predictive commoning we gain a bit more performance, probably due to the
bigger unrolling factor.
Any chance of the unrolling taking place while still using PRE?
Thanks,
Luis
--
h
--- Comment #18 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15
02:19 ---
64-bit with -mcpu=power6
.L93:
fmul 20,11,13
fmul 19,11,0
addis 12,11,0xffe5
lfd 3,0(11)
addi 5,11,8
lfd 2,9472(12)
addis 14,5,0xffe5
--- Comment #17 from luisgpm at linux dot vnet dot ibm dot com 2009-05-15
02:16 ---
Actually, 64-bit is affected too, but not with the "power6x" tuning i was
using. With "-mcpu=power6" i can reproduce the problem.
The problem seems to be a couple load instruc
--- Comment #16 from luisgpm at linux dot vnet dot ibm dot com 2009-05-14
04:12 ---
Just for the record... The 64-bit case is fixed. There are still performance
issues in the 32-bit case.
I'll attach more information soon.
Luis
--
http://gcc.gnu.org/bugzilla/show_bug.c
--- Comment #11 from luisgpm at linux dot vnet dot ibm dot com 2009-05-12
12:55 ---
Any updates?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-05-11
18:04 ---
Good asm code for a hot loop in swim's "calc1" function
10001e10: lfd f12,-10672(r11)
10001e14: lfd f9,-10672(r9)
10001e18: addir21,r21,16
10001e1c: lf
merge (gcc
r145494)
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vne
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04
15:50 ---
Follows the configure options, although i think you'll be able to reproduce it
with the flags i've mentioned.
/gcc/HEAD/configure --target=powerpc64-linux --host=powerpc64-linux
--build=power
--- Comment #8 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04
15:41 ---
Oops... forgot about that bit, sorry.
Compile flags: -m32 -O3 -mcpu=power[4|5|6] -ffast-math -ftree-loop-linear
-funroll-loops -fpeel-loops
This reproduces on both 32-bit and 64-bit.
Luis
--
http
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2009-05-04
13:50 ---
Just for the sake of information, -fselective-scheduling doesn't help.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30
19:38 ---
Created an attachment (id=17786)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17786&action=view)
Last tree pass for the bad code
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30
19:29 ---
ASM code for the bad loop
.L145:
fmul 10,8,13
fmul 5,8,0
addis 3,4,0xffe5
lfd 22,8(7)
addi 7,4,8
lfd 6,9472(3)
fmadd 10,9,0,10
fmsub
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2009-04-30
16:33 ---
This is already in 4.4, but we would like to add additional checks on 4.5 that
would be risky to have on 4.4 (since it was almost being released). I have the
additional patch and will attach it soon
Severity: normal
Priority: P3
Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vnet dot ibm dot com
GCC build triplet: powerpc*-*-*
GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*
http://gcc.
--- Comment #1 from luisgpm at linux dot vnet dot ibm dot com 2009-01-09
18:00 ---
Created an attachment (id=17065)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17065&action=view)
Second part of the combined patch
Additional check to avoid returning a NULL base. Th
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vnet dot ibm dot com
GCC build triplet: powerpc*-*-*
GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*
http://
--- Comment #21 from luisgpm at linux dot vnet dot ibm dot com 2008-10-03
20:59 ---
It fixes the bzip2 ICE.
Thanks,
Luis
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-10-02
01:43 ---
This problem also showed up as a CPU2000 regression in the Sixtrack benchmark
for PPC64, causing problems in the ondering of ld/st instructions.
A GCC patched with Richard's fix produced the right
--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01
17:44 ---
I still can ICE it with the same flags in a native system. Any other info you'd
like to have available?
I have a more reduced source, will post it soon.
--
http://gcc.gnu.org/bugzilla/show_bu
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01
13:19 ---
I'm still trying to minimize even further the source. Will attach when i have
something better.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01
13:13 ---
Created an attachment (id=16441)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16441&action=view)
Reduced source for bzip2.c
Indented reduced source.
--
luisgpm at linux dot vnet dot
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01
13:10 ---
Created an attachment (id=16440)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16440&action=view)
Reduced source for bzip2.c
Source reduced with delta
--
http://gcc.gnu.org/b
--- Comment #2 from luisgpm at linux dot vnet dot ibm dot com 2008-10-01
13:10 ---
Created an attachment (id=16439)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16439&action=view)
Preprocessed source for reduced bzip2.c
Preprocessed source.
--
http://gcc.gnu.org/b
ty: P3
Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: luisgpm at linux dot vnet dot ibm dot com
GCC build triplet: powerpc*-*-*
GCC host triplet: powerpc*-*-*
GCC target triplet: powerpc*-*-*
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37686
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2008-09-17
15:22 ---
Gathered some PPC 32/64 performance numbers with the patch (based on 140409).
No noticeable performance regressions were found. 32-bit swin and 64-bit art
had a little boost on speed (7.8% and 3.4
--- Comment #12 from luisgpm at linux dot vnet dot ibm dot com 2008-09-11
05:18 ---
This patch (revision 140068) breaks SPEC2000's 200.sixtrack benchmark for
POWER6 due to miscompares. Reverting this patch solves the problem. Not sure
what specific problem was introduced.
--- Comment #31 from luisgpm at linux dot vnet dot ibm dot com 2008-09-09
13:51 ---
I have the fix for PPC. Any special reason why this doesn't get reproduced
there? Still would be worthwhile to include the rs6000-specific fix for this
bug ticket?
Thanks,
Luis
--
--- Comment #9 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
16:09 ---
With revision 139317, the numbers for 197.parser as back to normal and the
generated ASM code carries only a single call to __ctype_toupper_loc.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
--- Comment #7 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
14:21 ---
The preprocessed sources for strncasecmp.c are exactly the same for both cases.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
--- Comment #6 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
14:07 ---
Created an attachment (id=16112)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16112&action=view)
Generated ASM code for the good case
The __ctype_toupper_loc function, differently than
--- Comment #5 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
14:06 ---
Created an attachment (id=16111)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16111&action=view)
Generated ASM code for the bad case
Notice that __ctype_toupper_loc is called 6 times in th
--- Comment #4 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
14:05 ---
Created an attachment (id=16110)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16110&action=view)
Preprocessed source for the good case
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
--- Comment #3 from luisgpm at linux dot vnet dot ibm dot com 2008-08-20
14:04 ---
Created an attachment (id=16109)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16109&action=view)
Preprocessed source for the bad case
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37171
36 matches
Mail list logo