[Bug target/38201] -mfma/-mavx and -msse5/-msse4a don't work together

2008-11-21 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-11-21 12:00 --- In short, set A={-favx, -ffma}, set B={-f3dnow, -f3dnowa, -fsse4a, -fsse5}. Any option combination from both sets should be prohibited. Please add more options into these set in case I missed any. -- http

[Bug rtl-optimization/38280] [4.4 regression] Revision 142207 breaks 416.gamess/481.wrf in SPEC CPU 2006

2008-11-27 Thread Joey dot ye at intel dot com
--- Comment #4 from Joey dot ye at intel dot com 2008-11-28 03:39 --- 142250 doesn't fix this regression. 416.gamess and 481.wrf still fail. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38280

[Bug rtl-optimization/38280] [4.4 regression] Revision 142207 breaks 416.gamess/481.wrf in SPEC CPU 2006

2008-11-28 Thread Joey dot ye at intel dot com
--- Comment #6 from Joey dot ye at intel dot com 2008-11-28 15:11 --- Patch at http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01428.html fixed this regression. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38280

[Bug rtl-optimization/38280] [4.4 regression] Revision 142207 breaks 416.gamess/481.wrf in SPEC CPU 2006

2008-11-30 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-12-01 02:18 --- Yes. It fixes 416/481 on 32 bits and 481 on 64 bits. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38280

[Bug rtl-optimization/37948] [4.4 Regression] IRA generates slower code

2008-12-09 Thread Joey dot ye at intel dot com
--- Comment #12 from Joey dot ye at intel dot com 2008-12-10 03:01 --- Fixed at trunk 142631 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37948

[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-12-29 Thread Joey dot ye at intel dot com
--- Comment #45 from Joey dot ye at intel dot com 2008-12-30 01:49 --- (In reply to comment #44) > Does anyone have new numbers? Fixed on both i386/x86_64: x86_64: 4.4 (trunk 142847): 5.4s 4.3.2 release: 5.4s 4.2.4 release: 5.4s i386: 4.4 (trunk 142847): 2.7s 4.3.2 rele

[Bug rtl-optimization/37397] [4.4 Regression] IRA performance impact on SPEC CPU 2K/2006

2008-12-29 Thread Joey dot ye at intel dot com
--- Comment #6 from Joey dot ye at intel dot com 2008-12-30 02:50 --- (In reply to comment #4) > Revision 141860 caused 30% slowdown on 454.calculix in SPEC CPU 2006 > with -O2 -ffast-math on Linux/Intel64. This regression has been fixed in some revision between 142187 and

[Bug target/38736] [4.4 Regression] -mavx can change the ABI via BIGGEST_ALIGNMENT

2009-01-06 Thread Joey dot ye at intel dot com
--- Comment #5 from Joey dot ye at intel dot com 2009-01-07 02:45 --- More places with BIGGEST_ALIGN: $ grep -r "(aligned)" .|grep attribute|grep -v testsuite|grep -v texi ./libstdc++-v3/libsupc++/eh_alloc.cc:typedef char one_buffer[EMERGENCY_OBJ_SIZE] __attribute_

[Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01

2009-01-14 Thread Joey dot ye at intel dot com
--- Comment #7 from Joey dot ye at intel dot com 2009-01-14 10:08 --- (In reply to comment #5) > Joern, re. comment #4, Richi refers to my patch to enable PRE at -Os, see > [1]. > An extension to this patch that we tested on x86 machines, is to disable PRE > for sc

[Bug target/38899] pessimizes function without SSE intrinsics

2009-01-20 Thread Joey dot ye at intel dot com
--- Comment #2 from Joey dot ye at intel dot com 2009-01-21 02:40 --- Following case isn't vecterized with -O3 on x86_64 either, although arrays are aligned: #include float __attribute__((aligned(16))) in1[] = { 1.2, 3.5, 1.7, 2.8 }; float __attribute__((align

[Bug target/38952] [4.4 Regression] EH does not work.

2009-01-26 Thread Joey dot ye at intel dot com
--- Comment #20 from Joey dot ye at intel dot com 2009-01-26 11:49 --- (In reply to comment #10) > This is caused by stack alignment change, revision 138335. Joey and > Xuepeng will look into it after holiday, Feb. 1. This must be stack alignment change. Looks we didn't h

[Bug libmudflap/33119] New: Missing mf-runtime.h after make -j2 install

2007-08-20 Thread Joey dot ye at intel dot com
Summary: Missing mf-runtime.h after make -j2 install Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: libmudflap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey d

[Bug libmudflap/33119] Missing mf-runtime.h after make -j2 install

2007-08-20 Thread Joey dot ye at intel dot com
--- Comment #2 from Joey dot ye at intel dot com 2007-08-20 08:53 --- (In reply to comment #1) > Nobody does "make install" with -j. I guess so, that's why I set it "minor". But does that mean error is expected with -j? My script had -j by accident and it cos

[Bug target/37010] -mno-accumulate-outgoing-args doesn't work with stack alignment

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #15 from Joey dot ye at intel dot com 2008-08-05 01:01 --- (In reply to comment #12) > I think the problem is in > /* Set offset to aligned because the realigned frame tarts from here. */ > if (stack_realign_fp) > offset = (offset + stack_alignme

[Bug middle-end/34921] Misalign stack variable referenced by nested function

2008-08-06 Thread Joey dot ye at intel dot com
--- Comment #9 from Joey dot ye at intel dot com 2008-08-06 08:05 --- Fixed -- Joey dot ye at intel dot com changed: What|Removed |Added Status|NEW

[Bug middle-end/36983] Trunk 138207 miscompiles 172.mgrid on x86-64

2008-08-07 Thread Joey dot ye at intel dot com
--- Comment #3 from Joey dot ye at intel dot com 2008-08-07 07:55 --- Although 138318 fixes the compiler ICE, it miscompile with -O3 -ffast-math on x86-64: Running 172.mgrid ref base o3 default *** Miscompare of mgrid.out, see /home/jye2/cpu2000/benchspec/CFP2000/172.mgrid/run

[Bug middle-end/36983] Trunk 138207 miscompiles 172.mgrid on x86-64

2008-08-10 Thread Joey dot ye at intel dot com
--- Comment #6 from Joey dot ye at intel dot com 2008-08-11 05:52 --- (In reply to comment #4) > If you remove -ffast-math, does it miscompare? Passes without -ffast-math. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36983

[Bug rtl-optimization/37124] New: ICE with attribute(option("no-mmx"))

2008-08-14 Thread Joey dot ye at intel dot com
o: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37124

[Bug target/37158] Wrong insn for _mm_comieq_sd

2008-08-19 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-08-19 08:19 --- Check out such code in i386.c: /* Figure out whether to use ordered or unordered fp comparisons. Return the appropriate mode to use. */ enum machine_mode ix86_fp_compare_mode (enum rtx_code code ATTRIBUTE_UNUSED

[Bug middle-end/37243] [4.4 Regression] Revision 139590 caused many regressions

2008-08-27 Thread Joey dot ye at intel dot com
--- Comment #7 from Joey dot ye at intel dot com 2008-08-27 08:07 --- Created an attachment (id=16155) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16155&action=view) Test case from 2006.434.zeusmp Though fail to extract a smaller case, hopeful it helpful. Compile with g

[Bug middle-end/37243] [4.4 Regression] Revision 139590 caused many regressions

2008-08-27 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-08-27 08:11 --- GDB output: (gdb) b tranx1_ Breakpoint 1 at 0x43a670 (gdb) r Breakpoint 1, 0x0043a670 in tranx1_ () (gdb) b *0x43accd Breakpoint 2 at 0x43accd (gdb) b *0x43acf4 Breakpoint 3 at 0x43acf4 (gdb) b *0x43ad2f

[Bug middle-end/37243] [4.4 Regression] Revision 139590 caused many regressions

2008-08-27 Thread Joey dot ye at intel dot com
--- Comment #11 from Joey dot ye at intel dot com 2008-08-28 06:14 --- (In reply to comment #4) > We got > Running 416.gamess ref base lnx32-gcc default > 416.gamess: copy #0 non-zero return code (rc=0, signal=11) > 416.gamess: copy #0 non-zero return code (rc=0, sign

[Bug rtl-optimization/37571] New: Performance regression with -mtune=core2

2008-09-18 Thread Joey dot ye at intel dot com
riority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37571

[Bug rtl-optimization/37571] Performance regression with -mtune=core2

2008-09-18 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-09-18 16:01 --- Root cause is that instruction length of fused jcc is set to 16, which prevent the block from merging and copying. For some reason Core2 runs poorly with a unmerged branch block under certain circonstances. Following

[Bug target/37364] [4.4 Regression] IRA generates inefficient code due to missing regmove pass

2008-10-23 Thread Joey dot ye at intel dot com
--- Comment #17 from Joey dot ye at intel dot com 2008-10-23 08:42 --- CPU2006/454.calculix has about 10% regression with IRA + core2 + fpmath=sse on Core2 ix86: IRAIRA_core2 NO_IRA_core2 454.calculix 1.00 0.901.01 Revision: trunk 140514 Options in

[Bug target/37364] [4.4 Regression] IRA generates inefficient code due to missing regmove pass

2008-10-24 Thread Joey dot ye at intel dot com
--- Comment #18 from Joey dot ye at intel dot com 2008-10-24 08:36 --- Created an attachment (id=16536) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16536&action=view) Reduced performance case from cpu2006/454.calculix 50% regression with IRA core2 on trunk revsion 140

[Bug target/37364] [4.4 Regression] IRA generates inefficient code due to missing regmove pass

2008-10-24 Thread Joey dot ye at intel dot com
--- Comment #21 from Joey dot ye at intel dot com 2008-10-25 04:14 --- To me scheduler is irrelevant here. GCC has no core2 pipeline description so the instruction scheduling doesn't looks optimized. But for OOO processor like core2, IMHO scheduling shouldn't make that much

[Bug target/37364] [4.4 Regression] IRA generates inefficient code due to missing regmove pass

2008-10-27 Thread Joey dot ye at intel dot com
--- Comment #23 from Joey dot ye at intel dot com 2008-10-28 01:19 --- (In reply to comment #22) > Created an attachment (id=16571) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16571&action=view) [edit] > A patch to re-enable regmove > After applying this pa

[Bug c/34921] Misalign stack variable referenced by nested function

2008-01-21 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-01-22 06:38 --- This patch should fix it: Index: gcc/tree-nested.c === --- gcc/tree-nested.c (revision 131342) +++ gcc/tree-nested.c (working copy) @@ -183,6 +183,10

[Bug c/34921] New: Misalign stack variable referenced by nested function

2008-01-21 Thread Joey dot ye at intel dot com
referenced by nested function Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gn

[Bug middle-end/34921] Misalign stack variable referenced by nested function

2008-01-22 Thread Joey dot ye at intel dot com
--- Comment #5 from Joey dot ye at intel dot com 2008-01-23 01:45 --- (In reply to comment #2) > I bet if you put jj in struct and don't have a nested function, this will be > the same issue. Not the same. In fact it passes if not referenced by a nested function. The root

[Bug target/39082] union with long double doesn't follow x86-64 psABI

2009-02-03 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2009-02-04 02:17 --- GCC doesn't follow x86-64 psABI on this case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39082

[Bug target/39146] Unnecessary stack alignment

2009-02-09 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2009-02-10 05:35 --- Argument need 32 bytes alignment, No way to guarantee the argument won't be spilled. That's why stack adjustment is there. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39146

[Bug target/39137] [4.4 Regression] -mpreferred-stack-boundary=2 causes lots of dynamic realign

2009-02-10 Thread Joey dot ye at intel dot com
--- Comment #10 from Joey dot ye at intel dot com 2009-02-11 01:03 --- (In reply to comment #9) > Created an attachment (id=17279) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17279&action=view) [edit] > A patch to add a new -malign-double= option This patch l

[Bug target/39146] Unnecessary stack alignment

2009-02-11 Thread Joey dot ye at intel dot com
--- Comment #5 from Joey dot ye at intel dot com 2009-02-12 01:45 --- Stack realign is finalized by stack_realign = (incoming_stack_boundary < (current_function_is_leaf ? crtl->max_used_stack_slot_ali

[Bug target/39146] Unnecessary stack alignment

2009-02-11 Thread Joey dot ye at intel dot com
--- Comment #7 from Joey dot ye at intel dot com 2009-02-12 02:26 --- Created an attachment (id=17283) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17283&action=view) A patch to fix this problem Impact to other test unknown. Test undergoing. HJ, can you also help to ver

[Bug target/39148] -Os increase code size when stack is aligned

2009-02-11 Thread Joey dot ye at intel dot com
--- Comment #6 from Joey dot ye at intel dot com 2009-02-12 02:33 --- (In reply to comment #5) > If ACCUMULATE_OUTGOING_ARGS is off, ECX will be used > for stack alignment and it may lead to code size > increase due to register spill since ia32 has very > few registe

[Bug target/39146] Unnecessary stack alignment

2009-02-11 Thread Joey dot ye at intel dot com
--- Comment #9 from Joey dot ye at intel dot com 2009-02-12 02:40 --- (In reply to comment #8) > We still have push and mov. I guess it may be the best we can do. I believe so too. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39146

[Bug target/39146] Unnecessary stack alignment

2009-02-12 Thread Joey dot ye at intel dot com
--- Comment #10 from Joey dot ye at intel dot com 2009-02-12 15:20 --- (In reply to comment #8) > We still have push and mov. I guess it may be the best we can do. > But please run full 32 and 64bit testsuite with your patch as well > as under emx-avx-sim. full 32/64 bit test

[Bug target/39146] Unnecessary stack alignment

2009-02-16 Thread Joey dot ye at intel dot com
--- Comment #12 from Joey dot ye at intel dot com 2009-02-16 08:49 --- Created an attachment (id=17305) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17305&action=view) New patch attached Test finished. No regression with emx_avx_sim. Wait to checkin to 4.5 -- Joey do

[Bug target/39137] [4.4 Regression] -mpreferred-stack-boundary=2 causes lots of dynamic realign

2009-02-17 Thread Joey dot ye at intel dot com
--- Comment #20 from Joey dot ye at intel dot com 2009-02-17 09:18 --- (In reply to comment #19) > Just for the record, here is an unsuccessful attempt to avoid stack > realignment > just because of DImode for -m32 or because of DFmode at -m32 -Os. This patch > unfortunat

[Bug target/39137] [4.4 Regression] -mpreferred-stack-boundary=2 causes lots of dynamic realign

2009-02-22 Thread Joey dot ye at intel dot com
--- Comment #31 from Joey dot ye at intel dot com 2009-02-23 03:15 --- How about this patch? 1. Only reduce DI mode when -Os 2. Ignore TYPE_USER_ALIGN, so that stack realign happens for case in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39137#c28, which IMHO is acceptable. Index

[Bug middle-end/39315] Unaligned move used on aligned stack variable

2009-02-26 Thread Joey dot ye at intel dot com
--- Comment #3 from Joey dot ye at intel dot com 2009-02-27 02:53 --- (In reply to comment #2) > Created an attachment (id=17368) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17368&action=view) [edit] > A patch > Does this patch make sense? It works fi

[Bug target/39137] [4.4 Regression] -mpreferred-stack-boundary=2 causes lots of dynamic realign

2009-03-03 Thread Joey dot ye at intel dot com
--- Comment #35 from Joey dot ye at intel dot com 2009-03-04 01:41 --- (In reply to comment #32) > I don't see the reason for && optimize_function_for_size_p (cfun), care to > back > up with benchmarks that forcing dynamic realignment for long long variables &g

[Bug target/39137] [4.4 Regression] -mpreferred-stack-boundary=2 causes lots of dynamic realign

2009-03-11 Thread Joey dot ye at intel dot com
--- Comment #47 from Joey dot ye at intel dot com 2009-03-12 06:51 --- (In reply to comment #46) > Created an attachment (id=17444) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17444&action=view) [edit] > gcc.target/i386/stackalign/longlong-2.c for -mnostackalig

[Bug middle-end/32598] [4.3 Regression]: 27_io/basic_stringbuf/setbuf/wchar_t/4.cc needs more than 6GB memory to compile

2007-07-03 Thread Joey dot ye at intel dot com
--- Comment #4 from Joey dot ye at intel dot com 2007-07-04 01:17 --- 126198 brought the regression -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32598

[Bug rtl-optimization/32755] Seg fault when compile CPU2000 with -fsee

2007-07-13 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2007-07-13 09:21 --- Created an attachment (id=13909) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13909&action=view) Reduced testcase GCC crashes with gcc -O2 -fsee case-see.c -c Fails at all recent 4.3 trunk. --

[Bug rtl-optimization/32755] New: Seg fault when compile CPU2000 with -fsee

2007-07-13 Thread Joey dot ye at intel dot com
fault when compile CPU2000 with -fsee Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at int

[Bug rtl-optimization/32755] Seg fault when compile CPU2000 with -fsee

2007-07-13 Thread Joey dot ye at intel dot com
--- Comment #2 from Joey dot ye at intel dot com 2007-07-13 09:27 --- Root cause looks like at see.c line 1643: emit_insn_after (merged_ref, ref); delete_insn (ref); where merged_ref and ref have the same INSN_UID. delete_insn will clear the df information of that UID

[Bug tree-optimization/32921] [4.3 Regression] Revision 126326 causes 12% slowdown

2007-10-22 Thread Joey dot ye at intel dot com
--- Comment #28 from Joey dot ye at intel dot com 2007-10-23 02:23 --- Got similar result on x86_64, Core 2 improves 24% from 129469 to 129504. That's great. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32921

[Bug target/34709] [4.3 regression]: revision 131342 miscompiled 481.wrf on Linux/Intel64

2008-01-17 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-01-17 10:11 --- A small case and patch are available at http://gcc.gnu.org/ml/gcc-patches/2008-01/msg00747.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34709

[Bug middle-end/36078] New: gfortran fails to build cpu2006/465.tonto

2008-04-29 Thread Joey dot ye at intel dot com
Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36078

[Bug middle-end/36074] [4.4 Regression]: 447.dealII in SPEC CPU 2006 failed to compile

2008-04-29 Thread Joey dot ye at intel dot com
--- Comment #5 from Joey dot ye at intel dot com 2008-04-29 10:41 --- Can be related to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36078, where I do have a small case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36074

[Bug tree-optimization/36054] bad code generation with -ftree-vectorize

2008-04-30 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-04-30 10:53 --- (In reply to comment #6) > (In reply to comment #4) > > > have you tried to compile with -march=core2 -mfpmath=sse -msse? > > Yes, I've compiled it as following: > > % g++ -g -O3 -m

[Bug tree-optimization/36054] bad code generation with -ftree-vectorize

2008-04-30 Thread Joey dot ye at intel dot com
--- Comment #9 from Joey dot ye at intel dot com 2008-04-30 10:56 --- (In reply to comment #8) > -m32 doesn't work. You have to use 4.3.0 release branch. Recent mainline > change Correction: -m32 is a must, but doesn't fix all. Options I'm using: g++ -g -O3 -mar

[Bug tree-optimization/36054] bad code generation with -ftree-vectorize

2008-04-30 Thread Joey dot ye at intel dot com
--- Comment #11 from Joey dot ye at intel dot com 2008-05-01 04:31 --- Tim, Since it doesn't link, I can only check the .s file. There are a couple of constructor called Environment, which one is the problemetic function? grep Environment kernel_build.s|grep glob ... .

[Bug tree-optimization/36054] bad code generation with -ftree-vectorize

2008-05-05 Thread Joey dot ye at intel dot com
--- Comment #13 from Joey dot ye at intel dot com 2008-05-05 07:22 --- It is helpful. Root cause is that memory allocated by new is only aligned to 8 bytes under i386. In your case, object Environment is allocated by new and its constructor tried to use movdqa to initialize its members

[Bug tree-optimization/36054] bad code generation with -ftree-vectorize

2008-05-05 Thread Joey dot ye at intel dot com
--- Comment #14 from Joey dot ye at intel dot com 2008-05-05 07:29 --- HJ, AVX will have the similar problem on x86_64, whose new only returns object aligned at 16 bytes. Dynamically allocated __m256 won't be guaranteed at 32 bytes boundary. -- http://gcc.gnu.org/bug

[Bug tree-optimization/36765] [4.4 Regression] Revision 137573 miscompiles 464.h264ref in SPEC CPU 2006

2008-07-10 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-07-11 05:46 --- Created an attachment (id=15897) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15897&action=view) Small test case reduced from cpu2006.464.h264ref /home/jye2/work/bug-37665> gcc -v Using built-in spe

[Bug tree-optimization/36765] [4.4 Regression] Revision 137573 miscompiles 464.h264ref in SPEC CPU 2006

2008-07-10 Thread Joey dot ye at intel dot com
--- Comment #2 from Joey dot ye at intel dot com 2008-07-11 05:49 --- Effect of line 76 buffer_frame[0] = InitFullness; is eliminated by optimizer due to bug in GCC. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36765

[Bug tree-optimization/36835] New: Trunk 137774 miscompile cpu2006.473.astar

2008-07-15 Thread Joey dot ye at intel dot com
gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36835

[Bug tree-optimization/36835] Trunk 137774 miscompile cpu2006.473.astar

2008-07-16 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-07-16 13:14 --- Fixed by revision 137859 -- Joey dot ye at intel dot com changed: What|Removed |Added

[Bug middle-end/36983] New: Trunk 138207 miscompiles 172.mgrid on x86-64

2008-07-31 Thread Joey dot ye at intel dot com
07 miscompiles 172.mgrid on x86-64 Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at intel dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36983

[Bug middle-end/36983] Trunk 138207 miscompiles 172.mgrid on x86-64

2008-07-31 Thread Joey dot ye at intel dot com
--- Comment #2 from Joey dot ye at intel dot com 2008-07-31 10:50 --- Yes. Just notice that latest trunk passes. -- Joey dot ye at intel dot com changed: What|Removed |Added

[Bug middle-end/36986] New: Trunk 138207 miscompiles 447.dealII

2008-07-31 Thread Joey dot ye at intel dot com
ry: Trunk 138207 miscompiles 447.dealII Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Joey dot ye at in

[Bug middle-end/36986] Trunk 138207 miscompiles 447.dealII

2008-07-31 Thread Joey dot ye at intel dot com
--- Comment #1 from Joey dot ye at intel dot com 2008-07-31 11:33 --- Created an attachment (id=15982) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15982&action=view) Preprocessed test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36986

[Bug c++/37012] numerous stackalign related testsuite failures on i686-apple-darwin9

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #18 from Joey dot ye at intel dot com 2008-08-04 07:24 --- (In reply to comment #9) > Joey, I think the problem is the usage of STACK_BOUNDARY / BITS_PER_UNIT > for stack alignment. On MacOS, STACK_BOUNDARY 128 on ia32. Shouldn't > we use UNITS_PER_WORD in some

[Bug target/37010] -mno-accumulate-outgoing-args doesn't work with stack alignment

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #6 from Joey dot ye at intel dot com 2008-08-04 08:28 --- (In reply to comment #3) > Joey, when we compute frame layout, we don't count the duplicated > return address pushed onto stack when DRAP is used. Also when we > push return address, shouldn't we

[Bug target/37010] -mno-accumulate-outgoing-args doesn't work with stack alignment

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #7 from Joey dot ye at intel dot com 2008-08-04 09:03 --- This problem is associated with -mpreferred-stack-boundary=2, rather than with stack alignment. Following case fails on trunk before merging with stack branch: $ cat y1.c /* PR middle-end/37010 */ /* { dg-do run

[Bug target/37010] -mno-accumulate-outgoing-args doesn't work with stack alignment

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #8 from Joey dot ye at intel dot com 2008-08-04 09:11 --- Root cause is that outgoing parameter frame is aligned based on stack pointer. Namely, address_of_stack_param = SP + offset + fixed_padding. With -mpreferred-stack-boundary=2, alignment of SP is only 4 bytes

[Bug target/37010] -mno-accumulate-outgoing-args doesn't work with stack alignment

2008-08-04 Thread Joey dot ye at intel dot com
--- Comment #11 from Joey dot ye at intel dot com 2008-08-04 14:11 --- (In reply to comment #10) > Did you mean we needed 2 "additional 'and $-16, sp" insns to align the > stack? I don't think so. Definitely not. Solution 1: Just ignore it. __m128 paramete