[Bug driver/48697] New: gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 Summary: gcc: error trying to exec 'f951': execvp: No such file or directory Product: gcc Version: 4.4.3 Status: UNCONFIRMED Severity: critical Priority: P3 Component: driver AssignedTo: unassig...@gcc.gnu.org ReportedBy: crive...@icmpe.cnrs.fr Hi, since the newest versions of gcc, when I precompile a file.F > file.f before a fortran compilation, I obtained this following message: # gcc file.F # gcc: error trying to exec 'f951': execvp: No such file or directory I didn't have with error when I used gcc in the old ubuntu 8.04 version. Thanks for giving some rescue tips. \jc
[Bug c/48685] [4.5/4.6/4.7 regression] ICE in gimplify_expr, at gimplify.c:7034
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48685 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||jsm28 at gcc dot gnu.org --- Comment #3 from Jakub Jelinek 2011-04-20 08:08:27 UTC --- int main () { int v = 1; (void) (1 == 2 ? (void) 0 : (v = 0)); return v; } I believe normally c_fully_fold_internal drops NON_LVALUE_EXPRs in STRIP_TYPE_NOPS at the end, but in this case the void type COND_EXPR is folded into int type MODIFY_EXPR (and even a fold_convert wouldn't fix that up, for MODIFY_EXPR it is just fold_ignored_result and still stays at the original type) and thus the void type NON_LVALUE_EXPR is kept. I wonder whether we should be creating NON_LVALUE_EXPRs with void type at all (aren't void values never lvalues?), or when handling NON_LVALUE_EXPR in c_fully_fold_internal we should drop them if they are void type.
[Bug tree-optimization/48694] [4.7 Regression] possible memory hog bug
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48694 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Known to work||4.6.0 Last reconfirmed||2011.04.20 08:17:14 Component|c |tree-optimization CC||jakub at gcc dot gnu.org Ever Confirmed|0 |1 Summary|possible memory hog bug |[4.7 Regression] possible ||memory hog bug Target Milestone|--- |4.7.0 Known to fail||4.7.0 --- Comment #3 from Jakub Jelinek 2011-04-20 08:17:14 UTC --- This is PRE, in func_1.
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #3 from ralphengels at gmail dot com 2011-04-20 08:48:38 UTC --- Created attachment 24055 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24055 screenshot of gdb output screenshot of gdb's output. hope it helps.
[Bug fortran/48692] [4.7 Regression] ICE with gfortran.dg/module_write_1.f90
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48692 Janne Blomqvist changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2011.04.20 08:53:15 CC||jb at gcc dot gnu.org Ever Confirmed|0 |1 --- Comment #4 from Janne Blomqvist 2011-04-20 08:53:15 UTC --- Confirmed.
[Bug libfortran/48682] Incorrect field justification with Gw.d edit descriptor
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48682 Janne Blomqvist changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #2 from Janne Blomqvist 2011-04-20 09:01:24 UTC --- (In reply to comment #1) > The observed behaviour is completely conforming to Fortran 2008: > > 10.7.5.2.2 Generalized real and complex editing > 4 ... Equivalent Conversion: F(w-n).(d-1),n('b') > > where b is a blank, n is 4 for Gw.d > > and that complete field (including the 4 spaces) is correctly right-justified > according to the mentioned rule in 10.7.2.1 Ah, indeed, so it seems.
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #4 from Jonathan Wakely 2011-04-20 09:10:21 UTC --- gdb's output is just text, wouldn't it have been easier to paste 4 lines of text instead of a 700KB screenshot showing your entire desktop?! You never answered my question about whether name is null. That screenshot shows it has the value 0xbaadf00d which indicates uninitialized memory from the Windows heap
[Bug driver/48697] gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2011.04.20 09:12:52 Ever Confirmed|0 |1 --- Comment #1 from Jonathan Wakely 2011-04-20 09:12:52 UTC --- http://gcc.gnu.org/bugs/#need
[Bug rtl-optimization/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 Richard Guenther changed: What|Removed |Added Keywords||missed-optimization Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |WAITING Last reconfirmed||2011.04.20 09:18:24 Ever Confirmed|0 |1 --- Comment #2 from Richard Guenther 2011-04-20 09:18:24 UTC --- Please provide a compilable testcase.
[Bug middle-end/48689] ICE in fold-const.c:13798 with fold checking
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48689 Richard Guenther changed: What|Removed |Added Summary|ICE in fold-const.c:13798 |ICE in fold-const.c:13798 ||with fold checking --- Comment #1 from Richard Guenther 2011-04-20 09:19:19 UTC --- You have fold checking enabled, I think it's known to be broken (sometimes).
[Bug driver/48691] Assembler file clobbered with -save-temps (LTO)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48691 Richard Guenther changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2011.04.20 09:21:24 Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther 2011-04-20 09:21:24 UTC --- Confirmed.
[Bug tree-optimization/48694] [4.7 Regression] possible memory hog bug
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48694 Richard Guenther changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org |gnu.org | --- Comment #4 from Richard Guenther 2011-04-20 09:22:09 UTC --- I'll have a looksee.
[Bug rtl-optimization/48695] [4.6/4.7 Regression] Runtime with an array of std::vectors
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48695 Richard Guenther changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org |gnu.org | --- Comment #7 from Richard Guenther 2011-04-20 09:26:30 UTC --- Mine.
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #5 from ralphengels at gmail dot com 2011-04-20 09:30:51 UTC --- sorry about that its just i have no idea how to copy the text from gdb's console window. about checking if name = null im not sure how i should go about it ? something like if (name[0] == NULL) print some error ?.
[Bug driver/48697] gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 --- Comment #2 from crivello 2011-04-20 09:32:56 UTC --- What you need * the exact version of GCC; gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) <-- ubuntu 10.10 * the system type; Using built-in specs. Target: x86_64-linux-gnu * the options given when GCC was configured/built; Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix * the complete command line that triggers the bug; gcc file.F * the compiler output (error messages, warnings, etc.); and gcc: error trying to exec 'f951': execvp: No such file or directory * the preprocessed file (*.i*) that triggers the bug, generated by adding -save-temps to the complete compilation command, or, in the case of a bug report for the GNAT front end, a complete set of source files (see below). I don't have it. I hope that it is now clear enough.
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #6 from Jonathan Wakely 2011-04-20 09:32:56 UTC --- it's not null. it has the value 0xbaadf00d.
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #7 from Jonathan Wakely 2011-04-20 09:34:16 UTC --- (In reply to comment #5) > sorry about that its just i have no idea how to copy the text from gdb's > console window. right-click, choose "Select All", hit Enter
[Bug driver/48697] gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 Jonathan Wakely changed: What|Removed |Added Status|WAITING |UNCONFIRMED Ever Confirmed|1 |0
[Bug tree-optimization/48498] Several gcc.dg/vect tests XPASS on SPARC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48498 --- Comment #6 from Ira Rosen 2011-04-20 09:38:20 UTC --- Thanks Rainer. It is caused by my patch http://gcc.gnu.org/viewcvs?view=revision&revision=171569 that changed these tests to work correctly on NEON doubleword vectors. The patch actually fixed these tests for SPARC as well, because SPARC also uses smaller vector size. The problem is that these tests are expected to fail on vect_no_align targets, but actually no realignment mechanism is needed here except for forcing alignment of arrays or having them aligned. We don't have a keyword for this and I think it is hard to have such because it's a combination of the target behavior and the test. We can try to remove vect_no_align keyword from these tests and check what happens on other no_align targets if { [istarget mipsisa64*-*-*] || [istarget sparc*-*-*] || [istarget ia64-*-*] || [check_effective_target_arm_vect_no_misalign] || ([istarget mips*-*-*] && [check_effective_target_mips_loongson]) } { set et_vect_no_align_saved 1 I checked it with a cross-compiler on ia64 (which I think is supposed to have a similar behavior to SPARC here). I think the tests XPASS on ia64 as well at the moment. I'll submit a patch with RFT.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 Richard Guenther changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Known to work||3.3.3, 4.0.3 Version|unknown |4.5.2 Keywords||missed-optimization Last reconfirmed||2011.04.20 09:43:28 Component|other |rtl-optimization CC||rguenth at gcc dot gnu.org Ever Confirmed|0 |1 Known to fail||4.0.4, 4.1.2, 4.3.5, 4.5.2, ||4.6.0, 4.7.0 --- Comment #2 from Richard Guenther 2011-04-20 09:43:28 UTC --- First of all, confirmed. We already expand to a byte read-modify-write and later do not realize that with a word one we could combine the later read. Thus, we optimize the bitfield store on its own, not looking at surroundings. I'm not sure where to best address this, rather than throwing in again the idea of lowering bitfield accesses early on trees. Eventually simply always doing whole-bitfield read-modify-write at expansion time will be beneficial, at least when not optimizing for size.
[Bug driver/48697] gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 Richard Guenther changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #3 from Richard Guenther 2011-04-20 09:44:34 UTC --- You need to install the corresponding gfortran package.
[Bug middle-end/47976] [4.5/4.6 Regression] Recent gfortran.dg/actual_array_constructor_3.f90 regression on arm-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47976 --- Comment #13 from Richard Guenther 2011-04-20 09:48:07 UTC --- Author: rguenth Date: Wed Apr 20 09:48:00 2011 New Revision: 172765 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172765 Log: 2011-04-20 Richard Guenther Backport from mainline 2011-04-19 Bernd Schmidt PR fortran/47976 * reload1.c (inc_for_reload): Return void. All callers changed. (emit_input_reload_insns): Don't try to delete previous output reloads to a register, or record spill_reg_store for autoincs. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/reload1.c
[Bug middle-end/48689] [4.7 Regression] ICE in fold-const.c:13798 with fold checking
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48689 Jakub Jelinek changed: What|Removed |Added CC||froydnj at gcc dot gnu.org, ||jakub at gcc dot gnu.org Target Milestone|--- |4.7.0 Summary|ICE in fold-const.c:13798 |[4.7 Regression] ICE in |with fold checking |fold-const.c:13798 with ||fold checking
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #8 from ralphengels at gmail dot com 2011-04-20 10:00:24 UTC --- my bad i ran cpp.exe by pulling it directly into gdb (had to use a command prompt). heres the output. -0x417f90:mov0x4(%esp),%eax -0x417f94:movzbl (%eax),%edx // breaks here -0x417f97:movzbl %dl,%ecx 0x417f9a:movzwl 0x445b20(%ecx,%ecx,1),%ecx -0x417fa2:and$0x88,%ecx -0x417fa8:je 0x417fb0 -0x417faa:cmpb $0x3a,0x1(%eax) -0x417fae:je 0x417fe0 -0x417fb0:test %dl,%dl -0x417fb2:je 0x417fcc -0x417fb4:lea0x1(%eax),%ecx -0x417fb7:cmp$0x5c,%dl -0x417fba:je 0x417fd0 -0x417fbc:cmp$0x2f,%dl -0x417fbf:je 0x417fd0 -0x417fc1:add$0x1,%ecx -0x417fc4:movzbl -0x1(%ecx),%edx -0x417fc8:test %dl,%dl -0x417fca:jne0x417fb7 -0x417fcc:repz ret -0x417fce:xchg %ax,%ax -0x417fd0:mov%ecx,%eax -0x417fd2:add$0x1,%ecx -0x417fd5:movzbl -0x1(%ecx),%edx -0x417fd9:test %dl,%dl -0x417fdb:jne0x417fb7 -0x417fdd:jmp0x417fcc -0x417fdf:nop -0x417fe0:movzbl 0x2(%eax),%edx -0x417fe4:add$0x2,%eax -0x417fe7:jmp0x417fb0 -0x417fe9:nop -0x417fea:nop -0x417feb:nop -0x417fec:nop -0x417fed:nop -0x417fee:nop -0x417fef:nop in code 179int mainCRTStartup (void) -180{ 181 int ret = 255; 182#ifdef __SEH__ 183 asm ("\t.l_start:\n" 184"\t.seh_handler __C_specific_handler, @except\n" 185"\t.seh_handlerdata\n" 186"\t.long 1\n" 187"\t.rva .l_start, .l_end, _gnu_exception_handler ,.l_end\n" 188"\t.text" 189); 190#endif im pretty new to gdb so bear with me.
[Bug rtl-optimization/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 Jakub Jelinek changed: What|Removed |Added Status|WAITING |NEW CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek 2011-04-20 10:04:37 UTC --- int foo (int x) { return (x << 3) | 5; } int bar (int x) { return (x * 8) + 5; } int baz (int x) { return (x << 3) + 5; }
[Bug rtl-optimization/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 Jakub Jelinek changed: What|Removed |Added CC||ubizjak at gmail dot com --- Comment #4 from Jakub Jelinek 2011-04-20 10:25:07 UTC --- All 6 variants: int fn1 (int x) { return (x << 3) | 5; } int fn2 (int x) { return (x * 8) | 5; } int fn3 (int x) { return (x << 3) + 5; } int fn4 (int x) { return (x * 8) + 5; } int fn5 (int x) { return (x << 3) ^ 5; } int fn6 (int x) { return (x * 8) ^ 5; } The problem is that for + this is turned into lea only in combine, and at that spot it only calls ix86_legitimate_address_p, not ix86_legitimize_address. So, either ix86_legitimate_address_p (well, ix86_decompose_address) should also handle the IOR/XOR with constant case because we can't just adjust it to the canonical plus through ix86_legitimize_address, or either simplify-rtx.c or fold-const.c or whatever should try to canonicalize the IOR/XOR resp. BIT_IOR_EXPR/BIT_XOR_EXPR into PLUS_EXPR in those cases. For simplify-rtx.c or fold-const.c, the question is why should be PLUS generally preferred over IOR/XOR though, yeah, it helps one target, but x86 isn't the only target GCC supports.
[Bug libstdc++/48698] New: gnu-versioned-namespace problems
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48698 Summary: gnu-versioned-namespace problems Product: gcc Version: 4.6.0 Status: UNCONFIRMED Keywords: rejects-valid Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: r...@gcc.gnu.org CC: b...@gcc.gnu.org Bootstrap fails with errors like: /tmp/b/x86_64-unknown-linux-gnu/libstdc++-v3/include/cwchar: In function 'wchar_t* std::_6::wcschr(wchar_t*, wchar_t)': /tmp/b/x86_64-unknown-linux-gnu/libstdc++-v3/include/cwchar:215:55: error: invalid conversion from 'const wchar_t*' to 'wchar_t*' [-fpermissive] /tmp/b/x86_64-unknown-linux-gnu/libstdc++-v3/include/cwchar:214:3: error: initializing argument 1 of 'wchar_t* std::_6::wcschr(wchar_t*, wchar_t)' [-fpermissive] This is because the using declaration for ::wcschr is not in the versioned namespace, so name lookup inside the versioned namespace finds the declaration there and doesn't look in the enclosing namespace. Fixed by: --- include/c_global/cwchar 2011-04-20 10:09:58.580607848 + +++ include/c_global/cwchar 2011-04-20 10:11:39.348231280 + @@ -136,6 +136,8 @@ namespace std _GLIBCXX_VISIBILITY(default) { +_GLIBCXX_BEGIN_NAMESPACE_VERSION + using ::wint_t; using ::btowc; @@ -207,8 +209,6 @@ using ::wcsstr; using ::wmemchr; -_GLIBCXX_BEGIN_NAMESPACE_VERSION - #ifndef __CORRECT_ISO_CPP_WCHAR_H_PROTO inline wchar_t* wcschr(wchar_t* __p, wchar_t __c) That allows the bootstrap to finish, but the choice of "_6" as the inline namespace means we cannot compile this valid program or it's C++0x equivalent: #include int f(int i); int g() { std::tr1::bind(f, std::tr1::placeholders::_6); } b.cc: In function 'int g()': b.cc:7:49: error: expected primary-expression before ')' token The problem is that "std::tr1::placeholders::_6" is a namespace not a bind placeholder // Inline namespace for symbol versioning. #if _GLIBCXX_INLINE_VERSION namespace std { ... namespace tr1 { ... namespace placeholders { inline namespace _6 { } } ... } ... namespace placeholders { inline namespace _6 { } } This won't work, placeholders::_6 is required to be a bind placeholder so can't be a namespace. It might be better to use _v6 although that would fail if users say #define _v6 bleurgh which is allowed. The inline namespaces should really use a name resperved for the impl, such as __6 or __v6
[Bug libstdc++/48698] gnu-versioned-namespace problems
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48698 --- Comment #1 from Jonathan Wakely 2011-04-20 10:33:23 UTC --- (In reply to comment #0) > Bootstrap fails with errors like: Just to be clear, this is only when configuring with --enable-symvers=gnu-versioned-namespace
[Bug middle-end/47976] [4.5/4.6 Regression] Recent gfortran.dg/actual_array_constructor_3.f90 regression on arm-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47976 --- Comment #14 from Richard Guenther 2011-04-20 11:05:13 UTC --- Author: rguenth Date: Wed Apr 20 11:05:09 2011 New Revision: 172766 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172766 Log: 2011-04-20 Richard Guenther Backport from 4.6 branch 2011-04-19 Bernd Schmidt PR fortran/47976 * reload1.c (inc_for_reload): Return void. All callers changed. (emit_input_reload_insns): Don't try to delete previous output reloads to a register, or record spill_reg_store for autoincs. Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/reload1.c
[Bug middle-end/47976] [4.5/4.6 Regression] Recent gfortran.dg/actual_array_constructor_3.f90 regression on arm-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47976 Richard Guenther changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #15 from Richard Guenther 2011-04-20 11:05:42 UTC --- Fixed.
[Bug bootstrap/48148] [4.7 Regression] LTO bootstrap failed with bootstrap-profiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48148 --- Comment #29 from Eric Botcazou 2011-04-20 11:18:57 UTC --- Author: ebotcazou Date: Wed Apr 20 11:18:50 2011 New Revision: 172767 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172767 Log: Backport from mainline 2011-04-19 Eric Botcazou PR lto/48148 * gimple.c (gimple_types_compatible_p_1) : Do not merge the types if they have different enumeration identifiers. 2011-04-18 Eric Botcazou PR lto/48492 * cfgexpand.c (expand_debug_expr) : Return NULL for a DECL_IN_CONSTANT_POOL without RTL. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/cfgexpand.c branches/gcc-4_6-branch/gcc/gimple.c
[Bug lto/48492] [4.7 Regression] LTO bootstrap failure in copy_constant
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48492 --- Comment #7 from Eric Botcazou 2011-04-20 11:18:57 UTC --- Author: ebotcazou Date: Wed Apr 20 11:18:50 2011 New Revision: 172767 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172767 Log: Backport from mainline 2011-04-19 Eric Botcazou PR lto/48148 * gimple.c (gimple_types_compatible_p_1) : Do not merge the types if they have different enumeration identifiers. 2011-04-18 Eric Botcazou PR lto/48492 * cfgexpand.c (expand_debug_expr) : Return NULL for a DECL_IN_CONSTANT_POOL without RTL. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/cfgexpand.c branches/gcc-4_6-branch/gcc/gimple.c
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 Janne Blomqvist changed: What|Removed |Added CC||jb at gcc dot gnu.org --- Comment #6 from Janne Blomqvist 2011-04-20 11:21:15 UTC --- Note that some of these issues might change with the new array descriptor that we must introduce at some point (the hope is that it'll get in for 4.7, but it remains to be seen if there is enough time). See http://gcc.gnu.org/wiki/ArrayDescriptorUpdate For instance (comments inline): (In reply to comment #4) > (In reply to comment #3) > > > The second item is interesting - it would be cool if backend was able to > > work > > out that the code is supposed to simplify after inlining. Either by itself > > or > > by frontend hint. > > Can you provide me very simple testcase for that I can look into how it > > looks > > like in backend? Perhaps some kind of frontend hinting would work well > > here. > > Here is some sample code (extreme, I admit) which profits a lot from > inlining: > > - Strides are known to be one when inlining (a common case, but you can > never be sure if the user doesn't call a(1:5:2)) Not strictly related to inlining, but in the new descriptor we'll have a field specifying whether the array is simply contiguous, so it might make sense to generate two loops for each loop over the array in the source, one for the contiguous case where it can be vectorized etc. and another loop for the general case. This might reduce the profitability of inlining. > - Expensive setting up of, and reading from the array descriptor As we're planning to use the TR 29113 descriptor as the native one, this has some implications for the procedure call interface as well. See http://gcc.gnu.org/ml/fortran/2011-03/msg00215.html This will reduce the procedure call overhead substantially, at the cost of some extra work in the caller in the case of non-default lower bounds.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 Eric Botcazou changed: What|Removed |Added CC||ebotcazou at gcc dot ||gnu.org --- Comment #3 from Eric Botcazou 2011-04-20 11:47:00 UTC --- So something changed between 4.0.3 and 4.0.4? Or maybe a typo?
[Bug driver/48697] gcc: error trying to exec 'f951': execvp: No such file or directory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48697 crivello changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #4 from crivello 2011-04-20 12:00:09 UTC --- well done Richard ! thanks
[Bug fortran/48699] New: [OOP] MOVE_ALLOC of polymorphic variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48699 Summary: [OOP] MOVE_ALLOC of polymorphic variables Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: sfilipp...@uniroma2.it Created attachment 24056 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24056 test-case Hello, The attached examples fail in rather strange ways, on both current trunk and 4.6.0. -- [sfilippo@donald bug31]$ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/local/gnu47/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc : (reconfigured) ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc : (reconfigured) ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc Thread model: posix gcc version 4.7.0 20110418 (experimental) (GCC) [sfilippo@donald bug31]$ gfortran -o testmv1 testmv1.f90 /tmp/ccIl5I0L.o:(.data+0x58): undefined reference to `__copy_foo2_Bar2.1685' collect2: ld returned 1 exit status [sfilippo@donald bug31]$ gfortran -c testmv2.f90 testmv2.f90:38.20: call move_alloc(sm,dat%sm) 1 Error: 'from' argument of 'move_alloc' intrinsic at (1) must be ALLOCATABLE The symptoms seem to be about two different things, though.
[Bug fortran/48699] [OOP] MOVE_ALLOC of polymorphic variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48699 --- Comment #1 from Salvatore Filippone 2011-04-20 12:07:04 UTC --- Created attachment 24057 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24057 test-case
[Bug fortran/48700] New: [OOP] memory leak with MOVE_ALLOC of polymorphic variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48700 Summary: [OOP] memory leak with MOVE_ALLOC of polymorphic variables Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: sfilipp...@uniroma2.it Created attachment 24058 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24058 test-case Running the attached test case through valgrind gives memory leak warnings, both with current trunk and 4.6.0 [sfilippo@donald bug31]$ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/local/gnu47/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc : (reconfigured) ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc : (reconfigured) ../gcc/configure --prefix=/usr/local/gnu47 --enable-languages=c,c++,fortran --with-gmp=/home/travel/GNUBUILD/gmp --with-mpfr=/home/travel/GNUBUILD/mpfr --with-mpc=/home/travel/GNUBUILD/mpc Thread model: posix gcc version 4.7.0 20110418 (experimental) (GCC) [sfilippo@donald bug31]$ gfortran -o testmv3 testmv3.f90 -ggdb [sfilippo@donald bug31]$ valgrind --leak-check=full ./testmv3 ==25909== Memcheck, a memory error detector ==25909== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==25909== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==25909== Command: ./testmv3 ==25909== ==25909== ==25909== HEAP SUMMARY: ==25909== in use at exit: 216 bytes in 4 blocks ==25909== total heap usage: 26 allocs, 22 frees, 12,368 bytes allocated ==25909== ==25909== 40 bytes in 1 blocks are definitely lost in loss record 3 of 4 ==25909==at 0x4A05E46: malloc (vg_replace_malloc.c:195) ==25909==by 0x401425: MAIN__ (testmv3.f90:37) ==25909==by 0x401729: main (testmv3.f90:22) ==25909== ==25909== 176 (96 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4 ==25909==at 0x4A05E46: malloc (vg_replace_malloc.c:195) ==25909==by 0x400DCF: MAIN__ (testmv3.f90:30) ==25909==by 0x401729: main (testmv3.f90:22) ==25909== ==25909== LEAK SUMMARY: ==25909==definitely lost: 136 bytes in 2 blocks ==25909==indirectly lost: 80 bytes in 2 blocks ==25909== possibly lost: 0 bytes in 0 blocks ==25909==still reachable: 0 bytes in 0 blocks ==25909== suppressed: 0 bytes in 0 blocks ==25909== ==25909== For counts of detected and suppressed errors, rerun with: -v ==25909== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #4 from Richard Guenther 2011-04-20 12:11:56 UTC --- (In reply to comment #3) > So something changed between 4.0.3 and 4.0.4? Or maybe a typo? I only have 32bit compilers for both and see, for 4.0.3: show_bug: pushl %ebp movl%esp, %ebp movl8(%ebp), %edx movl(%edx), %eax andl$-64, %eax movl%eax, (%edx) shrl$6, %eax popl%ebp movzwl %ax, %eax ret and for 4.0.4: show_bug: pushl %ebp movl%esp, %ebp movl8(%ebp), %eax andb$-64, (%eax) movl(%eax), %eax leave shrl$6, %eax movzwl %ax, %eax ret
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #5 from Richard Guenther 2011-04-20 12:15:14 UTC --- (In reply to comment #4) > (In reply to comment #3) > > So something changed between 4.0.3 and 4.0.4? Or maybe a typo? > > I only have 32bit compilers for both and see, for 4.0.3: > > show_bug: > pushl %ebp > movl%esp, %ebp > movl8(%ebp), %edx > movl(%edx), %eax > andl$-64, %eax > movl%eax, (%edx) > shrl$6, %eax > popl%ebp > movzwl %ax, %eax > ret > > and for 4.0.4: > > show_bug: > pushl %ebp > movl%esp, %ebp > movl8(%ebp), %eax > andb$-64, (%eax) > movl(%eax), %eax > leave > shrl$6, %eax > movzwl %ax, %eax > ret Actually the 4.0.4 compiler is x86_64, the code with -m32. The 4.0.3 compiler is i586. /space/rguenther/install/gcc-4.0.3/libexec/gcc/i686-pc-linux-gnu/4.0.3/cc1 -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=pentiumpro -auxbase t -O2 -version -o t.s /space/rguenther/install/gcc-4.0.4/libexec/gcc/x86_64-unknown-linux-gnu/4.0.4/cc1 -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=k8 -auxbase t -O2 -version -o t.s but no -march/tune combination makes the bug vanish for the 4.0.4 compiler (maybe a HWI dependent "optimization")
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #7 from Tobias Burnus 2011-04-20 12:29:02 UTC --- (In reply to comment #6) > > Here is some sample code (extreme, I admit) which profits a lot from > > inlining: > > > > - Strides are known to be one when inlining (a common case, but you can > > never be sure if the user doesn't call a(1:5:2)) First, you do not have any issue with strides if the dummy argument is either allocatable, has the contiguous attribute, or is an explicit or assumed-sized array. For inlining, I see only one place where information loss happens: If a simply-contiguous array is passed as actual argument to a assumed-shape dummy. Then the Fortran front-end knows that the stride of the actual argument is 1, but the callee needs to assume an arbitrary stride. The middle-end will continue to do so as the "simply contiguous" information is lost - even though it would be profitable for inlining. > Not strictly related to inlining, but in the new descriptor we'll have a field > specifying whether the array is simply contiguous I am not sure we will indeed have one; initially I thought one should, but I am no longer convinced that it is the right approach. My impression is now that setting and updating the flag all the time is more expensive then doing once a is_contiguous() check. The TR descriptor also does not such an flag - thus one needs to handle such arrays - if they come from C - with extra care. (Unless one requires the C side to call a function, which could set this flag. I think one does not need to do so.) By the way, the latest version of the TR draft is linked at http://j3-fortran.org/pipermail/interop-tr/2011-April/000582.html > so it might make sense to > generate two loops for each loop over the array in the source, one for the > contiguous case where it can be vectorized etc. and another loop for the > general case. Maybe. Definitely not for -Os. Best would be if the middle end would be able to generate automatically a stride-free version when it thinks that it is profitable. The FE could also do it, if one had a way to tell the ME that it might drop the stride-free version, if it thinks that it is more profitable. > As we're planning to use the TR 29113 descriptor as the native one, this has > some implications for the procedure call interface as well. See > http://gcc.gnu.org/ml/fortran/2011-03/msg00215.html Regarding: "For a descriptor of an assumed-shape array, the value of the lower-bound member of each element of the dim member of the descriptor shall be zero." That's actually also not that different from the current situation: In Fortran, the lower bound of assumed-shape arrays is also always the same: It is 1. Which makes sense as on can then do the following w/o worrying about the lbound: subroutine bar(a) real :: a(:) do i = 1, ubound(a, dim=1) a(i) = ... For explicit-shape/assumed-size arrays one does not have a descriptor and for deferred-shape arrays (allocatables, pointers) the TR keeps the lbound - which is the same as currently in Fortran. > This will reduce the procedure call overhead substantially, at the cost > of some extra work in the caller in the case of non-default lower bounds. Which is actually nothing new ... That's the reason that one often creates a new descriptor for procedure calls.
[Bug rtl-optimization/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 --- Comment #5 from Jakub Jelinek 2011-04-20 12:30:10 UTC --- Actually, I've managed to handle this by adding a new define_insn_and_split (*lea_general_4).
[Bug libfortran/48602] Invalid F conversion of G descriptor for values close to powers of 10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48602 --- Comment #38 from Thomas Henlich 2011-04-20 12:38:53 UTC --- As an alternative we might consider leaving the code as it was before and instead putting OUTPUT_FLOAT_FMT_G(4) OUTPUT_FLOAT_FMT_G(8) OUTPUT_FLOAT_FMT_G(10) into separate files and compile with -mpc32 -mpc64 -mpc80 respectively.
[Bug target/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Component|rtl-optimization|target AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org |
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 Eric Botcazou changed: What|Removed |Added Known to work||4.1.0 --- Comment #6 from Eric Botcazou 2011-04-20 12:59:40 UTC --- Very likely PR rtl-optimization/22563 for the regression in 4.0.x and 4.1.x.
[Bug libfortran/48602] Invalid F conversion of G descriptor for values close to powers of 10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48602 --- Comment #39 from Tobias Burnus 2011-04-20 13:01:34 UTC --- (In reply to comment #38) > and compile with -mpc32 -mpc64 -mpc80 respectively. Then I like Janne's proposal more: compiling libgfortran/io/*.c with -fexcess-precision=standard.
[Bug target/48688] [x64]: shift/or instead of lea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48688 --- Comment #6 from Jakub Jelinek 2011-04-20 13:01:40 UTC --- Created attachment 24059 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24059 gcc47-pr48688.patch Untested fix.
[Bug fortran/48699] [OOP] MOVE_ALLOC of polymorphic variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48699 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org, ||janus at gcc dot gnu.org --- Comment #2 from Tobias Burnus 2011-04-20 13:08:39 UTC --- See also PR 48700 (memleak with polymorphic vars in MOVE_ALLOC), which might be a duplicate.
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #8 from Janne Blomqvist 2011-04-20 13:09:51 UTC --- (In reply to comment #7) > (In reply to comment #6) > > > Here is some sample code (extreme, I admit) which profits a lot from > > > inlining: > > > > > > - Strides are known to be one when inlining (a common case, but you can > > > never be sure if the user doesn't call a(1:5:2)) > > First, you do not have any issue with strides if the dummy argument is either > allocatable, has the contiguous attribute, or is an explicit or assumed-sized > array. > > For inlining, I see only one place where information loss happens: If a > simply-contiguous array is passed as actual argument to a assumed-shape dummy. > Then the Fortran front-end knows that the stride of the actual argument is 1, > but the callee needs to assume an arbitrary stride. The middle-end will > continue to do so as the "simply contiguous" information is lost - even though > it would be profitable for inlining. Passing simply contiguous arrays to assumed-shape dummies is a fairly common case in "modern Fortran", so it would be nice if we could generate fast code for this. > > Not strictly related to inlining, but in the new descriptor we'll have a > > field > > specifying whether the array is simply contiguous > > I am not sure we will indeed have one; initially I thought one should, but I > am > no longer convinced that it is the right approach. My impression is now that > setting and updating the flag all the time is more expensive then doing once a > is_contiguous() check. Hmm, maybe. Shouldn't it be necessary to update the contiguous flag only when passing slices to procedures with explicit interfaces? But OTOH, calculating whether an array is simply contiguous at procedure entry is just a few arithmetic operations anyway. But, in any case I don't have any profiling data to argue which approach would be better. > The TR descriptor also does not such an flag - thus one > needs to handle such arrays - if they come from C - with extra care. (Unless > one requires the C side to call a function, which could set this flag. I think > one does not need to do so.) I suppose one cannot require the C side to set such a flag, as the TR doesn't require its presence? Thus we'd need to calculate whether the array is simply contiguous anyway if it's possible the array comes from C. Do such procedures have to be marked with BIND(C) in some way or how does this work? In any case, maybe this is what tips it in favor of always calculating the contiguousness instead of having a flag in the descriptor - it would be one single way of handling it, reducing the possibility of bugs. Also, if the contigousness isn't used for anything in the procedure, the dead code elimination should delete it anyway. > > As we're planning to use the TR 29113 descriptor as the native one, this has > > some implications for the procedure call interface as well. See > > http://gcc.gnu.org/ml/fortran/2011-03/msg00215.html > > Regarding: > "For a descriptor of an assumed-shape array, the value of the > lower-bound member of each element of the dim member of the descriptor > shall be zero." > > That's actually also not that different from the current situation: In > Fortran, > the lower bound of assumed-shape arrays is also always the same: It is 1. Yes. But what is different from the current situation is that the above is what the standard requires semantically, and the implementation is free to implement it as it sees fit. In the TR, OTOH, we have the explicit requirement that on procedure entry the lower bounds in the descriptor should be 0. This of course applies only to inter-operable procedures, for "pure Fortran" we're still free to do as we please. But again, it might make sense to do it the same way in both cases in order to reduce the implementation and maintenance burden. > For explicit-shape/assumed-size arrays one does not have a descriptor and for > deferred-shape arrays (allocatables, pointers) the TR keeps the lbound - which > is the same as currently in Fortran. Yes. > > This will reduce the procedure call overhead substantially, at the cost > > of some extra work in the caller in the case of non-default lower bounds. > > Which is actually nothing new ... That's the reason that one often creates a > new descriptor for procedure calls. But do we actually do this? I did some tests a while ago, and IIRC for assumed shape dummy arguments the procedure always calculates new bounds such that they start from 1. That is, the procedure assumes that the actual argument descriptor may have lower bounds != 1. So my argument is basically that with the new descriptor it might make sense to switch the responsibility around such that it's the caller who makes sure that all lower bounds are 0 (as we must have the capability to do this anyway in order to call inter-operable procedures, no?) instead of the callee.
[Bug rtl-optimization/48695] [4.6/4.7 Regression] Runtime with an array of std::vectors
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48695 --- Comment #8 from Richard Guenther 2011-04-20 13:11:12 UTC --- Author: rguenth Date: Wed Apr 20 13:11:06 2011 New Revision: 172768 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172768 Log: 2011-04-20 Richard Guenther PR middle-end/48695 * tree-ssa-alias.c (aliasing_component_refs_p): Compute base objects and types here. Adjust for their offset before comparing. * g++.dg/torture/pr48695.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/torture/pr48695.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-alias.c
[Bug fortran/48699] [OOP] MOVE_ALLOC of polymorphic variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48699 --- Comment #3 from Salvatore Filippone 2011-04-20 13:20:48 UTC --- (In reply to comment #2) > See also PR 48700 (memleak with polymorphic vars in MOVE_ALLOC), which might > be > a duplicate. They are related in the sense that the test cases for this one were obtained while searching for a memleak test case. Whether the root cause is the same is beyond me..
[Bug target/48701] New: [missed optimization] GCC fails to use aliasing of ymm and xmm registers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48701 Summary: [missed optimization] GCC fails to use aliasing of ymm and xmm registers Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: kr...@kde.org The two functions in the attached test case demonstrate the problem. The intermediate stores/loads on the stack really should be optimized away. testStore output now: vmovdqa %xmm1,-0x30(%rsp) vmovdqa %xmm0,-0x20(%rsp) vmovdqa -0x30(%rsp),%ymm0 vmovdqa %ymm0,() should be either: vinsertf128 $1,%xmm0,%ymm1,%ymm0 vmovdqa %ymm0,() or: vmovdqa %xmm1,() vmovdqa %xmm0,0x10() depending on the target microarchitecture and accompanying code. likewise the testLoad output now is: vmovdqa (),%ymm0 vmovdqa %ymm0,-0x30(%rsp) vmovdqa -0x20(%rsp),%xmm1 vmovdqa -0x30(%rsp),%xmm0 and should be either: vmovdqa (),%ymm0 vextractf128 $1,%ymm0,%xmm1 or: vmovdqa (),%xmm0 vmovdqa 0x10(),%xmm1
[Bug target/48701] [missed optimization] GCC fails to use aliasing of ymm and xmm registers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48701 --- Comment #1 from Matthias Kretz 2011-04-20 13:26:56 UTC --- Created attachment 24060 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24060 testcase
[Bug target/48678] [4.6/4.7 Regression] unable to find a register to spill in class ‘GENERAL_REGS’
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48678 Uros Bizjak changed: What|Removed |Added Status|RESOLVED|ASSIGNED Resolution|FIXED | AssignedTo|jakub at gcc dot gnu.org|ubizjak at gmail dot com --- Comment #10 from Uros Bizjak 2011-04-20 13:29:58 UTC --- Created attachment 24061 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24061 Expand movstrict pattern as pinsr for vector subregs. For the testcase, attached patch generates: movdqa(%rsi), %xmm0 pinsrw$0, (%rdi), %xmm0 pcmpeqw(%rdx), %xmm0 ret
[Bug target/48678] unable to find a register to spill in class ‘GENERAL_REGS’
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48678 Jakub Jelinek changed: What|Removed |Added Target Milestone|4.6.1 |4.7.0 Summary|[4.6/4.7 Regression] unable |unable to find a register |to find a register to spill |to spill in class |in class ‘GENERAL_REGS’ |‘GENERAL_REGS’ --- Comment #11 from Jakub Jelinek 2011-04-20 13:32:01 UTC --- Removing regression flag, as it is no longer a regression, just an enhancement.
[Bug bootstrap/48671] [4.7 Regression] LTO bootstrap failed with bootstrap-profiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48671 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #1 from H.J. Lu 2011-04-20 13:32:05 UTC --- Fixed by revision 172751: http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg00947.html
[Bug target/18145] Do not emit __do_copy_data or __do_clear_bss if .data or .bss is empty.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18145 --- Comment #4 from Georg-Johann Lay 2011-04-20 13:38:09 UTC --- Author: gjl Date: Wed Apr 20 13:38:05 2011 New Revision: 172769 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172769 Log: PR target/18145 * config/avr/avr.h (TARGET_ASM_INIT_SECTIONS): Delete. (ASM_OUTPUT_COMMON, ASM_OUTPUT_LOCAL): Delete. (ASM_OUTPUT_ALIGNED_DECL_COMMON): Define. (ASM_OUTPUT_ALIGNED_DECL_LOCAL): Define. (TARGET_ASM_NAMED_SECTION): Change to avr_asm_named_section. * config/avr/avr-protos.h (avr_asm_output_aligned_common): New prototype. * config/avr/avr.c (TARGET_ASM_INIT_SECTIONS): Define. (avr_asm_named_section,avr_asm_output_aligned_common, avr_output_data_section_asm_op,avr_output_bss_section_asm_op): New functions to update... (avr_need_clear_bss_p, avr_need_copy_data_p): ...these new variables. (avr_asm_init_sections): Overwrite section callbacks for data_section, bss_section. (avr_file_start): Move output of __do_copy_data, __do_clear_bss from here to... (avr_file_end): ...here. Modified: trunk/gcc/ChangeLog trunk/gcc/config/avr/avr-protos.h trunk/gcc/config/avr/avr.c trunk/gcc/config/avr/avr.h
[Bug target/18145] Do not emit __do_copy_data or __do_clear_bss if .data or .bss is empty.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18145 Georg-Johann Lay changed: What|Removed |Added Keywords|FIXME | Status|NEW |RESOLVED CC||gjl at gcc dot gnu.org Known to work||4.7.0 Resolution||FIXED --- Comment #5 from Georg-Johann Lay 2011-04-20 13:46:42 UTC --- Closed resolved+fixed in 4.7.0
[Bug rtl-optimization/48695] [4.6 Regression] Runtime with an array of std::vectors
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48695 Richard Guenther changed: What|Removed |Added Known to work||4.7.0 Summary|[4.6/4.7 Regression]|[4.6 Regression] Runtime |Runtime with an array of|with an array of |std::vectors|std::vectors --- Comment #9 from Richard Guenther 2011-04-20 13:46:54 UTC --- Fixed on trunk sofar.
[Bug target/48701] [missed optimization] GCC fails to use aliasing of ymm and xmm registers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48701 Richard Guenther changed: What|Removed |Added Target||x86_64-*-* Status|UNCONFIRMED |NEW Keywords||missed-optimization Last reconfirmed||2011.04.20 13:49:21 Ever Confirmed|0 |1 Severity|normal |enhancement --- Comment #2 from Richard Guenther 2011-04-20 13:49:21 UTC --- Confirmed.
[Bug rtl-optimization/48702] New: optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48702 Summary: optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: mariah.le...@gmail.com /* optimization regression with gcc-4.6.0 on x86_64-unknown-linux-gnu % gcc-4.6.0 -O2 -o foo foo.c % foo % 1 % % gcc-4.6.0 -O1 -o foo foo.c % foo % 4 % % gcc-4.5.1 -O2 -o foo foo.c % foo % 4 % gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/gcc-4.6.0/x86_64-Linux-core2-fc/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: /usr/local/gcc-4.6.0/src/gcc-4.6.0/configure --enable-languages=c,c++,fortran --with-gnu-as --with-gnu-as=/usr/local/binutils-2.21/x86_64-Linux-core2-fc-gcc-4.5.1-rh/bin/as --with-gnu-ld --with-ld=/usr/local/binutils-2.21/x86_64-Linux-core2-fc-gcc-4.5.1-rh/bin/ld --with-gmp=/usr/local/mpir-2.3.0/x86_64-Linux-core2-fc-gcc-4.5.1-rh --with-mpfr=/usr/local/mpfr-3.0.0/x86_64-Linux-core2-fc-mpir-2.3.0-gcc-4.5.1-rh --with-mpc=/usr/local/mpc-0.9/x86_64-Linux-core2-fc-mpir-2.3.0-mpfr-3.0.0-gcc-4.5.1-rh --prefix=/usr/local/gcc-4.6.0/x86_64-Linux-core2-fc Thread model: posix gcc version 4.6.0 (GCC) % */ #include #define LEN 4 void unpack(int array[LEN]) { int ii, val; val = 1; for (ii = 0; ii < LEN; ii++) { array[ii] = val % 2; val = val / 2; } return; } int pack(int array[LEN]) { int ans, ii; ans = 0; for (ii = LEN-1; ii >= 0; ii--) { ans = 2 * ans + array[ii]; } return ans; } int foo() { int temp, ans; int array[LEN]; unpack(array); temp = array[0]; array[0] = array[2]; array[2] = temp; ans = pack(array); return ans; } int main(void) { int val; val = foo(); printf("%d\n", val); return 0; }
[Bug target/48576] wrong code when accessing variables in a large stack frame
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48576 --- Comment #6 from Ramana Radhakrishnan 2011-04-20 14:20:10 UTC --- Can an RM reprioritize this one ? It smells of something higher than P3 since this is a wrong code regression from 4.4 ? cheers Ramana
[Bug target/48678] unable to find a register to spill in class ‘GENERAL_REGS’
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48678 --- Comment #12 from Uros Bizjak 2011-04-20 14:27:30 UTC --- Hm, if line 14 in the testcase is changed to: - ((T *) &s.d)[0] = *x; + ((T *) &s.d)[1] = *x; then gcc does not touch movstrict pattern at all and generates following code: movdqa(%rsi), %xmm0 movabsq$-4294901761, %rsi movzwl(%rdi), %eax movdqa%xmm0, -24(%rsp) movq-24(%rsp), %rcx salq$16, %rax andq%rsi, %rcx orq%rax, %rcx movq%rcx, -24(%rsp) movdqa-24(%rsp), %xmm0 pcmpeqw(%rdx), %xmm0 ret However, when byte offset reaches sizeof (*void), i.e. 8 bytes on 64bit target as in: - ((T *) &s.d)[0] = *x; + ((T *) &s.d)[4] = *x; then we again get: movdqa(%rsi), %xmm0 pinsrw$4, (%rdi), %xmm0 pcmpeqw(%rdx), %xmm0 ret I didn't investigate this in detail, but perhaps someone can shed some light here?
[Bug rtl-optimization/48702] optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48702 Richard Guenther changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2011.04.20 14:28:27 Known to work||4.7.0 Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther 2011-04-20 14:28:27 UTC --- Confirmed. Seems to work on trunk. The following fails at -O1: extern void abort (void); #define LEN 4 static inline void unpack(int array[LEN]) { int ii, val; val = 1; for (ii = 0; ii < LEN; ii++) { array[ii] = val % 2; val = val / 2; } } static inline int pack(int array[LEN]) { int ans, ii; ans = 0; for (ii = LEN-1; ii >= 0; ii--) { ans = 2 * ans + array[ii]; } return ans; } int __attribute__((noinline)) foo() { int temp, ans; int array[LEN]; unpack(array); temp = array[0]; array[0] = array[2]; array[2] = temp; ans = pack(array); return ans; } int main(void) { int val; val = foo(); if (val != 4) abort (); return 0; }
[Bug rtl-optimization/48702] [4.6 Regression] optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48702 Richard Guenther changed: What|Removed |Added Keywords||wrong-code Target Milestone|--- |4.6.1 Summary|optimization regression |[4.6 Regression] |with gcc-4.6 on |optimization regression |x86_64-unknown-linux-gnu|with gcc-4.6 on ||x86_64-unknown-linux-gnu
[Bug rtl-optimization/48702] [4.6 Regression] optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48702 Richard Guenther changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org |gnu.org | --- Comment #2 from Richard Guenther 2011-04-20 14:38:51 UTC --- DSE2 kills the stores of the shuffle temp = array[0]; array[0] = array[2]; array[2] = temp; likely being confused about the loops use. Mine.
[Bug debug/48703] New: segfault in canonicalize_for_substitution
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48703 Summary: segfault in canonicalize_for_substitution Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug AssignedTo: unassig...@gcc.gnu.org ReportedBy: m...@gcc.gnu.org Created attachment 24062 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24062 testcase The patch for PR48207 broke compiling any moderately complex C++ program with -g -flto. We can't use the C++ specific langhook for mangling when we've already cleared out language specific stuff in free_lang_data. $ cc1plus -g -flto bug.ii /matz/gcc/svn/real-trunk/dev/x86_64-unknown-linux-gnu/libstdc++-v3/include/istream: In instantiation of ‘std::basic_istream<_CharT, _Traits>::sentry::operator bool() const [with _CharT = wchar_t, _Traits = std::char_traits]’: EtherAppCli.cc:221:1: instantiated from here /matz/gcc/svn/real-trunk/dev/x86_64-unknown-linux-gnu/libstdc++-v3/include/istream:686:7: internal compiler error: Segmentation fault
[Bug fortran/48704] New: ICE: gfortran dies when '-finstrument-functions' option is used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48704 Summary: ICE: gfortran dies when '-finstrument-functions' option is used Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: mg1...@web.de Created attachment 24063 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24063 Preprocesses input file, generated by '-save-temps' When compiling the attached source file with gfortran 4.6.0 and the '-finstrument-functions' option specified, gfortran dies with a SIGSEGV. Without the option and with previous versions of the compiler (e.g., 4.5.2), compilation works fine. DETAILS === $ gfortran -v -save-temps -finstrument-functions -c jacobi.F90 Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/opt/packages/gcc/4.6.0/libexec/gcc/i686-pc-linux-gnu/4.6.0/lto-wrapper Target: i686-pc-linux-gnu Configured with: ../gcc-4.6.0/configure --prefix=/opt/packages/gcc/4.6.0 --enable-languages=c,c++,fortran --enable-__cxa_atexit --enable-threads --disable-multilib Thread model: posix gcc version 4.6.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-finstrument-functions' '-c' '-mtune=generic' '-march=pentiumpro' /opt/packages/gcc/4.6.0/libexec/gcc/i686-pc-linux-gnu/4.6.0/f951 jacobi.F90 -cpp=jacobi.f90 -quiet -v jacobi.F90 -quiet -dumpbase jacobi.F90 -mtune=generic -march=pentiumpro -auxbase jacobi -version -finstrument-functions -fintrinsic-modules-path /opt/packages/gcc/4.6.0/lib/gcc/i686-pc-linux-gnu/4.6.0/finclude -o jacobi.s GNU Fortran (GCC) version 4.6.0 (i686-pc-linux-gnu) compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ignoring nonexistent directory "/opt/packages/gcc/4.6.0/lib/gcc/i686-pc-linux-gnu/4.6.0/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /opt/packages/gcc/4.6.0/lib/gcc/i686-pc-linux-gnu/4.6.0/finclude /opt/packages/gcc/4.6.0/lib/gcc/i686-pc-linux-gnu/4.6.0/include /usr/local/include /opt/packages/gcc/4.6.0/include /opt/packages/gcc/4.6.0/lib/gcc/i686-pc-linux-gnu/4.6.0/include-fixed /usr/include End of search list. GNU Fortran (GCC) version 4.6.0 (i686-pc-linux-gnu) compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 jacobi.F90: In function 'jacobi': jacobi.F90:137:0: internal compiler error: Segmentation fault STEP TO REPRODUCE = $ gfortran -finstrument-functions -c jacobi.f90
[Bug rtl-optimization/48702] [4.6/4.7 Regression] optimization regression with gcc-4.6 on x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48702 Richard Guenther changed: What|Removed |Added Known to work|4.7.0 | Summary|[4.6 Regression]|[4.6/4.7 Regression] |optimization regression |optimization regression |with gcc-4.6 on |with gcc-4.6 on |x86_64-unknown-linux-gnu|x86_64-unknown-linux-gnu Known to fail||4.7.0 --- Comment #3 from Richard Guenther 2011-04-20 15:08:32 UTC --- I have a patch that makes it fail on trunk as well. IVOPTs generates for (p = &a; p != &a - 3; --p) *(p + 3) = ... and alias analysis doesn't like this invalid pointer.
[Bug libstdc++/36231] ostream includes unistd.h outside namespace std, polluting
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36231 --- Comment #12 from Jonathan Wakely 2011-04-20 15:17:37 UTC --- N.B. same issue reported at https://bugzilla.redhat.com/show_bug.cgi?id=502251
[Bug preprocessor/48677] cpp.exe broken ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48677 --- Comment #9 from ralphengels at gmail dot com 2011-04-20 15:17:40 UTC --- if its any help i noticed that cpp.exe seems to have a dependency on libstdc++6.dll "somewhere" since dependency walker says it doesnt but it barfs pretty loudly if its not there. strange thing is the libstdc++6.dll it wants is the gcc-4.6.0 one (which isnt even built yet) so i had to copy an earlier build to path for it to pick it up. ill try a static build to see if it still throws the error.
[Bug middle-end/48704] ICE: gfortran dies when '-finstrument-functions' option is used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48704 Tobias Burnus changed: What|Removed |Added CC||burnus at gcc dot gnu.org Component|fortran |middle-end --- Comment #1 from Tobias Burnus 2011-04-20 15:18:59 UTC --- With 4.7, I get the following; I have not yet analyzed the failure, but the valgrind output indicates that the segfault happens deep in the middle end: Invalid read of size 2 at 0x68DD55: gimple_build_call (gimple.c:267) by 0x69CB98: gimplify_function_tree (gimplify.c:7896) by 0x91708B: cgraph_analyze_function (cgraphunit.c:790) by 0x9188C9: cgraph_analyze_functions (cgraphunit.c:976) by 0x91931B: cgraph_finalize_compilation_unit (cgraphunit.c:1087) by 0x6D317E: write_global_declarations (langhooks.c:303) by 0x79376A: do_compile (toplev.c:591) by 0x793F24: toplev_main (toplev.c:1967) by 0x38FA81D993: (below main) (in /lib64/libc-2.5.so)
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #7 from Linus Torvalds 2011-04-20 15:30:17 UTC --- (In reply to comment #2) > > I'm not sure where to best address this, rather than throwing in again > the idea of lowering bitfield accesses early on trees. So my gut feel is that getting rid of the bitfield as early as possible, and turning all bitfield accesses into regular load/shift/mask/store operations is always the right thing to do. I also think that doing it with the size that the user specified is generally a good idea, ie I sincerely hope that gcc hasn't thrown away the "unsigned int" part of the type when it does the lowering of the bitfield op. If gcc has forgotten the underlying type, and only looks at the bitfield size and offset, gcc will likely never do a good job at it unless gcc gets _really_ smart and looks at all the accesses around it and decides "I need to do these as 'int' just because (ie in the example, the "unsigned" base type is as important as is the "bits 0..5" range information). So I suspect it's better to just do a totally mindless expansion of bitfield accesses early, and then use all the regular optimizations on them. Rather than keep them around as bitfields and try to optimize at some higher level. In an ironic twist, the real program that shows this optimization problem is "sparse" (the kernel source code checker), which can actually do a "linearize and optimize" the test-case itself, and in this case does this all better than gcc (using its "dump the linearized IR" test-program): [torvalds@i5 ~]$ ./src/sparse/test-linearize test.c test.c:7:5: warning: symbol 'show_bug' was not declared. Should it be static? show_bug: .L0x7f4cf7b93010: load.32 %r2 <- 0[%arg1] and.32 %r3 <- %r2, $-64 store.32%r3 -> 0[%arg1] lsr.32 %r7 <- %r3, $6 cast.32 %r8 <- (16) %r7 ret.32 %r8 Heh. Sparse may get a lot of other things wrong, but it got this particular case right.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #8 from Richard Guenther 2011-04-20 15:39:38 UTC --- (In reply to comment #7) > (In reply to comment #2) > > > > I'm not sure where to best address this, rather than throwing in again > > the idea of lowering bitfield accesses early on trees. > > So my gut feel is that getting rid of the bitfield as early as possible, and > turning all bitfield accesses into regular load/shift/mask/store operations is > always the right thing to do. > > I also think that doing it with the size that the user specified is generally > a > good idea, ie I sincerely hope that gcc hasn't thrown away the "unsigned int" > part of the type when it does the lowering of the bitfield op. Yeah, I was working on this some time ago. > If gcc has forgotten the underlying type, and only looks at the bitfield size > and offset, gcc will likely never do a good job at it unless gcc gets _really_ > smart and looks at all the accesses around it and decides "I need to do these > as 'int' just because (ie in the example, the "unsigned" base type is as > important as is the "bits 0..5" range information). Unfortunately the underlying type isn't easily available (at least I didn't yet find it ...). But I suppose we have to guess anyway considering targets that don't handle unaligned accesses well or packed bitfields. Thus, an idea was to use aligned word-size loads/stores and only at the start/end of a structure fall back to smaller accesses (for strict align targets). I still hope to eventually find that underlying type info somewhere ... > So I suspect it's better to just do a totally mindless expansion of bitfield > accesses early, and then use all the regular optimizations on them. Rather > than > keep them around as bitfields and try to optimize at some higher level. Yep. Same for how we currently deal with unaligned loads on targets that do not support them - the generated code is very similar.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #9 from Richard Guenther 2011-04-20 15:41:09 UTC --- Btw, the branch from the work "some time ago" created show_bug: .LFB2: movl(%rdi), %eax andl$-64, %eax movl%eax, (%rdi) shrl$6, %eax movzwl %ax, %eax ret for your testcase.
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #9 from Tobias Burnus 2011-04-20 15:39:47 UTC --- > But do we actually do this? I did some tests a while ago, and IIRC for assumed > shape dummy arguments the procedure always calculates new bounds such that > they > start from 1. That is, the procedure assumes that the actual argument > descriptor may have lower bounds != 1. > So my argument is basically that with the new descriptor it might make sense > to > switch the responsibility around such that it's the caller who makes sure that > all lower bounds are 0 (as we must have the capability to do this anyway in > order to call inter-operable procedures, no?) instead of the callee. No, the conversion is already done in the caller: subroutine bar(B) interface; subroutine foo(a); integer :: a(:); end subroutine foo end interface integer :: B(:) call foo(B) end subroutine bar Shows: parm.4.dim[0].lbound = 1; [...] foo (&parm.4); For assumed-shape actual arguments, creating a new descriptor is actually not needed - only for deferred shape ones - or if one does not have a full array ref. Cf. gfc_conv_array_parameter, which is called by gfc_conv_procedure_call. However, some additional calculation is also done in the the callee to determine the stride and offset; e.g. ubound.0 = (b->dim[0].ubound - b->dim[0].lbound) + 1; again, if the dummy argument is not deferred-shaped (allocatable or pointer), one actually knows that "b->dim[0].lbound" == 1. I think we have some redundancy here -> missed optimization.
[Bug c/47892] Fails to vectorize comparison code, if-conversion fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47892 --- Comment #7 from Richard Guenther 2011-04-20 15:50:38 UTC --- Author: rguenth Date: Wed Apr 20 15:50:26 2011 New Revision: 172774 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172774 Log: 2011-04-20 Richard Guenther PR tree-optimization/47892 * tree-if-conv.c (if_convertible_stmt_p): Const builtins are if-convertible. * gcc.dg/vect/fast-math-ifcvt-1.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-if-conv.c
[Bug c/47892] Fails to vectorize comparison code, if-conversion fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47892 Richard Guenther changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|--- |4.7.0 --- Comment #8 from Richard Guenther 2011-04-20 15:51:13 UTC --- Fixed on trunk.
Re: [Bug rtl-optimization/48696] Horrible bitfield code generation on x86
> Actually the 4.0.4 compiler is x86_64, the code with -m32. The 4.0.3 > compiler is i586. > > /space/rguenther/install/gcc-4.0.3/libexec/gcc/i686-pc-linux-gnu/4.0.3/cc1 > -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=pentiumpro -auxbase t -O2 > -version -o t.s > > > /space/rguenther/install/gcc-4.0.4/libexec/gcc/x86_64-unknown-linux-gnu/4.0.4/cc1 > -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=k8 -auxbase t -O2 -version -o > t.s > > but no -march/tune combination makes the bug vanish for the 4.0.4 compiler > (maybe a HWI dependent "optimization") We do have i386 flag to disable instruction choice that change memory access size, it is X86_TUNE_MEMORY_MISMATCH_STALL and I remember myself adding code to check that we don't do instruction selection that changes memory access size (i.e. andl->andb promotion) unless when optimizing for size. This code seems to've evaporated from backend, so I guess it needs re-adding ;) It did not however work quite reliably since combiner actually does this kind of transform himself from time to time and I did not get approval to get target hook for that back then (in 2000 or so). Honza
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #10 from Jan Hubicka 2011-04-20 15:54:17 UTC --- > Actually the 4.0.4 compiler is x86_64, the code with -m32. The 4.0.3 > compiler is i586. > > /space/rguenther/install/gcc-4.0.3/libexec/gcc/i686-pc-linux-gnu/4.0.3/cc1 > -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=pentiumpro -auxbase t -O2 > -version -o t.s > > > /space/rguenther/install/gcc-4.0.4/libexec/gcc/x86_64-unknown-linux-gnu/4.0.4/cc1 > -quiet -v t.c -quiet -dumpbase t.c -m32 -mtune=k8 -auxbase t -O2 -version -o > t.s > > but no -march/tune combination makes the bug vanish for the 4.0.4 compiler > (maybe a HWI dependent "optimization") We do have i386 flag to disable instruction choice that change memory access size, it is X86_TUNE_MEMORY_MISMATCH_STALL and I remember myself adding code to check that we don't do instruction selection that changes memory access size (i.e. andl->andb promotion) unless when optimizing for size. This code seems to've evaporated from backend, so I guess it needs re-adding ;) It did not however work quite reliably since combiner actually does this kind of transform himself from time to time and I did not get approval to get target hook for that back then (in 2000 or so). Honza
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #11 from Linus Torvalds 2011-04-20 16:16:52 UTC --- (In reply to comment #8) > > Unfortunately the underlying type isn't easily available (at least I didn't > yet find it ...). But I suppose we have to guess anyway considering > targets that don't handle unaligned accesses well or packed bitfields. > Thus, an idea was to use aligned word-size loads/stores and only at the > start/end of a structure fall back to smaller accesses (for strict align > targets). That sounds fine. The only reason to bother with the "underlying type" is that I suspect it could be possible for educated programmers to use it as a code generation hint. IOW, if all the individual fields end up fitting nicely in "char", using that as a base type (even if the _total_ fields don't fit in a single byte) might be a good hint for the compiler that it can/should use byte accesses and small constants. But using the biggest aligned word-size is probably equally good in practice. And if you end up narrowing the types on _reads_, I think that's fine on x86. I forget the exact store buffer forwarding rules (and they probably vary a bit between different microarchitectures anyway), but I think almost all of them support forwarding a larger store into a smaller (aligned) load. It's just the writes that should normally not be narrowed. (Of course, sometimes you may really want to narrow it. Replacing a andl $0xff00,(%rax) with a simple movb $0,(%rax) is certainly a very tempting optimization, but it really only works if there are no subsequent word-sized loads that would get fouled by the write buffer entry.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #12 from Jakub Jelinek 2011-04-20 16:19:05 UTC --- Well, there is also the expander that can and often does increase the size of the accesses, see e.g. PR48124 for more details. And e.g. for C++0x memory model as well as -fopenmp or, I guess, kernel SMP as well, the access size should never grow into following non-bitfield fields and the bitfield access lowering has to take it into account.
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #10 from Thomas Koenig 2011-04-20 16:40:46 UTC --- (In reply to comment #6) > Not strictly related to inlining, but in the new descriptor we'll have a field > specifying whether the array is simply contiguous, so it might make sense to > generate two loops for each loop over the array in the source, one for the > contiguous case where it can be vectorized etc. and another loop for the > general case. This might reduce the profitability of inlining. Consider the following, hand-crafted matmul: Here, we have three nested loops. The most interesting one is the innermost loop of the matmul, which we vectorize by inlining if we omit the call to my_matmul with non-unity stride for a when compiling with -fwhole-program -O3. How many versions of the loop should we generate? Two or eight, depending on what the caller may do? ;-) module foo implicit none contains subroutine my_matmul(a,b,c) implicit none integer :: count, m, n real, dimension(:,:), intent(in) :: a,b real, dimension(:,:), intent(out) :: c integer :: i,j,k m = ubound(a,1) n = ubound(b,2) count = ubound(a,2) c = 0 do j=1,n do k=1, count do i=1,m c(i,j) = c(i,j) + a(i,k) * b(k,j) end do end do end do end subroutine my_matmul end module foo program main use foo implicit none integer, parameter :: factor=100 integer, parameter :: n = 2*factor, m = 3*factor, count = 4*factor real, dimension(m, count) :: a real, dimension(count, n) :: b real, dimension(m,n) :: c1, c2 real, dimension(m/2, n) :: ch_1, ch_2 call random_number(a) call random_number(b) call my_matmul(a,b,c1) c2 = matmul(a,b) if (any(abs(c1 - c2) > 1e-5)) call abort call my_matmul(a(1:m:2,:),b,ch_1) ch_2 = matmul(a(1:m:2,:),b) if (any(abs(ch_1 - ch_2) > 1e-5)) call abort end program main
[Bug middle-end/48585] [4.7 Regression] 483.xalancbmk in SPEC CPU 2006 failed to build
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48585 --- Comment #8 from Martin Jambor 2011-04-20 17:24:04 UTC --- Looking into this a bit more, this has probably nothing to do with IPA-CP at all. When I dump the body of the function being inlined I get: -- (gdb) call debug_function (id->src_fn, 0) __base_ctor (struct XalanDOMStringPool * const this, long unsigned int theBlockSize, long unsigned int theBucketCount, long unsigned int theBucketSize) { struct XalanDOMStringHashTable * D.46218; struct XalanDOMStringAllocator * D.46217; : this_1(D)->_vptr.XalanDOMStringPool = &_ZTVN10xalanc_1_818XalanDOMStringPoolE[2]; D.46217_2 = &this_1(D)->m_stringAllocator; __comp_ctor (D.46217_2, theBlockSize_3(D)); this_1(D)->m_stringCount = 0; D.46218_4 = &this_1(D)->m_hashTable; __comp_ctor (D.46218_4, theBucketCount_5(D), theBucketSize_6(D)); : return; : __comp_dtor (D.46217_2); resx 1 } -- Yet when I look up the function in the pre-LTO release_ssa dump, I see: -- xalanc_1_8::XalanDOMStringPool::XalanDOMStringPool(unsigned long, unsigned long, unsigned long) (struct XalanDOMStringPool * const this, block_size_type theBlockSize, bucket_count_type theBucketCount, bucket_size_type theBucketSize) { struct XalanDOMStringHashTable * D.22278; struct AllocatorType * D.22277; : this_1(D)->_vptr.XalanDOMStringPool = &_ZTVN10xalanc_1_818XalanDOMStringPoolE[2]; D.22277_2 = &this_1(D)->m_stringAllocator; xalanc_1_8::XalanDOMStringAllocator::XalanDOMStringAllocator (D.22277_2, theBlockSize_3(D)); this_1(D)->m_stringCount = 0; D.22278_4 = &this_1(D)->m_hashTable; xalanc_1_8::XalanDOMStringHashTable::XalanDOMStringHashTable (D.22278_4, theBucketCount_5(D), theBucketSize_6(D)); : return; : xalanc_1_8::XalanDOMStringAllocator::~XalanDOMStringAllocator (D.22277_2); resx 1 } The call to the destructor is already there, it almost looks like we lost the part of the call graph that represents it somewhere in streaming, WPA, partitioning or streaming for LTRANS...
[Bug target/48678] unable to find a register to spill in class ‘GENERAL_REGS’
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48678 Uros Bizjak changed: What|Removed |Added Attachment #24061|0 |1 is obsolete|| --- Comment #13 from Uros Bizjak 2011-04-20 17:41:59 UTC --- Created attachment 24064 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24064 Expand movstrict pattern as pinsr for vector subregs, v2.
[Bug target/48678] unable to find a register to spill in class ‘GENERAL_REGS’
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48678 --- Comment #14 from Uros Bizjak 2011-04-20 17:42:45 UTC --- (In reply to comment #12) > Hm, if line 14 in the testcase is changed to: > > - ((T *) &s.d)[0] = *x; > + ((T *) &s.d)[1] = *x; We should go through insv pattern. Patch v2 attached above.
[Bug target/48690] gcc-4.3.5 fails for target m68k
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48690 --- Comment #7 from diggskevin38 at gmail dot com 2011-04-20 17:42:51 UTC --- This is also busted in all of 4.3 (0, 1, 2, 3, & 4) and 4.5.1.
[Bug target/48690] gcc-4.3.5 fails for target m68k
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48690 --- Comment #8 from diggskevin38 at gmail dot com 2011-04-20 17:46:09 UTC --- Would a diff of the 4.2.4 and 4.3.0 m68k.md file be a useful attachment?
[Bug target/48690] gcc-4.3.5 fails for target m68k
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48690 Andreas Tobler changed: What|Removed |Added CC||andreast at gcc dot gnu.org --- Comment #9 from Andreas Tobler 2011-04-20 17:49:59 UTC --- Is r122895 the last revision which 'works'? (check out a 4.3 branch and do and 'svn up -r 122895'. For checkout details see: http://gcc.gnu.org/svn.html)
[Bug fortran/48588] [4.6/4.7 Regression] ICE (segfault) in gfc_get_nodesc_array_type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48588 --- Comment #9 from Tobias Burnus 2011-04-20 18:07:56 UTC --- Author: burnus Date: Wed Apr 20 18:07:52 2011 New Revision: 172782 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172782 Log: 2011-04-19 Tobias Burnus PR fortran/48588 PR fortran/48692 * module.c (fix_mio_expr): Commit created symbol. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/module.c
[Bug fortran/48692] [4.7 Regression] ICE with gfortran.dg/module_write_1.f90
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48692 --- Comment #5 from Tobias Burnus 2011-04-20 18:07:56 UTC --- Author: burnus Date: Wed Apr 20 18:07:52 2011 New Revision: 172782 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172782 Log: 2011-04-19 Tobias Burnus PR fortran/48588 PR fortran/48692 * module.c (fix_mio_expr): Commit created symbol. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/module.c
[Bug fortran/48692] [4.7 Regression] ICE with gfortran.dg/module_write_1.f90
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48692 Tobias Burnus changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #6 from Tobias Burnus 2011-04-20 18:12:15 UTC --- FIXED
[Bug fortran/48636] Enable more inlining with -O2 and higher
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #11 from Janne Blomqvist 2011-04-20 18:14:20 UTC --- (In reply to comment #9) > > But do we actually do this? I did some tests a while ago, and IIRC for > > assumed > > shape dummy arguments the procedure always calculates new bounds such that > > they > > start from 1. That is, the procedure assumes that the actual argument > > descriptor may have lower bounds != 1. > > So my argument is basically that with the new descriptor it might make > > sense to > > switch the responsibility around such that it's the caller who makes sure > > that > > all lower bounds are 0 (as we must have the capability to do this anyway in > > order to call inter-operable procedures, no?) instead of the callee. > > No, the conversion is already done in the caller: > > subroutine bar(B) > interface; subroutine foo(a); integer :: a(:); end subroutine foo > end interface > integer :: B(:) > call foo(B) > end subroutine bar > > Shows: > parm.4.dim[0].lbound = 1; > [...] > foo (&parm.4); > For assumed-shape actual arguments, creating a new descriptor is actually not > needed - only for deferred shape ones - or if one does not have a full array > ref. > Cf. gfc_conv_array_parameter, which is called by gfc_conv_procedure_call. > > However, some additional calculation is also done in the the callee to > determine the stride and offset; e.g. > ubound.0 = (b->dim[0].ubound - b->dim[0].lbound) + 1; > again, if the dummy argument is not deferred-shaped (allocatable or pointer), > one actually knows that "b->dim[0].lbound" == 1. I think we have some > redundancy here -> missed optimization. Yes, there seems to be some redundancy indeed in that case. I dug up my testcase: module asstest implicit none contains subroutine assub(a, r) real, intent(in) :: a(:,:) real, intent(out) :: r r = a(42,43) end subroutine assub subroutine assub2(a, r) real, intent(in), allocatable :: a(:,:) real, intent(out) :: r r = a(42,43) end subroutine assub2 end module asstest The -fdump-tree-original tree for this module is: assub2 (struct array2_real(kind=4) & a, real(kind=4) & r) { *r = (*(real(kind=4)[0:] *) a->data)[(a->dim[0].stride * 42 + a->dim[1].stride * 43) + a->offset]; } assub (struct array2_real(kind=4) & a, real(kind=4) & r) { integer(kind=8) ubound.0; integer(kind=8) stride.1; integer(kind=8) ubound.2; integer(kind=8) stride.3; integer(kind=8) offset.4; integer(kind=8) size.5; real(kind=4)[0:D.1567] * a.0; integer(kind=8) D.1567; bit_size_type D.1568; D.1569; { integer(kind=8) D.1566; D.1566 = a->dim[0].stride; stride.1 = D.1566 != 0 ? D.1566 : 1; a.0 = (real(kind=4)[0:D.1567] *) a->data; ubound.0 = (a->dim[0].ubound - a->dim[0].lbound) + 1; stride.3 = a->dim[1].stride; ubound.2 = (a->dim[1].ubound - a->dim[1].lbound) + 1; size.5 = stride.3 * NON_LVALUE_EXPR ; offset.4 = -stride.1 - NON_LVALUE_EXPR ; D.1567 = size.5 + -1; D.1568 = (bit_size_type) size.5 * 32; D.1569 = () size.5 * 4; } *r = (*a.0)[(stride.1 * 42 + stride.3 * 43) + offset.4]; } So if we make sure that the caller fixes up the descriptor so that bounds are correct for assumed-shape parameters (as the TR requires for inter-operable procedures), then assub could be as simple as assub2.
[Bug middle-end/48585] [4.7 Regression] 483.xalancbmk in SPEC CPU 2006 failed to build
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48585 --- Comment #9 from Martin Jambor 2011-04-20 18:17:14 UTC --- Actually, IPA-CP is clearly involved, the function we're inlinig to is: (gdb) call debug_generic_expr(id->dst_fn) _ZN10xalanc_1_818XalanDOMStringPoolC2Emmm.constprop.15285 Looking at the WPA cgraph dump, it is apparent that also the src_fn is cloned by IPA-CP. However, it remains to be seen why the original and not the clone is being inlined here. When I look at the ASM name of the src function: (gdb) call debug_generic_expr(decl_assembler_name(id->src_fn)) _ZN10xalanc_1_818XalanDOMStringPoolC2Emmm.9230 And then it up in WPA cgraph_node, it shows no callers or callees at all: __base_ctor /77004(-1) @0x7f91ca86f000 (asm: _ZN10xalanc_1_818XalanDOMStringPoolC2Emmm) availability:n ot_available local prevailing_def_ironly finalized called by: calls: References: Refering this function: aliases & thunks: __comp_ctor /77005 (asm: _ZN10xalanc_1_818XalanDOMStringPoolC1Emmm) So I don't quite understand how it can be scheduled to be inlined...
[Bug tree-optimization/48611] [4.6/4.7 Regression] ICE: SIGSEGV in remap_eh_region_nr (tree-inline.c:1194) with -Os -fopenmp -fexceptions -fno-tree-ccp -fno-tree-copy-prop on basic code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48611 --- Comment #3 from Jakub Jelinek 2011-04-20 18:18:19 UTC --- Author: jakub Date: Wed Apr 20 18:18:16 2011 New Revision: 172783 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172783 Log: PR tree-optimization/48611 * tree-eh.c (note_eh_region_may_contain_throw): Don't propagate beyond ERT_MUST_NOT_THROW region. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-eh.c
[Bug tree-optimization/48611] [4.6/4.7 Regression] ICE: SIGSEGV in remap_eh_region_nr (tree-inline.c:1194) with -Os -fopenmp -fexceptions -fno-tree-ccp -fno-tree-copy-prop on basic code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48611 --- Comment #4 from Jakub Jelinek 2011-04-20 18:19:50 UTC --- Author: jakub Date: Wed Apr 20 18:19:47 2011 New Revision: 172786 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172786 Log: PR tree-optimization/48611 * tree-eh.c (note_eh_region_may_contain_throw): Don't propagate beyond ERT_MUST_NOT_THROW region. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/tree-eh.c
[Bug tree-optimization/48611] [4.6/4.7 Regression] ICE: SIGSEGV in remap_eh_region_nr (tree-inline.c:1194) with -Os -fopenmp -fexceptions -fno-tree-ccp -fno-tree-copy-prop on basic code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48611 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #5 from Jakub Jelinek 2011-04-20 18:29:48 UTC --- Fixed.
[Bug libstdc++/36231] ostream includes unistd.h outside namespace std, polluting
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36231 Paolo Carlini changed: What|Removed |Added CC||jwakely.gcc at gmail dot ||com --- Comment #13 from Paolo Carlini 2011-04-20 18:42:26 UTC --- The last time I checked, the problem boiled down to: typedef __gthread_mutex_t __c_lock; in c_io_stdio.h, which we cannot remove right away for ABI reasons, because we have a __c_lock data member in iostream classes. Of course the member is normally completely unused these days, thus a possible ABI-safe way to attack the problem would be replacing the data member with a dummy member of the same size and alignment, the equivalent of: class stream { union { char __data[sizeof(__gthread_mutex_t)]; struct __attribute__((__aligned__ ((__alignof__(__gthread_mutex_t) { } __align; } __dummy_member; }; In order to figure out those quantities, ie, sizeof(__gthread_mutex_t) and __alignof__(__gthread_mutex_t) we could probably use something like [GLIBCXX_COMPUTE_STDIO_INTEGER_CONSTANTS], Ralf, people, what do you think?
[Bug libstdc++/36231] ostream includes unistd.h outside namespace std, polluting
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36231 Paolo Carlini changed: What|Removed |Added CC||rwild at gcc dot gnu.org --- Comment #14 from Paolo Carlini 2011-04-20 18:44:18 UTC --- Ralf, what do you think about my last Comment?
[Bug fortran/48705] New: [OOP] ICE with generic TBP
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48705 Summary: [OOP] ICE with generic TBP Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: bur...@gcc.gnu.org CC: ja...@gcc.gnu.org The following program fails with: generic_deferred_01_pos.f90:45:0: internal compiler error: in fold_convert_loc, at fold-const.c:1915 The test case is part of LRZ's fortran_tests. module generic_deferred implicit none type, abstract :: addable contains private procedure(add), deferred :: a generic, public :: operator(+) => a end type addable abstract interface function add(x, y) result(res) import :: addable class(addable), intent(in) :: x, y class(addable), allocatable :: res end function add end interface type, extends(addable) :: vec integer :: i(2) contains procedure :: a => a_vec end type contains function a_vec(x, y) result(res) class(vec), intent(in) :: x class(addable), intent(in) :: y class(addable), allocatable :: res integer :: ii(2) select type(y) class is (vec) ii = y%i end select allocate(vec :: res) select type(res) type is (vec) res%i = x%i + ii end select end function end module generic_deferred program prog use generic_deferred implicit none type(vec) :: x, y class(addable), allocatable :: z ! x = vec( (/1,2/) ); y = vec( (/2,-2/) ) x%i = (/1,2/); y%i = (/2,-2/) allocate(z, source= x + y) select type(z) type is(vec) if (z%i(1) /= 3 .or. z%i(2) /= 0) then write(*,*) 'FAIL' else write(*,*) 'OK' end if end select end program prog