[Bug target/16641] fr30-elf-gcc compiler error when building newlib-1.12.0
--- Comment #6 from roger at eyesopen dot com 2006-04-23 21:19 --- This should now be fixed on mainline. I've confirmed that a cross-compiler to fr30-elf currently builds newlib without problems. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16641
[Bug target/21283] [4.0/4.1 regression] ICE with doubles
--- Comment #4 from roger at eyesopen dot com 2006-04-23 21:27 --- This has now been fixed on mainline. I've confirmed that a cross-compiler to fr30-elf can currently compile all of newlib without problems. If anyone has an fr30 board or a simulator to check the testsuite that would be great. -- roger at eyesopen dot com changed: What|Removed |Added Summary|[4.0/4.1/4.2 regression] ICE|[4.0/4.1 regression] ICE |with doubles|with doubles Target Milestone|4.1.1 |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21283
[Bug target/27282] [4.2 regression] ICE in final_scan_insn, at final.c:2448 - could not split insn
--- Comment #6 from roger at eyesopen dot com 2006-04-25 14:09 --- Paolo's fix looks good to me. The bugzilla PR shows that this is a 4.2 regression, probably due to the more aggressive RTL optimizations on mainline. So I'll preapprove Paolo's fix for mainline (please post the version you commit and a new testcase when you commit it). As for 4.1, do we have an example of a failure or wrong code generation against the branch? I can't tell from bugzilla whether this is safely latent in 4.0 and 4.1, or just hasn't been investigated there yet ("known to work" is blank, but the summary only lists [4.2]). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27282
[Bug target/27282] [4.2 regression] ICE in final_scan_insn, at final.c:2448 - could not split insn
--- Comment #8 from roger at eyesopen dot com 2006-04-25 15:41 --- Grr. David's patch is also good. Perhaps better if we follow the usual protocol of posting patches to gcc-patches *after* bootstrap and regression testing, for review and approval. Posting untested patch fragments to bugzilla without ChangeLog entries and asking for preapproval etc... seems to, in this instance at least, demonstrate why GCC has the contribution protocols that it has. Thanks to David for catching this. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27282
[Bug target/21283] [4.0 regression] ICE with doubles
--- Comment #6 from roger at eyesopen dot com 2006-04-26 18:59 --- This has now been fixed on the 4.1 branch. Unfortunately, its difficult to determine whether this patch is still needed on the 4.0 branch, or if other backports are also required, as libiberty and top-level configure are now incompatible between the gcc-4_0-branch and mainline "src", making an uberbaum build of a 4.0 cross-compiler almost impossible. -- roger at eyesopen dot com changed: What|Removed |Added Known to fail|4.0.1 4.1.0 |4.0.1 Known to work|3.4.5 |3.4.5 4.1.1 4.2.0 Summary|[4.0/4.1 regression] ICE|[4.0 regression] ICE with |with doubles|doubles Target Milestone|4.2.0 |4.1.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21283
[Bug rtl-optimization/13335] cse of sub-expressions of zero_extend/sign_extend expressions
--- Comment #9 from roger at eyesopen dot com 2006-04-30 19:52 --- This bug is a duplicate of PR17104 which was fixed by Nathan Sidwell in November 2004. If you read comment #4, you'll notice that the failure of CSE to handle the rs6000's rs6000_emit_move's zero_extends is identical. *** This bug has been marked as a duplicate of 17104 *** -- roger at eyesopen dot com changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13335
[Bug rtl-optimization/17104] Non-optimal code generation for bitfield initialization
--- Comment #9 from roger at eyesopen dot com 2006-04-30 19:52 --- *** Bug 13335 has been marked as a duplicate of this bug. *** -- roger at eyesopen dot com changed: What|Removed |Added CC||dje at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17104
[Bug fortran/27269] Segfault with EQUIVALENCEs in modules together with ONLY clauses
--- Comment #5 from roger at eyesopen dot com 2006-05-02 14:24 --- This should now be fixed on mainline, thanks to Paul's patch. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27269
[Bug fortran/27324] Initialized module equivalence member causes assembler error
--- Comment #4 from roger at eyesopen dot com 2006-05-02 14:26 --- This should now be fixed on mainline by Paul's patch. Thanks. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27324
[Bug c/25309] [4.0/4.1/4.2 Regression] ICE on initialization of a huge array
--- Comment #10 from roger at eyesopen dot com 2006-05-04 00:14 --- This should now be fixed on mainline and all active branches. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|4.1.1 |4.0.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25309
[Bug tree-optimization/27285] [4.1 regression] ivopts postgresql miscompilation
--- Comment #8 from roger at eyesopen dot com 2006-05-08 15:29 --- I've now reconfirmed that this has been fixed on the gcc-4_1-branch by Jakub's backport of Zdenek's patch. Thanks to you both. -- roger at eyesopen dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27285
[Bug target/26600] [4.1/4.2 Regression] internal compiler error: in push_reload, at reload.c:1303
--- Comment #8 from roger at eyesopen dot com 2006-05-11 17:22 --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00472.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26600
[Bug middle-end/20722] select_section invoked with argument "unlikely"
--- Comment #2 from roger at eyesopen dot com 2006-05-13 18:59 --- This is the correct documented behaviour. See the section entitled "USE_SELECT_SECTION_FOR_FUNCTIONS" in doc/tm.texi, which reads: > @defmac USE_SELECT_SECTION_FOR_FUNCTIONS > Define this macro if you wish TARGET_ASM_SELECT_SECTION to be called > for @code{FUNCTION_DECL}s as well as for variables and constants. > > In the case of a @code{FUNCTION_DECL}, @var{reloc} will be zero if the > function has been determined to be likely to be called, and nonzero if > it is unlikely to be called. > @end defmac This is also cross referenced from the TARGET_ASM_SELECT_SECTION target hook documentation, as the semantics for selecting function sections. The only backend(s) that define USE_SELECT_SECTION_FOR_FUNCTIONS, darwin, appears to implement the semantics as described above. The two calls in function_section and current_function_section are guarded by #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS. Admittedly, this could have been implemented by a second target hook, and/or the variable names in the varasm functions could be less confusing, but this isn't a bug, and certainly not P1. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20722
[Bug middle-end/26729] [4.0 regression] bad bitops folding
--- Comment #20 from roger at eyesopen dot com 2006-05-14 17:39 --- Hi APL, Re: comment #18. It was actually stevenb that changed the "known to work" line, and assigned this PR to me, after I'd committed a fix to the gcc-4_1-branch. See http://gcc.gnu.org/ml/gcc-bugs/2006-05/msg01351.html Marking 4.1.0 as known to work was a simple mistake/typo, and it should read that 4.1.0 is known to fail, but 4.1.1 is known to work. I retested b.cxx explicitly to confirm that it really is fixed on the release branch. -- roger at eyesopen dot com changed: What|Removed |Added Known to fail|3.3.6 3.4.3 4.0.2 |3.3.6 3.4.3 4.0.2 4.1.0 Known to work|2.95.4 4.2.0 4.1.0 |2.95.4 4.2.0 4.1.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26729
[Bug rtl-optimization/14261] ICE due to if-conversion
--- Comment #5 from roger at eyesopen dot com 2006-05-15 17:37 --- This should now be fixed on both mainline and the 4.1 branch. Thanks Andreas. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.1.2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14261
[Bug middle-end/26729] [4.0 regression] bad bitops folding
--- Comment #22 from roger at eyesopen dot com 2006-05-15 17:41 --- This should now be fixed on all open branches. -- roger at eyesopen dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|4.1.1 |4.0.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26729
[Bug target/26600] [4.1/4.2 Regression] internal compiler error: in push_reload, at reload.c:1303
--- Comment #12 from roger at eyesopen dot com 2006-05-18 01:50 --- This is now fixed on both mainline and the 4.1 branch. -- roger at eyesopen dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26600
[Bug middle-end/21067] Excessive optimization of floating point expression
--- Comment #4 from roger at eyesopen dot com 2006-05-20 15:14 --- This problem is fixed by specifying the -frounding-math command line option, which informs the compiler that non-default rounding modes may be used. With gcc-3.4, specifying this command line option disables this potentially problematic transformation. Strangely, on mainline, it looks like this transformation is no longer triggered, which may now indicate a missed optimization regression with (the default) -fno-rounding-math. We should also catch the division case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21067
[Bug tree-optimization/23452] Optimizing CONJG_EXPR (a) * a
--- Comment #3 from roger at eyesopen dot com 2006-06-01 02:41 --- This is now fixed on mainline provided the user specifies -ffast-math. There are some complications where imagpart(z*~z) can be non-zero, if imagpart(z) is non-finite, such as an Inf or a NaN. It's unclear from the fortran-95 standard whether gfortran is allowed to optimize this even without -ffast-math. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23452
[Bug target/26223] [4.0 regression] ICE on long double with -mno-80387
--- Comment #13 from roger at eyesopen dot com 2006-06-06 22:41 --- This should now be fixed on all active branches. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26223
[Bug target/27082] segfault with virtual class and visibility ("hidden")
--- Comment #14 from roger at eyesopen dot com 2006-06-19 23:50 --- Unfortunately, I'm unable to reproduce this failure with a cross-compiler to alphaev68-unknown-linux-gnu. However, examination of the tracebacks attached to this PR and the relevant source code reveals there is a potential problem. It looks like alpha_expand_mov can call force_const_mem on RTL expressions that are CONSTANT_P but that potentially don't satify targetm.cannot_force_const_mem, such as CONST etc... This would lead to precisely the failures observed in the discussion. operands[1] gets overwritten by NULL_RTX, and we then call validize_mem on a NULL pointer! Kaboom! I think one aspect of the solution is the following patch: Index: alpha.c === *** alpha.c (revision 114721) --- alpha.c (working copy) *** alpha_expand_mov (enum machine_mode mode *** 2227,2232 --- 2227,2237 return true; } + /* Don't call force_const_mem on things that we can't force + into the constant pool. */ + if (alpha_cannot_force_const_mem (operands[1])) + return false; + /* Otherwise we've nothing left but to drop the thing to memory. */ operands[1] = force_const_mem (mode, operands[1]); if (reload_in_progress) However, it's not impossible that this will prevent the current failure only to pass the problematic operand on to somewhere else in the compiler. Could someone who can reproduce this failure, try the above patch and see if there's any downstream fallout? It would also be great to see what the problematic RTX looks like. I'm pretty sure its either a SYMBOL_REF, a LABEL_REF or a CONST. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27082
[Bug middle-end/28131] [4.2 Regression] FAIL: gcc.c-torture/execute/va-arg-25.c compilation (ICE)
--- Comment #5 from roger at eyesopen dot com 2006-06-22 00:37 --- Doh! My apologies for the breakage! I think Dave's patch looks good, but the one suggestion that I would make would be to test for MODE_INT first, then call the type_for_mode langhook. This saves calling type_for_mode on unusual modes. tree tmp = NULL_TREE; if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) return const_vector_from_tree (exp); if (GET_MODE_CLASS (mode) == MODE_INT) - tmp = fold_unary (VIEW_CONVERT_EXPR, - lang_hooks.types.type_for_mode (mode, 1), - exp); + { + tree type_for_mode = lang_hooks.types.type_for_mode (mode, 1); + if (type_for_mode) + tmp = fold_unary (VIEW_CONVERT_EXPR, type_for_mode, exp); + } if (!tmp) I'll pre-approve that change, it bootstraps and regression tests OK. Unfortunately, extern "C" conflicts for errno in the HPUX system headers mean that I'm unable to test on my HPPA box myself at the moment :-( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28131
[Bug target/27861] [4.0 regression] ICE in expand_expr_real_1, at expr.c:6916
--- Comment #10 from roger at eyesopen dot com 2006-06-22 04:46 --- This should now be fixed on all active branches. Thanks to Martin for confirming the fix bootstraps and regression tests fine on mipsel-linux-gnu. And thanks, as always, to Andrew Pinski for maintaining the PR in bugzilla. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27861
[Bug middle-end/27889] [4.1/4.2 Regression] ICE on complex assignment in nested function
--- Comment #14 from roger at eyesopen dot com 2006-06-26 00:24 --- The problem appears to be that DECL_COMPLEX_GIMPLE_REG_P is not getting set on the declarations correctly. The VAR_DECLs that are operands to the additions don't have DECL_COMPLEX_GIMPLE_REG_P set, so fail the is_gimple_val check in verify_stmts. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27889
[Bug middle-end/28283] SuperH: Very unoptimal code generated for 64-bit ints
--- Comment #2 from roger at eyesopen dot com 2006-07-06 19:17 --- Investigating... I suspect that the SH backend's rtx_costs are parameterized incorrectly, such that a 64-bit shift by the constant 32, looks to be at least 32 times more expensive than a 64-bit addition. The middle-end then uses these numbers to select the appropriate code sequence to generate. Combine also doesn't both cleaning this up because it also the same invalid rtx_costs, and discovers than combining additions into shifts doesn't appear to be a win on this target. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28283
[Bug middle-end/28283] SuperH: Very unoptimal code generated for 64-bit ints
--- Comment #5 from roger at eyesopen dot com 2006-07-06 19:47 --- No the rtx_costs for a DImode shift really are wrong. The use of the constant 1 in sh.c:shift_costs instructs the middle-end to avoid using DImode shifts at all costs. The semantics of rtx_costs is that it is expected to provide an estimate of the cost of performing an instruction (either in size when optimize_size or in performance whrn !optimize_size) even if the hardware doesn't support that operation directly. For example, a backend may even need to provide estimates of the time taken for a libcall to libgcc, if such an operation is necessary, or when optimizing for size, how large such setting up and executing such a call sequence should be. It's only by providing accurate information such as this that an embedded backend such as SH is able to provide fine control over the code sequences selected by the GCC middle-end. As for the little-endian vs. big-endian issue that looks like a second bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28283
[Bug c/25995] switch/case does not detect invalid enum values if default-case is used
--- Comment #3 from roger at eyesopen dot com 2006-07-08 14:31 --- I tried fixing this bug, only to discover why things are exactly as they are. The short answer is that GCC will warn about the example code if you specify the -Wswitch-enum command line option. Specifying -Wall implies the weaker -Wswitch, which intentionally disables the checking of enumeration literals in switch statements. But why would anyone want to disable warning for the example code, I thought to myself, until bootstrapping GCC itself discovered a large number of cases identical to the one reported. Internally, GCC itself uses an enumeration called tree_code that tracks the different types of node of GCC's abstract syntax tree (AST). However, numerous front-ends, supplement this enumeration with their own front-end specific tree codes, for example, COMPOUND_LITERAL_EXPR. Hence, the various GCC front-ends are littered with source code that looks like: switch (TREE_CODE (t)) { case COMPOUND_LITERAL_EXPR: ... where the case value isn't one of the values of the original enum tree_code enumeration. Similar problems appeared in dwarf2out.c and other GCC source files. At first I started changing things to "switch ((int) TREE_CODE (t))" to silence the warning, but quickly became overwhelmed by the number of source files that needed updating. Hence, the current status quo. GCC uses the "default: break;" idiom to indicate which switch statements may be bending the rules, to turn off this warning with the default -Wall/-Wswitch used during bootstrap. Well written user code, on the other hand, should probably always use -Wswitch-enum. If you read the documentation of -Wswitch vs. -Wswitch-enum, you'll see that the disabling of these warnings when a default case is specified, is a curious "feature", purely to aid GCC to compile itself. As Andrew Pinskia points out in comment #2, it's valid C/C++ so shouldn't warrant an immediate warning, so the explicit -Wswitch-enum, requesting stricter checking seems reasonable. I hope this helps, and the -Wswitch-enum fulfils this enhancement request. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25995
[Bug other/22313] [4.2 Regression] profiledbootstrap is broken on the mainline
--- Comment #37 from roger at eyesopen dot com 2006-07-17 22:15 --- I've now tested "make profiledbootstrap" on both mainline and the gcc-4_1-branch, on both x86_64-unknown-linux-gnu and i686-pc-linux-gnu, and not only does the profiled bootstrap build fine, but the dejagnu testsuite looks identical to a baseline "make bootstrap". Could anyone confirm whether they're still seeing this problem? Its likely that Andrew Pinski's patches together with the resolution of PRs 25518 and 26449 have now resolved this issue. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22313
[Bug middle-end/5169] paradoxical subreg problem
--- Comment #13 from roger at eyesopen dot com 2005-12-04 18:06 --- This bug has been fixed, and not just hidden. Jeff Law's proposed solution to this problem http://gcc.gnu.org/ml/gcc/2002-01/msg01872.html which was proposed in January 2002, was contained as part of Jeff/HP's patch of April 2004 (which is subversion revision 51785). 2002-04-03 Jeffrey A Law ([EMAIL PROTECTED]) Hans-Peter Nilsson <[EMAIL PROTECTED]> * combine.c (simplify_comparison): Avoid narrowing a comparison with a paradoxical subreg when doing so would drop signficant bits. This explains why no-one was has been able to reproduce the problem since that date, and assumed that it was hidden (had gone latent) by an unrelated change. Since then that solution has been corrected/improved further by Ulrich in revision 70785. 2003-08-25 Ulrich Weigand <[EMAIL PROTECTED]> * combine.c (simplify_comparison): Re-enable widening of comparisons with non-paradoxical subregs of non-REG expressions. Alan Modra's (self-described) "band-aid" patch was never ideal. It's not unreasonable for combine to eliminate an AND expression with a read from memory, if the paradoxical subreg semantics for the target imply zero extension. It's the later "unsafe" simplification of the comparison that was at fault. Hence the current situation where simplify_comparison has been fixed, and we don't have to needlessly disable a useful optimization with Alan's patch is the most appropriate outcome. Alan's work-around would have been suitable for a release branch, if we didn't yet have the correct fix or such a fix was too intrusive. -- roger at eyesopen dot com changed: What|Removed |Added ---------------- CC||roger at eyesopen dot com Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5169
[Bug c/7776] const char* p = "foo"; if (p == "foo") ... is compiled without warning!
--- Comment #14 from roger at eyesopen dot com 2005-12-06 15:39 --- Fixed for gcc v4.2 -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7776
[Bug c++/25263] [4.2 regression] ICE on invalid array bound: int x[1/0];
--- Comment #3 from roger at eyesopen dot com 2005-12-06 15:43 --- Fixed. I've checked all other uses of TREE_OVERFLOW in cp/decl.c and c-decl.c to confirm that the C front-end isn't affected by a similar issue there. Sorry for any inconvenience. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25263
[Bug rtl-optimization/25432] [4.1/4.2 Regression] Reload ICE in gen_add2_insn
--- Comment #8 from roger at eyesopen dot com 2005-12-22 16:05 --- Alan's patch has already been approved by Ian here: http://gcc.gnu.org/ml/gcc-patches/2005-12/msg01397.html I think it would also be good idea to add the original bugzilla test case, from comment #1, to the testsuite, to prevent future problems. Pre-approved :-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25432
[Bug target/25213] [3.4 only] -fpic/-fPIC testsuite failures in gcc.dg/i386-387-3.c and i386-387-4.c
--- Comment #2 from roger at eyesopen dot com 2005-12-29 19:42 --- Investigating further, PR25213 looks like a duplicate of PR23098. In that bugzilla trail, Andrew correctly identified it as a regression from gcc 3.2.3 when using -fpic/-fPIC on x86, but the PR was closed once the fix was applied to 4.0 and later. I suspect that Jakub's fix for this problem (committed to the 4.0 branch in comment #11) is the correct fix. i.e. the changed splitter in i386.md is where mainline currently selects the fldpi instruction. Looking through the patch it seems safe enough for the 3.4.x branch, I'll try bootstrapping and regression testing a backport. -- roger at eyesopen dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2005-12-29 19:42:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25213
[Bug other/22313] [4.2 Regression] profiledbootstrap is broken on the mainline
--- Comment #39 from roger at eyesopen dot com 2006-07-24 00:45 --- My latest analysis and a possible patch/workaround have been posted here: http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01015.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22313
[Bug middle-end/28473] [4.0/4.1/4.2 Regression] with -O, casting result of round(x) to uint64_t produces wrong values for x > INT_MAX
--- Comment #6 from roger at eyesopen dot com 2006-07-25 20:02 --- Grr. I've just noticed richi has just assigned this patch to himself. I also have a patch that been bootstrapped and has nearly finished regression testing, that I was just about to post/commit. richi what does your fix look like? Mine contains several copies of: if (!TARGET_C99_FUNCTIONS) break; ! if (outprec < TYPE_PRECISION (long_integer_type_node) ! || (outprec == TYPE_PRECISION (long_integer_type_node) ! && !TYPE_UNSIGNED (type))) fn = mathfn_built_in (s_intype, BUILT_IN_LCEIL); + else if (outprec == TYPE_PRECISION (long_long_integer_type_node) + && !TYPE_UNSIGNED (type)) + fn = mathfn_built_in (s_intype, BUILT_IN_LLCEIL); break; [Serves me right for not assigning this when pinkia asked me to investigate. I knew there was a good reason I don't normally bother with recent PRs]. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28473
[Bug middle-end/28473] [4.0/4.1/4.2 Regression] with -O, casting result of round(x) to uint64_t produces wrong values for x > INT_MAX
--- Comment #7 from roger at eyesopen dot com 2006-07-25 20:08 --- Ahh, I've just found the Richard's patch submission posting at http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01065.html I agree with Andrew Pinski, I think my changes are the better fix. We also need to investigate wether (unsigned int)round(x) is better implemented as (unsigned int)llround(x). For the time being, my patch doesn't perform this transformation, and using lround is unsafe. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28473
[Bug middle-end/28915] [4.2 regression] ICE: tree check: expected class 'constant', have 'declaration' (var_decl) in build_vector, at tree.c:973
--- Comment #11 from roger at eyesopen dot com 2006-09-06 15:27 --- Hmm, yep I guess it was caused my change, most probably this part of it: * tree.c (build_constructor_single): Mark a CONSTRUCTOR as constant, if all of its elements/components are constant. (build_constructor_from_list): Likewise. It looks like someplace is changing the contents of this CONSTRUCTOR to a VAR_DECL "t.0", but not reseting the TREE_CONSTANT flag. Hence on PPC we end up with a bogus constant constructor during RTL expansion!? Scalar replacement perhaps?? Grr. I'll investigate. Sorry for the inconvenience. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28915
[Bug middle-end/28915] [4.2 regression] ICE: tree check: expected class 'constant', have 'declaration' (var_decl) in build_vector, at tree.c:973
--- Comment #12 from roger at eyesopen dot com 2006-09-06 15:36 --- Here's the .102t.final_cleanup ;; Function f (f) f () { int D.1524; int D.1522; int D.1520; int t.0; : t.0 = (int) &t; D.1520 = (int) &t[1]; D.1522 = (int) &t[2]; D.1524 = (int) &t[3]; return {t.0, D.1520, D.1522, D.1524}; } The CONSTRUCTOR in the return incorrectly has the TREE_CONSTANT flag set. So the problem is somewhere in tree-ssa. One workaround/improvement might be for out-of-ssa to reconstitute the constructor back to a constant. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28915
[Bug libgomp/28296] [4.2 Regression] libgomp fails to configure on Tru64 UNIX
--- Comment #7 from roger at eyesopen dot com 2006-09-11 16:36 --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00406.html -- roger at eyesopen dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |roger at eyesopen dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-09-11 16:36:30 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28296
[Bug bootstrap/28784] [4.2 regression] Bootstrap comparison failure
--- Comment #6 from roger at eyesopen dot com 2006-09-11 16:52 --- I believe I have a patch. I'm just waiting for the fix for PR28672 (which I've just approved) to be applied, so I can complete bootstrap and regression test to confirm there are no unexpected side-effects. -- roger at eyesopen dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |roger at eyesopen dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-09-11 16:52:53 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28784
[Bug c/29132] [4.2 Regression] Mips exception handling broken.
--- Comment #1 from roger at eyesopen dot com 2006-09-18 21:27 --- Hi David, I was wondering if you have a MIPS tree handy, whether you could easily test the following single line patch: Index: dwarf2out.c === *** dwarf2out.c (revision 117035) --- dwarf2out.c (working copy) *** dwarf2out_begin_prologue (unsigned int l *** 2572,2578 fde = &fde_table[fde_table_in_use++]; fde->decl = current_function_decl; fde->dw_fde_begin = dup_label; ! fde->dw_fde_current_label = NULL; fde->dw_fde_hot_section_label = NULL; fde->dw_fde_hot_section_end_label = NULL; fde->dw_fde_unlikely_section_label = NULL; --- 2572,2578 fde = &fde_table[fde_table_in_use++]; fde->decl = current_function_decl; fde->dw_fde_begin = dup_label; ! fde->dw_fde_current_label = dup_label; fde->dw_fde_hot_section_label = NULL; fde->dw_fde_hot_section_end_label = NULL; fde->dw_fde_unlikely_section_label = NULL; Due to all the abstraction with debugging formats, its difficult to tell the order in which things get executed, and whether this initial value for dw_fde_current_label survives long enough to avoid use of a set_loc. Many thanks in advance, -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29132
[Bug middle-end/26983] [4.0 Regression] Missing label with builtin_setjmp/longjmp
--- Comment #16 from roger at eyesopen dot com 2006-09-22 15:40 --- Fixed everywhere. Eric even has an improved patch/fix for mainline, but the backports of this change are sufficient to resolve the current PR. Thanks to Steven for coming up with the solution. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26983
[Bug debug/29132] [4.1 Regression] Mips exception handling broken.
--- Comment #7 from roger at eyesopen dot com 2006-09-22 16:51 --- Fixed on mainline (confirmed on mips-sgi-irix6.5). It'll take another day or two to backport to the 4.1 branch, as bootstrap and regtest on MIPS takes a while. -- roger at eyesopen dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |roger at eyesopen dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Known to fail|4.1.2 4.2.0 |4.1.2 Known to work|4.1.1 |4.1.1 4.2.0 Last reconfirmed|-00-00 00:00:00 |2006-09-22 16:51:25 date|| Summary|[4.1/4.2 Regression] Mips |[4.1 Regression] Mips |exception handling broken. |exception handling broken. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29132
[Bug middle-end/22370] Vec lower produces mis-match types
--- Comment #2 from roger at eyesopen dot com 2007-01-28 02:58 --- Hi Andrew, could you recheck whether you can reproduce this problem on mainline? Updating the MODIFY_EXPR patch in PR 22368 to check GIMPLE_MODIFY_STMT, I'm unable to reproduce this failure on x86_64-unknown-linux-gnu, even with -m32. There has been at least one type clean-up patch to veclower, so I suspect this issue may have been resolved. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22370
[Bug rtl-optimization/17236] inefficient code for long long multiply on x86
--- Comment #5 from roger at eyesopen dot com 2007-02-02 00:17 --- It looks like Ian's recent subreg lowering pass patch has improved code generation on this testcase. Previously, we'd spill three integer registers to the stack for "LLM", we're now down to two. [A significant improvement from the five we spilled when this bug was reported] Before: LLM:subl$12, %esp movl%ebx, (%esp) movl28(%esp), %edx movl20(%esp), %ebx movl16(%esp), %ecx movl24(%esp), %eax movl%esi, 4(%esp) movl%edx, %esi movl%edi, 8(%esp) movl%ebx, %edi movl(%esp), %ebx imull %ecx, %esi imull %eax, %edi mull%ecx addl%edi, %esi movl8(%esp), %edi leal(%esi,%edx), %edx movl4(%esp), %esi addl$12, %esp ret After: LLM:subl$8, %esp movl%ebx, (%esp) movl20(%esp), %eax movl%esi, 4(%esp) movl24(%esp), %ecx movl12(%esp), %esi movl16(%esp), %ebx imull %esi, %ecx imull %eax, %ebx mull%esi movl4(%esp), %esi addl%ebx, %ecx movl(%esp), %ebx addl$8, %esp leal(%ecx,%edx), %edx ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236
[Bug middle-end/24427] missing optimization opportunity with binary operators
--- Comment #10 from roger at eyesopen dot com 2007-02-18 18:10 --- Hi Eric, It's not PR24427 that's the motivation for this backport, but PR 28173. In fact, it was *your* request in comment #2 of PR28173 to backport this! I'm a little disappointed you'd even question my decision/authority to backport a regression fix. :-) Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24427
[Bug middle-end/30744] [4.2/4.3 Regression] ICE in compare_values, at tree-vrp.c:466
--- Comment #4 from roger at eyesopen dot com 2007-03-06 16:32 --- This should now be fixed on both mainline and the 4.2 release branch. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Known to fail|4.2.0 4.3.0 | Known to work|4.1.1 |4.1.1 4.2.0 4.3.0 Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30744
[Bug rtl-optimization/28173] [4.0/4.1 regression] misses constant folding
--- Comment #6 from roger at eyesopen dot com 2007-03-08 01:55 --- I suspect this problem is now fully resolved. The patch for PR24427 has been backported to the gcc-4_1-branch, and additionally on mainline, simplify-rtx.c has been enhanced to also perform the missed-optimization at the RTL level. Given that the 4.0 branch is now closed, I believe this is sufficient to close this PR. -- roger at eyesopen dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28173
[Bug fortran/31620] [4.3 regression] Zeroing one component of array of derived types zeros the whole structure.
--- Comment #10 from roger at eyesopen dot com 2007-04-23 20:54 --- Many thanks to Paul for fixing this, and my apologies for being overloaded at work and not being available to investigate it fully myself. I believe that Paul's fix of explicitly checking expr1->ref->next is the correct way to determine whether a reference is too complex. My confusion is that this test should already be being checked/verified in the call to gfc_full_array_ref_p on the line immediately following his change. So on line 1124 of dependency.c in gfc_f_a_r_p is the clause if (ref->next) return false; which should be doing exactly the same thing. The reason I mention this is perhaps GCC is miscompiling itself, and this gfortran failure is the visible manifestation. Alternatively, perhaps ref->next isn't getting set properly, or is getting clobbered somehow. Paul does your new testcase fail without your fix? My apologies again if I'm missing something obvious. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31620
[Bug fortran/31620] [4.3 regression] Zeroing one component of array of derived types zeros the whole structure.
--- Comment #11 from roger at eyesopen dot com 2007-04-23 21:05 --- Duh! I am missing something obvious! The ref->u.ar.type == AR_FULL test on line 1120 returns true. The test for ref->next needs to be moved earlier. Sorry again for the inconvenience. Clearly, my brain isn't working properly at the moment :-( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31620
[Bug target/33545] New: [4.3 regression] Bootstrap failure/broken exceptions on alpha/Tru64
I just tried compiling mainline on my dusty alphaev67-dec-osf5.1 and discovered that recent RTL CFG changes have broken the way that exceptions are implemented on alpha/Tru64. Natively, this is seen with "configure" and "make bootstrap" as a breakage configuring libstdc++-v3 where the "exception model" can't be detected in configure. The underlying cause is that exceptions seem to be now be fundamentally broken on this target. The simple source file: struct S { ~S(); }; void bar(); void foo() { S s; bar(); } triggers the following ICE: conf.C: In function 'void foo()': conf.C:7: error: end insn 27 for block 7 not found in the insn stream conf.C:7: error: head insn 24 for block 7 not found in the insn stream conf.C:7: internal compiler error: Segmentation fault The problem is reproduceable with a cross-compiler (for example from x86_64-linux) which is configured as "configure --target=alphaev67-dec-osf5.1" followed by "make". The build fails attempting to build mips-tfile, but not before creating a suitable "cc1plus" which can be used to demonstrate the problem/logic error. The underlying cause appears to be associated with the alpha backend's use of the middle-end RTL function "emit_insn_at_entry". Using grep it appears "alpha" is the only backend that uses this function, and I suspect recent changes to it's implementation (Honza?) are responsible for the breakage. The flow of events is that in the rest_of_handle_eh pass, expect.c:finish_eh_generation calls gen_exception_receiver. In alpha.md, the define_expand for "exception_receiver" makes use of the function "alpha_gp_save_rtx" when TARGET_LD_BUGGY_LDGP is defined (as on Tru64 with the native ld). The implementation of alpha_gp_save_rtx in alpha/alpha.c creates a memory slot (with assign_stack_local) and the generates some RTL to initialize this which it tries to insert at the start of the function using "emit_insn_at_entry". The logic error is that emit_insn_at_entry as currently implemented in cfgrtl.c uses insert_insn_on_edge and commit_edge_insertions. This occurs during RTL expansion when some of the basic blocks are not yet fully constructed, hence the expected invariants are not correctly satisfied leading to the errors and the segmentation fault. The only other caller of emit_insn_at_entry appears to be integrate.c's emit_initial_value_sets. However, I'm guessing that because this doesn't occur during RTL expansion, there's no problem with using commit_edge_insertions. I'm not sure what the correct way to fix this is. I'd like to say that this is a middle-end problem with the change in semantics/implementation of emit_insn_at_entry causing a regression. However, because it's only the alpha that's misbehaving it's probably more appropriate to classify this as a target bug, and get Jan/RTH or someone familiar with how this should work to correct alpha.c's alpha_gp_save_rtx. One approach might be to always emit the memory write in the function prologue, and rely on later RTL passes to eliminate the dead store and reclaim the unused stack slot from the function's frame. I'm happy to test (and approve) patches for folks who don't have access to this hardware. -- Summary: [4.3 regression] Bootstrap failure/broken exceptions on alpha/Tru64 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: roger at eyesopen dot com GCC target triplet: alphaev67-dec-osf5.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33545
[Bug target/33545] [4.3 regression] Bootstrap failure/broken exceptions on alpha/Tru64
--- Comment #1 from roger at eyesopen dot com 2007-10-13 04:14 --- Many thanks to Eric Botcazou! It turns out that this bug was a duplicate of PR target/32325. I can confirm that with Eric's fix, and once I'd committed my libstdc++ patch for the EOVERFLOW issue (mentioned by Eric in PR23225's comment #6), and Alex's SRA fixes stop miscompiling mips-tfile, that I can once again bootstrap mainline on alphaev67-dec-osf5.1! Thanks again to Eric for a speedy fix. I'm sorry that this was a dupe, and that it took a while for mainline to stablize to the point that I could confirm the problem is resolved. *** This bug has been marked as a duplicate of 32325 *** -- roger at eyesopen dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33545
[Bug target/32325] [4.3 Regression] cc1plus ICE configuring libstdc++ on Tru64 UNIX V5.1B: SEGV in rtl_verify_flow_info
--- Comment #7 from roger at eyesopen dot com 2007-10-13 04:14 --- *** Bug 33545 has been marked as a duplicate of this bug. *** -- roger at eyesopen dot com changed: What|Removed |Added CC||roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32325
[Bug bootstrap/33781] New: [4.3 Regression] "Arg list too long" building libgcc.a
The recent addition of a large number of libgcc objects (for fixed point arithmetic and other things) now breaks bootstrap on IRIX. The problem is that the command line in libgcc/Makefile.in, approx line 697 reads as: $(AR_CREATE_FOR_TARGET) $@ $$objects which doesn't defend against $objects being a huge list. Currently on 32-bit IRIX this is 1762 files. Indeed, even typing "ls *.o" in the directory mips-sgi-irix6.5/32/libgcc, returns "-bash: /usr/bin/ls: Arg list too long"! Alas I'm not a wizard in build machinery, but I suspect that all that's required is a one or two line change, perhaps to use "libtool" to create the archive, which contains logic to circumvent these host command line limits. I believe this is what we currently do for libgcj and other large libraries. Many thanks in advance to the kind build maintainer or volunteer who looks into the problem. I'm happy to test patches on my dusty MIPS/IRIX box. -- Summary: [4.3 Regression] "Arg list too long" building libgcc.a Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: roger at eyesopen dot com GCC host triplet: mips-sgi-irix6.5 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781
[Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo
--- Additional Comments From roger at eyesopen dot com 2005-02-16 19:17 --- Hmm. I don't think the problem in this case is at the tree-level, where I think keeping X-(Y*C) and -(Y*C) as a more canonical X + (Y*C') and Y*C' should help with reassociation and other tree-ssa optimizations. Indeed, it's these types of transformations that have enabled the use of fmadd on the PowerPC for mainline. The regression however comes from the (rare) interaction when a floating point constant and its negative now need to be stored in the constant pool. It's only when X and -X are required in a function (potentially in short succession) that this is a problem, and then only on machines that need to load floating point constant from memory (AVR and other platforms with immediate floating point constants, for example, are unaffected). Some aspects of keeping X and -X in the constant pool were addressed by my patch quoted in comment #1, which attempts to keep floating point constant positive *when* this doesn't interfere with GCC's other optimizations. I think the correct solution to this regression is to improve CSE/GCSE to recognize that X*C can be synthesized from a previously available X*(-C) at the cost of a negation, which is presumably cheaper than a multiplication on most platforms. Indeed, there's probably a set of targets for which loading a positive from a constant pool and then negating it, is cheaper than loading both a positive constant and then loading a negative constant. Unfortunately, I doubt whether it'll be possible to siumultaneously address this performance regression without reintroducing the 3.x issue mentioned in the original "PS". I doubt on many platforms a two multiply-adds are much faster than a single floating point multiplication whose result is shared by two additions. Though again it might be possible to do something at the RTL level, especially if duplicating the multiplication is a win with -Os. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988
[Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo
--- Additional Comments From roger at eyesopen dot com 2005-02-19 05:41 --- Re: comment #5 For floating point expressions, -(A+B) is only transformed into (-A)-B or (-B)-A when the user explicitly specifies -ffast-math, i.e. only when flag_unsafe_math_optimizations is true. Re: comment #6 Interesting. Although on a handful of rs6000 cores (mpccore, 601 and 603), a fused-multiply-add is more expensive that an addition, its always a win to perform two fma's rather than a mult and two adds. It might be possible (with some work) to teach combine to un-CSE the following: double x; double y; void foo(double p, double q, double r, double s) { double t = p * q; x = t + r; y = t + s; } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988
[Bug rtl-optimization/336] Superfluous instructions generated from bit-field operations
--- Additional Comments From roger at eyesopen dot com 2005-02-19 19:56 --- This bug has now been fixed for gcc 4.0. For the testcase attached to the PR, mainline now generates the following code on sparc-sun-solaris2.8 with -O2: fun:ld [%sp+64], %o5 sll %o0, 2, %g1 mov %o5, %o0 or %g1, 2, %g1 jmp %o7+12 st %g1, [%o5] i.e. we've now eliminated the unnecessary "and" and "or" instructions, that were present in 2.95.2 (and still present in 3.4.3). -- What|Removed |Added Status|SUSPENDED |RESOLVED Resolution||FIXED Target Milestone|--- |4.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=336
[Bug middle-end/19466] [meta-bug] bit-fields are non optimal
-- Bug 19466 depends on bug 336, which changed state. Bug 336 Summary: Superfluous instructions generated from bit-field operations http://gcc.gnu.org/bugzilla/show_bug.cgi?id=336 What|Old Value |New Value Status|SUSPENDED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466
[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-03-09 01:28 --- Subject: Re: [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary On 8 Mar 2005, Alexandre Oliva wrote: > > * fold-const.c (non_lvalue): Split tests into... > (maybe_lvalue_p): New function. > (fold_ternary): Use it to avoid turning a COND_EXPR lvalue into > a MIN_EXPR rvalue. This version is Ok for mainline, and currently open release branches. Thanks, Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug middle-end/18628] [4.0/4.1 regression] miscompilation of switch statement in loop
--- Additional Comments From roger at eyesopen dot com 2005-03-09 16:13 --- Subject: Re: [PR middle-end/18628] do not fold to label load from tablejump to reg On 9 Mar 2005, Alexandre Oliva wrote: > This patch is meant to implement suggestion #3 proposed to fix the bug > by Roger Sayle and selected by RTH in bugzilla. So far, I've only > verified that it fixes the testcase included in the patch. > > Alexandre Oliva <[EMAIL PROTECTED]> > > PR middle-end/18628 > * cse.c (fold_rtx_mem): Instead of returning the label extracted > from a tablejump, add it as an REG_EQUAL note, if the insn loaded > from the table to a register. > (cse_insn): Don't use it as src_eqv. Thanks! OK for mainline if bootstrap and regression testing passes. Once this patch has been on mainline for a few days, to check that targets with different forms of tablejump and conditional branches don't have issues, OK to backport to the 4.0 branch. Thanks also to RTH for selecting which of the proposals in the bugzilla PR he preferred. I'll admit that I hadn't noticed he'd commented on them until you'd posted this patch. However, full credit goes to your patch. I hadn't appreciated that the problematic transformation takes place in fold_rtx_mem which has the instruction context, allowing us to perform this transformation when its safe (i.e. we directly converting a tablejump into an unconditional jump) but to avoid the problematic case of hositing a label_ref into a register that can then escape. Cool. > The thought of adding the REG_EQUAL note was to help other passes that > might want to turn the indirect jump into a direct jump. I'm not sure > this may actually happen. I'm also not sure how much this will help. If you do encounter any problems with your patch, my first instinct would be to investigate not bothering with the REG_EQUAL note. We've had issues in the past with whether label_refs in REG_EQUAL notes are counted in the label's NUSES, and similar ugly corner cases. If there's no measurable performance impact, we're probably better off without the risk [on the 4.0 branch at least]. > Bootstrap and regtesting starting shortly. Ok to install if they > pass? Please forgive me for commenting on this, but it's kind of a pet peeve. There really are no patches so urgent that they can't be bootstrapped and regression tested before posting to gcc-patches. Even "obvious" fixes to target-independent bootstrap failures can affort the few hours it takes to confirm changes work and are safe. Indeed the language (and emphasis) in contribute.html is (are) quite clear on the matter. My apologies for bringing this up now as you're certainly not amongst the worst offenders in this regard. Many thanks again for tackling high-priority PR middle-end/18628. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18628
[Bug middle-end/20493] [4.0/4.1 Regression] Bootstrap failure because of aliased symbols
--- Additional Comments From roger at eyesopen dot com 2005-03-17 05:06 --- Hmm, yep, probably caused by my change. It looks like with my change fold_widened_comparison is now converting (int)t == -1 into the equivalent t == (typeof(t))-1. Normally, this would be reasonable but the "special" semantics of HAVE_canonical_funcptr_for_compare clearly screw this up. My suggestion would be to add the following to the top of the subroutine fold_widened_comparison: #ifdef HAVE_canonicalize_funcptr_for_compare /* Disable this optimization if we're casting to a function pointer type on targets that require function pointer canonicalization. */ if (HAVE_canonicalize_funcptr_for_compare && TREE_CODE (shorter_type) == POINTER_TYPE && TREE_CODE (TREE_TYPE (shorter_type)) == FUNCTION_TYPE) return NULL_TREE; #endif Dave, could you give this a try and see if it restores bootstrap for you? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20493
[Bug middle-end/20539] [4.0/4.1 Regression] ICE in simplify_subreg, at simplify-rtx.c:3674
--- Additional Comments From roger at eyesopen dot com 2005-03-20 16:47 --- Patch here http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01871.html -- What|Removed |Added Keywords||patch http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20539
[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-03-25 06:03 --- Splitting non_value into maybe_lvalue_p is a good thing, is totally safe and is preapproved for both mainline and the 4.0 branch. The remaining change to fold_ternary/fold_cond_expr_with_comparison are more controversial, and can theoretically be discussed independently. Reading through each of the transformations of COND_EXPR in fold, all of the problematic transformations are guarded in the block beginning at line 4268 of fold-const.c. This is the set of "A op B ? A : B" transformations. All other transformations are either lvalue-safe, or require either operand 2 or operand 3 to be a non-lvalue (typically operand 3 must be a constant). I believe a suitable 4.0-timescale (grotesque hack workaround) is (untested): --- 4265,4275 a number and A is not. The conditions in the original expressions will be false, so all four give B. The min() and max() versions would give a NaN instead. */ ! if (operand_equal_for_comparison_p (arg01, arg2, arg00) ! && (in_gimple_form ! || strcmp (lang_hooks.name, "GNU C++") != 0 ! || ! maybe_lvalue_p (arg1) ! || ! maybe_lvalue_p (arg2))) { tree comp_op0 = arg00; tree comp_op1 = arg01; The maybe_lvalue_p tests should be obvious from the previous versions of Alexandre's patch. The remaining two lines are both hideous hacks. The first is that these transformations only need to be disabled for the C++ front-end. This is (AFAIK) the only language front-end that uses COND_EXPRs as lvalues, and disabling "x > y ? x : y" into MAX_EXPR (where x and y are VAR_DECLs) is less than ideal for the remaining front-ends. The second equally bad hack is to re-allow this transformation even for C++ once we're in the tree-ssa optimizers. I believe once we're in gimple, COND_EXPR is no longer allowed as the lhs of an assignment, hence the MAX_EXPR/MIN_EXPR recognition transformations are the gimple-level should be able to clean-up any rvalue COND_EXPRs they're presented with. Clearly, testing lang_hooks.name is an option of last resort. I'm increasingly convinced that the correct long term solution is to introduce a new LCOND_EXPR tree node for use by the C++ end. Either as a C++-only tree code, or a generic tree code. Additionally, depending upon whether we go ahead and deprecate >?= and http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-04-03 03:20 --- > Excuse me for asking, but what is it that makes the latest patch I posted not > reasonable for the 4.0 timeframe? The performance regression on C, Java, Ada and fortran code, that isn't affected by this bug. The bug is marked has the "c++" component because it only affects the C++ front-end. A fix that disables MIN_EXPR/MAX_EXPR optimizations in all front-ends is not suitable for a release branch without SPEC testing to show how badly innocent targets will get burnt! There may be lots of places in the Linux kernel that depend upon generating min/max insns for performance reasons... I'd hoped I'd made this clear when I proposed the alternate strategy of only disabling this optimization *only* in the C++ front-end, and *only* prior to tree-ssa. Whilst I agree this is a serious bug, it currently affects all release branches, so taking a universal performance hit to resolve it without considering the consequences seems a bad decision. Just grep for "MIN (" and "MAX (" in GCC's own source code to see how badly this could impact the compiler's own code/performance/bootstrap times. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-04-04 16:02 --- Subject: Re: [Committed] PR c++/19199: Preserve COND_EXPR lvalueness in fold Hi Alex, My apologies yet again for not being more explicit about all of the things that were wrong (and or I was unhappy with) with your proposed solution. I'd hoped that I was clear in the comments in the bugzilla thread, and that you'd appreciate the issues it addressed. Problems with your approach: [1] The use of pedantic_lvalues to identify the non-C front-ends, adversely affects code generation in the Java, Fortran and Ada front-ends. The use of COND_EXPRs as lvalues is unique to the C++ front-end, so ideally a fix shouldn't regress code quality on innocent front-ends. Certainly, not without benchmarking to indicate how significant a performance hit, these other languages are taking prior to a release. [2] The pedantic_lvalues flag is itself a hack used by the C front-end, that is currently being removed by Joseph in his clean-up of the C parser. Adding this use would block his efforts, until an alternate solution is found. Admittedly, this isn't an issue for the 4.0 release, but creates more work or a regression once this is removed from mainline. [3] Your patch is too invasive. Compared to the four-line counter proposal that disables just the problematic class of transformations, your much larger patch inherently contains a much larger risk. For example, there is absolutely no need to modify the source code on the "A >= 0 ? A : -A -> abs (A)" paths as these transformations could never interfere with the lvalueness of an expression. Additionally, once one of the longer term solutions proposed by Mark or me is implemented, all of these workarounds will have to be undone/ reverted. By only affecting a single clause, we avoid the potential for leaving historic code bit-rotting in the tree. [4] Despite your counter claims you approach *does* inhibit the ability of the the tree-ssa optimizers to synthesize MIN_EXPR and MAX_EXPR nodes. Once the in_gimple_form flag is set, fold-const.c is able to optimize "A == B ? A : B -> B" even when compiling C++, as it knows that a COND_EXPR can't be used as an lvalue in the late middle-end. [5] For the immediate term, I don't think its worth worrying about converting non-lvalues into lvalues, the wrong-code bugs and diagnostic issues are related solely to the lvalue -> non-lvalue transition. At this stage of pre-release, lower risk changes are cleary preferrable, and making this change will break code that may have erroneously compiled in the past. Probably, OK for 4.0, but not for 3.4 (which also exhibits this problem). And although, not serious enough to warrant a [6], it should be pointed out that several of your recent patches have introduced regressions. Indeed, you've not yet reported that your patch has been sucessfully bootstraped or regression tested on any target triple. Indeed, your approach of posting a patch before completing the prerequisite testing sprecified/stressed in contribute.html, on more than one occassion recently resulted in the patch having to be rewritten/tweaked. Indeed, as witnessed in "comment #17", I've already approved an earlier version your patch once, only to later discover you were wasting my time. As a middle-end maintainer, having been pinged by the release manager that we/you weren't making sufficient progress towards addressing the issues with your patch, I took the opportunity to apply a fix that was within my authority to commit. If you'd had made and tested the changes that I requested in a timely manner, I'm sure I'd have approved those efforts by now. My apologies again for not being blunt earlier. My intention was that by listing all of the benefits of an alternate approach in comment #19 of the bugzilla PR, I wouldn't have to explicitly list them as the defficiencies of your approach. Some people prefer the carrot to the stick with patch reviews [others like RTH's "No"]. Perhaps, I should ask the counter question to your comment #21? In what way do you feel that the committed patch isn't clearly superior to your proposed solution? p.s. Thanks for spotting my mistake of leaving a bogus comment above maybe_lvalue_p. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-05 04:22 --- The stricter version is mostly OK, except for one correction and one suggestion. The correction is that in the case where the replacement wasn't a register, you shouldn't be calling validate_change_maybe_volatile inside a gcc_assert. When ENABLE_ASSERT_CHECKING is disabled, the side-effects of this statement will be lost (i.e. the replacement attempt using the new pseudo). You should instead try "if (!validate_change_maybe_volatile (...)) gcc_unreachable();" or alternatively use a temporary variable. The minor suggestion is that any potential performance impact of this change can be reduced further by tweaking validate_change_maybe_volatile to check whether "object" contains any volatile mem references before attempting all of the fallback/retry logic. Something like: int validate_change_maybe_volatile (rtx object, rtx *loc, rtx new) { if (validate_change (object, loc, new, 0)) return 1; if (volatile_ok + || !for_each_rtx (&object, volatile_mem_p, 0) || !insn_invalid_p (object)) return 0; ... This has the "fail fast" advantage that if the original instruction didn't contain any volatile memory references (the typical case), we don't bother to attempt instruction recognitions with (and without) volatile_ok set. Admittedly, this new function is only called in one place which probably isn't on any critical path, but the above tweak should improve things (the for_each_rtx should typically be faster than the insn_invalid_p call, and certainly better than two calls to insn_invalid_p and one to validate_change when there's usually no need.) p.s. I completely agree with your decision to implement a stricter test to avoid inserting/replacing/modifying volatile memory references. Please could you bootstrap and regression test with the above changes and repost to gcc-patches? I'm prepared to approve with those changes, once testing confirms no unexpected interactions. Or if you disagree with the above comments, let me/someone know. Thanks in advance, Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug c++/19199] [3.3/3.4 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-04-05 23:13 --- Now that a fix has been applied to both mainline and the 4.0 branch, I've been investigating backporting the fix to 3.4. Unfortunately, a significant feature of the fixes proposed by Alex and me are that they disable the problematic transformations during parsing, but allow them to be caught by later tree-ssa passes. Unfortunately, in gcc 3.4.x if we disable constant folding of COND_EXPRs for C++, there are no tree-level optimizers that can take up the slack, and instead we'd rely on if-conversion to do what it can at the RTL level. I'm not convinced that 3.4 if-conversion is particularly effective in this respect (certainly across all targets), and I've not yet checked whether all of the affected transformations are implementated in ifcvt.c (even on mainline). Might I propose that we close this bug as won't fix for both the 3.3 and 3.4 branches. GCC has always generated wrong code for this case, and the current state of 3.4.4 is that we issue an incorrect warning (which seems better than silently generating wrong code as we did with 3.1). It's a trade-off; but I'm keen to avoid degrading g++ 3.4.5 (in the common case), for a diagnostic regression. Or we could leave the PR open (and perhaps unassign it) in the hope that a 3.4 solution will be discovered, as the mainline/4.x fixes aren't suitable. Mark? Alexandre? -- What|Removed |Added Known to fail|3.4.0 4.0.0 3.3.2 |3.4.0 3.3.2 Known to work||4.0.0 4.1.0 Summary|[3.3/3.4/4.0 Regression]|[3.3/3.4 Regression] Wrong |Wrong warning about |warning about returning a |returning a reference to a |reference to a temporary |temporary | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug c++/19199] [3.3/3.4 Regression] Wrong warning about returning a reference to a temporary
--- Additional Comments From roger at eyesopen dot com 2005-04-06 00:03 --- That's my interpretation of (Andrew Pinki's) comment #6 in the bugzilla PR [n.b. I haven't reconfirmed his analysis personally] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199
[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-08 17:03 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail Hi Alex, On 8 Apr 2005, Alexandre Oliva wrote: > Roger suggested some changes in the patch. I've finally completed > bootstrap and test with and without the patch on amd64-linux-gnu, and > posted the results to the test-results list. No regressions. Ok to > install? Hmm. It looks like you misunderstood some of the comments in my review (comment #16 in the bugzilla PR)... + gcc_assert (validate_change_maybe_volatile (v->insn, v->location, + reg)); This is still unsafe. If you look in system.h, you'll see that when ENABLE_ASSERT_CHECKING is undefined, the gcc_assert macro gets defined as: #define gcc_assert(EXPR) ((void)(0 && (EXPR))) which means that EXPR will not get executed. Hence you can't put side-effecting statements (especially those whose changes you depend upon) naked inside a gcc_assert. Ahh, I now see the misunderstanding; you changed/fixed the other "safe" gcc_assert statement, and missed the important one that I was worried about. Sorry for the confusion. Secondly: + if (volatile_ok + /* Make sure we're not adding or removing volatile MEMs. */ + || for_each_rtx (loc, volatile_mem_p, 0) + || for_each_rtx (&new, volatile_mem_p, 0) + || ! insn_invalid_p (object)) +return 0; The suggestion wasn't just to reorder the existing for_each_rtx to move these tests earlier, it was to confirm that the original "whole" instruction had a volatile memory reference in it, i.e. that this is a problematic case, before doing any more work. Something like: + if (volatile_ok ++ /* If there isn't a volatile MEM, there's nothing we can do. */ ++ || !for_each_rtx (&object, volatile_mem_p, 0) +! /* But make sure we're not adding or removing volatile MEMs. */ + || for_each_rtx (loc, volatile_mem_p, 0) + || for_each_rtx (&new, volatile_mem_p, 0) + || ! insn_invalid_p (object)) +return 0; This second change was just a micro-optimization, and I'd have approved your patch without it, but the use of gcc_assert in loop_givs_rescan is a real correctness issue. Sorry again for the inconvenience, Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-10 03:18 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail On 9 Apr 2005, Alexandre Oliva wrote: > On Apr 8, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote: > > > ++ /* If there isn't a volatile MEM, there's nothing we can do. */ > > ++ || !for_each_rtx (&object, volatile_mem_p, 0) > > This actually caused crashes. We don't want to scan the entire insn > (it might contain NULLs), only the insn pattern. Argh! Indeed, my mistake/oversight. Thanks. > PR target/20126 > * loop.c (loop_givs_rescan): If replacement of DEST_ADDR failed, > set the original address pseudo to the correct value before the > original insn, if possible, and leave the insn alone, otherwise > create a new pseudo, set it and replace it in the insn. > * recog.c (validate_change_maybe_volatile): New. > * recog.h (validate_change_maybe_volatile): Declare. This is OK for mainline, thanks. Now that 4.0 is frozen for release candidate one, Mark needs to decide whether this patch can make it into 4.0.0 or will have to wait for 4.0.1. I also think we should wait for that decision before considering a backport for 3.4.x (or we'll have a strange temporal regression). I'd recommend commiting this patch to mainline ASAP, so it can have a few days of testing before Mark has to make his decision. Thanks again for your patience, Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-12 14:38 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail Hi Alexandre, On 12 Apr 2005, Alexandre Oliva wrote: > Does any expert in rtl loop care to chime in? I'm not sure I qualify for the title "rtl loop" expert, but setting bl->all_reduced to zero after we fail to validate a change to the RTL looks to be a reasonable failure mode. I still like your fallbacks, that by trying harder we perform better optimization, but as shown by the ARM's "stmia" instruction I suspect there will always be cases were we can't reduce IV expressions in some backend instructions. Previously, we didn't even detect these cases and potentially generated bad code. I think the ICE was an improvement over the "potentially" bad code, but what we really need is a more graceful failure/degradation. As you propose, I'd recommend something like (for your final clause): /* If it wasn't a reg, create a pseudo and use that. */ rtx reg, seq; start_sequence (); reg = force_reg (v->mode, *v->location); ! if (validate_change_maybe_volatile (v->insn, v->location, reg)) ! { ! seq = get_insns (); ! end_sequence (); ! loop_insn_emit_before (loop, 0, v->insn, seq); ! } ! else ! { ! end_sequence (); ! if (loop_dump_stream) ! fprintf (loop_dump_stream, ! "unable to reduce iv to register in insn %d\n", ! INSN_UID (v->insn)); ! bl->all_reduced = 0; ! v->ignore = 1; ! continue; ! } I think its worthwhile keeping the validate_change_maybe_volatile calls/changes on mainline. But then for gcc 4.0.0 or 4.0.1 we can use the much simpler: if (v->giv_type == DEST_ADDR) /* Store reduced reg as the address in the memref where we found this giv. */ ! { ! if (!validate_change (v->insn, v->location, v->new_reg, 0)) ! { ! if (loop_dump_stream) ! fprintf (loop_dump_stream, !"unable to reduce iv to register in insn %d\n", !INSN_UID (v->insn)); ! bl->all_reduced = 0; ! v->ignore = 1; ! continue; ! } ! } A much less intrusive regression fix than previously proposed fix for 4.0. But perhaps one of the real "rtl loop" experts would like to comment? Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug middle-end/20739] [4.0/4.1 regression] ICE in gimplify_addr_expr
--- Additional Comments From roger at eyesopen dot com 2005-04-14 17:19 --- Thanks Alex! This is OK for mainline. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20739
[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-14 17:37 --- You'll notice in loop.c that everywhere that we currently set all_reduced to zero, we also set ignore to true. This change is to avoid wasting CPU cycles, if we know that an IV can't be eliminated, there's no point in trying to modify any more instructions that use it. At best, this incurs wasted CPU cycles, at worst we'll end up substituting in some places and not others, which will result in requiring both the original IV *and* the replacement IV which will increase register pressure in the loop. As your (Alex's) testing showed, I'm not sure that its strictly required for correctness, it's mainly to preserve consistency with the exisiting all_reduced invariants by using the same idiom as used elsewhere, but also as a potential compile-time/run-time micro-optimization. However for 4.0, I thought it best to reuse/copy the existing idiom, rather than risk clearing all_reduced without setting ignore (which may have potentially exposed code paths not seen before). We still need the 4.1 variant to be tested/committed to mainline. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-15 14:52 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail On 15 Apr 2005, Alexandre Oliva wrote: > On Apr 12, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote: > > I still like your fallbacks, that by trying harder we perform > > better optimization, > > The more I think about this, the more I have the impression that > perhaps the fallbacks are not necessary. > ... > So I'm wondering if taking out all of the workarounds and going back > to something like what is now in the 4.0 branch, except for the use of > validate_change_maybe_volatile, wouldn't get us exactly what we want. > ... > Anyhow, in the meantime, could I check in the patch to fix Josh's > asm-elf build failure? > ... > It would be nice to keep the hard failure in place for a bit longer, > such that we stood a better chance of finding other situations that > might require work arounds. Sure. Your patch in comment #28 of bugzilla PR20126 is OK for mainline to resolve Josh's bootstrap failure. Sounds like you've already done the necessary testing, and I'll trust you on a suitable ChangeLog entry. I agree with your proposed game plan of keeping the hard failure in place temporarily, to discover whether there are any other "fallback" strategies that would be useful. Ultimately though, I don't think we should close PR20126 until a "soft failure" is implemented on mainline, like we've (Jakub has) done on the gcc-4_0-branch (such as the mainline code proposed in comment #30). Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-17 00:21 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail Hi Alex, On 16 Apr 2005, Alexandre Oliva wrote: > On Apr 15, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote: > > I agree with your proposed game plan of keeping the hard failure in > > place temporarily, to discover whether there are any other "fallback" > > strategies that would be useful. Ultimately though, I don't think we > > should close PR20126 until a "soft failure" is implemented on mainline, > > like we've (Jakub has) done on the gcc-4_0-branch (such as the > > mainline code proposed in comment #30). > > But see, the problem with the soft failure mode is that, if it is ever > legitimate to leave the giv alone and not make sure we set whatever > register appears in it to the right value, then can't we always do it, > removing all of the (thus useless) workarounds? > > And if there's any case in which it is not legitimate to do so, then > the soft failure mode would be a disservice to the user, that would > silently get a miscompiled program. We should probably at least warn > in this case. I don't believe there are any cases in which it is not legitimate to leave the GIV alone, so we'll never silently miscompile anything. My understanding is that it's always possible to leave the giv alone (provided that we set all_reduced to false). The "workarounds" as we've got used to calling them are not required for correctness, but for aggressive optimization. There's clearly a benefit to strength reducing GIVs, and the harder we try to replace them, the better the code we generate. Yes, they are (useless/not necessary) from a purely correctness point of view; we don't even have to call validate_change we could just always give-up and punt using clearing all_reduced (technicaly we don't have to perform any loop optimizations for correctness), but we'd generate pretty poor code. The patch you proposed provides the soft failure mode we want (and now have on the release branch). We could, as you say remove all of the current workarounds, and the only thing that would suffer is the quality of the code we generate. Needless to say, I'd prefer to keep these optimizations (for example your recent one for Josh to allow us to strength reduce the ARM's stim instruction). It's not unreasonable to try three or four approaches before giving up, and forcing the optimizers to preserve the original GIV. Does this clear things up? Do you agree? Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry
--- Additional Comments From roger at eyesopen dot com 2005-04-17 03:06 --- Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail On 16 Apr 2005, Alexandre Oliva wrote: > On Apr 16, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote: > > > Does this clear things up? Do you agree? > > Yup, for both questions. Thanks for the clarification. It wasn't > clear to me that the assignments played any useful role, as soon as I > found out the givs could be assumed to hold the correct value. It > all makes sense to me now. Your patch (in comment #45) is OK for mainline, with a suitable ChangeLog entry. Hurray, we can close the PR. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126
[Bug c/34720] ICE in real_to_decimal, at real.c:1656
--- Comment #7 from roger at eyesopen dot com 2008-01-29 01:12 --- I'm also seeing this same failure with "make profiledbootstrap" on x86_64-unknown-linux-gnu. A "make bootstrap" on the same machine completes and regression tests fine (14 unexpected failures in gcc). I suspect that the miscompilation is either non-deterministic or is caused by an optimization that only triggers on some targets and/or with additional profile information. Perhaps we should regression hunt for the change that broke things. It might not be anything to do with real.c or decimal floating point. -- roger at eyesopen dot com changed: What|Removed |Added CC| |roger at eyesopen dot com Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 GCC build triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu GCC host triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu GCC target triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu Last reconfirmed|-00-00 00:00:00 |2008-01-29 01:12:50 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34720
[Bug fortran/31711] rhs array is changed while assiging to same lhs array
--- Comment #10 from roger at eyesopen dot com 2007-04-27 18:20 --- Paul's fix looks correct to me. It appears that when the "#if 0" was added to disable broken loop shifting at some point in the distant past, the critical functionality. if (nDepend) break; was accidentally removed. This is similar to the idiom used a few lines earlier "nDepend = 1; break;" I would normally have suggested that the type of nDepend and the return type of gfc_dep_resolver be changed to bool, and the dead "#if 0" code removed to clean-up this code... However, I think a much better longer term strategy is to completely remove gfc_conv_resolve_dependencies, and gfc_dep_resolver and the automatic handling of temporaries in the scalarizer. Instead, in line with the TODO above gfc_conv_resolve_dependencies, I think it's much better to handle this at a higher level using the new/preferred gfc_check_dependency API that's used through-out the front-end. For example, when lhs=rhs has a dependency, this can be transformed into "tmp=rhs; lhs=tmp" during lowering or at the front-end tree-level, which greatly simplifies the complex code in the scalarizer. It also allows the post-assignment copy "lhs=tmp" to be performed efficiently [though I may have already implemented that in the scalarizer already :-)]. One benefit of doing this at a higher level of abstraction is that loop-reversal isn't a simple matter of running our default scalarized loop backwards (as hinted by the comment, and the PRs in bugzilla). Consider the general case of a(s1:e1,s2:e2,s3:e3) = rhs. The dependency analysis in gfc_check_dependency may show that s1:e1 needs to run forwards to avoid a conflict, and that s3:e3 needs to run backwards for the same reason. It's trivial then to treat this almost like a source-to-source transformation (ala KAP) and convert the assignment into a(s1:e1:1,s2:e2,s3:e3:-1) instead of attempting to shoehorn all this into the scalarizer at a low-level and perform book-keeping on the direction vectors. By the time, we've lowered gfc_expr to gfc_ss, we've made things much harder for ourselves. And things get much more complex once we start seriously tackling CSHIFT, TRANSPOSE and friends! Keeping the scalarizer simple also eases support for autovectorization and/or moving it into the middle-end (the topic of Toon's GCC summit presentation). I'm as surprised as Paul that this hasn't been a problem before. I suspect it's because we use the alternate gfc_check_dependency in the vast majority of cases, and the empirical observation in the research literature that most f90 array assignments in real code don't carry a dependency. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31711
[Bug target/23322] [4.1 regression] performance regression, possibly related to caching
--- Additional Comments From roger at eyesopen dot com 2005-08-11 14:56 --- I'll take a look, but on first inspection this looks more like a register allocation issue than a reg-stack problem. In the first (4.0) case, the accumulator "result" is assigned a hard register in the loop, whilst in the second (4.1) it is being placed in memory, at -16(%ebp). This may also explain why extracting that loop into a stand-alone function produces optimal/original code, as the register allocator gets less confused by other influences in the function. The extracted code is also even better than 4.0's, as it avoids writing "result" to memory on each iteration (store sinking). The second failure does show an interesting reg-stack/reg-alloc interaction though. The "hot" accumulator value is live on the backedge and the exit edge of the loop but not on the incoming edge. Clearly, the best fix is to make this value live on the incoming edge, but failing that it is actually better to prevent it being live on the back and exit edges, and add compensation code after the loop. i.e. if the store to result in the loop used fstpl, you wouldn't need to fstp %st(0) on each loop iteration, but would instead need a compensating fldl after the loop. I'm not sure how easy it would be to teach GCC's register allocation to take these considerations into account, or failing that, whether reg-stack could be tweaked/hacked to locally fix this up. But the fundamental problem is that reg-alloc should assign result to a hard resigster as it clearly knows there are enough available in that block. reg-stack.c is just doing what its told, and in this case its being told to do something stupid. -- What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2005-08-11 14:56:31 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322
[Bug middle-end/21137] Convert (a >> 2) & 1 != 0 into a & 4 != 0
--- Additional Comments From roger at eyesopen dot com 2005-08-14 19:17 --- Hi James, Unfortunately, there are a few mistakes in your proposed patch for PR21137. Firstly Kazu's proposed transformation is only valid when the results of the bitwise-AND are being tested for equality or inequality with zero. i.e. its safe to transform "((a >> 2) & 1) != 0" into "(a & 4) != 0" but not "x = (a >> 2) & 1;" into "x = (a & 4)". Your patch is in the general fold path for BIT_AND_EXPR, so you'll transform both. It's surprising there are no testsuite checks for the second example above; it might be worth adding them to prevent anyone making a similar mistake in future. Secondly, this transformation is only valid is c1 + c2 < TYPE_PRECISION(type). Consider the following code: signed char c; if ((c >> 6) & 64) ... this is not equivalent to if (c & (char)(64<<6)) ... i.e. if (c & (char)4096) ... i.e. if (c & 0) ... i.e. if (0) Of course, when c1+c2 >= TYPE_PRECISION(type), there are two additional optimizations that can be performed. If TYPE_UNSIGNED(type) the result is always false, and if !TYPE_UNSIGNED(type), the condition is equivalent to "a < 0". So in the example of mine above, optimization should produce: if (c < 0) ... Finally, in your patch, you use "break", if the transformation is invalid. This isn't really the correct "idiom/style" for fold, where if the guard for a transformation fails, you shouldn't drop out of the switch, but instead continue onto the following/next transformation "in the list". So instead of "if (!guard) break; return transform();", this optimization should be written as "if (guard) return transform();". I haven't looked for other examples for "break" in fold_unary/fold_binary/fold_ternary, but if there are any, they're probably (latent) missed-optimization bugs. Other than that the patch looks good. Thanks for looking into this. -- What|Removed |Added CC||phython at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21137
[Bug tree-optimization/23476] [4.1 Regression] ICE in VRP, remove_range_assertions
--- Additional Comments From roger at eyesopen dot com 2005-08-20 15:27 --- My apologies for adding a comment to an already resolved PR, but I've some follow-up thoughts on Diego's recent solution to this regression. From a high-level perspective, it would probably be more efficient to require that conditions are always folded as an invariant of our tree-ssa data structures. It's better to fold a conditional once when it is constructed/modified, rather than need to call fold on it each time it is examined. Some places that build/modify conditionals may know that fold doesn't need to be (or has already been) called, whilst requiring the many places that examine CFGs to call fold themselves is pessimistic. This also fits well with our recent "folded by construction" philosophy, using fold_buildN instead of build. I appreciate that this a meta-issue, and Diego's fix is fine for this problem, but ultimately I think that placing stricter invariants on our data structures will reduce the number of unnecessary calls to fold, and speed up the compiler. Eventually, most calls to build* should be fold_build*, and it should rarely be necessary to call fold() without a call to build (or in place modification of a tree). But perhaps there a valid tree-ssa reasons why this shouldn't be a long-term goal? -- What|Removed |Added CC| |roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23476
[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a
--- Comment #4 from roger at eyesopen dot com 2007-11-01 17:15 --- Thanks to both Jakub and DJ for their help. I just tried out the suggested patch on my IRIX box, and was surprised that it didn't resolve the error. My apologies that my initial analysis might have been wrong (or incomplete), but it looks like the error occurs earlier on the same command line. Not only does, objects="$(objects)" ; $(AR_CREATE_FOR_TARGET) $@ $$objects Infact, stripping the command back to just objects="$(objects)" is enough to trigger the error. Hoping that this was perhaps a limitation of IRIX's /bin/sh, I've tried again with SHELL=/usr/local/bin/bash but alas I get the same error. make: execvp: /usr/local/bin/bash: Arg list too long So it's not a bound on argc or the number of entries in argv[] that's the problem, but a hard limitation on command line length. So it looks like we can't even assign $objects yet alone use it, either directly or looping over it to use xargs. Perhaps we could do something with "find". Just a wild guess here as I don't understand build machinery but something like: find . -name '*.o' -exec ar rc libgcc.a {} \; And then test afterwards if test ! -f libgcc.a ; then {the eh_dummy.o stuff to avoid empty libgcc.a} ; fi I'm not sure why I'm seeing this. There mustn't be many IRIX testers for mainline and either MIPS is building more objects than other platforms (for saturating and fixed point math) or most OSes are less "restricted" than IRIX. Many thanks again for peoples help. Is "find" portable, or is there a better way to achieve the same thing without ever placing all of the filenames on a single command line? Sorry for any inconvenience. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781
[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a
--- Comment #8 from roger at eyesopen dot com 2007-11-02 16:41 --- Created an attachment (id=14471) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14471&action=view) Default libgcc.a objects on mips-sgi-irix6.5 I'll respond to Jakub's latest comments before trying DJ's more recent patch. Running "getconf ARG_MAX" on my IRIX box, returns 20480, which is 20K. I believe this is the default, out of the box setting for my machine which is running IRIX 6.5.19m. Using cut'n'paste from the failing "make" output, I measure the current "$$objects" to be 25949 bytes. I've attached the "attempted" value of $objects to this e-mail. I'll give DJ's patch a spin... I apologise that this box isn't that speedy. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781
[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a
--- Comment #9 from roger at eyesopen dot com 2007-11-02 17:12 --- Doh! DJ's patch gets us a little further, but it things are still broken. However, it's an excellent debugging tool which shows that its the invocation with libgcc-objects-15 that's broken. Applying the same trick as above shows that $libgcc-objects-15 alone is 19962 bytes, which combined with the "ar" etc.. at the beginning of the command line exceeds the limits. So it's the "fixed-conv-funcs" that are to blame. Perhaps "gen-fixed.sh" has gone insane with the large number of integer-like machine modes on MIPS. The correct fix might actually be in the optabs handling of the middle-end, so we don't need quite so many conversion functions in MIPS' libgcc.a. Or perhaps mips.md need improved support (patterns) for this functionality. I've no idea what _satfractunsUTIUHA is, it's a recent addition and I've not been following gcc-patches lately. Splitting "_fract*" from "_sat*" with a patch similar to DJ's should work. I hope this is enlightening. Is there a --disable-option to avoid building fixed point conversion support? Looks like our command line usage is O(n^2) in the number of backend integer machine modes? Thanks again for everyone's help on this. I'll owe you beers at the next GCC summit. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #2 from roger at eyesopen dot com 2008-04-21 03:22 --- Yep, now that we're back in stage1, it's about time I got around to submitting the O(n) worst case nth_element implementation that I mentioned last year. For Steven's benefit, the implementation I've already coded up uses the median-of-medians in groups of five strategy as a fallback to a modified quickselect. [Though I'll need to quickly read the paper you cite] The trick for libstdc++ is to attempt to make the typical case as fast or faster than the existing implementation. Whilst the standards now require O(n) worst case, the perceived performance of g++'s users is the average case and changing to an O(n) implementation that has a large co-efficient constant may upset some folks. -- roger at eyesopen dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |roger at eyesopen dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-04-21 03:22:22 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug libstdc++/35968] nth_element fails to meet its complexity requirements
--- Comment #5 from roger at eyesopen dot com 2008-04-24 15:01 --- Well, I've now had time to read the Barriato, Hofri et al. 2002 paper, and the bad news is that such an approximate median selection algorithm can't be used to guarantee an O(N) worst-case std::nth_element implementation. It could be used in an implementation to guess a good pivot, but the quality of this median, i.e. how approximate it is, doesn't meet the necessary criterion to ensure an O(N) worst case. You'd still need a fallback method with guaranteed bounds or an exact median in order to achieve O(N). i.e. it could help improve the average case performance, but doesn't help with the worst case. For the mathematically inclined, in order to achieve an O(N) worst-case performance, you need to guarantee a constant fraction of elements can be eliminated at each level of the recursion. In comment #4, Steven fixates on "just as long as N/2 elements are reduced each time round", but the equations for sum of geometric series show that doing better than any constant fraction guarantees O(N) worst case. Hence even if you only guarantee that you can eliminate 10% each round, you still achieve O(N) worst-case. Hence you need a method that provides an approximate median that worst-case can guarantee elimination of say 10% of elements from consideration. This is why approximate medians offer some utility over exact medians if they can be found faster. Unfortunately, the method of Battiato referenced in comment #1 doesn't provide such a constant fraction guarantee. An analysis shows that at each round, it can only eliminate (2^n-1)/3^n of the elements in its worst case, where n is log_3(N). By hand, naming the ranks 0..N-1, when N=3, the true median at rank 1 is selected. For N=9, the elements at rank 3,4 or 5 may be considered as a median, i.e. 1/3 eliminated. For N=27, the elements between ranks 8 and 20 may be returned as the median, i.e. 7/27 eliminated. In the limit, as N tends towards infinity (and n tends to infinity), the eliminated fraction (2^n-1)/3^n tends to zero unbounded. i.e. the larger the input size the less useful is the worst-case median. The poor quality of the median is lamented by the authors in the penultimate paragraph of section 4.1 of the paper. They then go on to show that statistically such a worst-case is rare, but unfortunately even a rare worst case breaks the C++ standard libraries O(N) constraint. This Achilles heel is already well documented in the algorithmic complexity community. The Blum, Floyd, Pratt, Rivest and Trarjan paper [BFRT73] and the Floyd and Rivest paper [FR75] analyse the issues with median-of-k-medians, and show that k>=5 is the lowest value capable of guaranteed fractional worst case. i.e. they already consider and reject the algorithm given in the cited work (k=3) for the purpose of exact median finding. Anyway, I hope you find this interesting. There will always be efficient methods for finding approximate medians. The question is how efficient vs. how approximate. Many quicksort implementation select the first element as a pivot, an O(1) method for selecting an extremely approximate median! Statistically over all possible input orders, this first element will on average partition the input array at the median, with some variance. It's not that the paper is wrong or incorrect; it does what it describes in finding a statistically good approximate median very efficiently with excellent worst case performance. Unfortunately for the problem we need to solve, which is not the problem the paper's authors were attempting to solve, we need a better approximation perhaps using a more complex implementation. Anyway, thanks again for the reference. I'd not come across it before and really enjoyed reading it. Let me know if you spot a flaw in my reasoning above. Dr Roger Sayle, Ph.D. Computer Science -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968
[Bug bootstrap/33781] [4.3/4.4 Regression] "Arg list too long" building libgcc.a
--- Comment #20 from roger at eyesopen dot com 2008-06-12 21:31 --- Hi Ralf, Thanks for your patch. Sorry for the delay in replying, I needed to check out mainline on my IRIX box and rebuild a baseline, and once that had completed "make -k check", I tried with "--enable-fixed-point" first without, and then with your patch. The good news is that this allows the libgcc build to get further, but unfortunately the bad news is that we die just a little further on with a similar "execvp: /bin/sh: Arg list too long". This second failure is where we run nm on all of the objects and pipe the results through mkmap-flat.awk to create tmp-libgcc.map. This looks to be in the same libgcc/Makefile.in in the libgcc.map rule (when SHLIB_MKMAP is defined). I do like your PR33781.diff patch which moves us in the right direction. Is it possible/safe to apply similar voodoo to the libgcc.map rule? Many thanks again for your help. I've no personal interest in using fixed point arithmetic on the MIPS, but resolving this issue on IRIX helps keep the build machinery portable. If it's not IRIX now, it'll be some other platform with a low MAXARGS limit in the near future. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781
[Bug rtl-optimization/25703] [4.2 Regression] ACATS cxa4024 failure
--- Comment #5 from roger at eyesopen dot com 2006-01-25 01:05 --- I'm testing the following patch... Index: combine.c === *** combine.c (revision 109912) --- combine.c (working copy) *** try_combine (rtx i3, rtx i2, rtx i1, int *** 1967,1972 --- 1967,1983 if (BITS_BIG_ENDIAN) offset = GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0))) - width - offset; + + /* If this is the low part, we're done. */ + if (subreg_lowpart_p (XEXP (dest, 0))) + ; + /* Handle the case where inner is twice the size of outer. */ + else if (GET_MODE_BITSIZE (GET_MODE (SET_DEST (temp))) + == 2 * GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0 + offset += GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0))); + /* Otherwise give up for now. */ + else + offset = -1; } } else if (subreg_lowpart_p (dest)) My apologies for any inconvenience. -- roger at eyesopen dot com changed: What|Removed |Added CC| |roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25703
[Bug rtl-optimization/25703] [4.2 Regression] ACATS cxa4024 failure
--- Comment #11 from roger at eyesopen dot com 2006-01-25 19:52 --- Created an attachment (id=10729) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10729&action=view) patch v2 Here's a revised version of the patch that also handles the STRICT_LOW_PART case. My apologies once again for the inconvenience. In the previous version of the patch I'd mistakenly assumed that STRICT_LOW_PART was some indication that the SUBREG only affected the "low_part". Investigating Jan's testcase with -mtune=i486, I now understand it really means STRICT_SUB_PART, and actually behaves identically to SUBREG in this optimization, as we preserve all of the unaffected bits anyway! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25703
[Bug c++/26079] New: Template instantiation behavior change in 4.1 (regression?)
The following short code fragment no longer compiles with gcc 4.1. I've no clue if this a regression or mandated by the standard. #include #include #include int size(char x) { return (int) sizeof(x); } int size(int x) { return (int) sizeof(x); } int size(const std::string &x) { return (int) x.size() + (int) sizeof(int); } template int size(const std::vector &x) { int result = (int) sizeof(int); typename std::vector::const_iterator iter; for (iter = x.begin() ; iter != x.end() ; iter++) result += size(*iter); return result; } template int size(const std::pair &x) { return size(x.first) + size(x.second); } int foo() { std::vector > pvec; return size(pvec); } Sorry to not reduce a stand-alone testcase without headers. The STL isn't important. The issue is that the list of candidates for "size(std::pair<...>)" doesn't include the templates, only the functions, when instantiating "size(std::vector<...>). On IRC they thought this looked reasonable enough to file a PR. This works fine in 4.0.2 and 3.4.x and many other C++ compilers. -- Summary: Template instantiation behavior change in 4.1 (regression?) Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: roger at eyesopen dot com GCC host triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26079
[Bug c++/26080] New: Template instantiation behavior change in 4.1 (regression?)
The following short code fragment no longer compiles with gcc 4.1. I've no clue if this a regression or mandated by the standard. #include #include #include int size(char x) { return (int) sizeof(x); } int size(int x) { return (int) sizeof(x); } int size(const std::string &x) { return (int) x.size() + (int) sizeof(int); } template int size(const std::vector &x) { int result = (int) sizeof(int); typename std::vector::const_iterator iter; for (iter = x.begin() ; iter != x.end() ; iter++) result += size(*iter); return result; } template int size(const std::pair &x) { return size(x.first) + size(x.second); } int foo() { std::vector > pvec; return size(pvec); } Sorry to not reduce a stand-alone testcase without headers. The STL isn't important. The issue is that the list of candidates for "size(std::pair<...>)" doesn't include the templates, only the functions, when instantiating "size(std::vector<...>). On IRC they thought this looked reasonable enough to file a PR. This works fine in 4.0.2 and 3.4.x and many other C++ compilers. -- Summary: Template instantiation behavior change in 4.1 (regression?) Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: roger at eyesopen dot com GCC host triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26080
[Bug c++/26080] Template instantiation behavior change in 4.1 (regression?)
--- Comment #1 from roger at eyesopen dot com 2006-02-02 18:43 --- *** This bug has been marked as a duplicate of 26079 *** -- roger at eyesopen dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26080
[Bug c++/26079] Template instantiation behavior change in 4.1 (regression?)
--- Comment #1 from roger at eyesopen dot com 2006-02-02 18:43 --- *** Bug 26080 has been marked as a duplicate of this bug. *** -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26079
[Bug bootstrap/26161] New: Configure tests for pthread.h sometimes need to use -pthread
The problem is that on some systems, including Tru64 and I believe AIX, the compiler has to be passed the -pthread command line option in order to use #include Effectively, the first lines of /usr/include/pthread.h contain the lines: #ifndef _REENTRANT #error POSIX pthreads are only available with the use of -pthreads #endif For this reason the autoconf tests of pthread.h in libstdc++-v3 and libgomp always fail. Fortunately, this was previously not serious, as the target configurations would include pthread.h anyway, and all the relevant source libraries are compiled with -pthread. In directories where they don't, GCC has workarounds, such as in gcc/gcc-posix.h which contains the lines: /* Some implementations of require this to be defined. */ #ifndef _REENTRANT #define _REENTRANT 1 #endif #include This issue escalcated to a bootstrap failure in libgomp recently, which now aborts whilst configuring libgomp when pthread.h isn't detected. Prior to this change, libgomp built fine and the test results were quite reasonable on Alpha/Tru64. [Stretching the definition of a regression :-)] I believe that what is needed is a "local" configure test for pthread.h that first decides whether the compiler supports -pthread (for example, GCC on IRIX currently does not), and then uses this flag testing for headers. This is perhaps similar to the related patch I posted recently, where we need to test system header with the same compiler options we'll be using to build the source files: http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00139.html See the related definitions of THREADCXXFLAGS and THREADLDFLAGS in libjava's configure.ac. Unfortunately, my autoconf-fu isn't strong enough to tackle this. The temporary work-around is to use --disable-libgomp. The long-term fix would be to port libgomp to use GCC's gthreads library. But in the meantime, it would be good to correct the test for pthread.h and/or add a PTHREAD_CFLAGS that can be used any project. I'm happy to test patches on affected systems. However, it should be trivial to re-create a model system with the above lines and using -D_REENTRANT as the compiler option that needs to be passed. -- Summary: Configure tests for pthread.h sometimes need to use - pthread Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: roger at eyesopen dot com GCC host triplet: alpha*-*-osf* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161
[Bug bootstrap/26161] Configure tests for pthread.h sometimes need to use -pthread
--- Comment #2 from roger at eyesopen dot com 2006-02-07 21:15 --- I've discovered your bootstrap failure is PR16787. It'll take a while for me to try out your XCFLAGS fix on my slow machine. I'll also propose a fix for PR16787. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161
[Bug bootstrap/26161] Configure tests for pthread.h sometimes need to use -pthread
--- Comment #3 from roger at eyesopen dot com 2006-02-08 04:04 --- Subject: Re: Configure tests for pthread.h sometimes need to use -pthread On 7 Feb 2006, fxcoudert at gcc dot gnu dot org wrote: > I tried to give it a look on alphaev68-dec-osf5.1b, but I couldn't > get to the point of configuring libgomp :) > > cc -c -DHAVE_CONFIG_H -g -I. -I../../gcc/libiberty/../include -Wc++-compat > ../../gcc/libiberty/floatformat.c -o ./floatformat.o > cc: Error: ../../gcc/libiberty/floatformat.c, line 343: In this statement, the > libraries on this platform do not yet support compile-time evaluation of the > constant expression "0.0/0.0". (constfoldns) > dto = NAN; Hi FX, Could you try the following for me, and I'll submit it to gcc-patches? Unfortunately, my OSF_DEV PAK has expired so I rely on gcc for hosting GCC. 2006-02-07 Roger Sayle <[EMAIL PROTECTED]> R. Scott Bailey <[EMAIL PROTECTED]> PR bootstrap/16787 * floatformat.c: Include where available. (NAN): Use value of DBL_QNAN if defined, and NAN isn't. Index: floatformat.c === *** floatformat.c (revision 110738) --- floatformat.c (working copy) *** *** 1,5 /* IEEE floating point support routines, for GDB, the GNU Debugger. !Copyright 1991, 1994, 1999, 2000, 2003, 2005 Free Software Foundation, Inc. This file is part of GDB. --- 1,5 /* IEEE floating point support routines, for GDB, the GNU Debugger. !Copyright 1991, 1994, 1999, 2000, 2003, 2005, 2006 Free Software Foundation, Inc. This file is part of GDB. *** Foundation, Inc., 51 Franklin Street - F *** 31,36 --- 31,41 #include #endif + /* On some platforms, provides DBL_QNAN. */ + #ifdef STDC_HEADERS + #include + #endif + #include "ansidecl.h" #include "libiberty.h" #include "floatformat.h" *** Foundation, Inc., 51 Franklin Street - F *** 44,51 --- 49,60 #endif #ifndef NAN + #ifdef DBL_QNAN + #define NAN DBL_QNAN + #else #define NAN (0.0 / 0.0) #endif + #endif static unsigned long get_field (const unsigned char *, enum floatformat_byteorders, Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161
[Bug libgomp/25936] libgomp needs to link against rt on HPUX
--- Comment #4 from roger at eyesopen dot com 2006-02-08 17:46 --- This problem affects both hppa*-hp-hpux* and ia64-hp-hpux*. It appears that the required sem_init, sem_wait, sem_post, etc... symbols are defined both in the -lrt libraries on HPUX and in the -lc_r libraries. The fix is to update LIB_SPEC, perhaps in the -pthread clause, for HPUX, but I'm not sure if it requires adding -lrt or changing -lc to -lc_r, or adding -lc_r? I notice that config/pa/pa-hpux10.h does mention -lc_r, but for use with -threads. Should -pthread pull in the required symbols? i.e. is this a libgomp problem or a target problem? -- roger at eyesopen dot com changed: What|Removed |Added CC||roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25936
[Bug target/22209] [4.1 regression] libgfortran unresolvable symbols on irix6.5
--- Comment #10 from roger at eyesopen dot com 2006-02-09 14:41 --- Hi David, nm $objdir/gcc/libgcc.a contains both __ctzdi2 and __ctzti2 for me. grasp% nm libgcc.a | grep ctz _ctzsi2.o: T __ctzdi2 _ctzdi2.o: T __ctzti2 The post-commit bootstrap and regression test on IRIX 6.5.19m just completed fine for me, with the following gfortran test results. gfortran # of expected passes11485 # of unexpected failures20 # of expected failures 12 # of unsupported tests 26 Could you investigate this failure a bit further? I've no idea why you should be seeing these problems. If it makes any difference I configure with: ${SRCDIR}/configure --with-gnu-as --with-as=/usr/local/bin/as \ --with-gnu-ld --with-ld=/usr/local/bin/ld Where the above as and ld are both binutils 2.16. I've had trouble with binutils 2.16.1's ld on IRIX built both with MIPSPro cc and gcc 3.4.3 so we currently stick with 2.16, but I'll investigate if that makes a difference. -- roger at eyesopen dot com changed: What|Removed |Added CC| |roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22209
[Bug target/22209] [4.1 regression] libgfortran unresolvable symbols on irix6.5
--- Comment #11 from roger at eyesopen dot com 2006-02-09 14:54 --- p.s. I can also confirm that this patch fixes the test case in PR25028 for me on mips-sgi-irix6.5. This failed previously with undefined references to __floattisf and __floattidf, but now not only compiles and links but produces the correct output. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22209
[Bug other/25028] TImode-to-floating conversions broken
--- Comment #4 from roger at eyesopen dot com 2006-02-09 15:00 --- My recent fix for PR target/22209 adding TImode support for MIPS, just fixed this PR's testcase for me on mips-sgi-irix6.5. The new fix*.c and float*.c source files may be useful in resolving the remaining PR25028 issue on ia64/HPUX? I'll investigate. -- roger at eyesopen dot com changed: What|Removed |Added CC| |roger at eyesopen dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25028