from:"roger at eyesopen dot com"

[Bug target/16641] fr30-elf-gcc compiler error when building newlib-1.12.0

2006-04-23 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2006-04-23 21:19 ---
This should now be fixed on mainline.  I've confirmed that a cross-compiler
to fr30-elf currently builds newlib without problems.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16641

[Bug target/21283] [4.0/4.1 regression] ICE with doubles

2006-04-23 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2006-04-23 21:27 ---
This has now been fixed on mainline.  I've confirmed that a cross-compiler
to fr30-elf can currently compile all of newlib without problems.  If anyone
has an fr30 board or a simulator to check the testsuite that would be great.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

Summary|[4.0/4.1/4.2 regression] ICE|[4.0/4.1 regression] ICE
   |with doubles|with doubles
   Target Milestone|4.1.1   |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21283

[Bug target/27282] [4.2 regression] ICE in final_scan_insn, at final.c:2448 - could not split insn

2006-04-25 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2006-04-25 14:09 ---
Paolo's fix looks good to me.  The bugzilla PR shows that this is a 4.2
regression, probably due to the more aggressive RTL optimizations on mainline.
So I'll preapprove Paolo's fix for mainline (please post the version you
commit and a new testcase when you commit it).

As for 4.1, do we have an example of a failure or wrong code generation
against the branch?  I can't tell from bugzilla whether this is safely
latent in 4.0 and 4.1, or just hasn't been investigated there yet
("known to work" is blank, but the summary only lists [4.2]).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27282

[Bug target/27282] [4.2 regression] ICE in final_scan_insn, at final.c:2448 - could not split insn

2006-04-25 Thread roger at eyesopen dot com



--- Comment #8 from roger at eyesopen dot com  2006-04-25 15:41 ---
Grr.  David's patch is also good.  Perhaps better if we follow the usual
protocol of posting patches to gcc-patches *after* bootstrap and regression
testing, for review and approval.  Posting untested patch fragments to
bugzilla without ChangeLog entries and asking for preapproval etc... seems
to, in this instance at least, demonstrate why GCC has the contribution
protocols that it has.

Thanks to David for catching this.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27282

[Bug target/21283] [4.0 regression] ICE with doubles

2006-04-26 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2006-04-26 18:59 ---
This has now been fixed on the 4.1 branch.  Unfortunately, its difficult to
determine whether this patch is still needed on the 4.0 branch, or if other
backports are also required, as libiberty and top-level configure are now
incompatible between the gcc-4_0-branch and mainline "src", making an uberbaum
build of a 4.0 cross-compiler almost impossible.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

  Known to fail|4.0.1 4.1.0 |4.0.1
  Known to work|3.4.5   |3.4.5 4.1.1 4.2.0
Summary|[4.0/4.1 regression] ICE|[4.0 regression] ICE with
   |with doubles|doubles
   Target Milestone|4.2.0   |4.1.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21283

[Bug rtl-optimization/13335] cse of sub-expressions of zero_extend/sign_extend expressions

2006-04-30 Thread roger at eyesopen dot com



--- Comment #9 from roger at eyesopen dot com  2006-04-30 19:52 ---
This bug is a duplicate of PR17104 which was fixed by Nathan Sidwell in
November 2004.  If you read comment #4, you'll notice that the failure of
CSE to handle the rs6000's rs6000_emit_move's zero_extends is identical.


*** This bug has been marked as a duplicate of 17104 ***


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13335

[Bug rtl-optimization/17104] Non-optimal code generation for bitfield initialization

2006-04-30 Thread roger at eyesopen dot com



--- Comment #9 from roger at eyesopen dot com  2006-04-30 19:52 ---
*** Bug 13335 has been marked as a duplicate of this bug. ***


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC||dje at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17104

[Bug fortran/27269] Segfault with EQUIVALENCEs in modules together with ONLY clauses

2006-05-02 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2006-05-02 14:24 ---
This should now be fixed on mainline, thanks to Paul's patch.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27269

[Bug fortran/27324] Initialized module equivalence member causes assembler error

2006-05-02 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2006-05-02 14:26 ---
This should now be fixed on mainline by Paul's patch.  Thanks.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27324

[Bug c/25309] [4.0/4.1/4.2 Regression] ICE on initialization of a huge array

2006-05-03 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2006-05-04 00:14 ---
This should now be fixed on mainline and all active branches.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|4.1.1   |4.0.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25309

[Bug tree-optimization/27285] [4.1 regression] ivopts postgresql miscompilation

2006-05-08 Thread roger at eyesopen dot com



--- Comment #8 from roger at eyesopen dot com  2006-05-08 15:29 ---
I've now reconfirmed that this has been fixed on the gcc-4_1-branch by
Jakub's backport of Zdenek's patch.  Thanks to you both.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27285

[Bug target/26600] [4.1/4.2 Regression] internal compiler error: in push_reload, at reload.c:1303

2006-05-11 Thread roger at eyesopen dot com



--- Comment #8 from roger at eyesopen dot com  2006-05-11 17:22 ---
Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00472.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26600

[Bug middle-end/20722] select_section invoked with argument "unlikely"

2006-05-13 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2006-05-13 18:59 ---
This is the correct documented behaviour.  See the section entitled 
"USE_SELECT_SECTION_FOR_FUNCTIONS" in doc/tm.texi, which reads:

> @defmac USE_SELECT_SECTION_FOR_FUNCTIONS
> Define this macro if you wish TARGET_ASM_SELECT_SECTION to be called
> for @code{FUNCTION_DECL}s as well as for variables and constants.
>
> In the case of a @code{FUNCTION_DECL}, @var{reloc} will be zero if the
> function has been determined to be likely to be called, and nonzero if
> it is unlikely to be called.
> @end defmac

This is also cross referenced from the TARGET_ASM_SELECT_SECTION target
hook documentation, as the semantics for selecting function sections.
The only backend(s) that define USE_SELECT_SECTION_FOR_FUNCTIONS, darwin,
appears to implement the semantics as described above.

The two calls in function_section and current_function_section are guarded
by #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS.  Admittedly, this could have
been implemented by a second target hook, and/or the variable names in the
varasm functions could be less confusing, but this isn't a bug, and certainly
not P1.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20722

[Bug middle-end/26729] [4.0 regression] bad bitops folding

2006-05-14 Thread roger at eyesopen dot com



--- Comment #20 from roger at eyesopen dot com  2006-05-14 17:39 ---
Hi APL,

Re: comment #18.  It was actually stevenb that changed the "known to work"
line,
and assigned this PR to me, after I'd committed a fix to the gcc-4_1-branch.
See http://gcc.gnu.org/ml/gcc-bugs/2006-05/msg01351.html
Marking 4.1.0 as known to work was a simple mistake/typo, and it should
read that 4.1.0 is known to fail, but 4.1.1 is known to work.  I retested
b.cxx explicitly to confirm that it really is fixed on the release branch.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

  Known to fail|3.3.6 3.4.3 4.0.2   |3.3.6 3.4.3 4.0.2 4.1.0
  Known to work|2.95.4 4.2.0 4.1.0  |2.95.4 4.2.0 4.1.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26729

[Bug rtl-optimization/14261] ICE due to if-conversion

2006-05-15 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2006-05-15 17:37 ---
This should now be fixed on both mainline and the 4.1 branch.  Thanks Andreas.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14261

[Bug middle-end/26729] [4.0 regression] bad bitops folding

2006-05-15 Thread roger at eyesopen dot com



--- Comment #22 from roger at eyesopen dot com  2006-05-15 17:41 ---
This should now be fixed on all open branches.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED
   Target Milestone|4.1.1   |4.0.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26729

[Bug target/26600] [4.1/4.2 Regression] internal compiler error: in push_reload, at reload.c:1303

2006-05-17 Thread roger at eyesopen dot com



--- Comment #12 from roger at eyesopen dot com  2006-05-18 01:50 ---
This is now fixed on both mainline and the 4.1 branch.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26600

[Bug middle-end/21067] Excessive optimization of floating point expression

2006-05-20 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2006-05-20 15:14 ---
This problem is fixed by specifying the -frounding-math command line option,
which informs the compiler that non-default rounding modes may be used.
With gcc-3.4, specifying this command line option disables this potentially
problematic transformation.

Strangely, on mainline, it looks like this transformation is no longer
triggered, which may now indicate a missed optimization regression with
(the default) -fno-rounding-math.  We should also catch the division case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21067

[Bug tree-optimization/23452] Optimizing CONJG_EXPR (a) * a

2006-05-31 Thread roger at eyesopen dot com



--- Comment #3 from roger at eyesopen dot com  2006-06-01 02:41 ---
This is now fixed on mainline provided the user specifies -ffast-math.
There are some complications where imagpart(z*~z) can be non-zero, if
imagpart(z) is non-finite, such as an Inf or a NaN.  It's unclear from
the fortran-95 standard whether gfortran is allowed to optimize this
even without -ffast-math.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23452

[Bug target/26223] [4.0 regression] ICE on long double with -mno-80387

2006-06-06 Thread roger at eyesopen dot com



--- Comment #13 from roger at eyesopen dot com  2006-06-06 22:41 ---
This should now be fixed on all active branches.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26223

[Bug target/27082] segfault with virtual class and visibility ("hidden")

2006-06-19 Thread roger at eyesopen dot com



--- Comment #14 from roger at eyesopen dot com  2006-06-19 23:50 ---
Unfortunately, I'm unable to reproduce this failure with a cross-compiler to
alphaev68-unknown-linux-gnu.  However, examination of the tracebacks attached
to this PR and the relevant source code reveals there is a potential problem.
It looks like alpha_expand_mov can call force_const_mem on RTL expressions that
are CONSTANT_P but that potentially don't satify
targetm.cannot_force_const_mem,
such as CONST etc...  This would lead to precisely the failures observed in the
discussion.  operands[1] gets overwritten by NULL_RTX, and we then call
validize_mem on a NULL pointer!  Kaboom!

I think one aspect of the solution is the following patch:

Index: alpha.c
===
*** alpha.c (revision 114721)
--- alpha.c (working copy)
*** alpha_expand_mov (enum machine_mode mode
*** 2227,2232 
--- 2227,2237 
return true;
  }

+   /* Don't call force_const_mem on things that we can't force
+  into the constant pool.  */
+   if (alpha_cannot_force_const_mem (operands[1]))
+ return false;
+
/* Otherwise we've nothing left but to drop the thing to memory.  */
operands[1] = force_const_mem (mode, operands[1]);
if (reload_in_progress)

However, it's not impossible that this will prevent the current failure only
to pass the problematic operand on to somewhere else in the compiler.

Could someone who can reproduce this failure, try the above patch and see
if there's any downstream fallout?  It would also be great to see what the
problematic RTX looks like.  I'm pretty sure its either a SYMBOL_REF, a
LABEL_REF or a CONST.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27082

[Bug middle-end/28131] [4.2 Regression] FAIL: gcc.c-torture/execute/va-arg-25.c compilation (ICE)

2006-06-21 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2006-06-22 00:37 ---
Doh!  My apologies for the breakage!  I think Dave's patch looks good, but the
one suggestion that I would make would be to test for MODE_INT first, then
call the type_for_mode langhook.  This saves calling type_for_mode on unusual
modes.

tree tmp = NULL_TREE;
if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
|| GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
  return const_vector_from_tree (exp);
if (GET_MODE_CLASS (mode) == MODE_INT)
- tmp = fold_unary (VIEW_CONVERT_EXPR,
-   lang_hooks.types.type_for_mode (mode, 1),
-   exp);
+ {
+   tree type_for_mode = lang_hooks.types.type_for_mode (mode, 1);
+   if (type_for_mode)
+ tmp = fold_unary (VIEW_CONVERT_EXPR, type_for_mode, exp);
+ }
if (!tmp)

I'll pre-approve that change, it bootstraps and regression tests OK.
Unfortunately, extern "C" conflicts for errno in the HPUX system headers
mean that I'm unable to test on my HPPA box myself at the moment :-(


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28131

[Bug target/27861] [4.0 regression] ICE in expand_expr_real_1, at expr.c:6916

2006-06-21 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2006-06-22 04:46 ---
This should now be fixed on all active branches.  Thanks to Martin for
confirming the fix bootstraps and regression tests fine on mipsel-linux-gnu.
And thanks, as always, to Andrew Pinski for maintaining the PR in bugzilla.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27861

[Bug middle-end/27889] [4.1/4.2 Regression] ICE on complex assignment in nested function

2006-06-25 Thread roger at eyesopen dot com



--- Comment #14 from roger at eyesopen dot com  2006-06-26 00:24 ---
The problem appears to be that DECL_COMPLEX_GIMPLE_REG_P is not getting set on
the declarations correctly.  The VAR_DECLs that are operands to the additions
don't have DECL_COMPLEX_GIMPLE_REG_P set, so fail the is_gimple_val check in
verify_stmts.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27889

[Bug middle-end/28283] SuperH: Very unoptimal code generated for 64-bit ints

2006-07-06 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2006-07-06 19:17 ---
Investigating...  I suspect that the SH backend's rtx_costs are parameterized
incorrectly, such that a 64-bit shift by the constant 32, looks to be at least
32 times more expensive than a 64-bit addition.  The middle-end then uses
these numbers to select the appropriate code sequence to generate.  Combine
also doesn't both cleaning this up because it also the same invalid rtx_costs,
and discovers than combining additions into shifts doesn't appear to be a win
on this target.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28283

[Bug middle-end/28283] SuperH: Very unoptimal code generated for 64-bit ints

2006-07-06 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2006-07-06 19:47 ---
No the rtx_costs for a DImode shift really are wrong.  The use of the constant
1 in sh.c:shift_costs instructs the middle-end to avoid using DImode
shifts at all costs.  The semantics of rtx_costs is that it is expected to
provide an estimate of the cost of performing an instruction (either in
size when optimize_size or in performance whrn !optimize_size) even if the
hardware doesn't support that operation directly.  For example, a backend
may even need to provide estimates of the time taken for a libcall to libgcc,
if such an operation is necessary, or when optimizing for size, how large such
setting up and executing such a call sequence should be.

It's only by providing accurate information such as this that an embedded
backend such as SH is able to provide fine control over the code sequences
selected by the GCC middle-end.

As for the little-endian vs. big-endian issue that looks like a second bug.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28283

[Bug c/25995] switch/case does not detect invalid enum values if default-case is used

2006-07-08 Thread roger at eyesopen dot com



--- Comment #3 from roger at eyesopen dot com  2006-07-08 14:31 ---
I tried fixing this bug, only to discover why things are exactly as they are.

The short answer is that GCC will warn about the example code if you
specify the -Wswitch-enum command line option.  Specifying -Wall implies
the weaker -Wswitch, which intentionally disables the checking of enumeration
literals in switch statements.

But why would anyone want to disable warning for the example code, I thought
to myself, until bootstrapping GCC itself discovered a large number of cases
identical to the one reported.  Internally, GCC itself uses an enumeration
called tree_code that tracks the different types of node of GCC's abstract
syntax tree (AST).  However, numerous front-ends, supplement this enumeration
with their own front-end specific tree codes, for example,
COMPOUND_LITERAL_EXPR.  Hence, the various GCC front-ends are littered with
source code that looks like:

switch (TREE_CODE (t))
  {
  case COMPOUND_LITERAL_EXPR:
...

where the case value isn't one of the values of the original enum tree_code
enumeration.  Similar problems appeared in dwarf2out.c and other GCC source
files.  At first I started changing things to "switch ((int) TREE_CODE (t))"
to silence the warning, but quickly became overwhelmed by the number of
source files that needed updating.

Hence, the current status quo.  GCC uses the "default: break;" idiom to
indicate which switch statements may be bending the rules, to turn off this
warning with the default -Wall/-Wswitch used during bootstrap.  Well written
user code, on the other hand, should probably always use -Wswitch-enum.

If you read the documentation of -Wswitch vs. -Wswitch-enum, you'll see that
the disabling of these warnings when a default case is specified, is a curious
"feature", purely to aid GCC to compile itself.  As Andrew Pinskia points out
in comment #2, it's valid C/C++ so shouldn't warrant an immediate warning, so
the explicit -Wswitch-enum, requesting stricter checking seems reasonable.

I hope this helps, and the -Wswitch-enum fulfils this enhancement request.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25995

[Bug other/22313] [4.2 Regression] profiledbootstrap is broken on the mainline

2006-07-17 Thread roger at eyesopen dot com



--- Comment #37 from roger at eyesopen dot com  2006-07-17 22:15 ---
I've now tested "make profiledbootstrap" on both mainline and the
gcc-4_1-branch,
on both x86_64-unknown-linux-gnu and i686-pc-linux-gnu, and not only does the
profiled bootstrap build fine, but the dejagnu testsuite looks identical to a
baseline "make bootstrap".

Could anyone confirm whether they're still seeing this problem?  Its likely
that Andrew Pinski's patches together with the resolution of PRs 25518 and
26449 have now resolved this issue.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22313

[Bug middle-end/5169] paradoxical subreg problem

2005-12-04 Thread roger at eyesopen dot com



--- Comment #13 from roger at eyesopen dot com  2005-12-04 18:06 ---
This bug has been fixed, and not just hidden.  Jeff Law's proposed solution to
this problem http://gcc.gnu.org/ml/gcc/2002-01/msg01872.html which was proposed
in January 2002, was contained as part of Jeff/HP's patch of April 2004 (which
is subversion revision 51785).

2002-04-03  Jeffrey A Law  ([EMAIL PROTECTED])
Hans-Peter Nilsson  <[EMAIL PROTECTED]>

* combine.c (simplify_comparison): Avoid narrowing a comparison
with a paradoxical subreg when doing so would drop signficant bits.

This explains why no-one was has been able to reproduce the problem since that
date, and assumed that it was hidden (had gone latent) by an unrelated change.
Since then that solution has been corrected/improved further by Ulrich in
revision 70785.

2003-08-25  Ulrich Weigand  <[EMAIL PROTECTED]>

* combine.c (simplify_comparison): Re-enable widening of comparisons
with non-paradoxical subregs of non-REG expressions.


Alan Modra's (self-described) "band-aid" patch was never ideal.  It's not
unreasonable for combine to eliminate an AND expression with a read from
memory, if the paradoxical subreg semantics for the target imply zero
extension.
It's the later "unsafe" simplification of the comparison that was at fault.
Hence the current situation where simplify_comparison has been fixed, and
we don't have to needlessly disable a useful optimization with Alan's patch
is the most appropriate outcome.  Alan's work-around would have been suitable
for a release branch, if we didn't yet have the correct fix or such a fix
was too intrusive.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added
----------------
 CC||roger at eyesopen dot com
 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5169

[Bug c/7776] const char* p = "foo"; if (p == "foo") ... is compiled without warning!

2005-12-06 Thread roger at eyesopen dot com



--- Comment #14 from roger at eyesopen dot com  2005-12-06 15:39 ---
Fixed for gcc v4.2


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7776

[Bug c++/25263] [4.2 regression] ICE on invalid array bound: int x[1/0];

2005-12-06 Thread roger at eyesopen dot com



--- Comment #3 from roger at eyesopen dot com  2005-12-06 15:43 ---
Fixed.  I've checked all other uses of TREE_OVERFLOW in cp/decl.c and c-decl.c
to confirm that the C front-end isn't affected by a similar issue there.
Sorry for any inconvenience.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25263

[Bug rtl-optimization/25432] [4.1/4.2 Regression] Reload ICE in gen_add2_insn

2005-12-22 Thread roger at eyesopen dot com



--- Comment #8 from roger at eyesopen dot com  2005-12-22 16:05 ---
Alan's patch has already been approved by Ian here:
http://gcc.gnu.org/ml/gcc-patches/2005-12/msg01397.html
I think it would also be good idea to add the original bugzilla test
case, from comment #1, to the testsuite, to prevent future problems.
Pre-approved :-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25432

[Bug target/25213] [3.4 only] -fpic/-fPIC testsuite failures in gcc.dg/i386-387-3.c and i386-387-4.c

2005-12-29 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2005-12-29 19:42 ---
Investigating further, PR25213 looks like a duplicate of PR23098.
In that bugzilla trail, Andrew correctly identified it as a
regression from gcc 3.2.3 when using -fpic/-fPIC on x86, but the
PR was closed once the fix was applied to 4.0 and later.  I suspect
that Jakub's fix for this problem (committed to the 4.0 branch
in comment #11) is the correct fix.  i.e. the changed splitter in
i386.md is where mainline currently selects the fldpi instruction.
Looking through the patch it seems safe enough for the 3.4.x branch,
I'll try bootstrapping and regression testing a backport.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2005-12-29 19:42:15
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25213

[Bug other/22313] [4.2 Regression] profiledbootstrap is broken on the mainline

2006-07-23 Thread roger at eyesopen dot com



--- Comment #39 from roger at eyesopen dot com  2006-07-24 00:45 ---
My latest analysis and a possible patch/workaround have been posted here:
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01015.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22313

[Bug middle-end/28473] [4.0/4.1/4.2 Regression] with -O, casting result of round(x) to uint64_t produces wrong values for x > INT_MAX

2006-07-25 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2006-07-25 20:02 ---
Grr.  I've just noticed richi has just assigned this patch to himself.
I also have a patch that been bootstrapped and has nearly finished
regression testing, that I was just about to post/commit.  richi what
does your fix look like?

Mine contains several copies of:

  if (!TARGET_C99_FUNCTIONS)
break;
! if (outprec < TYPE_PRECISION (long_integer_type_node)
! || (outprec == TYPE_PRECISION (long_integer_type_node)
! && !TYPE_UNSIGNED (type)))
fn = mathfn_built_in (s_intype, BUILT_IN_LCEIL);
+ else if (outprec == TYPE_PRECISION (long_long_integer_type_node)
+  && !TYPE_UNSIGNED (type))
+   fn = mathfn_built_in (s_intype, BUILT_IN_LLCEIL);
  break;

[Serves me right for not assigning this when pinkia asked me to investigate.
 I knew there was a good reason I don't normally bother with recent PRs].


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28473

[Bug middle-end/28473] [4.0/4.1/4.2 Regression] with -O, casting result of round(x) to uint64_t produces wrong values for x > INT_MAX

2006-07-25 Thread roger at eyesopen dot com



--- Comment #7 from roger at eyesopen dot com  2006-07-25 20:08 ---
Ahh, I've just found the Richard's patch submission posting at
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01065.html
I agree with Andrew Pinski, I think my changes are the better fix.

We also need to investigate wether (unsigned int)round(x) is better
implemented as (unsigned int)llround(x).  For the time being, my
patch doesn't perform this transformation, and using lround is unsafe.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28473

[Bug middle-end/28915] [4.2 regression] ICE: tree check: expected class 'constant', have 'declaration' (var_decl) in build_vector, at tree.c:973

2006-09-06 Thread roger at eyesopen dot com



--- Comment #11 from roger at eyesopen dot com  2006-09-06 15:27 ---
Hmm, yep I guess it was caused my change, most probably this part of it:

* tree.c (build_constructor_single): Mark a CONSTRUCTOR as constant,
if all of its elements/components are constant.
(build_constructor_from_list): Likewise.

It looks like someplace is changing the contents of this CONSTRUCTOR to a
VAR_DECL "t.0", but not reseting the TREE_CONSTANT flag.  Hence on PPC we
end up with a bogus constant constructor during RTL expansion!?
Scalar replacement perhaps??

Grr.  I'll investigate.  Sorry for the inconvenience.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28915

[Bug middle-end/28915] [4.2 regression] ICE: tree check: expected class 'constant', have 'declaration' (var_decl) in build_vector, at tree.c:973

2006-09-06 Thread roger at eyesopen dot com



--- Comment #12 from roger at eyesopen dot com  2006-09-06 15:36 ---
Here's the .102t.final_cleanup

;; Function f (f)

f ()
{
  int D.1524;
  int D.1522;
  int D.1520;
  int t.0;

:
  t.0 = (int) &t;
  D.1520 = (int) &t[1];
  D.1522 = (int) &t[2];
  D.1524 = (int) &t[3];
  return {t.0, D.1520, D.1522, D.1524};

}

The CONSTRUCTOR in the return incorrectly has the TREE_CONSTANT flag set.
So the problem is somewhere in tree-ssa.  One workaround/improvement might
be for out-of-ssa to reconstitute the constructor back to a constant.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28915

[Bug libgomp/28296] [4.2 Regression] libgomp fails to configure on Tru64 UNIX

2006-09-11 Thread roger at eyesopen dot com



--- Comment #7 from roger at eyesopen dot com  2006-09-11 16:36 ---
Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00406.html


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |roger at eyesopen dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-09-11 16:36:30
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28296

[Bug bootstrap/28784] [4.2 regression] Bootstrap comparison failure

2006-09-11 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2006-09-11 16:52 ---
I believe I have a patch.  I'm just waiting for the fix for PR28672 (which I've
just approved) to be applied, so I can complete bootstrap and regression test
to confirm there are no unexpected side-effects. 


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |roger at eyesopen dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-09-11 16:52:53
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28784

[Bug c/29132] [4.2 Regression] Mips exception handling broken.

2006-09-18 Thread roger at eyesopen dot com



--- Comment #1 from roger at eyesopen dot com  2006-09-18 21:27 ---
Hi David,

I was wondering if you have a MIPS tree handy, whether you could easily
test the following single line patch:

Index: dwarf2out.c
===
*** dwarf2out.c (revision 117035)
--- dwarf2out.c (working copy)
*** dwarf2out_begin_prologue (unsigned int l
*** 2572,2578 
fde = &fde_table[fde_table_in_use++];
fde->decl = current_function_decl;
fde->dw_fde_begin = dup_label;
!   fde->dw_fde_current_label = NULL;
fde->dw_fde_hot_section_label = NULL;
fde->dw_fde_hot_section_end_label = NULL;
fde->dw_fde_unlikely_section_label = NULL;
--- 2572,2578 
fde = &fde_table[fde_table_in_use++];
fde->decl = current_function_decl;
fde->dw_fde_begin = dup_label;
!   fde->dw_fde_current_label = dup_label;
fde->dw_fde_hot_section_label = NULL;
fde->dw_fde_hot_section_end_label = NULL;
fde->dw_fde_unlikely_section_label = NULL;

Due to all the abstraction with debugging formats, its difficult to tell the
order in which things get executed, and whether this initial value for
dw_fde_current_label survives long enough to avoid use of a set_loc.

Many thanks in advance,


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29132

[Bug middle-end/26983] [4.0 Regression] Missing label with builtin_setjmp/longjmp

2006-09-22 Thread roger at eyesopen dot com



--- Comment #16 from roger at eyesopen dot com  2006-09-22 15:40 ---
Fixed everywhere.  Eric even has an improved patch/fix for mainline, but the
backports of this change are sufficient to resolve the current PR.  Thanks
to Steven for coming up with the solution.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26983

[Bug debug/29132] [4.1 Regression] Mips exception handling broken.

2006-09-22 Thread roger at eyesopen dot com



--- Comment #7 from roger at eyesopen dot com  2006-09-22 16:51 ---
Fixed on mainline (confirmed on mips-sgi-irix6.5).  It'll take another day or
two to backport to the 4.1 branch, as bootstrap and regtest on MIPS takes a
while.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |roger at eyesopen dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
  Known to fail|4.1.2 4.2.0 |4.1.2
  Known to work|4.1.1   |4.1.1 4.2.0
   Last reconfirmed|-00-00 00:00:00 |2006-09-22 16:51:25
   date||
Summary|[4.1/4.2 Regression] Mips   |[4.1 Regression] Mips
   |exception handling broken.  |exception handling broken.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29132

[Bug middle-end/22370] Vec lower produces mis-match types

2007-01-27 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2007-01-28 02:58 ---
Hi Andrew, could you recheck whether you can reproduce this problem on
mainline?
Updating the MODIFY_EXPR patch in PR 22368 to check GIMPLE_MODIFY_STMT, I'm
unable to reproduce this failure on x86_64-unknown-linux-gnu, even with -m32. 
There has been at least one type clean-up patch to veclower, so I suspect this
issue may have been resolved.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22370

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-02-01 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2007-02-02 00:17 ---
It looks like Ian's recent subreg lowering pass patch has improved code
generation on this testcase.  Previously, we'd spill three integer registers to
the stack for "LLM", we're now down to two.  [A significant improvement from
the five we spilled when this bug was reported]

Before:

LLM:subl$12, %esp
movl%ebx, (%esp)
movl28(%esp), %edx
movl20(%esp), %ebx
movl16(%esp), %ecx
movl24(%esp), %eax
movl%esi, 4(%esp)
movl%edx, %esi
movl%edi, 8(%esp)
movl%ebx, %edi
movl(%esp), %ebx
imull   %ecx, %esi
imull   %eax, %edi
mull%ecx
addl%edi, %esi
movl8(%esp), %edi
leal(%esi,%edx), %edx
movl4(%esp), %esi
addl$12, %esp
ret

After:

LLM:subl$8, %esp
movl%ebx, (%esp)
movl20(%esp), %eax
movl%esi, 4(%esp)
movl24(%esp), %ecx
movl12(%esp), %esi
movl16(%esp), %ebx
imull   %esi, %ecx
imull   %eax, %ebx
mull%esi
movl4(%esp), %esi
addl%ebx, %ecx
movl(%esp), %ebx
addl$8, %esp
leal(%ecx,%edx), %edx
ret


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug middle-end/24427] missing optimization opportunity with binary operators

2007-02-18 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2007-02-18 18:10 ---
Hi Eric,

It's not PR24427 that's the motivation for this backport, but PR 28173.
In fact, it was *your* request in comment #2 of PR28173 to backport this!
I'm a little disappointed you'd even question my decision/authority to
backport a regression fix. :-)

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24427

[Bug middle-end/30744] [4.2/4.3 Regression] ICE in compare_values, at tree-vrp.c:466

2007-03-06 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2007-03-06 16:32 ---
This should now be fixed on both mainline and the 4.2 release branch.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Known to fail|4.2.0 4.3.0 |
  Known to work|4.1.1   |4.1.1 4.2.0 4.3.0
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30744

[Bug rtl-optimization/28173] [4.0/4.1 regression] misses constant folding

2007-03-07 Thread roger at eyesopen dot com



--- Comment #6 from roger at eyesopen dot com  2007-03-08 01:55 ---
I suspect this problem is now fully resolved.  The patch for PR24427 has been
backported to the gcc-4_1-branch, and additionally on mainline, simplify-rtx.c
has been enhanced to also perform the missed-optimization at the RTL level.
Given that the 4.0 branch is now closed, I believe this is sufficient to close
this PR.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28173

[Bug fortran/31620] [4.3 regression] Zeroing one component of array of derived types zeros the whole structure.

2007-04-23 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2007-04-23 20:54 ---
Many thanks to Paul for fixing this, and my apologies for being overloaded at
work and not being available to investigate it fully myself.

I believe that Paul's fix of explicitly checking expr1->ref->next is the
correct way to determine whether a reference is too complex.  My confusion is
that this test should already be being checked/verified in the call to
gfc_full_array_ref_p on the line immediately following his change.
So on line 1124 of dependency.c in gfc_f_a_r_p is the clause

if (ref->next)
  return false;

which should be doing exactly the same thing.  The reason I mention this
is perhaps GCC is miscompiling itself, and this gfortran failure is the visible
manifestation.  Alternatively, perhaps ref->next isn't getting set properly,
or is getting clobbered somehow.

Paul does your new testcase fail without your fix?

My apologies again if I'm missing something obvious.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31620

[Bug fortran/31620] [4.3 regression] Zeroing one component of array of derived types zeros the whole structure.

2007-04-23 Thread roger at eyesopen dot com



--- Comment #11 from roger at eyesopen dot com  2007-04-23 21:05 ---
Duh!  I am missing something obvious!  The ref->u.ar.type == AR_FULL test on
line 1120 returns true.  The test for ref->next needs to be moved earlier.

Sorry again for the inconvenience.  Clearly, my brain isn't working properly at
the moment :-(


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31620

[Bug target/33545] New: [4.3 regression] Bootstrap failure/broken exceptions on alpha/Tru64

2007-09-24 Thread roger at eyesopen dot com

I just tried compiling mainline on my dusty alphaev67-dec-osf5.1 and discovered
that recent RTL CFG changes have broken the way that exceptions are implemented
on alpha/Tru64.  Natively, this is seen with "configure" and "make bootstrap"
as a breakage configuring libstdc++-v3 where the "exception model" can't be
detected in configure.

The underlying cause is that exceptions seem to be now be fundamentally broken
on this target.  The simple source file:

struct S { ~S(); };
void bar();
void foo()
{
  S s;
  bar();
}

triggers the following ICE:

conf.C: In function 'void foo()':
conf.C:7: error: end insn 27 for block 7 not found in the insn stream
conf.C:7: error: head insn 24 for block 7 not found in the insn stream
conf.C:7: internal compiler error: Segmentation fault


The problem is reproduceable with a cross-compiler (for example from
x86_64-linux) which is configured as "configure --target=alphaev67-dec-osf5.1"
followed by "make".  The build fails attempting to build mips-tfile, but not
before creating a suitable "cc1plus" which can be used to demonstrate the
problem/logic error.

The underlying cause appears to be associated with the alpha backend's use of
the middle-end RTL function "emit_insn_at_entry".  Using grep it appears
"alpha" is the only backend that uses this function, and I suspect recent
changes to it's implementation (Honza?) are responsible for the breakage.

The flow of events is that in the rest_of_handle_eh pass,
expect.c:finish_eh_generation calls gen_exception_receiver.  In alpha.md,
the define_expand for "exception_receiver" makes use of the function
"alpha_gp_save_rtx" when TARGET_LD_BUGGY_LDGP is defined (as on Tru64 with the
native ld).  The implementation of alpha_gp_save_rtx in alpha/alpha.c creates a
memory slot (with assign_stack_local) and the generates some RTL to initialize
this which it tries to insert at the start of the function using
"emit_insn_at_entry".

The logic error is that emit_insn_at_entry as currently implemented in cfgrtl.c
uses insert_insn_on_edge and commit_edge_insertions.  This occurs during RTL
expansion when some of the basic blocks are not yet fully constructed, hence
the expected invariants are not correctly satisfied leading to the errors and
the segmentation fault.

The only other caller of emit_insn_at_entry appears to be integrate.c's
emit_initial_value_sets.  However, I'm guessing that because this doesn't
occur during RTL expansion, there's no problem with using
commit_edge_insertions.

I'm not sure what the correct way to fix this is.  I'd like to say that this is
a middle-end problem with the change in semantics/implementation of
emit_insn_at_entry causing a regression.  However, because it's only the alpha
that's misbehaving it's probably more appropriate to classify this as a target
bug, and get Jan/RTH or someone familiar with how this should work to correct
alpha.c's alpha_gp_save_rtx.  One approach might be to always emit the memory
write in the function prologue, and rely on later RTL passes to eliminate the
dead store and reclaim the unused stack slot from the function's frame.

I'm happy to test (and approve) patches for folks who don't have access to this
hardware.


-- 
   Summary: [4.3 regression] Bootstrap failure/broken exceptions on
alpha/Tru64
   Product: gcc
   Version: 4.3.0
        Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: roger at eyesopen dot com
GCC target triplet: alphaev67-dec-osf5.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33545

[Bug target/33545] [4.3 regression] Bootstrap failure/broken exceptions on alpha/Tru64

2007-10-12 Thread roger at eyesopen dot com



--- Comment #1 from roger at eyesopen dot com  2007-10-13 04:14 ---
Many thanks to Eric Botcazou!  It turns out that this bug was a duplicate
of PR target/32325.  I can confirm that with Eric's fix, and once I'd committed
my libstdc++ patch for the EOVERFLOW issue (mentioned by Eric in PR23225's
comment #6), and Alex's SRA fixes stop miscompiling mips-tfile, that I can once
again
bootstrap mainline on alphaev67-dec-osf5.1!

Thanks again to Eric for a speedy fix.  I'm sorry that this was a dupe, and
that
it took a while for mainline to stablize to the point that I could confirm the
problem is resolved.


*** This bug has been marked as a duplicate of 32325 ***


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33545

[Bug target/32325] [4.3 Regression] cc1plus ICE configuring libstdc++ on Tru64 UNIX V5.1B: SEGV in rtl_verify_flow_info

2007-10-12 Thread roger at eyesopen dot com



--- Comment #7 from roger at eyesopen dot com  2007-10-13 04:14 ---
*** Bug 33545 has been marked as a duplicate of this bug. ***


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC||roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32325

[Bug bootstrap/33781] New: [4.3 Regression] "Arg list too long" building libgcc.a

2007-10-15 Thread roger at eyesopen dot com

The recent addition of a large number of libgcc objects (for fixed point
arithmetic and other things) now breaks bootstrap on IRIX.  The problem is
that the command line in libgcc/Makefile.in, approx line 697 reads as:

$(AR_CREATE_FOR_TARGET) $@ $$objects

which doesn't defend against $objects being a huge list.  Currently on 32-bit
IRIX this is 1762 files.  Indeed, even typing "ls *.o" in the directory
mips-sgi-irix6.5/32/libgcc, returns "-bash: /usr/bin/ls: Arg list too long"!

Alas I'm not a wizard in build machinery, but I suspect that all that's
required is a one or two line change, perhaps to use "libtool" to create the
archive, which contains logic to circumvent these host command line limits.
I believe this is what we currently do for libgcj and other large libraries.

Many thanks in advance to the kind build maintainer or volunteer who looks
into the problem.  I'm happy to test patches on my dusty MIPS/IRIX box.


-- 
   Summary: [4.3 Regression] "Arg list too long" building libgcc.a
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: roger at eyesopen dot com
  GCC host triplet: mips-sgi-irix6.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781

[Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo

2005-02-16 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-02-16 19:17 
---
Hmm.  I don't think the problem in this case is at the tree-level, where I think
keeping X-(Y*C) and -(Y*C) as a more canonical X + (Y*C') and Y*C' should help
with reassociation and other tree-ssa optimizations.  Indeed, it's these types
of transformations that have enabled the use of fmadd on the PowerPC for 
mainline.

The regression however comes from the (rare) interaction when a floating point
constant and its negative now need to be stored in the constant pool.  It's only
when X and -X are required in a function (potentially in short succession) that
this is a problem, and then only on machines that need to load floating point
constant from memory (AVR and other platforms with immediate floating point
constants, for example, are unaffected).

Some aspects of keeping X and -X in the constant pool were addressed by my
patch quoted in comment #1, which attempts to keep floating point constant
positive *when* this doesn't interfere with GCC's other optimizations.

I think the correct solution to this regression is to improve CSE/GCSE to
recognize that X*C can be synthesized from a previously available X*(-C) at
the cost of a negation, which is presumably cheaper than a multiplication on
most platforms.  Indeed, there's probably a set of targets for which loading
a positive from a constant pool and then negating it, is cheaper than loading
both a positive constant and then loading a negative constant.

Unfortunately, I doubt whether it'll be possible to siumultaneously address
this performance regression without reintroducing the 3.x issue mentioned in
the original "PS".  I doubt on many platforms a two multiply-adds are much
faster than a single floating point multiplication whose result is shared by
two additions.  Though again it might be possible to do something at the RTL
level, especially if duplicating the multiplication is a win with -Os.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988

[Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo

2005-02-18 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-02-19 05:41 
---
Re: comment #5
For floating point expressions, -(A+B) is only transformed into (-A)-B or
(-B)-A when the user explicitly specifies -ffast-math, i.e. only when
flag_unsafe_math_optimizations is true.

Re: comment #6
Interesting.  Although on a handful of rs6000 cores (mpccore, 601 and 603),
a fused-multiply-add is more expensive that an addition, its always a win
to perform two fma's rather than a mult and two adds.  It might be possible
(with some work) to teach combine to un-CSE the following:

double x;
double y;

void foo(double p, double q, double r, double s)
{
  double t = p * q;
  x = t + r;
  y = t + s;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988

[Bug rtl-optimization/336] Superfluous instructions generated from bit-field operations

2005-02-19 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-02-19 19:56 
---
This bug has now been fixed for gcc 4.0.  For the testcase attached to the PR,
mainline now generates the following code on sparc-sun-solaris2.8 with -O2:

fun:ld  [%sp+64], %o5
sll %o0, 2, %g1
mov %o5, %o0
or  %g1, 2, %g1
jmp %o7+12
st %g1, [%o5]

i.e. we've now eliminated the unnecessary "and" and "or" instructions, that
were present in 2.95.2 (and still present in 3.4.3).


-- 
   What|Removed |Added

 Status|SUSPENDED   |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=336

[Bug middle-end/19466] [meta-bug] bit-fields are non optimal

2005-02-19 Thread roger at eyesopen dot com



-- 
Bug 19466 depends on bug 336, which changed state.

Bug 336 Summary: Superfluous instructions generated from bit-field operations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=336

   What|Old Value   |New Value

 Status|SUSPENDED   |RESOLVED
 Resolution||FIXED

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466

[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary

2005-03-08 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-03-09 01:28 
---
Subject: Re:  [3.3/3.4/4.0/4.1 Regression] Wrong warning about
 returning a reference to a temporary


On 8 Mar 2005, Alexandre Oliva wrote:
>
>   * fold-const.c (non_lvalue): Split tests into...
>   (maybe_lvalue_p): New function.
>   (fold_ternary): Use it to avoid turning a COND_EXPR lvalue into
>   a MIN_EXPR rvalue.

This version is Ok for mainline, and currently open release branches.

Thanks,

Roger
--



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug middle-end/18628] [4.0/4.1 regression] miscompilation of switch statement in loop

2005-03-09 Thread roger at eyesopen dot com

--- Additional Comments From roger at eyesopen dot com  2005-03-09 16:13 
---
Subject: Re: [PR middle-end/18628] do not fold to label load from tablejump
 to reg

On 9 Mar 2005, Alexandre Oliva wrote:
> This patch is meant to implement suggestion #3 proposed to fix the bug
> by Roger Sayle and selected by RTH in bugzilla.  So far, I've only
> verified that it fixes the testcase included in the patch.
>
> Alexandre Oliva  <[EMAIL PROTECTED]>
>
>   PR middle-end/18628
>   * cse.c (fold_rtx_mem): Instead of returning the label extracted
>   from a tablejump, add it as an REG_EQUAL note, if the insn loaded
>   from the table to a register.
>   (cse_insn): Don't use it as src_eqv.

Thanks!  OK for mainline if bootstrap and regression testing passes.
Once this patch has been on mainline for a few days, to check that targets
with different forms of tablejump and conditional branches don't have
issues, OK to backport to the 4.0 branch.  Thanks also to RTH for
selecting which of the proposals in the bugzilla PR he preferred.  I'll
admit that I hadn't noticed he'd commented on them until you'd posted
this patch.

However, full credit goes to your patch.  I hadn't appreciated that the
problematic transformation takes place in fold_rtx_mem which has the
instruction context, allowing us to perform this transformation when its
safe (i.e. we directly converting a tablejump into an unconditional jump)
but to avoid the problematic case of hositing a label_ref into a register
that can then escape.  Cool.

> The thought of adding the REG_EQUAL note was to help other passes that
> might want to turn the indirect jump into a direct jump.  I'm not sure
> this may actually happen.

I'm also not sure how much this will help.  If you do encounter any
problems with your patch, my first instinct would be to investigate
not bothering with the REG_EQUAL note.  We've had issues in the past
with whether label_refs in REG_EQUAL notes are counted in the label's
NUSES, and similar ugly corner cases.  If there's no measurable
performance impact, we're probably better off without the risk [on
the 4.0 branch at least].

> Bootstrap and regtesting starting shortly.  Ok to install if they
> pass?

Please forgive me for commenting on this, but it's kind of a pet peeve.
There really are no patches so urgent that they can't be bootstrapped
and regression tested before posting to gcc-patches.  Even "obvious"
fixes to target-independent bootstrap failures can affort the few hours
it takes to confirm changes work and are safe.  Indeed the language
(and emphasis) in contribute.html is (are) quite clear on the matter.
My apologies for bringing this up now as you're certainly not amongst
the worst offenders in this regard.

Many thanks again for tackling high-priority PR middle-end/18628.

Roger
--

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18628

[Bug middle-end/20493] [4.0/4.1 Regression] Bootstrap failure because of aliased symbols

2005-03-16 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-03-17 05:06 
---
Hmm, yep, probably caused by my change.

It looks like with my change fold_widened_comparison is now converting
(int)t == -1 into the equivalent t == (typeof(t))-1.  Normally, this
would be reasonable but the "special" semantics of
HAVE_canonical_funcptr_for_compare clearly screw this up.

My suggestion would be to add the following to the top of the subroutine
fold_widened_comparison:

#ifdef HAVE_canonicalize_funcptr_for_compare
  /* Disable this optimization if we're casting to a function pointer
 type on targets that require function pointer canonicalization.  */
  if (HAVE_canonicalize_funcptr_for_compare
  && TREE_CODE (shorter_type) == POINTER_TYPE
  && TREE_CODE (TREE_TYPE (shorter_type)) == FUNCTION_TYPE)
return NULL_TREE;
#endif

Dave, could you give this a try and see if it restores bootstrap for you?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20493

[Bug middle-end/20539] [4.0/4.1 Regression] ICE in simplify_subreg, at simplify-rtx.c:3674

2005-03-20 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-03-20 16:47 
---
Patch here http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01871.html


-- 
   What|Removed |Added

   Keywords||patch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20539

[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary

2005-03-24 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-03-25 06:03 
---
Splitting non_value into maybe_lvalue_p is a good thing, is totally safe and is
preapproved for both mainline and the 4.0 branch.  The remaining change to
fold_ternary/fold_cond_expr_with_comparison are more controversial, and can
theoretically be discussed independently.

Reading through each of the transformations of COND_EXPR in fold, all of the
problematic transformations are guarded in the block beginning at line 4268
of fold-const.c.  This is the set of "A op B ? A : B" transformations.  All
other transformations are either lvalue-safe, or require either operand 2 or
operand 3 to be a non-lvalue (typically operand 3 must be a constant).

I believe a suitable 4.0-timescale (grotesque hack workaround) is (untested):

--- 4265,4275 
   a number and A is not.  The conditions in the original
   expressions will be false, so all four give B.  The min()
   and max() versions would give a NaN instead.  */
!   if (operand_equal_for_comparison_p (arg01, arg2, arg00)
!   && (in_gimple_form
!   || strcmp (lang_hooks.name, "GNU C++") != 0
!   || ! maybe_lvalue_p (arg1)
!   || ! maybe_lvalue_p (arg2)))
  {
tree comp_op0 = arg00;
tree comp_op1 = arg01;

The maybe_lvalue_p tests should be obvious from the previous versions of
Alexandre's patch.  The remaining two lines are both hideous hacks.  The
first is that these transformations only need to be disabled for the C++
front-end.  This is (AFAIK) the only language front-end that uses COND_EXPRs
as lvalues, and disabling "x > y ? x : y" into MAX_EXPR (where x and y are
VAR_DECLs) is less than ideal for the remaining front-ends.  The second equally
bad hack is to re-allow this transformation even for C++ once we're in the
tree-ssa optimizers.  I believe once we're in gimple, COND_EXPR is no longer
allowed as the lhs of an assignment, hence the MAX_EXPR/MIN_EXPR recognition
transformations are the gimple-level should be able to clean-up any rvalue
COND_EXPRs they're presented with.

Clearly, testing lang_hooks.name is an option of last resort.  I'm increasingly
convinced that the correct long term solution is to introduce a new LCOND_EXPR
tree node for use by the C++ end.  Either as a C++-only tree code, or a generic
tree code.  Additionally, depending upon whether we go ahead and deprecate >?=
and http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary

2005-04-02 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-03 03:20 
---
> Excuse me for asking, but what is it that makes the latest patch I posted not
> reasonable for the 4.0 timeframe?

The performance regression on C, Java, Ada and fortran code, that isn't affected
by this bug.  The bug is marked has the "c++" component because it only affects
the C++ front-end.  A fix that disables MIN_EXPR/MAX_EXPR optimizations in all
front-ends is not suitable for a release branch without SPEC testing to show how
badly innocent targets will get burnt!  There may be lots of places in the Linux
kernel that depend upon generating min/max insns for performance reasons...

I'd hoped I'd made this clear when I proposed the alternate strategy of only
disabling this optimization *only* in the C++ front-end, and *only* prior to
tree-ssa.  Whilst I agree this is a serious bug, it currently affects all
release branches, so taking a universal performance hit to resolve it without
considering the consequences seems a bad decision.  Just grep for "MIN (" and
"MAX (" in GCC's own source code to see how badly this could impact the
compiler's own code/performance/bootstrap times.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug c++/19199] [3.3/3.4/4.0/4.1 Regression] Wrong warning about returning a reference to a temporary

2005-04-04 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-04 16:02 
---
Subject: Re: [Committed] PR c++/19199: Preserve COND_EXPR lvalueness in fold


Hi Alex,

My apologies yet again for not being more explicit about all of the
things that were wrong (and or I was unhappy with) with your proposed
solution.  I'd hoped that I was clear in the comments in the bugzilla
thread, and that you'd appreciate the issues it addressed.

Problems with your approach:

[1] The use of pedantic_lvalues to identify the non-C front-ends,
adversely affects code generation in the Java, Fortran and Ada
front-ends.  The use of COND_EXPRs as lvalues is unique to the
C++ front-end, so ideally a fix shouldn't regress code quality on
innocent front-ends.  Certainly, not without benchmarking to
indicate how significant a performance hit, these other languages
are taking prior to a release.

[2] The pedantic_lvalues flag is itself a hack used by the C
front-end, that is currently being removed by Joseph in his clean-up
of the C parser.  Adding this use would block his efforts, until an
alternate solution is found.  Admittedly, this isn't an issue for
the 4.0 release, but creates more work or a regression once this is
removed from mainline.

[3] Your patch is too invasive.  Compared to the four-line counter
proposal that disables just the problematic class of transformations,
your much larger patch inherently contains a much larger risk.  For
example, there is absolutely no need to modify the source code on the
"A >= 0 ? A : -A  ->   abs (A)" paths as these transformations could
never interfere with the lvalueness of an expression.

Additionally, once one of the longer term solutions proposed by Mark
or me is implemented, all of these workarounds will have to be undone/
reverted.  By only affecting a single clause, we avoid the potential
for leaving historic code bit-rotting in the tree.

[4] Despite your counter claims you approach *does* inhibit the ability
of the the tree-ssa optimizers to synthesize MIN_EXPR and MAX_EXPR
nodes.  Once the in_gimple_form flag is set, fold-const.c is able to
optimize "A == B ? A : B  ->  B" even when compiling C++, as it knows
that a COND_EXPR can't be used as an lvalue in the late middle-end.

[5] For the immediate term, I don't think its worth worrying about
converting non-lvalues into lvalues, the wrong-code bugs and diagnostic
issues are related solely to the lvalue -> non-lvalue transition.
At this stage of pre-release, lower risk changes are cleary preferrable,
and making this change will break code that may have erroneously compiled
in the past.  Probably, OK for 4.0, but not for 3.4 (which also exhibits
this problem).


And although, not serious enough to warrant a [6], it should be pointed
out that several of your recent patches have introduced regressions.
Indeed, you've not yet reported that your patch has been sucessfully
bootstraped or regression tested on any target triple.  Indeed, your
approach of posting a patch before completing the prerequisite testing
sprecified/stressed in contribute.html, on more than one occassion
recently resulted in the patch having to be rewritten/tweaked.  Indeed,
as witnessed in "comment #17", I've already approved an earlier version
your patch once, only to later discover you were wasting my time.


As a middle-end maintainer, having been pinged by the release manager
that we/you weren't making sufficient progress towards addressing the
issues with your patch, I took the opportunity to apply a fix that
was within my authority to commit.  If you'd had made and tested the
changes that I requested in a timely manner, I'm sure I'd have approved
those efforts by now.

My apologies again for not being blunt earlier.  My intention was
that by listing all of the benefits of an alternate approach in
comment #19 of the bugzilla PR, I wouldn't have to explicitly list
them as the defficiencies of your approach.  Some people prefer the
carrot to the stick with patch reviews [others like RTH's "No"].


Perhaps, I should ask the counter question to your comment #21?
In what way do you feel that the committed patch isn't clearly
superior to your proposed solution?  p.s. Thanks for spotting my
mistake of leaving a bogus comment above maybe_lvalue_p.

Roger
--



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry

2005-04-04 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-05 04:22 
---
The stricter version is mostly OK, except for one correction and one suggestion.
The correction is that in the case where the replacement wasn't a register, you
shouldn't be calling validate_change_maybe_volatile inside a gcc_assert.  When
ENABLE_ASSERT_CHECKING is disabled, the side-effects of this statement will be
lost (i.e. the replacement attempt using the new pseudo).  You should instead
try "if (!validate_change_maybe_volatile (...)) gcc_unreachable();" or
alternatively use a temporary variable.

The minor suggestion is that any potential performance impact of this change
can be reduced further by  tweaking validate_change_maybe_volatile to check
whether "object" contains any volatile mem references before attempting all
of the fallback/retry logic.

Something like:

int
validate_change_maybe_volatile (rtx object, rtx *loc, rtx new)
{
  if (validate_change (object, loc, new, 0))
return 1;

  if (volatile_ok
+ || !for_each_rtx (&object, volatile_mem_p, 0)
  || !insn_invalid_p (object))
return 0;

  ...

This has the "fail fast" advantage that if the original instruction
didn't contain any volatile memory references (the typical case), we
don't bother to attempt instruction recognitions with (and without)
volatile_ok set.  Admittedly, this new function is only called in
one place which probably isn't on any critical path, but the above
tweak should improve things (the for_each_rtx should typically be faster
than the insn_invalid_p call, and certainly better than two calls to
insn_invalid_p and one to validate_change when there's usually no need.)

p.s. I completely agree with your decision to implement a stricter test
to avoid inserting/replacing/modifying volatile memory references.

Please could you bootstrap and regression test with the above changes and
repost to gcc-patches?  I'm prepared to approve with those changes, once
testing confirms no unexpected interactions.  Or if you disagree with the
above comments, let me/someone know.  Thanks in advance,

Roger
-- 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug c++/19199] [3.3/3.4 Regression] Wrong warning about returning a reference to a temporary

2005-04-05 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-05 23:13 
---
Now that a fix has been applied to both mainline and the 4.0 branch, I've been
investigating backporting the fix to 3.4.  Unfortunately, a significant feature
of the fixes proposed by Alex and me are that they disable the problematic
transformations during parsing, but allow them to be caught by later tree-ssa
passes.  Unfortunately, in gcc 3.4.x if we disable constant folding of
COND_EXPRs for C++, there are no tree-level optimizers that can take up the
slack, and instead we'd rely on if-conversion to do what it can at the RTL
level. I'm not convinced that 3.4 if-conversion is particularly effective in
this respect (certainly across all targets), and I've not yet checked whether
all of the affected transformations are implementated in ifcvt.c (even on
mainline).

Might I propose that we close this bug as won't fix for both the 3.3 and 3.4
branches.  GCC has always generated wrong code for this case, and the current
state of 3.4.4 is that we issue an incorrect warning (which seems better than
silently generating wrong code as we did with 3.1).  It's a trade-off; but I'm
keen to avoid degrading g++ 3.4.5 (in the common case), for a diagnostic
regression.  Or we could leave the PR open (and perhaps unassign it) in the
hope that a 3.4 solution will be discovered, as the mainline/4.x fixes aren't
suitable.  Mark?  Alexandre?


-- 
   What|Removed |Added

  Known to fail|3.4.0 4.0.0 3.3.2   |3.4.0 3.3.2
  Known to work||4.0.0 4.1.0
Summary|[3.3/3.4/4.0 Regression]|[3.3/3.4 Regression] Wrong
   |Wrong warning about |warning about returning a
   |returning a reference to a  |reference to a temporary
   |temporary   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug c++/19199] [3.3/3.4 Regression] Wrong warning about returning a reference to a temporary

2005-04-05 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-06 00:03 
---
That's my interpretation of (Andrew Pinki's) comment #6 in the bugzilla PR
[n.b. I haven't reconfirmed his analysis personally]


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19199

[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry

2005-04-08 Thread roger at eyesopen dot com

--- Additional Comments From roger at eyesopen dot com  2005-04-08 17:03 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail

Hi Alex,

On 8 Apr 2005, Alexandre Oliva wrote:
> Roger suggested some changes in the patch.  I've finally completed
> bootstrap and test with and without the patch on amd64-linux-gnu, and
> posted the results to the test-results list.  No regressions.  Ok to
> install?

Hmm.  It looks like you misunderstood some of the comments in my
review (comment #16 in the bugzilla PR)...

+  gcc_assert (validate_change_maybe_volatile (v->insn, v->location,
+  reg));

This is still unsafe.  If you look in system.h, you'll see that when
ENABLE_ASSERT_CHECKING is undefined, the gcc_assert macro gets defined
as:

#define gcc_assert(EXPR) ((void)(0 && (EXPR)))

which means that EXPR will not get executed.  Hence you can't put
side-effecting statements (especially those whose changes you depend
upon) naked inside a gcc_assert.  Ahh, I now see the misunderstanding;
you changed/fixed the other "safe" gcc_assert statement, and missed
the important one that I was worried about.  Sorry for the confusion.

Secondly:

+  if (volatile_ok
+  /* Make sure we're not adding or removing volatile MEMs.  */
+  || for_each_rtx (loc, volatile_mem_p, 0)
+  || for_each_rtx (&new, volatile_mem_p, 0)
+  || ! insn_invalid_p (object))
+return 0;

The suggestion wasn't just to reorder the existing for_each_rtx to
move these tests earlier, it was to confirm that the original "whole"
instruction had a volatile memory reference in it, i.e. that this is
a problematic case, before doing any more work.  Something like:

+  if (volatile_ok
++ /* If there isn't a volatile MEM, there's nothing we can do.  */
++ || !for_each_rtx (&object, volatile_mem_p, 0)
+! /* But make sure we're not adding or removing volatile MEMs.  */
+  || for_each_rtx (loc, volatile_mem_p, 0)
+  || for_each_rtx (&new, volatile_mem_p, 0)
+  || ! insn_invalid_p (object))
+return 0;

This second change was just a micro-optimization, and I'd have approved
your patch without it, but the use of gcc_assert in loop_givs_rescan is
a real correctness issue.

Sorry again for the inconvenience,

Roger
--

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug target/20126] [3.3/3.4/4.0/4.1 Regression] Inlined memcmp makes one argument null on entry

2005-04-09 Thread roger at eyesopen dot com

--- Additional Comments From roger at eyesopen dot com  2005-04-10 03:18 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail

On 9 Apr 2005, Alexandre Oliva wrote:
> On Apr  8, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote:
>
> > ++ /* If there isn't a volatile MEM, there's nothing we can do.  */
> > ++ || !for_each_rtx (&object, volatile_mem_p, 0)
>
> This actually caused crashes.  We don't want to scan the entire insn
> (it might contain NULLs), only the insn pattern.

Argh! Indeed, my mistake/oversight.  Thanks.

>   PR target/20126
>   * loop.c (loop_givs_rescan): If replacement of DEST_ADDR failed,
>   set the original address pseudo to the correct value before the
>   original insn, if possible, and leave the insn alone, otherwise
>   create a new pseudo, set it and replace it in the insn.
>   * recog.c (validate_change_maybe_volatile): New.
>   * recog.h (validate_change_maybe_volatile): Declare.

This is OK for mainline, thanks.  Now that 4.0 is frozen for release
candidate one, Mark needs to decide whether this patch can make it
into 4.0.0 or will have to wait for 4.0.1.  I also think we should
wait for that decision before considering a backport for 3.4.x (or
we'll have a strange temporal regression).

I'd recommend commiting this patch to mainline ASAP, so it can have
a few days of testing before Mark has to make his decision.

Thanks again for your patience,

Roger
--

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry

2005-04-12 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-12 14:38 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail


Hi Alexandre,

On 12 Apr 2005, Alexandre Oliva wrote:
> Does any expert in rtl loop care to chime in?

I'm not sure I qualify for the title "rtl loop" expert, but setting
bl->all_reduced to zero after we fail to validate a change to the
RTL looks to be a reasonable failure mode.

I still like your fallbacks, that by trying harder we perform better
optimization, but as shown by the ARM's "stmia" instruction I suspect
there will always be cases were we can't reduce IV expressions in
some backend instructions.

Previously, we didn't even detect these cases and potentially generated
bad code.  I think the ICE was an improvement over the "potentially" bad
code, but what we really need is a more graceful failure/degradation.


As you propose, I'd recommend something like (for your final clause):

/* If it wasn't a reg, create a pseudo and use that.  */
rtx reg, seq;
start_sequence ();
reg = force_reg (v->mode, *v->location);
!   if (validate_change_maybe_volatile (v->insn, v->location, reg))
! {
!   seq = get_insns ();
!   end_sequence ();
!   loop_insn_emit_before (loop, 0, v->insn, seq);
! }
!   else
! {
!   end_sequence ();
!   if (loop_dump_stream)
! fprintf (loop_dump_stream,
!  "unable to reduce iv to register in insn %d\n",
!  INSN_UID (v->insn));
!   bl->all_reduced = 0;
!   v->ignore = 1;
!   continue;
! }


I think its worthwhile keeping the validate_change_maybe_volatile
calls/changes on mainline.  But then for gcc 4.0.0 or 4.0.1
we can use the much simpler:


  if (v->giv_type == DEST_ADDR)
/* Store reduced reg as the address in the memref where we found
   this giv.  */
!   {
! if (!validate_change (v->insn, v->location, v->new_reg, 0))
!   {
! if (loop_dump_stream)
!   fprintf (loop_dump_stream,
!"unable to reduce iv to register in insn %d\n",
!INSN_UID (v->insn));
! bl->all_reduced = 0;
! v->ignore = 1;
! continue;
!   }
!   }


A much less intrusive regression fix than previously proposed fix for
4.0.  But perhaps one of the real "rtl loop" experts would like to
comment?

Roger
--



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug middle-end/20739] [4.0/4.1 regression] ICE in gimplify_addr_expr

2005-04-14 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-14 17:19 
---
Thanks Alex! This is OK for mainline.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20739

[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry

2005-04-14 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-14 17:37 
---
You'll notice in loop.c that everywhere that we currently set all_reduced to
zero, we also set ignore to true.  This change is to avoid wasting CPU cycles,
if we know that an IV can't be eliminated, there's no point in trying to
modify any more instructions that use it.  At best, this incurs wasted
CPU cycles, at worst we'll end up substituting in some places and not others,
which will result in requiring both the original IV *and* the replacement IV
which will increase register pressure in the loop.

As your (Alex's) testing showed, I'm not sure that its strictly required for
correctness, it's mainly to preserve consistency with the exisiting all_reduced
invariants by using the same idiom as used elsewhere, but also as a potential
compile-time/run-time micro-optimization.  However for 4.0, I thought it best
to reuse/copy the existing idiom, rather than risk clearing all_reduced without
setting ignore (which may have potentially exposed code paths not seen before).

We still need the 4.1 variant to be tested/committed to mainline.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry

2005-04-15 Thread roger at eyesopen dot com

--- Additional Comments From roger at eyesopen dot com  2005-04-15 14:52 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail

On 15 Apr 2005, Alexandre Oliva wrote:
> On Apr 12, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote:
> > I still like your fallbacks, that by trying harder we perform
> > better optimization,
>
> The more I think about this, the more I have the impression that
> perhaps the fallbacks are not necessary.
> ...
> So I'm wondering if taking out all of the workarounds and going back
> to something like what is now in the 4.0 branch, except for the use of
> validate_change_maybe_volatile, wouldn't get us exactly what we want.
> ...
> Anyhow, in the meantime, could I check in the patch to fix Josh's
> asm-elf build failure?
> ...
> It would be nice to keep the hard failure in place for a bit longer,
> such that we stood a better chance of finding other situations that
> might require work arounds.

Sure.   Your patch in comment #28 of bugzilla PR20126 is OK for mainline
to resolve Josh's bootstrap failure.  Sounds like you've already done
the necessary testing, and I'll trust you on a suitable ChangeLog entry.

I agree with your proposed game plan of keeping the hard failure in
place temporarily, to discover whether there are any other "fallback"
strategies that would be useful.  Ultimately though, I don't think we
should close PR20126 until a "soft failure" is implemented on mainline,
like we've (Jakub has) done on the gcc-4_0-branch (such as the
mainline code proposed in comment #30).

Roger
--

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry

2005-04-16 Thread roger at eyesopen dot com

--- Additional Comments From roger at eyesopen dot com  2005-04-17 00:21 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail

Hi Alex,

On 16 Apr 2005, Alexandre Oliva wrote:
> On Apr 15, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote:
> > I agree with your proposed game plan of keeping the hard failure in
> > place temporarily, to discover whether there are any other "fallback"
> > strategies that would be useful.  Ultimately though, I don't think we
> > should close PR20126 until a "soft failure" is implemented on mainline,
> > like we've (Jakub has) done on the gcc-4_0-branch (such as the
> > mainline code proposed in comment #30).
>
> But see, the problem with the soft failure mode is that, if it is ever
> legitimate to leave the giv alone and not make sure we set whatever
> register appears in it to the right value, then can't we always do it,
> removing all of the (thus useless) workarounds?
>
> And if there's any case in which it is not legitimate to do so, then
> the soft failure mode would be a disservice to the user, that would
> silently get a miscompiled program.  We should probably at least warn
> in this case.

I don't believe there are any cases in which it is not legitimate
to leave the GIV alone, so we'll never silently miscompile anything.

My understanding is that it's always possible to leave the giv
alone (provided that we set all_reduced to false).  The "workarounds"
as we've got used to calling them are not required for correctness,
but for aggressive optimization.  There's clearly a benefit to
strength reducing GIVs, and the harder we try to replace them, the
better the code we generate.  Yes, they are (useless/not necessary)
from a purely correctness point of view; we don't even have to call
validate_change we could just always give-up and punt using clearing
all_reduced (technicaly we don't have to perform any loop optimizations
for correctness), but we'd generate pretty poor code.

The patch you proposed provides the soft failure mode we want (and
now have on the release branch).  We could, as you say remove all
of the current workarounds, and the only thing that would suffer is
the quality of the code we generate.  Needless to say, I'd prefer
to keep these optimizations (for example your recent one for Josh
to allow us to strength reduce the ARM's stim instruction).  It's
not unreasonable to try three or four approaches before giving up,
and forcing the optimizers to preserve the original GIV.

Does this clear things up?  Do you agree?

Roger
--

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug target/20126] [3.3/3.4/4.0 Regression] Inlined memcmp makes one argument null on entry

2005-04-16 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-04-17 03:06 
---
Subject: Re: [PR target/20126, RFC] loop DEST_ADDR biv replacement may fail


On 16 Apr 2005, Alexandre Oliva wrote:
> On Apr 16, 2005, Roger Sayle <[EMAIL PROTECTED]> wrote:
>
> > Does this clear things up?  Do you agree?
>
> Yup, for both questions.  Thanks for the clarification.  It wasn't
> clear to me that the assignments played any useful role, as soon as I
> found out the givs could be assumed to hold the correct value.  It
> all makes sense to me now.

Your patch (in comment #45) is OK for mainline, with a suitable
ChangeLog entry.  Hurray, we can close the PR.

Roger
--



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20126

[Bug c/34720] ICE in real_to_decimal, at real.c:1656

2008-01-28 Thread roger at eyesopen dot com



--- Comment #7 from roger at eyesopen dot com  2008-01-29 01:12 ---
I'm also seeing this same failure with "make profiledbootstrap" on
x86_64-unknown-linux-gnu.  A "make bootstrap" on the same machine completes and
regression tests fine (14 unexpected failures in gcc).  I suspect that the
miscompilation is either non-deterministic or is caused by an optimization that
only triggers on some targets and/or with additional profile information.

Perhaps we should regression hunt for the change that broke things.  It might
not be anything to do with real.c or decimal floating point.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC|            |roger at eyesopen dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
  GCC build triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu
   GCC host triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu
 GCC target triplet|powerpc64-unknown-linux-gnu |*64-unknown-linux-gnu
   Last reconfirmed|-00-00 00:00:00 |2008-01-29 01:12:50
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34720

[Bug fortran/31711] rhs array is changed while assiging to same lhs array

2007-04-27 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2007-04-27 18:20 ---
Paul's fix looks correct to me.  It appears that when the "#if 0" was added
to disable broken loop shifting at some point in the distant past, the critical
functionality.

  if (nDepend)
break;

was accidentally removed.  This is similar to the idiom used a few lines
earlier "nDepend = 1;  break;"


I would normally have suggested that the type of nDepend and the return type of
gfc_dep_resolver be changed to bool, and the dead "#if 0" code removed to
clean-up this code...

However, I think a much better longer term strategy is to completely remove
gfc_conv_resolve_dependencies, and gfc_dep_resolver and the automatic handling
of temporaries in the scalarizer.  Instead, in line with the TODO above
gfc_conv_resolve_dependencies, I think it's much better to handle this at a
higher level using the new/preferred gfc_check_dependency API that's used
through-out the front-end.

For example, when lhs=rhs has a dependency, this can be transformed into
"tmp=rhs; lhs=tmp" during lowering or at the front-end tree-level, which
greatly simplifies the complex code in the scalarizer.  It also allows the
post-assignment copy "lhs=tmp" to be performed efficiently [though I may have
already implemented that in the scalarizer already :-)].

One benefit of doing this at a higher level of abstraction is that
loop-reversal isn't a simple matter of running our default scalarized loop
backwards (as hinted by the comment, and the PRs in bugzilla).  Consider the
general case of a(s1:e1,s2:e2,s3:e3) = rhs.  The dependency analysis in
gfc_check_dependency may show that s1:e1 needs to run forwards to avoid a
conflict, and that s3:e3 needs to run backwards for the same reason.  It's
trivial then to treat this almost like a source-to-source transformation (ala
KAP) and convert the assignment into a(s1:e1:1,s2:e2,s3:e3:-1) instead of
attempting to shoehorn all this into the scalarizer at a low-level and perform
book-keeping on the direction vectors.  By the time, we've lowered gfc_expr to
gfc_ss, we've made things much harder for ourselves.  And things get much more
complex once we start seriously tackling CSHIFT, TRANSPOSE and friends!

Keeping the scalarizer simple also eases support for autovectorization and/or
moving it into the middle-end (the topic of Toon's GCC summit presentation).

I'm as surprised as Paul that this hasn't been a problem before.  I suspect
it's because we use the alternate gfc_check_dependency in the vast majority of
cases, and the empirical observation in the research literature that most f90
array assignments in real code don't carry a dependency.

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31711

[Bug target/23322] [4.1 regression] performance regression, possibly related to caching

2005-08-11 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-08-11 14:56 
---
I'll take a look, but on first inspection this looks more like a register
allocation issue than a reg-stack problem.  In the first (4.0) case, the
accumulator "result" is assigned a hard register in the loop, whilst in
the second (4.1) it is being placed in memory, at -16(%ebp).  This may also
explain why extracting that loop into a stand-alone function produces
optimal/original code, as the register allocator gets less confused by other
influences in the function.  The extracted code is also even better than 4.0's,
as it avoids writing "result" to memory on each iteration (store sinking).

The second failure does show an interesting reg-stack/reg-alloc interaction
though.  The "hot" accumulator value is live on the backedge and the exit
edge of the loop but not on the incoming edge.  Clearly, the best fix is to
make this value live on the incoming edge, but failing that it is actually
better to prevent it being live on the back and exit edges, and add compensation
code after the loop.  i.e. if the store to result in the loop used fstpl, you
wouldn't need to fstp %st(0) on each loop iteration, but would instead need a
compensating fldl after the loop.

I'm not sure how easy it would be to teach GCC's register allocation to take
these considerations into account, or failing that, whether reg-stack could be
tweaked/hacked to locally fix this up.  But the fundamental problem is that
reg-alloc should assign result to a hard resigster as it clearly knows there
are enough available in that block.

reg-stack.c is just doing what its told, and in this case its being told to
do something stupid.


-- 
   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed||1
   Last reconfirmed|-00-00 00:00:00 |2005-08-11 14:56:31
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322

[Bug middle-end/21137] Convert (a >> 2) & 1 != 0 into a & 4 != 0

2005-08-14 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-08-14 19:17 
---
Hi James,

Unfortunately, there are a few mistakes in your proposed patch for PR21137.

Firstly Kazu's proposed transformation is only valid when the results of the
bitwise-AND are being tested for equality or inequality with zero.  i.e. its
safe to transform
  "((a >> 2) & 1) != 0"  into  "(a & 4) != 0"
but not
  "x = (a >> 2) & 1;"  into "x = (a & 4)".

Your patch is in the general fold path for BIT_AND_EXPR, so you'll transform
both.  It's surprising there are no testsuite checks for the second example
above; it might be worth adding them to prevent anyone making a similar mistake
in future.

Secondly, this transformation is only valid is c1 + c2 < TYPE_PRECISION(type).
Consider the following code:

  signed char c;
  if ((c >> 6) & 64) ...

this is not equivalent to

  if (c & (char)(64<<6)) ...
i.e.  if (c & (char)4096) ...
i.e.  if (c & 0) ...
i.e.  if (0)

Of course, when c1+c2 >= TYPE_PRECISION(type), there are two additional
optimizations that can be performed.  If TYPE_UNSIGNED(type) the result
is always false, and if !TYPE_UNSIGNED(type), the condition is equivalent
to "a < 0".  So in the example of mine above, optimization should produce:
   if (c < 0) ...

Finally, in your patch, you use "break", if the transformation is invalid.
This isn't really the correct "idiom/style" for fold, where if the guard
for a transformation fails, you shouldn't drop out of the switch, but instead
continue onto the following/next transformation "in the list".
So instead of "if (!guard) break; return transform();", this optimization
should be written as "if (guard) return transform();".  I haven't looked
for other examples for "break" in fold_unary/fold_binary/fold_ternary, but
if there are any, they're probably (latent) missed-optimization bugs.

Other than that the patch looks good.  Thanks for looking into this.


-- 
   What|Removed |Added

 CC||phython at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21137

[Bug tree-optimization/23476] [4.1 Regression] ICE in VRP, remove_range_assertions

2005-08-20 Thread roger at eyesopen dot com


--- Additional Comments From roger at eyesopen dot com  2005-08-20 15:27 
---
My apologies for adding a comment to an already resolved PR, but I've some
follow-up thoughts on Diego's recent solution to this regression.  From a
high-level perspective, it would probably be more efficient to require that
conditions are always folded as an invariant of our tree-ssa data structures. 
It's better to fold a conditional once when it is constructed/modified,
rather than need to call fold on it each time it is examined.  Some places
that build/modify conditionals may know that fold doesn't need to be (or
has already been) called, whilst requiring the many places that examine CFGs
to call fold themselves is pessimistic.  This also fits well with our recent
"folded by construction" philosophy, using fold_buildN instead of build.

I appreciate that this a meta-issue, and Diego's fix is fine for this problem,
but ultimately I think that placing stricter invariants on our data structures
will reduce the number of unnecessary calls to fold, and speed up the compiler.
Eventually, most calls to build* should be fold_build*, and it should rarely
be necessary to call fold() without a call to build (or in place modification
of a tree).  But perhaps there a valid tree-ssa reasons why this shouldn't be
a long-term goal?


-- 
   What|Removed |Added

     CC|            |roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23476

[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a

2007-11-01 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2007-11-01 17:15 ---
Thanks to both Jakub and DJ for their help.  I just tried out the suggested
patch on my IRIX box, and was surprised that it didn't resolve the error.
My apologies that my initial analysis might have been wrong (or incomplete),
but it looks like the error occurs earlier on the same command line.

Not only does,

  objects="$(objects)" ; $(AR_CREATE_FOR_TARGET) $@ $$objects

Infact, stripping the command back to just

  objects="$(objects)"

is enough to trigger the error.  Hoping that this was perhaps a limitation
of IRIX's /bin/sh, I've tried again with SHELL=/usr/local/bin/bash but alas
I get the same error.

make: execvp: /usr/local/bin/bash: Arg list too long

So it's not a bound on argc or the number of entries in argv[] that's the
problem, but a hard limitation on command line length.


So it looks like we can't even assign $objects yet alone use it, either
directly or looping over it to use xargs.  Perhaps we could do something
with "find".  Just a wild guess here as I don't understand build machinery
but something like:

  find . -name '*.o' -exec ar rc libgcc.a {} \;

And then test afterwards

  if test ! -f libgcc.a ; then
{the eh_dummy.o stuff to avoid empty libgcc.a} ;
  fi


I'm not sure why I'm seeing this.  There mustn't be many IRIX testers for
mainline and either MIPS is building more objects than other platforms (for
saturating and fixed point math) or most OSes are less "restricted" than
IRIX.

Many thanks again for peoples help.  Is "find" portable, or is there a
better way to achieve the same thing without ever placing all of the filenames
on a single command line?

Sorry for any inconvenience.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781

[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a

2007-11-02 Thread roger at eyesopen dot com



--- Comment #8 from roger at eyesopen dot com  2007-11-02 16:41 ---
Created an attachment (id=14471)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14471&action=view)
Default libgcc.a objects on mips-sgi-irix6.5

I'll respond to Jakub's latest comments before trying DJ's more recent patch.
Running "getconf ARG_MAX" on my IRIX box, returns 20480, which is 20K.
I believe this is the default, out of the box setting for my machine which
is running IRIX 6.5.19m.
Using cut'n'paste from the failing "make" output, I measure the current
"$$objects" to be 25949 bytes.  I've attached the "attempted" value of
$objects to this e-mail.

I'll give DJ's patch a spin... I apologise that this box isn't that speedy.

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781

[Bug bootstrap/33781] [4.3 Regression] "Arg list too long" building libgcc.a

2007-11-02 Thread roger at eyesopen dot com



--- Comment #9 from roger at eyesopen dot com  2007-11-02 17:12 ---
Doh!  DJ's patch gets us a little further, but it things are still broken. 
However, it's an excellent debugging tool which shows that its the invocation
with libgcc-objects-15 that's broken.  Applying the same trick as above shows
that $libgcc-objects-15 alone is 19962 bytes, which combined with the "ar"
etc.. at the beginning of the command line exceeds the limits.

So it's the "fixed-conv-funcs" that are to blame.  Perhaps "gen-fixed.sh" has
gone insane with the large number of integer-like machine modes on MIPS.  The
correct fix might actually be in the optabs handling of the middle-end, so we
don't need quite so many conversion functions in MIPS' libgcc.a.  Or perhaps
mips.md need improved support (patterns) for this functionality.

I've no idea what _satfractunsUTIUHA is, it's a recent addition and I've not
been following gcc-patches lately.  Splitting "_fract*" from "_sat*" with a
patch similar to DJ's should work.

I hope this is enlightening.  Is there a --disable-option to avoid building
fixed point conversion support?  Looks like our command line usage is O(n^2)
in the number of backend integer machine modes?

Thanks again for everyone's help on this.  I'll owe you beers at the next
GCC summit.

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781

[Bug libstdc++/35968] nth_element fails to meet its complexity requirements

2008-04-20 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2008-04-21 03:22 ---
Yep, now that we're back in stage1, it's about time I got around to submitting
the O(n) worst case nth_element implementation that I mentioned last year.  For
Steven's benefit, the implementation I've already coded up uses the
median-of-medians in groups of five strategy as a fallback to a modified
quickselect.
[Though I'll need to quickly read the paper you cite]

The trick for libstdc++ is to attempt to make the typical case as fast or
faster than the existing implementation.  Whilst the standards now require
O(n) worst case, the perceived performance of g++'s users is the average
case and changing to an O(n) implementation that has a large co-efficient
constant may upset some folks.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |roger at eyesopen dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2008-04-21 03:22:22
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968

[Bug libstdc++/35968] nth_element fails to meet its complexity requirements

2008-04-24 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2008-04-24 15:01 ---
Well, I've now had time to read the Barriato, Hofri et al. 2002 paper, and the
bad news is that such an approximate median selection algorithm can't be used
to guarantee an O(N) worst-case std::nth_element implementation.  It could be
used in an implementation to guess a good pivot, but the quality of this
median, i.e. how approximate it is, doesn't meet the necessary criterion to
ensure an O(N) worst case.  You'd still need a fallback method with guaranteed
bounds or an exact median in order to achieve O(N).  i.e. it could help improve
the average case performance, but doesn't help with the worst case.

For the mathematically inclined, in order to achieve an O(N) worst-case
performance, you need to guarantee a constant fraction of elements can be
eliminated at each level of the recursion.  In comment #4, Steven fixates
on "just as long as N/2 elements are reduced each time round", but the
equations for sum of geometric series show that doing better than any constant
fraction guarantees O(N) worst case.  Hence even if you only guarantee that
you can eliminate 10% each round, you still achieve O(N) worst-case.

Hence you need a method that provides an approximate median that worst-case
can guarantee elimination of say 10% of elements from consideration.  This is
why approximate medians offer some utility over exact medians if they can be
found faster.  Unfortunately, the method of Battiato referenced in comment #1
doesn't provide such a constant fraction guarantee.  An analysis shows that at
each round, it can only eliminate (2^n-1)/3^n of the elements in its worst
case, where n is log_3(N).

By hand, naming the ranks 0..N-1, when N=3, the true median at rank 1 is
selected.  For N=9, the elements at rank 3,4 or 5 may be considered as a
median, i.e. 1/3 eliminated.  For N=27, the elements between ranks 8 and 20 may
be returned as the median, i.e. 7/27 eliminated.  In the limit, as N tends
towards infinity (and n tends to infinity), the eliminated fraction (2^n-1)/3^n
tends to zero unbounded.  i.e. the larger the input size the less useful is the
worst-case median.

The poor quality of the median is lamented by the authors in the penultimate
paragraph of section 4.1 of the paper.  They then go on to show that
statistically such a worst-case is rare, but unfortunately even a rare worst
case breaks the C++ standard libraries O(N) constraint.

This Achilles heel is already well documented in the algorithmic complexity
community.  The Blum, Floyd, Pratt, Rivest and Trarjan paper [BFRT73] and the
Floyd and Rivest paper [FR75] analyse the issues with median-of-k-medians, and
show that k>=5 is the lowest value capable of guaranteed fractional worst case.
i.e. they already consider and reject the algorithm given in the cited work
(k=3) for the purpose of exact median finding.

Anyway, I hope you find this interesting.  There will always be efficient
methods for finding approximate medians.  The question is how efficient vs.
how approximate.  Many quicksort implementation select the first element as a
pivot, an O(1) method for selecting an extremely approximate median! 
Statistically over all possible input orders, this first element will on
average partition the input array at the median, with some variance.  It's not
that the paper is wrong or incorrect; it does what it describes in finding a
statistically good approximate median very efficiently with excellent worst
case performance.  Unfortunately for the problem we need to solve, which is not
the problem the paper's authors were attempting to solve, we need a better
approximation perhaps using a more complex implementation.

Anyway, thanks again for the reference.  I'd not come across it before and
really enjoyed reading it.  Let me know if you spot a flaw in my reasoning
above.

Dr Roger Sayle,
Ph.D. Computer Science
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968

[Bug bootstrap/33781] [4.3/4.4 Regression] "Arg list too long" building libgcc.a

2008-06-12 Thread roger at eyesopen dot com



--- Comment #20 from roger at eyesopen dot com  2008-06-12 21:31 ---
Hi Ralf,

Thanks for your patch.

Sorry for the delay in replying, I needed to check out mainline on my IRIX
box and rebuild a baseline, and once that had completed "make -k check",
I tried with "--enable-fixed-point" first without, and then with your
patch.  The good news is that this allows the libgcc build to get further,
but unfortunately the bad news is that we die just a little further on with
a similar "execvp: /bin/sh: Arg list too long".

This second failure is where we run nm on all of the objects and pipe the
results through mkmap-flat.awk to create tmp-libgcc.map.  This looks to be
in the same libgcc/Makefile.in in the libgcc.map rule (when SHLIB_MKMAP is
defined).

I do like your PR33781.diff patch which moves us in the right direction.
Is it possible/safe to apply similar voodoo to the libgcc.map rule?

Many thanks again for your help.  I've no personal interest in using fixed
point arithmetic on the MIPS, but resolving this issue on IRIX helps keep
the build machinery portable.  If it's not IRIX now, it'll be some other
platform with a low MAXARGS limit in the near future.

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33781

[Bug rtl-optimization/25703] [4.2 Regression] ACATS cxa4024 failure

2006-01-24 Thread roger at eyesopen dot com



--- Comment #5 from roger at eyesopen dot com  2006-01-25 01:05 ---
I'm testing the following patch...

Index: combine.c
===
*** combine.c   (revision 109912)
--- combine.c   (working copy)
*** try_combine (rtx i3, rtx i2, rtx i1, int
*** 1967,1972 
--- 1967,1983 
  if (BITS_BIG_ENDIAN)
offset = GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0)))
 - width - offset;
+
+ /* If this is the low part, we're done.  */
+ if (subreg_lowpart_p (XEXP (dest, 0)))
+   ;
+ /* Handle the case where inner is twice the size of outer.  */
+ else if (GET_MODE_BITSIZE (GET_MODE (SET_DEST (temp)))
+  == 2 * GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0
+   offset += GET_MODE_BITSIZE (GET_MODE (XEXP (dest, 0)));
+ /* Otherwise give up for now.  */
+ else
+   offset = -1;
}
}
else if (subreg_lowpart_p (dest))

My apologies for any inconvenience.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC|        |roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25703

[Bug rtl-optimization/25703] [4.2 Regression] ACATS cxa4024 failure

2006-01-25 Thread roger at eyesopen dot com



--- Comment #11 from roger at eyesopen dot com  2006-01-25 19:52 ---
Created an attachment (id=10729)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10729&action=view)
patch v2

Here's a revised version of the patch that also handles the STRICT_LOW_PART
case.
My apologies once again for the inconvenience.  In the previous version of the
patch I'd mistakenly assumed that STRICT_LOW_PART was some indication that the
SUBREG only affected the "low_part".  Investigating Jan's testcase with
-mtune=i486, I now understand it really means STRICT_SUB_PART, and actually
behaves identically to SUBREG in this optimization, as we preserve all of the
unaffected bits anyway!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25703

[Bug c++/26079] New: Template instantiation behavior change in 4.1 (regression?)

2006-02-02 Thread roger at eyesopen dot com

The following short code fragment no longer compiles with gcc 4.1.
I've no clue if this a regression or mandated by the standard.

#include 
#include 
#include 

int size(char x) { return (int) sizeof(x); }
int size(int x) { return (int) sizeof(x); }

int size(const std::string &x) {
  return (int) x.size() + (int) sizeof(int);
}

template 
int size(const std::vector &x) {
  int result = (int) sizeof(int);
  typename std::vector::const_iterator iter;
  for (iter = x.begin() ; iter != x.end() ; iter++)
result += size(*iter);
  return result;
}

template 
int size(const std::pair &x) {
  return size(x.first) + size(x.second);
}

int foo() {
  std::vector > pvec;
  return size(pvec);
}


Sorry to not reduce a stand-alone testcase without headers.  The STL isn't
important.  The issue is that the list of candidates for "size(std::pair<...>)"
doesn't include the templates, only the functions, when instantiating
"size(std::vector<...>).

On IRC they thought this looked reasonable enough to file a PR.
This works fine in 4.0.2 and 3.4.x and many other C++ compilers.


-- 
   Summary: Template instantiation behavior change in 4.1
(regression?)
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: roger at eyesopen dot com
  GCC host triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26079

[Bug c++/26080] New: Template instantiation behavior change in 4.1 (regression?)

2006-02-02 Thread roger at eyesopen dot com

The following short code fragment no longer compiles with gcc 4.1.
I've no clue if this a regression or mandated by the standard.

#include 
#include 
#include 

int size(char x) { return (int) sizeof(x); }
int size(int x) { return (int) sizeof(x); }

int size(const std::string &x) {
  return (int) x.size() + (int) sizeof(int);
}

template 
int size(const std::vector &x) {
  int result = (int) sizeof(int);
  typename std::vector::const_iterator iter;
  for (iter = x.begin() ; iter != x.end() ; iter++)
result += size(*iter);
  return result;
}

template 
int size(const std::pair &x) {
  return size(x.first) + size(x.second);
}

int foo() {
  std::vector > pvec;
  return size(pvec);
}


Sorry to not reduce a stand-alone testcase without headers.  The STL isn't
important.  The issue is that the list of candidates for "size(std::pair<...>)"
doesn't include the templates, only the functions, when instantiating
"size(std::vector<...>).

On IRC they thought this looked reasonable enough to file a PR.
This works fine in 4.0.2 and 3.4.x and many other C++ compilers.


-- 
   Summary: Template instantiation behavior change in 4.1
(regression?)
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: roger at eyesopen dot com
  GCC host triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26080

[Bug c++/26080] Template instantiation behavior change in 4.1 (regression?)

2006-02-02 Thread roger at eyesopen dot com



--- Comment #1 from roger at eyesopen dot com  2006-02-02 18:43 ---


*** This bug has been marked as a duplicate of 26079 ***


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26080

[Bug c++/26079] Template instantiation behavior change in 4.1 (regression?)

2006-02-02 Thread roger at eyesopen dot com



--- Comment #1 from roger at eyesopen dot com  2006-02-02 18:43 ---
*** Bug 26080 has been marked as a duplicate of this bug. ***


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26079

[Bug bootstrap/26161] New: Configure tests for pthread.h sometimes need to use -pthread

2006-02-07 Thread roger at eyesopen dot com

The problem is that on some systems, including Tru64 and I believe AIX, the
compiler has to be passed the -pthread command line option in order to use
#include 

Effectively, the first lines of /usr/include/pthread.h contain the lines:

#ifndef _REENTRANT
#error POSIX pthreads are only available with the use of -pthreads
#endif

For this reason the autoconf tests of pthread.h in libstdc++-v3 and libgomp
always fail.  Fortunately, this was previously not serious, as the target
configurations would include pthread.h anyway, and all the relevant source
libraries are compiled with -pthread.  In directories where they don't, GCC
has workarounds, such as in gcc/gcc-posix.h which contains the lines:

/* Some implementations of  require this to be defined.  */
#ifndef _REENTRANT
#define _REENTRANT 1
#endif

#include 

This issue escalcated to a bootstrap failure in libgomp recently, which now
aborts whilst configuring libgomp when pthread.h isn't detected.  Prior to
this change, libgomp built fine and the test results were quite reasonable
on Alpha/Tru64. [Stretching the definition of a regression :-)]


I believe that what is needed is a "local" configure test for pthread.h that
first decides whether the compiler supports -pthread (for example, GCC on IRIX
currently does not), and then uses this flag testing for headers.  This is
perhaps similar to the related patch I posted recently, where we need to test
system header with the same compiler options we'll be using to build the source
files:  http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00139.html  See the
related definitions of THREADCXXFLAGS and THREADLDFLAGS in libjava's
configure.ac.


Unfortunately, my autoconf-fu isn't strong enough to tackle this.


The temporary work-around is to use --disable-libgomp.  The long-term fix
would be to port libgomp to use GCC's gthreads library.  But in the meantime,
it would be good to correct the test for pthread.h and/or add a PTHREAD_CFLAGS
that can be used any project.

I'm happy to test patches on affected systems.  However, it should be trivial
to re-create a model system with the above lines and using -D_REENTRANT as the
compiler option that needs to be passed.


-- 
   Summary: Configure tests for pthread.h sometimes need to use -
pthread
   Product: gcc
   Version: 4.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: roger at eyesopen dot com
  GCC host triplet: alpha*-*-osf*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161

[Bug bootstrap/26161] Configure tests for pthread.h sometimes need to use -pthread

2006-02-07 Thread roger at eyesopen dot com



--- Comment #2 from roger at eyesopen dot com  2006-02-07 21:15 ---
I've discovered your bootstrap failure is PR16787.  It'll take a while for me
to try out your XCFLAGS fix on my slow machine.  I'll also propose a fix for
PR16787.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161

[Bug bootstrap/26161] Configure tests for pthread.h sometimes need to use -pthread

2006-02-07 Thread roger at eyesopen dot com



--- Comment #3 from roger at eyesopen dot com  2006-02-08 04:04 ---
Subject: Re:  Configure tests for pthread.h sometimes
 need to use -pthread


On 7 Feb 2006, fxcoudert at gcc dot gnu dot org wrote:
> I tried to give it a look on alphaev68-dec-osf5.1b, but I couldn't
> get to the point of configuring libgomp :)
>
> cc -c -DHAVE_CONFIG_H -g  -I. -I../../gcc/libiberty/../include   -Wc++-compat
> ../../gcc/libiberty/floatformat.c -o ./floatformat.o
> cc: Error: ../../gcc/libiberty/floatformat.c, line 343: In this statement, the
> libraries on this platform do not yet support compile-time evaluation of the
> constant expression "0.0/0.0". (constfoldns)
> dto = NAN;


Hi FX,

Could you try the following for me, and I'll submit it to gcc-patches?
Unfortunately, my OSF_DEV PAK has expired so I rely on gcc for hosting
GCC.



2006-02-07  Roger Sayle  <[EMAIL PROTECTED]>
R. Scott Bailey  <[EMAIL PROTECTED]>

PR bootstrap/16787
* floatformat.c: Include  where available.
(NAN): Use value of DBL_QNAN if defined, and NAN isn't.


Index: floatformat.c
===
*** floatformat.c   (revision 110738)
--- floatformat.c   (working copy)
***
*** 1,5 
  /* IEEE floating point support routines, for GDB, the GNU Debugger.
!Copyright 1991, 1994, 1999, 2000, 2003, 2005
 Free Software Foundation, Inc.

  This file is part of GDB.
--- 1,5 
  /* IEEE floating point support routines, for GDB, the GNU Debugger.
!Copyright 1991, 1994, 1999, 2000, 2003, 2005, 2006
 Free Software Foundation, Inc.

  This file is part of GDB.
*** Foundation, Inc., 51 Franklin Street - F
*** 31,36 
--- 31,41 
  #include 
  #endif

+ /* On some platforms,  provides DBL_QNAN.  */
+ #ifdef STDC_HEADERS
+ #include 
+ #endif
+
  #include "ansidecl.h"
  #include "libiberty.h"
  #include "floatformat.h"
*** Foundation, Inc., 51 Franklin Street - F
*** 44,51 
--- 49,60 
  #endif

  #ifndef NAN
+ #ifdef DBL_QNAN
+ #define NAN DBL_QNAN
+ #else
  #define NAN (0.0 / 0.0)
  #endif
+ #endif

  static unsigned long get_field (const unsigned char *,
  enum floatformat_byteorders,


Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26161

[Bug libgomp/25936] libgomp needs to link against rt on HPUX

2006-02-08 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2006-02-08 17:46 ---
This problem affects both hppa*-hp-hpux* and ia64-hp-hpux*.  It appears that
the required sem_init, sem_wait, sem_post, etc... symbols are defined both in
the -lrt libraries on HPUX and in the -lc_r libraries.  The fix is to update
LIB_SPEC, perhaps in the -pthread clause, for HPUX, but I'm not sure if it
requires adding -lrt or changing -lc to -lc_r, or adding -lc_r?  I notice that
config/pa/pa-hpux10.h does mention -lc_r, but for use with -threads.

Should -pthread pull in the required symbols?  i.e. is this a libgomp problem
or a target problem?


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC||roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25936

[Bug target/22209] [4.1 regression] libgfortran unresolvable symbols on irix6.5

2006-02-09 Thread roger at eyesopen dot com



--- Comment #10 from roger at eyesopen dot com  2006-02-09 14:41 ---
Hi David, nm $objdir/gcc/libgcc.a contains both __ctzdi2 and __ctzti2 for me.

grasp% nm libgcc.a | grep ctz
_ctzsi2.o:
 T __ctzdi2
_ctzdi2.o:
 T __ctzti2

The post-commit bootstrap and regression test on IRIX 6.5.19m just completed
fine for me, with the following gfortran test results.

gfortran
# of expected passes11485
# of unexpected failures20
# of expected failures  12
# of unsupported tests  26

Could you investigate this failure a bit further?  I've no idea why you should
be seeing these problems.  If it makes any difference I configure with:

${SRCDIR}/configure --with-gnu-as --with-as=/usr/local/bin/as \
--with-gnu-ld --with-ld=/usr/local/bin/ld

Where the above as and ld are both binutils 2.16.  I've had trouble with
binutils 2.16.1's ld on IRIX built both with MIPSPro cc and gcc 3.4.3 so we
currently stick with 2.16, but I'll investigate if that makes a difference.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC|        |roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22209

[Bug target/22209] [4.1 regression] libgfortran unresolvable symbols on irix6.5

2006-02-09 Thread roger at eyesopen dot com



--- Comment #11 from roger at eyesopen dot com  2006-02-09 14:54 ---
p.s. I can also confirm that this patch fixes the test case in PR25028 for me
on mips-sgi-irix6.5.  This failed previously with undefined references to
__floattisf and __floattidf, but now not only compiles and links but produces
the correct output.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22209

[Bug other/25028] TImode-to-floating conversions broken

2006-02-09 Thread roger at eyesopen dot com



--- Comment #4 from roger at eyesopen dot com  2006-02-09 15:00 ---
My recent fix for PR target/22209 adding TImode support for MIPS, just fixed
this
PR's testcase for me on mips-sgi-irix6.5.  The new fix*.c and float*.c source
files may be useful in resolving the remaining PR25028 issue on ia64/HPUX?
I'll investigate.


-- 

roger at eyesopen dot com changed:

   What|Removed |Added

 CC|        |roger at eyesopen dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25028

1 2 >

1 - 100 of 181 matches

Mail list logo