gcc@gcc.gnu.org

2015-06-06 Thread Steven

Hi Mikhail,

Thanks for the comments. I haven't updated my GDB yet and I will test it 
again once I have a newer version GDB.



Yuhang

On 06/06/2015 09:31 PM, Mikhail Maltsev wrote:

On 07.06.2015 0:15, steven...@gmail.com wrote:

Dear GCC developers,

I have successfully compiled & installed GCC 4.9.2. Could you comment on the 
results of 'make check' (see below). Here is the relevant information:


You can verify it against published test results:
https://www.gnu.org/software/gcc/gcc-4.9/buildstat.html

=== gfortran tests ===


Running target unix
FAIL: gfortran.dg/guality/pr41558.f90  -O2  line 7 s == 'foo'
FAIL: gfortran.dg/guality/pr41558.f90  -O3 -fomit-frame-pointer  line 7 s == 
'foo'
FAIL: gfortran.dg/guality/pr41558.f90  -O3 -fomit-frame-pointer -funroll-loops  
line 7 s == 'foo'
FAIL: gfortran.dg/guality/pr41558.f90  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  line 7 s == 'foo'
FAIL: gfortran.dg/guality/pr41558.f90  -O3 -g  line 7 s == 'foo'
FAIL: gfortran.dg/guality/pr41558.f90  -Os  line 7 s == 'foo'


guality testsuite checks generated debug information. It's a functional
test, i.e. it performs real GDB invocation, so the result might also
depend on your version of GDB, it's settings, etc.

There are similar issues on CentOS 6 in this test report
https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg03335.html (though
it's i686).

BTW, this failure also reproduces for me on current trunk.





Dead include file: dwarf.h ?

2006-10-14 Thread Steven Bosscher
Hi,

As far as I can tell, dwarf.h is not included anywhere in gcc/
or any of its subdirectories.  Is there any reason not to remove
this file?

Thanks,

Gr.
Steven


Re: TARGET_SCHED_PROLOG defined twice

2006-10-18 Thread Steven Bosscher

On 10/18/06, Marcin Dalecki <[EMAIL PROTECTED]> wrote:

Looking at rs6000.opt I have found that the above command line switch
variable is defined TWICE:

msched-prolog
Target Report Var(TARGET_SCHED_PROLOG) Init(1)
Schedule the start and end of the procedure

msched-epilog
Target Undocumented Var(TARGET_SCHED_PROLOG) VarExists

This appears of course to be wrong.


The latter probably ought to be TARGET_SCHED_EPILOG, if that exists, eh?

Apparently we also don't have test cases to actually verify that the
proper forms of these options are accepted and have the desired
effect...

Gr.
Steven


Question about LTO dwarf reader vs. artificial variables and formal arguments

2006-10-21 Thread Steven Bosscher
Hello,

I want to make gfortran produce better debug information, but I want to do it 
in a way that doesn't make it hard/impossible to read back in sufficient 
information for LTO to work for gfortran.  

I haven't really been following the whole LTO thing much, but if I understand 
correctly, the goal is to reconstruct information about declarations from 
DWARF information that we write out for those declarations.  If that's the 
case, I wonder how LTO will handle artificial "variables" and formal argument 
lists. 

For example, gfortran adds additional formal arguments for functions that take 
a CHARACTER string as a formal argument, e.g.

program test
implicit none
call sub("Hi World!")

contains
   subroutine sub(c)
   character*10 c
   end subroutine

end

produces as a GIMPLE dump:

MAIN__ ()
{
  static void sub (char[1:10] &, int4);

  _gfortran_set_std (70, 127, 0);
  sub ("Hi World!", 9);
}


sub (c, _c)
{
  (void) 0;
}

where _c is strlen("Hi World!").  From a user perspective, it would be better 
to hide _c for the debugger because it is not something that the user had in 
the original program.  I have a patch to hide that parameter, that is, it 
stops GCC from writing out DW_TAG_formal_parameter for _c.  But I am worried 
about how this will work out later if/when someone tries to make LTO work for 
gfortran too.
Can you still reconstruct the correct function prototype for LTO from the 
debug info if you don't write debug info for _c?

Similarly, LTO has to somehow deal with DECL_VALUE_EXPR and the debug 
information that is produced from it.  Gfortran (and iiuc other front ends 
and SRA) use this DECL_VALUE_EXPR to produce fake variables that point to 
some location to improve the debug experience of the user.  For Fortran we 
use it to create fake variables to point at members of a COMMON block, for 
example, so that the user can do "p A" for a variable A in a common block, 
instead of "p name_of_the_common_block.A".  Is there already some provision 
to handle this kind of trickery in LTO?

Finally, consider another Fortran example:

program debug_array_dimensions
implicit none
integer i(10,10)
i(2,9) = 1
end

Gfortran currently produces the following wrong debug information for this 
example:

 <2><94>: Abbrev Number: 3 (DW_TAG_variable)
 DW_AT_name: i
 DW_AT_decl_file   : 1
 DW_AT_decl_line   : 1
 DW_AT_type: 
 DW_AT_location: 3 byte block: 91 e0 7c (DW_OP_fbreg: -416)
 <1>: Abbrev Number: 4 (DW_TAG_array_type)
 DW_AT_type: 
 DW_AT_sibling : 
 <2>: Abbrev Number: 5 (DW_TAG_subrange_type)
 DW_AT_type: 
 DW_AT_lower_bound : 0
 DW_AT_upper_bound : 99
 <1>: Abbrev Number: 6 (DW_TAG_base_type)
 DW_AT_byte_size   : 8
 DW_AT_encoding: 5  (signed)
 DW_AT_name: int8
 <1>: Abbrev Number: 6 (DW_TAG_base_type)
 DW_AT_byte_size   : 4
 DW_AT_encoding: 5  (signed)
 DW_AT_name: int4

Note the sinlge DW_TAG_subrange_type <0, 99> for the type of "i", instead of 
two times DW_TAG_subrange_type <1, 10> instead.  This happens because in 
gfortran all arrays are flattened (iirc to make code generation easier).  I 
would like to make gfortran write out the correct debug information, e.g. 
something with

 <2>: Abbrev Number: 5 (DW_TAG_subrange_type)
 DW_AT_type: 
 DW_AT_upper_bound : 10
 <2>: Abbrev Number: 5 (DW_TAG_subrange_type)
 DW_AT_type: 
 DW_AT_upper_bound : 10

but what would happen if LTO reads this in and re-constructs the type of "i" 
from this information?  I imagine it would lead to mis-matches of the GIMPLE 
code that you read in, where "i" is a 1x100 array, and the re-constructed 
variable "i" which would be a 10x10 2D array.

Has anyone working on LTO already thought of these challanges?

I'm all new to both DWARF and LTO, so forgive me if my rant doesn't make 
sense ;-)  

Gr.
Steven



Re: Re: LOOP_HEADER tree code?

2006-10-25 Thread Steven Bosscher

On 10/25/06, Devang Patel <[EMAIL PROTECTED]> wrote:

> > However, various optimizer needs to know about this special tree node.
>
> not really (not any more than they know about other tree codes that are
> not interesting for them).

If we take an example of Jump Threading pass then it needs to  know
about this tree node and update it properly.


Yes, when it modifies the CFG in ways that affect the loops info. And
one nice thing about this LOOP_HEADER idea is that, in your example,
Jump Threading:
- can see that node so it knows there is something to update
- knows what it is changing so it also knows how that affects the loops info
- can change it on-the-fly

This means, no need for a cleanup pass after all changes are done.


So, the passes that maniuplate loop structure need to know about
LOOP_HEADER and others do not need to worry about LOOP_HEADER.


More acurately, the passes that manipulate the cfg. Right now most of
these passes don't even know they modify the loop structure.


Now, focusing on the passes that manipulate loop structure. Are these
pass responsible for fixing loop info or is it responsiblity of cleanup pass ?


It seems to me that a cleanup pass would defeat the purpose of keeping
loop info up to date. Your cleanup pass would probably end up just
recomputing everything.

That said, I don't really see what a LOOP_HEADER node would give you
that you can't get by making the cfg-modifying passes actually
loop-aware, or perhaps by using cfghooks to update the loop
information on the fly when a pass changes the CFG. It would be
helpful if Zdenek could give an example where a LOOP_HEADER node is
really the only way to help keep loop info accurate.

Gr.
Steven


Re: Re: Re: Re: LOOP_HEADER tree code?

2006-10-25 Thread Steven Bosscher

On 10/25/06, Devang Patel <[EMAIL PROTECTED]> wrote:

> > One way to achieve this is to mark n_1 (in your example) as
> > "do not dead strip because I know it is used" , kind of attribute((used)).
>
> This is what as I understand LOOP_HEADER is used for.

Big difference. New tree vs TREE_USED or DECL_PRESERVE_P bit.


DECL_PRESERVE_P wouldn't work, because afaiu the number of iterations
is stored in an SSA_NAME tree node , not a *DECL node.

You could use TREE_USED,  but your suggestion implies that dead code
should be retained in the program, just for the sake of knowing how
many iterations a loop has. I wouldn't be surprised if some passes are
not prepared to handle that, and it sounds like just a really bad
idea.

Gr.
Steven


Re: Re: LOOP_HEADER tree code?

2006-10-25 Thread Steven Bosscher

On 10/25/06, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:

it definitely is not the only way, and seeing the reaction of people,
I probably won't use it.  The main reason for considering to use
the tree node for me was the possibility to make the number of iterations
of the loop as its operand, so that I would not need to worry about
keeping it alive through dce, copy/constant propagation, etc. (without
a statement carrying it in IL, I do not see a solution that would not
be just asking for introducing bugs and getting broken accidentally).


I wouldn't give up so fast.  If there are convincing technical reasons
for this kind of tree node, then your idea should be seriously
considered.  Many people thought ASSERT_EXPRs were a really bad idea
too, when they were invented...

Gr.
Steven


Re: Re: Re: Re: Re: LOOP_HEADER tree code?

2006-10-25 Thread Steven Bosscher

On 10/26/06, Devang Patel <[EMAIL PROTECTED]> wrote:

On 10/25/06, Steven Bosscher <[EMAIL PROTECTED]> wrote:

> You could use TREE_USED,  but your suggestion implies that dead code
> should be retained in the program,

May be I misunderstood, but it is not dead code. Here is what Zdenek said,

"
...
To keep the information valid, we need
> >   to prevent optimizations from destroying it (e.g., if the number
> >   is n_1 = n_2 - 1, and this is the last use of n_1, we do not want
> >   DCE to remove it);

..."


So you would mark n_1 with TREE_USED, and never let it be removed?
What would happen if e.g. the entire loop turns out to be dead code?
Or if the loop is rewritten (e.g. vectorized) in a way that changes
the number of iterations of the loop? Then the assignment to n_1 would
be _really_ dead, but there wouldn't be any way to tell.

The nice thing about the LOOP_HEADER node is that it makes these uses
of SSA names explicit.

Gr.
Steven


Re: Re: LOOP_HEADER tree code?

2006-10-26 Thread Steven Bosscher

On 10/26/06, Jeffrey Law <[EMAIL PROTECTED]> wrote:

> So, the passes that maniuplate loop structure need to know about
> LOOP_HEADER and others do not need to worry about LOOP_HEADER.
Passes which do code motions may need to know about it -- they don't
need to update its contents, but they may need to be careful about
how statements are moved around in the presence of a LOOP_HEADER note.


It is not a note, it's a statement. The problem with RTL loop notes
was that they were not statements, but rather markers, e.g. "a loop
starts/ends here".  The LOOP_HEADER node, on the other hand, is more
like a placeholder for the result of the number of iterations
computation. Basically it is a statement that does not produce a
result, but does have uses.

I don't see why a code motion pass would have to worry about the
LOOP_HEADER node. The LOOP_HEADER node is before the loop, IIUC, so
any code moved out of the loop would not affect the value of the use
operand for the LOOP_HEADER (by definition, because we're in SSA form
so DEFs inside the loop can't reach the LOOP_HEADER node).

Gr.
Steven


Re: build failure, GMP not available

2006-10-30 Thread Steven Bosscher

On 30 Oct 2006 22:56:59 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:


I'm certainly not saying that we should pull out GMP and MPFR.  But I
am saying that we need to do much much better about making it easy for
people to build gcc.


Can't we just make it so that, if gmp/  amd mpfr/ directories exist in
the toplevel, they are built along with GCC?  I don't mean actually
including gmp and mpfr in the gcc SVN repo, but just making it
possible to build them when someone unpacks gmp/mpfr tarballs in the
toplevel dir.

Gr.
Steven


Re: build failure, GMP not available

2006-10-31 Thread Steven Bosscher

On 10/31/06, Marcin Dalecki <[EMAIL PROTECTED]> wrote:

This question is not related to the apparent
instability and thus
low quality of GMP/MPFR at all.


This is the second time I see someone complain about GMP/MPFR
instability. What is this complaint based on?  We've used GMP in g95
and later gfortran since the project incarnation 7 years ago, and as
far as I know we've never had to change anything for reasons of
instability. In fact, AFAIK we still had source compatibility when we
moved from GMP3 to GMP4. Is there some bug report / web page somewhere
that describes the instability problems you folks apparently have on
Macs?

Gr.
Steven


Re: defunct fortran built by default for cross-compiler

2006-11-01 Thread Steven Bosscher

On 11/1/06, Joern RENNECKE <[EMAIL PROTECTED]> wrote:

With literally more than ten thousand lines of error messages per
multilib for fortran, that makes the test results unreportable.


So you don't report any error messages at all and leave us guessing?

Gr.
Steven


Re: [PING] fwprop in 4.3 stage 1?

2006-11-01 Thread Steven Bosscher

On 10/31/06, Roger Sayle <[EMAIL PROTECTED]> wrote:

I foresee no problems in getting the fwprop pass merged into mainline
this week.  One detail I would like resolved however, is if you and
Steven Bosscher could confirm you're both co-ordinating your efforts.
Presumably, adding fwprop is part of the agreed upon game-plan, and
not something that will complicate Steven's CSE efforts.


We're not co-ordinating the effort right now, but we've obviously been
working very hard together in GCC 4.2 stage1, and fwprop was "part of
the plan" back then to eliminate CSE path following completely (a goal
that I've since abandoned).

What fwprop should achieve, is:
- catch the optimizations we miss with CSE skip-blocks disabled
- make the first gcse.c local const/copy prop pass redundant

It used to do both these things quite well late last year, and I have
no reason to believe that it would be any different right now. The
only downside is that the compile time benefit is not as big as it
would have been if CSE path following could have been eliminated, but
fwprop is really fast anyway.

Also, fwprop is a nice example pass for how to use df.c and how to use
the CFG instead of working around it like CSE does ;-)

So, having fwprop in the trunk will only be a good thing IMHO.

Gr.
Steven


Re: Handling of extern inline in c99 mode

2006-11-01 Thread Steven Bosscher

On 11/1/06, Paolo Bonzini <[EMAIL PROTECTED]> wrote:


> According to the proposal, we will restore the GNU handling for
> "extern inline" even when using -std=c99, which will fix the problem
> when using glibc.

I am probably overlooking something, but if the only problematic system
is glibc, maybe this can be fixed with a fixincludes hack?


That would be a massive hack.

Gr.
Steven


Re: GCSE again: bypass_conditional_jumps -vs- commit_edge_insertions - problem with ccsetters?

2006-11-01 Thread Steven Bosscher

On 11/2/06, Roger Sayle <[EMAIL PROTECTED]> wrote:

Steven Bosscher might even have plans for reorganizing jump bypassing
already as part of his CSE/GCSE overhaul?


Yes, and one part of that plan is to pre-split all critical edges so
that you never have to insert on edges.  That would make your problem
go away, iiuc.

Gr.
Steven


Re: compiling very large functions.

2006-11-05 Thread Steven Bosscher

On 11/5/06, Richard Guenther <[EMAIL PROTECTED]> wrote:

> I lean to leave the numbers static even if they do increase as time goes
> by.  Otherwise you get two effects, the first optimizations get to be
> run more, and you get the wierd non linear step functions where small
> changes in some upstream function effect the down stream.

Ok, I guess we can easily flag each function as having
 - many BBs
 - big BBs
 - complex CFG (many edges)
and set these flags at CFG construction time during the lowering phase
(which is after the early inlining pass I believe).


IMHO any CFG-based criteria should be using dynamic numbers, simply
because they are  available at all times. Large BBs is a more
interesting one, because in general they don't get smaller during
optimizations.

What Kenny suggests here is not new, BTW.  I know that gcse already
disables itself on very large functions (see
gcse.c:is_too_expensive()), and probably some other passes do this as
well. A grep for OPT_Wdisabled_optimization *should* show all the
places where we throttle or disable passes, but it appears that
warnings have not been added consistently when someone throttled a
pass.

AFAIK not one of the tree optimizers disables itself, but perhaps we
should. The obvious candidates would be the ones that require
recomputation of alias analysis, and the ones that don't update SSA
info on the fly (i.e. require update_ssa, which is a horrible compile
time hog).

Gr.
Steven


Re: compiling very large functions.

2006-11-05 Thread Steven Bosscher

On 11/5/06, Eric Botcazou <[EMAIL PROTECTED]> wrote:

> AFAIK not one of the tree optimizers disables itself, but perhaps we
> should. The obvious candidates would be the ones that require
> recomputation of alias analysis, and the ones that don't update SSA
> info on the fly (i.e. require update_ssa, which is a horrible compile
> time hog).

Tree alias analysis can partially disable itself though:

  /* If the program has too many call-clobbered variables and/or function
 calls, create .GLOBAL_VAR and use it to model call-clobbering
 semantics at call sites.  This reduces the number of virtual operands
 considerably, improving compile times at the expense of lost
 aliasing precision.  */
  maybe_create_global_var (ai);

We have found this to be quite helpful on gigantic elaboration procedures
generated for Ada packages instantiating gazillions of generics.  We have
actually lowered the threshold locally.


Heh, I believe you! :-)

IMHO we should add a OPT_Wdisabled_optimization warning there, though.

Gr.
Steven


Re: compiling very large functions.

2006-11-05 Thread Steven Bosscher

On 11/5/06, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:

I would like to point out that the central point of my proposal was to
have the compilation manager be the process that manages if an
optimization is skipped or not rather than having each pass make a
decision on it's own.  If we have a central mechanism, then it is
relative easy to find some sweet spots.  If every pass rolls its own, it
is more difficult to balance.


Hmm, I don't understand this.  Why is it harder to find a sweet spot
if every pass decides for itself whether to run or not?  I would think
that this decision should be made by each pass individually, because
the pass manager is one abstraction level higher where it shouldn't
have to know the behavior of each pass.

Gr.
Steven


Re: Polyhedron performance regression

2006-11-11 Thread Steven Bosscher

On 11/11/06, Paul Thomas <[EMAIL PROTECTED]> wrote:

Richard,
>
> If I had to guess I would say it was the forwprop merge...
The what? :-)


fwprop, see
http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00141.html

If someone can confirm that this patch causes the drop, I can help
trying to find a fix.

Gr.
Steven


Re: vectorizer data dependency graph

2006-11-15 Thread Steven Bosscher

On 11/15/06, Sebastian Pop <[EMAIL PROTECTED]> wrote:


There is a ddg in this patch if somebody wants the classic Allen&Kennedy
way to look at the dependences:
http://gcc.gnu.org/wiki/OptimizationCourse?action=AttachFile&do=get&target=loop-distribution-patch-against-gcc-4.1.0-release.patch



Any plans to merge this into the FSF trunk?

Gr.
Steven


Re: EXPR_HAS_LOCATION seems to always return false

2006-11-16 Thread Steven Bosscher

On 11/17/06, Brendon Costa <[EMAIL PROTECTED]> wrote:

Is there something i should be doing before using EXPR_HAS_LOCATION() ?


Compile with -g, perhaps?

Gr.
Steven


Why does flow_loops_find modify the CFG, again?

2006-11-18 Thread Steven Bosscher
Hi Zdenek, all,

I'm running into some troubles with an if-conversion pass that runs
after reload, where we have to avoid lifting insns across a loop
exit edge into a loop.  ifcvt.c uses flow_loops_find to find loops
and mark all loop exit edges:

  if ((! targetm.cannot_modify_jumps_p ())
  && (!flag_reorder_blocks_and_partition || !no_new_pseudos
  || !targetm.have_named_sections))
{
  struct loops loops;

  flow_loops_find (&loops);
  mark_loop_exit_edges (&loops);
  flow_loops_free (&loops);
  free_dominance_info (CDI_DOMINATORS);
}

I was wondering why we would sometimes *not* mark exit edges, but then
I remembered that for some reason flow_loops_find modifies the CFG,
which may lead to problems that we have to work around here.

But if we do not mark loop exit edges, we can sometimes end up doing
unprofitable if-conversions!

It seems to me that a function called "flow_loops_find" is supposed to
do *just* analysis, and not transformations.  Apparently it now first
transforms all loops into some canonical form, but that is completely
inappropriate and unnecessary for some users of this loops analysis.

Is this something that could be easily fixed?  E.g. can we make it
that flow_loops_find only performs transformations if asked to (by
adding a function argument for that)?

Gr.
Steven



Re: [avr-gcc-list] Re: AVR byte swap optimization

2006-11-19 Thread Steven Bosscher

On 11/19/06, Eric Weddington <[EMAIL PROTECTED]> wrote:

> Use gcc head, __builtin_bswap and make sure the AVR backend
> implements the
> bswap rtl patterns.

There's the problem. You can't just glibly say "make sure the AVR backend
implements the bswap rtl patterns". There are precious few volunteers who
are familiar enough with gcc internals and the avr port in particular to go
do just that. AFAIK, there is no bswap rtl pattern in the avr port, at least
there doesn't seem to be in 4.1.1.


Why is that a problem?
Do you have a different solution in mind?


> Future versions of gcc may also be able to recognise these
> idioms without
> using the builtin, but AFAIK that's not been implemented yet.

Plus there is a long lead time between when it is implemented on HEAD, then
branched, released from a branch, and then when it shows up in binary
distributions.


That happens with all improvements that are implemented between
releases, so I don't see your point.

Gr.
Steven


Re: [PATCH] Canonical types (1/3)

2006-11-28 Thread Steven Bosscher

On 11/28/06, Doug Gregor <[EMAIL PROTECTED]> wrote:

* tree.h (TYPE_CANONICAL): New.
(TYPE_STRUCTURAL_EQUALITY): New.
(struct tree_type): Added structural_equality, unused_bits,
canonical fields.


If I understand your patches correctly, this stuff is only needed for
the C-family languages.  So why steal two pointers on the generic
struct tree_type?  Are you planning to make all front ends use these
fields, or is it just additional bloat for e.g. Ada, Fortran, Java?
;-)

Gr.
Steven


Re: rtl dumps

2006-12-01 Thread Steven Bosscher

On 12/1/06, Andrija Radicevic <[EMAIL PROTECTED]> wrote:

Hi,

I have noticed that the INSN_CODE for all patterns in the rtl dumps
.00.expand are -1 ... does this mean that the .md file was not used for the
initial RTL generation?


It was used, but it is assumed that the initial RTL produced by
'expand' is valid, i.e. you should be able to call recog() on all
insns and not fail.

Gr.
Steven


Re: expand_builtin_memcpy bug exposed by TER and gfortran

2006-12-05 Thread Steven Bosscher

On 12/5/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:

My preference is to check in the TER code which exposes this bug, and
open a PR against the failure with this info.  That way we don't lose
track of the problem, and someone can fix it at their leisure. Until
then there will be a testsuite failure in gfortran for the testcase
which triggers this.

Does that seem reasonable? or would everyone prefer I get it fixed
before checking in the TER code?


No, IMHO.

It's unfortunate enough if a patch introduces a bug that we only find
later.  It's Very Bad And Very Wrong to allow in patches that cause
test suite failures.

Frankly, I don't understand why you even ask.  We have rules for
testing for a reason.

Gr.
Steven


Re: void* vector

2006-12-09 Thread Steven Bosscher

On 12/9/06, Alexey Smirnov <[EMAIL PROTECTED]> wrote:

typedef void* handle_t;

DEF_VEC_I(handle_t);
DEF_VEC_ALLOC_I(handle_t,heap);


Why DEF_VEC_I instead of DEF_VEC_P?

See vec.h.

Gr.
Steven


Re: Bootstrap broken on mipsel-linux...

2006-12-10 Thread Steven Bosscher

On 12/11/06, David Daney <[EMAIL PROTECTED]> wrote:

 From svn r119726 (Sun, 10 Dec 2006) I am getting an ICE during
bootstrap on mipsel-linux.  This is a new failure  since Wed Dec 6
06:34:07 UTC 2006 (revision 119575) which bootstrapped and tested just
fine. I don't really want to do a regression hunt as bootstraps take 3
or 4 days for me.  I will update and try it again.


No need.  It's my CSE patch, no doubt:
http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00698.html

I'll try to figure out what's wrong.


/home/build/gcc-build/./prev-gcc/xgcc
-B/home/build/gcc-build/./prev-gcc/
-B/usr/local/mipsel-unknown-linux-gnu/bin/ -c   -g -O2 -DIN_GCC   -W
-Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings
-Wold-style-definition -Wmissing-format-attribute -Werror -fno-common
-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
-I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include
-I../../gcc/gcc/../libdecnumber -I../libdecnumber
../../gcc/gcc/c-decl.c -o c-decl.o
../../gcc/gcc/c-decl.c: In function 'set_type_context':
../../gcc/gcc/c-decl.c:691: internal compiler error: in cse_find_path,
at cse.c:5930
Please submit a full bug report,
with preprocessed source if appropriate.


Sic :-)  A test case would be helpful.

Gr.
Steven


Re: Bootstrap broken on mipsel-linux...

2006-12-11 Thread Steven Bosscher

On 12/11/06, David Daney <[EMAIL PROTECTED]> wrote:

Lets assume that it doesn't effect i686 or x86_64.  Because if it did,
someone else would have been hit by it by now.


I'm sure it doesn't, I bootstrapped&tested on those targets (and on ia64).


So you would need a mips[el]-linux system in order to reproduce it.  But
if you had that, you could compile c-decl.c yourself to reproduce it.
But if you really want it, I can get you a preprocessed version of
c-decl.c.  I suppose one could try it on a  cross-compiler, but I have
no idea if that would fail in the same manner.


If you have a test case, I should be able to reproduce it with a
cross.  Getting a test case with a cross-compiler is the more
difficult part.  I could try to use a preprocessed c-decl.c from the
cross-compiler configuration. But it wouldn't be the same input file
as the one from your ICE, so whether that would allow me to reproduce
the problem remains to be seen. If you have a preprocessed c-decl.c
that ICEs for you, that would be helpful.  If not, I'll just have to
figure out a way to reproduce the ICE in some different way.

Gr.
Steven


Re: Bootstrap broken on mipsel-linux...

2006-12-11 Thread Steven Bosscher

On 12/11/06, Kaz Kojima <[EMAIL PROTECTED]> wrote:

It seems that the first tree dump which differs before and
after r119711 is .099t.optimized.


In that case, this is a different problem, probably caused by the new
out-of-SSA pass.  But to be sure, I suggest you revert my CSE patch
and see if that makes the problem go away for you.

Gr.
Steven


Re: Bootstrap broken on mipsel-linux...

2006-12-11 Thread Steven Bosscher

On 12/12/06, Kaz Kojima <[EMAIL PROTECTED]> wrote:

"Steven Bosscher" <[EMAIL PROTECTED]> wrote:
> In that case, this is a different problem, probably caused by the new
> out-of-SSA pass.  But to be sure, I suggest you revert my CSE patch
> and see if that makes the problem go away for you.

I've confirmed that that problem is remained after reverting
r119706 changes of cse.c.  So it may be another problem, though
it might produce a wrong stage 1 compiler for mipsel-linux and
end up with the ICE in stage 2.


In the mipsel-linux case, we ended up with a diamond region where the
jump in the IF-block was folded, so that we could extend the path
along one of the diamond's arms with the JOIN-block.  This could
happen because cse_main traversed the basic blocks in DFS order
instead of in topological order.  I have just posted a hopeful fix for
this.

Gr.
Steven


Re: 32 bit jump instruction.

2006-12-13 Thread Steven Bosscher

On 12/13/06, Joern Rennecke <[EMAIL PROTECTED]> wrote:

In http://gcc.gnu.org/ml/gcc/2006-12/msg00328.html, you wrote:
However, because the SH has delayed branches, there is always a guaranteed way
to find a register - one can be saved, and then be restored in the delay slot.


Heh, that's an interesting feature :-)

How does that work?  I always thought that the semantics of delayed
insns is that the insn in the delay slot is executed *before* the
branch. But that is apparently not the case, or the branch register
would have been over-written before the branch. How does that work on
SH?

Gr.
Steven


Re: g++ doesn't unroll a loop it should unroll

2006-12-13 Thread Steven Bosscher

On 12/13/06, Benoît Jacob <[EMAIL PROTECTED]> wrote:

g++ -DUNROLL -O3 toto.cpp -o toto   ---> toto runs in 0.3 seconds
g++ -O3 toto.cpp -o toto---> toto runs in 1.9 seconds

So what can I do? Is that a bug in g++? If yes, any hope to see it fixed soon?


You could try adding -funroll-loops.

Gr.
Steven


Re: Memory allocation for local variables.

2006-12-13 Thread Steven Bosscher

On 12/13/06, Sandeep Kumar <[EMAIL PROTECTED]> wrote:

Hi all,
I tried compiling the above two programs :
on x86, 32 bit machines.
[EMAIL PROTECTED] ~]# gcc test.c


Try with optimization enabled (try -O1 and/or -O2).

Gr.
Steven


Re: Back End Responsibilities + RTL Generation

2006-12-13 Thread Steven Bosscher

On 12/13/06, Frank Riese <[EMAIL PROTECTED]> wrote:

One of my professors stated that a GCC Back End uses the Control Flow Graph as
its input and that generation of RTL expressions occurs later on.


That is not true.


What roles
do Back and Middle End play in generation of RTL? Would you consider the CFG
or RTL expressions as the input for a GCC Back End?


Let me first say that the definitions of front end, back end, and
middle end are a bit hairy.  You have to carefully define what you
classify as belonging to the middle end or the back end.  I actually
try to avoid the terms nowadays.

Also, you have to be specific about the version of GCC that you're
talking about. GCC2, GCC3 and GCC4 are completely different
internally, and even the differences between various GCC4 releases are
quite significant.

Anyway...

The steps through the compiler are as follows:

1. front end runs, produces GENERIC
2. GENERIC is lowered to GIMPLE
3. a CFG is constructed for GIMPLE
4. GIMPLE (tree-ssa) optimizers run
5. GIMPLE is expanded to RTL, while preserving the CFG
6. RTL optimizers run
7. assembly is written out

The RTL generation in step 5 is done one statement at a time.   The
part of the compiler that generates the RTL is a mix of shared code
and of back end code: A single GIMPLE statement at a time is passed to
the middle-end expand routines, which tries to produce RTL for this
statement using instructions available on the target machine.  The
available instructions are defined by the target machine description
(i.e. the back end).

Try to understand cfgexpand.c and the section on named RTL patterns in
the GCC internals manual.


I also remembered having read the following line from the gcc internals
documentation. However, I'm still not sure how to interpret this:

"A control flow graph (CFG) is a data structure built on top of the
intermediate code representation (the RTL or tree instruction stream)
abstracting the control flow behavior of a function that is being compiled"

Does that mean that a control flow graph is built after rtl has been generated
or that information about that information about the control flow is
incorporated into the RTL data structures?


Neither.

I'm assuming you're interested in how this works in recent GCC
releases, i.e. GCC4 based. In GCC4, the control flow graph is built on
GIMPLE, the tree-ssa optimizers need a CFG too.  This CFG is kept
up-to-date through the optimizers and through expansion to RTL.  This
means that GCC builds the CFG only once for each function.

The data structures for the CFG are in basic-block.h.  These data
structures are most definitely *not* incorporated into the RTL
structures.  The CFG is independent of the intermediate
representations for the function instructions.  It has to be, or you
could have the same CFG data structures for both GIMPLE and RTL.

Hope this helps,

Gr.
Steven


Re: g++ doesn't unroll a loop it should unroll

2006-12-13 Thread Steven Bosscher

On 12/14/06, Benoît Jacob <[EMAIL PROTECTED]> wrote:

I don't understand why you say that. At the language specification level,
templates come with no inherent speed overhead. All of the template stuff is
unfolded at compile time, none of it remains visible in the binary, so it
shouldn't make the binary slower.


You're confusing theory and practice...

Gr.
Steven


Re: Do we want non-bootstrapping "make" back?

2006-12-30 Thread Steven Bosscher

On 12/30/06, Daniel Jacobowitz <[EMAIL PROTECTED]> wrote:

Once upon a time, the --disable-bootstrap configure option wasn't
necessary.  "make" built gcc, and "make bootstrap" bootstrapped it.

Is this behavior useful?  Should we have it back again?


For me the current behavior works Just Fine.

Gr.
Steveb


Nested libcalls (was: Re: RFC: SMS problem with emit_copy_of_insn_after copying REG_NOTEs)

2006-12-30 Thread Steven Bosscher
On Sunday 31 December 2006 00:59, Jan Hubicka wrote:
> > Also I should mention, this also fixes a possible bug with libcalls that
> > are embedded in one another.  Before we were just assuming if we have a
> > REG_RETVAL, then the previous REG_LIBCALL would be the start of the
> > libcall but that would be incorrect with embedded libcalls.
>
> We should not have nested libcalls at all.  One level of libcalls is
> painful enough and we take care to not do this.

It's unclear whether we can have nested libcalls or not.  We expect them
in some places (especially, see libcall_stack in gcse.c:local_cprop_pass)
but are bound to fail miserably in others.

This is something I've been wondering for a while.  Maybe someone can
give a definitive answer: Can libcalls be nested, or not?

Gr.
Steven


Re: changing "configure" to default to "gcc -g -O2 -fwrapv ..."

2006-12-31 Thread Steven Bosscher

On 12/31/06, Paul Eggert <[EMAIL PROTECTED]> wrote:

Also, as I understand it this change shouldn't affect gcc's
SPEC benchmark scores, since they're typically done with -O3
or better.


It's not all about benchmark scores.  I think most users compile at
-O2 and they also won't understand why they get a performance drop on
their code.

You say you doubt it affects performance.  Based on what?  Facts
please, not guesses and hand-waiving...

Gr.
Steven


Re: gcc 3.4 > mainline performance regression

2007-01-05 Thread Steven Bosscher

On 05 Jan 2007 07:18:47 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

At the tree level, the problem is that the assignment to a[0] is seen
as aliasing a[1].  This causes the use of a[1] to look like a USE of
an SMT, and the assignment to a[0] to look like a DEF of the same
SMT.  So in tree-ssa-loop-im.c the statements look like they are not
loop invariant.

I don't know we can do better with our current aliasing
representation.  Unless we decide to do some sort of array SRA.

Or perhaps we could make the loop invariant motion pass more
complicated: when it sees a use or assignment of a memory tag, it
could explicitly check all the other uses/assignments in the loop and
see if they conflict.  I don't really know how often this would pay
off, though.


How about using dependence analysis instead?



At the RTL level we no longer try to hoist MEM references out of
loops.  We now assume that is handled at the tree level.


We do hoist MEMs out of loops, in gcse.c.

Gr.
Steven


Re: gcc 3.4 > mainline performance regression

2007-01-05 Thread Steven Bosscher

On 1/5/07, Andrew Haley <[EMAIL PROTECTED]> wrote:

This is from the gcc-help mailing list.  It's mentioned there for ARM,
but it's just as bad for x86-64.

It appears that memory references to arrays aren't being hoisted out
of loops: in this test case, gcc 3.4 doesn't touch memory at all in
the loop, but 4.3pre (and 4.2, etc) does.

Here's the test case:

void foo(int *a)
{   int i;
for (i = 0; i < 100; i++)
   a[0] += a[1];
}

gcc 3.4.5 -O2:

.L5:
leal(%rcx,%rsi), %edx
decl%eax
movl%edx, %ecx
jns .L5

gcc 4.3pre -O2:

.L2:
addl4(%rdi), %eax
addl$1, %edx
cmpl$100, %edx
movl%eax, (%rdi)
jne .L2

Thoughts?


What does the code look like if you compile with -O2  -fgcse-sm?

Gr.
Steven


Re: gcc 3.4 > mainline performance regression

2007-01-05 Thread Steven Bosscher

On 1/5/07, David Edelsohn <[EMAIL PROTECTED]> wrote:

>>>>> Steven Bosscher writes:

Steven> What does the code look like if you compile with -O2  -fgcse-sm?

Yep.  Mark and I recently discussed whether gcse-sm should be
enabled by default at some optimization level.  We're hiding performance
from GCC users.


The problem with it used to be that it was just very broken. When I
fixed PR24257, it was still not possible to bootstrap with gcse store
motion enabled.

Putting someone on fixing tree load&store motion is probably more
useful anyway, if you're going to do load&store motion for
performance.  In RTL, we can't move loads and stores that are not
simple loads or stores (i.e. reg <- mem, or mem <- reg). There are two
very popular targets where this is the common case ;-)

Gr.
Steven


We have no active maintainer for the i386 port

2007-01-06 Thread Steven Bosscher
Hi,

We currently do not have an active maintainer for the i386 port.  The
only listed maintainer for the port is rth, and he hasn't been around
to approve patches in a while.   This situation is a bit strange for
a port that IMHO is one of the most important ports GCC has...

In the mean time, patches don't get approved (see e.g. [1]), or they
get approved by middle-end maintainers who, strictly speaking, should
not be approving backend patches, as I understand it.

So, can the SC please appoint a new/extra i386 port maintainer?

Thanks,

Gr.
Steven



[1] http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00379.html


Re: dump after RTL expand

2007-01-11 Thread Steven Bosscher

On 1/11/07, Andrija Radicevic <[EMAIL PROTECTED]> wrote:

Hi,
how could I find out from which patterns, in the md file, the 00.expand file 
was generated (i.e. to map the patterns in the expand file with the ones in the 
.md file)? Is there a compiler option/switch which would tell the compiler mark 
the patterns in the expand file with the insns names from the md file?


There isn't.

You would have to walk over the insn and make recog assign them an insn code.

Gr.
Steven


Re: dump after RTL expand

2007-01-12 Thread Steven Bosscher

On 1/12/07, Andrija Radicevic <[EMAIL PROTECTED]> wrote:

> On Thursday 11 January 2007 19:27, Steven Bosscher wrote:
> > On 1/11/07, Andrija Radicevic <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > > how could I find out from which patterns, in the md file, the
> 00.expand
> > > file was generated (i.e. to map the patterns in the expand file with
> the
> > > ones in the .md file)? Is there a compiler option/switch which would
> tell
> > > the compiler mark the patterns in the expand file with the insns names
> > > from the md file?
> >
> > There isn't.
> >
> > You would have to walk over the insn and make recog assign them an insn
> > code.
>
> That still wouldn't tell you what names were used to generate them. It's
> common to have a named expander that generates other (possibly anonymous
> insns).
>

Does that mean that the expand file isn't the dump after the initial rtl
generation phase? According to internals manual, only the named
define_insn and define_expand are used during rtl generation phase.


The manual is correct, but the define_expands can produce the anonymous insns.

If you recog an insn that isn't a named pattern, you still get the
"name" of the define_insn (with the "*" in front of it) or just "" if
the insn doesn't have a name. You always get at least the insn code.

Gr.
Steven


Ada and the TREE_COMPLEXITY field on struct tree_exp

2007-01-18 Thread Steven Bosscher
Hello,

Ada is the last user of the tree_exp->complexity field.  Removing this
field should reduce GCC's memory usage by about 5% on a 64 bit host.
Could an Ada maintainer see if it possible to remove the use of this
field?  I would think it shouldn't be too hard -- TREE_COMPLEXITY is
used only inside ada/decl.c.  But I haven't been able to figure out
myself yet how to avoid using TREE_COMPLEXITY there...

Thanks,

Gr.
Steven



Re: CSE not combining equivalent expressions.

2007-01-18 Thread Steven Bosscher
On Thursday 18 January 2007 09:31, Jeffrey Law wrote:
> I haven't followed this thread that closely, but it seems to me this
> could be done in the propagation engine.
>
> Basically we keep track of the known zero, sign bit copies and known
> nonzero bits for SSA names, then propagate them in the obvious ways.
> Basically replicating a lot of what combine & cse do in this area,
> but at the tree level.   It's something I've always wanted to see
> implemented, but never bothered to do...

I had this implemented at one point (2 years ago??) and I could not
show any real benefit.  There were almost no opportunities for this
kind of optimization in GCC itself or in some benchmarks I looked at.

There appear to be more bit operations in RTL, so perhaps it is a
better idea to implement a known-bits propagation pass for RTL, with
the new dataflow engine.

Gr.
Steven



Re: Ada and the TREE_COMPLEXITY field on struct tree_exp

2007-01-18 Thread Steven Bosscher

On 1/18/07, Richard Kenner <[EMAIL PROTECTED]> wrote:

> Ada is the last user of the tree_exp->complexity field.  Removing
> this field should reduce GCC's memory usage by about 5% on a 64 bit
> host.  Could an Ada maintainer see if it possible to remove the use
> of this field?  I would think it shouldn't be too hard --
> TREE_COMPLEXITY is used only inside ada/decl.c.  But I haven't been
> able to figure out myself yet how to avoid using TREE_COMPLEXITY there...

It's just being used as a cache to avoid recomputing a value.  My
suggestion would be to replace it with a hash table. It'll tend to keep nodes
around a little more than usual, but that should be a tiny cost.


I had thought of a hash table, too, but I couldn't figure out where to
initialize and free it (i.e. where it is a "live" table, so to speak).
For example, I don't know if this table would be required after
gimplification, and I also don't even know how GNAT translates its own
representation to GIMPLE (whole translation unit at once? function at
a time?).

Gr.
Steven


Re: raising minimum version of Flex

2007-01-21 Thread Steven Bosscher

On 21 Jan 2007 22:13:06 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

Ben Elliston <[EMAIL PROTECTED]> writes:

> I think it's worth raising the minimum required version from 2.5.4 to
> 2.5.31.

I want to point out that Fedora Core 5 appears to still ship flex
2.5.4.  At least, that is what flex --version reports.  (I didn't
bother to check this before.)  I think we need a very strong reason to
upgrade our requirements ahead of common distributions.  We've already
run into that problem with MPFR.


For MPFR, everyone needs to have the latest installed to be able to
build gcc. That is not the case with flex. No-one needs flex at all to
build gcc, except gcc hackers who modify one of the (two or three?)
remaining flex files and regenerate the lexers. So you can't really
compare flex and MPFR this way.

If flex 2.5.31 is already four years old, it doesn't seem unreasonable
to me to expect people to upgrade if their distribution ships with an
even older flex.

Gr.
Steven


Re: About building conditional expressions

2007-01-23 Thread Steven Bosscher

On 1/23/07, Ferad Zyulkyarov <[EMAIL PROTECTED]> wrote:

But, as I noticed this function "build" is not maintained (used) by
gcc any more. Instead build, what else may I use to create a
conditional expression node?


Look for buildN where N is a small integer ;-)

I think you want build2 for EQ_EXPR.

Gr.
Steven


Re: [RFC] Our release cycles are getting longer

2007-01-23 Thread Steven Bosscher

On 1/23/07, Diego Novillo <[EMAIL PROTECTED]> wrote:


So, I was doing some archeology on past releases and we seem to be
getting into longer release cycles.  With 4.2 we have already crossed
the 1 year barrier.


Heh.

Maybe part of the problem here is that the release manager isn't very
actively persuing a release. The latest GCC 4.2 status report is from
October 17, 2006, according to the web site.  That is already more
than 100 days ago.



For 4.3 we have already added quite a bit of infrastructure that is all
good in paper but still needs some amount of TLC.


And the entire backend dataflow engine is about to be replaced, too.
GCC 4.3 is probably going to be the most experimental release since
GCC 4.0...



There was some discussion on IRC that I would like to move to the
mailing list so that we get a wider discussion.  There's been thoughts
about skipping 4.2 completely, or going to an extended Stage 3, etc.


Has there ever been a discussion about releasing "on demand"? Almost
all recent Linux and BSD distributions appear to converge on GCC 4.1
as the system compiler, so maybe there just isn't a "market" for GCC
4.2.

I don't see any point in an extended Stage 3.  People work on what
they care about, and we see time and again that developers just work
on branches instead of on bug fixes for the trunk when it is in Stage
3.

IMHO the real issue with the GCC release plan, is that there is no way
for the RM to make people fix bugs. I know the volunteer blah-blah,
but at the end of the day many bugs are caused by the people who work
on new projects on a branch when the trunk is in Stage 3.

Maybe there should just be some rules about accepting projects for the
next release cycle. Like, folks with many bugs assigned to them, or in
their area of expertise, are not allowed to merge a branch or big
patches into the trunk during Stage 1.

Not that I *really* believe that would work...  But skipping releases
is IMHO not really a better idea.

Gr.
Steven


Re: Signed int overflow behaviour in the security context

2007-01-25 Thread Steven Bosscher

On 1/25/07, Andreas Bogk <[EMAIL PROTECTED]> wrote:

"It's not my fault if
people write buggy software" is a lame excuse for sloppy engineering on
the part of gcc.


So basically you're saying gcc developers should compensate for other
people's sloppy engineering?  ;-)

Gr.
Steven


Re: [RFC] Our release cycles are getting longer

2007-01-25 Thread Steven Bosscher

On 1/25/07, H. J. Lu <[EMAIL PROTECTED]> wrote:

> >Gcc 4.2 has a serious FP performace issue:
> >
> >http://gcc.gnu.org/ml/gcc/2007-01/msg00408.html
> >
> >on both ia32 and x86-64. If there will be a 4.2.0 release, I hope it
> >will be addressed.
>
> As always, the best way to ensure that it is addressed if it is
> important to you is to address it yourself, or pay someone to do so :-)

The fix is in mainline. The question is if it should be backported to
4.2.


ISTR Dan already made it clear more than once that the answer to that
question is a loud NO.

Gr.
Steven


G++ OpenMP implementation uses TREE_COMPLEXITY?!?!

2007-01-27 Thread Steven Bosscher
Hello rth,

Can you explain what went through your mind when you picked the 
tree_exp.complexity field for something implemented new...  :-(

You know (or so I assume) this was a very Very VERY BAD thing to do,
if we are ever going to get rid of TREE_COMPLEXITY, which is a major
memory hog.  We are all better off if we can remove TREE_COMPLEXITY.

I thought we were there with an Ada patch I've just crafted, and
with the effort of Tom Tromey to remove TREE_COMPLEXITY usage from
the java front end.

But my all-languages bootstrap failed, and guess what:

/* Used to store the operation code when OMP_ATOMIC_DEPENDENT_P is set.  */
#define OMP_ATOMIC_CODE(NODE) \
  (OMP_ATOMIC_CHECK (NODE)->exp.complexity)

We should _not_ be doing this.  *Especially* not through anything
else than the TREE_COMPLEXITY accessor macro to hide the issue...
or maybe that was done on purpose? ;-)

I don't know if there is another place where we can store this
value, but we definitely should.  It is hugely disappointing to see
that, just when we're there with all other front ends, you've just
introduced another user of the tree_exp.complexity field.

Can you please help me fix this ASAP?

Gr.
Steven



Re: Ada and the TREE_COMPLEXITY field on struct tree_exp

2007-01-27 Thread Steven Bosscher

On 1/18/07, Richard Kenner <[EMAIL PROTECTED]> wrote:

> I had thought of a hash table, too, but I couldn't figure out where to
> initialize and free it (i.e. where it is a "live" table, so to speak).  For
> example, I don't know if this table would be required after gimplification,
> and I also don't even know how GNAT translates its own representation to
> GIMPLE (whole translation unit at once? function at a time?).

It's fairly conventional in that part.

But that's not relevant here.  This is used for transmitting location
information on FIELD_DECLs back to the front end.  Most records in Ada are
defined at GCC's global level, so there's little point in doing anything
other than a hash table that's initialized early on (e.g., in the routine
"gigi") and never freed.  Also, the current code just saves the result for
EXPR_P nodes since only those have TREE_COMPLEXITY, but if you're switching
to a hash table, it's probably best just to record *all* results in it.


OK, attached is the preliminary hack I created some time ago.  After
some changes, it now bootstraps, but I haven't tested it yet.  I'm
passing it as an RFC.

I did not go as far as what you suggested, because I don't want to
change code I don't really understand.  This is the minimum patch I
would need to remove the complexity field from struct tree_exp.  If
one of you can do better than this, for the purpose of GNAT, please go
ahead and change it any way you see fit ;-)


No point in getting too sophisticated here: this is just a small hack to
avoid pathalogical compile-time behavior when compiling certain very complex
record types.


Are these test cases in the FSF test suite?

Thanks,
Gr.
Steven
2007-xx-xx  Steven Bosscher  <[EMAIL PROTECTED]>

gcc/
	* tree.c (iterative_hash_expr): Handle types generically.
	Also handle PLACEHOLDER_EXPR nodes.

ada/
	* decl.c: Include hashtab.h and gt-ada-decl.h
	(struct cached_annotate_value_t, cached_annotate_value_tab,
	cached_annotate_value_hash, cached_annotate_value_eq,
	cached_annotate_value_marked_p, cached_annotate_value_lookup,
	cached_annotate_value_insert): New data structures and support
	functions to implement a cache for annotate_value results.
	(annotate_value): Use the hash table as a cache, instead of
	using TREE_COMPLEXITY.

Index: tree.c
===
--- tree.c	(revision 121230)
+++ tree.c	(working copy)
@@ -5158,12 +5158,21 @@ iterative_hash_expr (tree t, hashval_t v
 	  /* DECL's have a unique ID */
 	  val = iterative_hash_host_wide_int (DECL_UID (t), val);
 	}
+  else if (class == tcc_type)
+	{
+	  /* TYPEs also have a unique ID.  */
+	  val = iterative_hash_host_wide_int (TYPE_UID (t), val);
+	}
   else
 	{
-	  gcc_assert (IS_EXPR_CODE_CLASS (class));
-	  
 	  val = iterative_hash_object (code, val);
 
+	  /* The tree must be a placeholder now, or an expression.
+	 For anything else, die.  */
+	  if (code == PLACEHOLDER_EXPR)
+	return val;
+	  gcc_assert (IS_EXPR_CODE_CLASS (class));
+
 	  /* Don't hash the type, that can lead to having nodes which
 	 compare equal according to operand_equal_p, but which
 	 have different hash codes.  */
Index: ada/decl.c
===
--- ada/decl.c	(revision 121230)
+++ ada/decl.c	(working copy)
@@ -34,6 +34,7 @@
 #include "convert.h"
 #include "ggc.h"
 #include "obstack.h"
+#include "hashtab.h"
 #include "target.h"
 #include "expr.h"
 
@@ -5864,6 +5865,104 @@ compare_field_bitpos (const PTR rt1, con
 return 1;
 }
 
+
+/* In annotate_value, we compute an Uint to be placed into an Esize,
+   Component_Bit_Offset, or Component_Size value in the GNAT tree.
+   Because re-computing the value is expensive, we cache the unique
+   result for each tree in a hash table.  The hash table key is the
+   hashed GNU tree, and the hash table value is the useful data in
+   the buckets.
+
+   The hash table entries are pointers to cached_annotate_value_t.
+   We hash GNU_SIZE in the insert and lookup functions for this
+   the hash table, using iterative_hash_expr.  Caching the hash
+   value on the bucket entries speeds up the hash table quite a
+   bit during resizing.  */
+
+struct cached_annotate_value_t GTY(())
+{
+  /* Cached hash value for this table entry.  */
+  hashval_t hashval;
+
+  /* The cached value.
+ ??? This should be an Uint but gengtype choques on that.  */
+  int value;
+
+  /* The tree that the value was computed for.  */
+  tree gnu_size;
+};
+
+/* The hash table used as the annotate_value cache.  */
+static GTY ((if_marked ("cached_annotate_value_marked_p"),
+	 param_is (struct cached_annotate_value_t)))
+  htab_t cached_annotate_value_tab;
+
+/* Hash an annotate_value result for the annotate_valu

Re: G++ OpenMP implementation uses TREE_COMPLEXITY?!?!

2007-01-29 Thread Steven Bosscher

On 1/28/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

It's entirely reasonable to look for a way to get rid of this use of
TREE_COMPLEXITY, but things like:

> You know (or so I assume) this was a very Very VERY BAD thing to do

are not helpful.  Of course, if RTH had thought it was a bad thing, he
wouldn't have done it.


Fine.

Then consider all my efforts to remove it finished.

Gr.
Steven


Re: Ada and the TREE_COMPLEXITY field on struct tree_exp

2007-01-29 Thread Steven Bosscher

On 1/28/07, Steven Bosscher <[EMAIL PROTECTED]> wrote:

OK, attached is the preliminary hack I created some time ago.  After
some changes, it now bootstraps, but I haven't tested it yet.  I'm
passing it as an RFC.


This patch is hereby withdrawn.

Gr.
Steven


Re: G++ OpenMP implementation uses TREE_COMPLEXITY?!?!

2007-01-29 Thread Steven Bosscher

On 1/29/07, Paolo Bonzini <[EMAIL PROTECTED]> wrote:

I hope Steven accepts a little deal: he exits angry-stevenb-mode, and I
donate him this untested patch to remove TREE_COMPLEXITY from C++.


No, thank you.

I've decided long ago that I'm not going to work on anything unless
there is nobody working in the other direction.

In the case of TREE_COMPLEXITY, one of the best and most prominent gcc
hacker decided to use something of which, I believe, everyone thinks
it should go. And he did so in a way almost as if to cover it up, by
accessing the field directly instead of through the accessor macros.

So I freak out, which is not good, I know.  I appologize to those who
feel offended, because I did not mean to. My "what was on your mind"
remark was *very* tongue-in-cheek, because clearly rth wouldn't have
done this when he would have had more time/patience/whatever. See his
own remark in the commit mail about his state of mind when he commited
this bit.

But then to have Mark *support* rth's change, that really shows the
total lack of leadership and a common plan in the design of gcc.

Why should I spend hours on this kind of cleanup, only to feel
frustrated, to make others dislike me, and to have zero result in the
end?  I'll just work on something else instead.

Gr.
Steven


Re: G++ OpenMP implementation uses TREE_COMPLEXITY?!?!

2007-01-29 Thread Steven Bosscher

On 1/29/07, Joe Buck <[EMAIL PROTECTED]> wrote:

On Mon, Jan 29, 2007 at 03:24:56PM +0100, Steven Bosscher wrote:
> But then to have Mark *support* rth's change, that really shows the
> total lack of leadership and a common plan in the design of gcc.

There you go again.


Actually, there *you* go again :-)  Do you know I can't find even just
one mail from you that did not in one way or another criticize the way
I said something?


 Mark did not support or oppose rth's change, he just
said that rth probably thought he had a good reason.


Well, forgive me for missing the subtle difference between supporting
a change and suggesting there was a good reason for the change.

Also, to say that there is no common plan in GCC, or that there is no
good leadership of the project, is just the expresion of my opinion.
I really am of the opinion that gcc is a strange project which claims
to be open but where the maintainers are appointed by a group of
people that hasn't changed in ten years time. If you see that as
attacking someone personally, that is your problem, not mine.



If you think that there's a problem with a patch, there are ways to say so
without questioning the competence or good intentions of the person who
made it.


Where did I question rth's competence? If you're going to be so picky
about everything I say, can you at least be specific?

Gr.
Steven


Re: G++ OpenMP implementation uses TREE_COMPLEXITY?!?!

2007-01-29 Thread Steven Bosscher

On 1/29/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

Email is a tricky thing.  I've learned -- the hard way -- that it's best
to put a smiley on jokes, because otherwise people can't always tell
that they're jokes.


I did use a smiley.

Maybe I should put the smiley smiling then, instead of a sad looking smlley.

Gr.
Steven


Re: Use of INSN_CODE

2007-02-01 Thread Steven Bosscher

On 2/1/07, Pranav Bhandarkar <[EMAIL PROTECTED]> wrote:

However, the internals  only warn against using INSN_CODE on use,
clobber, asm_input, addr_vec, addr_diff_vec. There is no mention of
other members of the other members of RTX_EXTRA. or shouldnt
recog_memoized have an INSN_P check in it ?
Am I missing something here ?


recog* should ice if what it gets passed is not an insn (i.e. !INSN_P).

Gr.
Steven


Re: "error: unable to generate reloads for...", any hints?

2007-02-08 Thread Steven Bosscher

On 2/8/07, 吴曦 <[EMAIL PROTECTED]> wrote:

Thanks. But what does it mean by saying:
"Sometimes an insn can match more than one instruction pattern. Then
the pattern that appears first in the machine description is the one
used."


Basically it means, "Don't do that" ;-)

Make your insns match only one pattern.

Gr.
Steven


Re: Division by zero

2007-02-10 Thread Steven Bosscher

On 2/10/07, Jie Zhang <[EMAIL PROTECTED]> wrote:

The code I posted in my first email is from libgloss/libnosys/_exit.c.
It's used to cause an exception deliberately. From your replies, it
seems it should find another way to do that.


Maybe you can use __builtin_trap() ?

Gr.
Steven


Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-12 Thread Steven Bosscher

On 2/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:


  I like df infrastructure code from the day one for its clearness.
Unfortunately users don't see it and probably don't care about it.
With my point of view the df infrastructure has a design flaw.  It
extracts a lot of information about RTL and keep it on the side.  It
does not make the code fast.


It also does not make code slow.  And the data it extracts and keeps
on the side, could be used to simplify many algorithms in gcc (most
notably, cprop, mode switching, and regmove).  There is a tremendous
potential for speedups in RTL passes if they start using the df
register caches, instead of traversing the PATTERN of every insn.

Gr.
Steven


Re: Some thoughts and quetsions about the data flow infrastructure

2007-02-12 Thread Steven Bosscher

On 2/13/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:

> There are certainly performance issues here.  There are limits on
> how much I, and the others who have worked on this have been able to
> change before we do our merge.  So far, only those passes that were
> directly hacked into flow, such as dce, and auto-inc-dec detection
> have been rewritten from the ground up to fully utilize the new
> framework.  However, it had gotten to the point where the two
> frameworks really should not coexist.  Both implementations expect
> to work in an environment where the information is maintained from
> pass to pass and doing with two systems was not workable.  So the
> plan accepted by the steering committee accommodates the wholesale
> replacement of the dataflow analysis but even after the merge, there
> will still be many passes that will be changed.

Does it means that compiler will be even more slower?


No, it will mean the compiler will be faster.  Sooner if you help. You
seem to believe that the DF infrastructure is fundamentally slower
than flow is.  I believe that there are other reasons for the current
differences in compile time.

AFAICT the current compile time slowdowns on the dataflow branch are due to:

* bitmaps bitmaps bitmaps. We badly need a faster bitmap implementation.

* duplicate work on insns scanning:
  1. DF scans all insns and makes available accurate information
  2. Many (most) passes see it and think, "Hey, I can do that
myself!", and they
 rescan all insns for no good reason.
 The new passes, that use the new infrastructure, are among the fastest in
 the RTL path right now.  The slow passes are the passes doing their own
 thing (CSE, GCSE, regmove, etc.).

* duplicate work between passes (minor):
  - on the trunk, regmove can make auto increment insns
  - on the df branch, the auto-inc-dec pass makes those
transformations redundant

* earlier availability of liveness information:
  - On the trunk we compute liveness for the first time just before combine
  - On the dataflow branch, we have liveness already after the first CSE pass
Updating it between CSE and combine over ~20 passes is probably costly
compared to doing nothing on the trunk.  (I believe having cfglayout mode
early in the compiler will help reduce this cost thanks to no iterations in
cleanup_cfg)

Maybe I overestimate the cost of some of these items, and maybe I'm
missing a few items. But the message is the same: There is still
considerable potential for speeding up GCC using the new dataflow
infrastructure.

Gr.
Steven


Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Steven Bosscher

On 2/13/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:

  Wow, I got so many emails. I'll try to answer them in one email in
Let us look at major RTL optimizations: combiner, scheduler, RA.


...PRE, CPROP,SEE, RTL loop optimizers, if-conversion, ...  It is easy
to make your arguments look valid if you take it as a proposition that
only register allocation and scheduling ought to be done on RTL.

The reality is that GIMPLE is too high level (by design) to catch many
useful transformations performed on RTL. Tthink CSE of lowered
addresses, expanded builtins, code sequences generated for bitfield
operations and expensive instructions (e.g. mul, div).  So we are
going to have more RTL optimizers than just regalloc and sched.

Many RTL optimizations still matter very much (disable some of them
and test SPEC again, if you're unconvinced).  Having a uniform
dataflow framework for those optimizations is IMHO a good thing.



Do
we need a global analysis for building def-use use-def chains?   We
don't need it for combiner (only in bb scope)


It seems to me that this limitation is only there because when combine
was written, the idea of "global dataflow information" was in the
"future work" section for most practical compilers.  So, perhaps
combine, as it is now, does not need DU/UD chains. But maybe we can
improve passes like this if we re-implement them in, or migrate them
to a better dataflow framework.

Gr.
Steven


Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Steven Bosscher

On 2/13/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:

I am just trying to convince that the proposed df infrastructure is not
ready and might create serious problems for this release and future
development because it is slow.  Danny is saying that the beauty of the
infrastracture is just in improving it in one place.  I am agree in this
partially.  I am only affraid that solution for faster infrastructure
(e.g. another slimmer data representation) might change the interface
considerably.  I am not sure that I can convinince in this.  But I am
more worried about 4.3 release and I really believe that inclusion of
the data flow infrastructure should be the 1st step of stage 1 to give
people more time to solve at least some problems.



I recall this wonderful quote of just a few days ago, which perfectly
expresses my feelings about the proposed merge of the dataflow branch
for GCC 4.3:

"I would hope that the
community would accept the major structural improvement, even if it is
not a 100% complete transition, and that we can then work on any
remaining conversions in the fullness of time."
   -- Mark Mitchell, 11 Feb 2007  [1]

:-D

Gr.
Steven





[1] http://gcc.gnu.org/ml/gcc-patches/2007-02/msg01012.html


Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Steven Bosscher

On 2/13/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:

>   Why is it unacceptable for it to mature further on mainline like
>Tree-SSA?
>
>
Two releases one after another to avoid.  No one real experiment to try
to rewrite an RTL optimization to figure out how def-use chain will work.


Vlad, this FUD-spreading is beginning to annoy me. Please get your
view of the  facts in order.

There *are* passes rewritten in the new framework to figure out how
this will work. In fact, some of those passes existed even before the
rest of the backend was converted to the new dataflow scheme. Existing
on trunk even now: fwprop, see, web, loop-iv. New on the branch: at
least auto-inc-dec.

Gr.
Steven


Re: Some thoughts and quetsions about the data flow infrastracture

2007-02-13 Thread Steven Bosscher

On 2/12/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:

Getting 0.5% and 11.5% slowness
(308sec vs 275sec for compiling SPECINT2000) does not seems reasonable


Just to be sure: Did you build with --disable-checking for both
compilers?  I often find myself comparing compilers with checking
enabled, so, you know, just checking... ;-)
Thanks,

Gr.
Steven


Call for help: when can compare_and_jump_seq produce sequences with control flow insns?

2007-02-16 Thread Steven Bosscher
Hello,

As some folks perhaps have noticed, my effort to make gcc use cfglayout
mode between expand and, roughly, sched1, have stagnated a bit.  I am
completely stuck on a problem that I basically can't trigger.  In other
words, I *know* I should expect problems if I make a certain change, but
I haven't been able to actually trigger that problem.

Let me explain that...  Consider the following code, from loop-unroll.c:

basic_block
split_edge_and_insert (edge e, rtx insns)
{
  basic_block bb;

  if (!insns)
return NULL;
  bb = split_edge (e);
  emit_insn_after (insns, BB_END (bb));
  bb->flags |= BB_SUPERBLOCK;
  return bb;
}

We call this function to insert insns sequences produced by either
compare_and_jump_seq or expand_simple_binop.  compare_and_jump_seq can
produce insns sequences with control flow insns in it (i.e. jumps) (I am
not sure about expand_simple_binop, but I think it never needs control
flow insns).

We have to split a block with multiple control flow insns into multiple
blocks at some point.  We could split it in place, but loop-unroll
decides to defer it until going out of cfglayout mode, where we now have
to call break_superblocks to split the basic blocks with BB_SUPERBLOCK
set on them.

Here comes the problem: break_superblocks() doesn't work in cfglayout
mode.  There is no serialized insns stream, so you can't know what the
fallthrough edge should be.  I could fix this, but it is both unclean
and hard.  The alternative is to go out of cfglayout mode, fixup the
CFG, and go back into cfglayout mode.  Wasteful, IMHO, so I'd like to
avoid that solution, too.

I don't want to go out of cfglayout mode (I want to stay in it,
that's the whole point ;-) and since break_superblocks() doesn't work
for me, I have re-introduced find_sub_basic_blocks (which was removed
by Kazu long ago), made it work in cfglayout mode, and use it in
split_edge_and_insert().  That way, I update the CFG in place, and
simply avoid break_superblocks.

This was not hard to do.  I think.

But now that I've implemented it, I need to test the new code somehow.
And I can't find a test case.  I've tried to craft some test case based
on how I understand loop-unroll should work, but I did not succeed.  So
I moved on to brute force methods.

I have tested a small patch on i686, x86_64, ia64, mips, and sh:
--
Index: loop-unroll.c
===
--- loop-unroll.c   (revision 122011)
+++ loop-unroll.c   (working copy)
@@ -879,7 +879,6 @@ split_edge_and_insert (edge e, rtx insns
 return NULL;
   bb = split_edge (e);
   emit_insn_after (insns, BB_END (bb));
-  bb->flags |= BB_SUPERBLOCK;
   return bb;
 }

--

My thoughts here were, that if I can make the compiler crash with this
patch, my new find_sub_basic_blocks should fix that crash.  So I'm
trying to make gcc crash with the above patch.

I know for sure that the compiler would crash if it finds a basic block
with more than one control flow insn, but without the BB_SUPERBLOCK flag.
verify_flow_info has good checks for this.

But the patch doesn't trigger failures on the targets I tested.  I can't
get gcc to ICE with this patch, hence I can't find a test case for my
patch.

So I'm looking for help here: Who can help me find a test case to trigger
a verify_flow_info ICE in GCC with the above patch applied?  Can people
try this patch on their favorite target, and see if they can trigger a
test suite failure?

Hope you can help,

Thanks,

Gr.
Steven




Re: GCC 4.2.0 Status Report (2007-02-19)

2007-02-20 Thread Steven Bosscher

On 2/20/07, Vladimir Makarov <[EMAIL PROTECTED]> wrote:

>[Option 1] Instead of 4.2, we should backport some functionality from
>4.2 to the 4.1 branch, and call that 4.2.
>
>[Option 2] Instead of 4.2, we should skip 4.2, stabilize 4.3, and call
>that 4.2.
>
>[Option 3] Like (2), but create (the new) 4.2 branch before merging the
>dataflow code.
>
>
>
>
...

>Considering the options above:
>
>* I think [Option 3] is unfair to Kenny, Songbae, and others who have
>worked on dataflow code.  The SC set criteria for that merge and a
>timeline to do the merge, and I believe that the dataflow code has met,
>or has nearly met, those criteria.
>
>
In term of ports, yes I am agree.  As the preformance even with last
Paolo's patches (some changes could be applied to the mainline too, so
it is not only about df), the branch compiler is still 8.7% slower for
SPECint2000 compilation on 2.66Ghz Core2 with --enable-check=release.


I mostly agree with Vlad. IMHO the dataflow branch is in a state where
merging it early in a stage1 of a release cycle makes sense, but for
gcc 4.3 it is getting a bit late.

A lot depends on the current state of the trunk, of course. Do we also
have some quality indicators (bug numbers, compile time performance,
SPEC numbers, etc.) to compare it with the current gcc 4.2 and gcc 4.1
branches?

I don't think it would be very useful to stabilize the trunk if that
can't be done in a matter of, say, two months. If it takes longer than
that, releasing gcc 4.2 as-is would be my choice. Yes, there is a SPEC
performance gap, but SPEC is not the one-benchmark-to-rule-them-all,
and there are things in the current gcc 4.2 release branch (such as
OpenMP, and a hugely improved GFortran) that I would like to see
released.

Not releasing GCC 4.2 is IMHO not a really good option. If we do that,
GCC 4.3 will contain so much new code that the number of not yet
uncovered bugs that our users may run into, may be larger than we can
handle.

Gr.
Steven


Re: Question about source-to-source compilation

2007-02-21 Thread Steven Bosscher

On 2/21/07, Thomas Bernard <[EMAIL PROTECTED]> wrote:

Hello all,

As far as I know, GCC 4.x is easily retargetable for a new architecture.
I would be interested by source-to-source compilation with the GCC
framework. For instance, let's say the input language is C and the
output language is C annotated with pragmas which are the results of
some code analysis (done at middle-end level). I do not think that the
GCC back-end could support a programming language such as C, C++ or Java.
Is GCC 4.x designed to source-to-source compilation ? Is that possible
or do I miss something here ?


It is not always possible.  GCC is certainly not designed for it. You
will have problems mostly with types and decls, which are hard to
reproduce from the intermediate representation once it has been
lowered to GIMPLE.

Gr.
Steven


Re: Inconsistent next_bb info when EXIT is a successor

2007-03-02 Thread Steven Bosscher

On 3/2/07, Andrey Belevantsev <[EMAIL PROTECTED]> wrote:

I have tried to reorganize the check so that the "e->src->next_bb ==
e->dest" condition is checked for all edges (see the patch below).  Of
course, GCC does not bootstrap with this patch, triggering an assert of
incorrect fallthru block in cfg_layout_finalize, after RTL loop
optimizations.  In my case, combine has broken that condition.


No. The condition you're checking is simply not true in cfglayout
mode. The whole point of cfglayout mode is to get rid of the
requirement that basic blocks are serial. That means a fallthru edge
in cfglayout mode doesn't have to go to next_bb. It can go to *any*
bb.

Gr.
Steven


Re: Inconsistent next_bb info when EXIT is a successor

2007-03-03 Thread Steven Bosscher

On 3/2/07, Andrey Belevantsev <[EMAIL PROTECTED]> wrote:

Steven Bosscher wrote:
> No. The condition you're checking is simply not true in cfglayout
> mode. The whole point of cfglayout mode is to get rid of the
> requirement that basic blocks are serial. That means a fallthru edge
> in cfglayout mode doesn't have to go to next_bb. It can go to *any*
> bb.
Yes, but I'm not in cfglayout mode, because I'm either in sched1 or
sched2.  In that case, should this condition be preserved or not?


The condition should always be preserved when you are not in cfglayout
mode, but...

You wrote:

> > During my work on the selective scheduler I have triggered an assert in
> > our code saying that a fall-through edge should have e->src->next_bb ==
> > e->dest. This was for a bb with EXIT_BLOCK as its fall-through
> > successor, but its next_bb pointing to another block.


I don't understand this.  You're saying there is a fallthrough edge
from your e->src to  EXIT_BLOCK. This case is explicitly allowed by
the checking code. It is an exception from the rule: For a fallthrough
edge to EXIT, e->src->next_bb != e->dest is OK.

It is hard to tell without more context what your problem is. That
assert, is it an assert in your own code?  Maybe it is too strict?

Gr.
Steven



Gr.
Steven


Re: CFG question

2007-03-04 Thread Steven Bosscher

On 3/4/07, Sunzir Deepur <[EMAIL PROTECTED]> wrote:

 hello ppl,

 when I use -fdump-rtl-all with -dv I get CFG files.

 where can I learn the syntax of that CFG files ?
 it seems some kind of LISP language...


As the fine manual says:

   `-dv'
 For each of the other indicated dump files (either with `-d'
 or `-fdump-rtl-PASS'), dump a representation of the control
 flow graph suitable for viewing with VCG to `FILE.PASS.vcg'.

So my guess is that the syntax is VCG's.

Gr.
Steven


Re: CFG question

2007-03-04 Thread Steven Bosscher

On 3/4/07, Sunzir Deepur <[EMAIL PROTECTED]> wrote:

Forgive me, I had mistake in the question - I meant the debug dump files
that we get just by using the -fdump-rtl-all. not the vcg files.
how can I understand their syntax ?


http://gcc.gnu.org/onlinedocs/gccint/RTL.html#RTL

Gr.
Steven


Re: Improvements of the haifa scheduler

2007-03-04 Thread Steven Bosscher

On 3/4/07, Andrew Pinski <[EMAIL PROTECTED]> wrote:

On 3/4/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:
> Another important thing to do is to make the 1st scheduler register
> pressure sensitive.

I don't know how many times this has to be said, no this is not the
correct approach to fix that issue.  The correct fix is able for the
register allocator to work correctly and fix up the IR.


Andrew, your truth isn't necessarily _the_ truth in this matter ;-)

Gr.
Steven


Re: BUG: wrong function call

2007-03-06 Thread Steven Bosscher

On 3/6/07, W. Ivanov <[EMAIL PROTECTED]> wrote:

Paulo J. Matos wrote:
> On 3/6/07, W. Ivanov <[EMAIL PROTECTED]> wrote:
>> Hi, I use multiple inheritance in my project. In the child class i have
>> functions GetParam() and SetParam().
>> In the cpp-file I call GetParam() function, but I fell to SetParam()
>> function.
>> Can You help me?
>>
>
> Don't take me wrong but it is most likely a bug in your code. Still,
> you might want to inform the developers (not me) through this mailing
> list which code you're compiling and which version of gcc you're
> using.
>
> Cheers,
>
Please, give me a mail address of developers.


You're already reaching pretty much all of them through this mailing list.

Gr.
Steven


Looking for specific pages from Muchnick's book

2007-03-08 Thread Steven Bosscher

Hi,

I found this old patch
(http://gcc.gnu.org/ml/gcc-patches/2003-06/msg01669.html) that refers
to pages 202-214 of Muchnick's "Advanced Compiler Design and
Implementation" book. That book still is not in my own compiler books
collection because of its price. I used to have access to a copy in a
university library, but that copy has been removed from the collection
and, apparently, it's been disposed off :-(

Could someone scan those pages and send them to me, please?

Thanks,

Gr.
Steven


Re: Looking for specific pages from Muchnick's book

2007-03-08 Thread Steven Bosscher

On 3/8/07, Steven Bosscher <[EMAIL PROTECTED]> wrote:

Could someone scan those pages and send them to me, please?


I received some private mails from people that are concerned about
copyright issues and all that.

I should have said I've actually ordered the book from Amazon (the
price used to be a problem, back when I was a student), but the
shipping to Europe takes at least 9 days, and in my experience usually
more than a month. In order to move ahead with a plan I'm persuing, I
just want to read those pages asap, not many weeks from now ;-)

Gr.
Steven


Re: Looking for specific pages from Muchnick's book

2007-03-08 Thread Steven Bosscher

On 3/8/07, Robert Dewar <[EMAIL PROTECTED]> wrote:

Dave Korn wrote:

>   A few pages for personal study?  That's fair use by any meaningful
> definition, no matter how much the RIAA/MPAA/similar-copyright-nazis would
> like to redefine the meanings of perfectly clear words and phrases in the
> english language.

It is of course way off topic, but just so no one is confused, "fair
use" does not mean "use that anyone would consider fair", it refers
specifically to the fair use section of the copyright act, which lays
out very specific criteria. So the question of "perfectly clear words
and phrases" is not the issue. I suggest anyone interested actually
read the statute!


In the mean time, I've received those pages.  I'll make sure to
ritually burn them when I finally receive the book.

To stray a bit further off topic, I'd actually much more enjoy the
book if I would not have to print out almost as many pages as the book
has to get all the errata. IMVHO very few books have a poorer
quality/price ratio than Muchnick, which is why I have never found it
worth it to buy it before. There should be a law that says you can
freely copy books with too many errata ;-)

Gr.
Steven


Re: Looking for specific pages from Muchnick's book

2007-03-09 Thread Steven Bosscher

On 3/9/07, Vladimir N. Makarov <[EMAIL PROTECTED]> wrote:

  o Muchnik book is a fat one.  Muchnick's book is rather encyclopedia
of optimizations and can be considered as collection of articles with
many details (sometimes too many).  But some themes (like RA and
scheduling) are described not deep.


Muchnick is also famous for its >150 A4 pages of errata, especially
the 1st and 2nd print.  I really wouldn't recommend it to you unless
you're looking for a compiler algorithms cook book.



  o Robert Morgan.  Building an Optimizing compiler.


This is my favorite book.  If you've read the Dragon book and this
one, you're well under way to being a compiler expert. I agree with
Vlad about the contents of the book, but it is the only fairly
comprehensive introduction text I know of that deals with LCM and SSA
at a level that even I can understand ;-)


  o Appel.  Modern Compiler implementation in C/Java/ML.  Another good
book to start to study compilers from parser to code generation and
basic optimizations.  I especially like the version in ML (Modern
compiler implementation in ML).


The version in ML is the best of the three.  The other two look too
much like "had to do this"-books where algorithms are translated from
ML, which makes them look very unnatural in C/Java.


  o Aho/Lam/Sethi/Ulman.  Compilers: Principles, Techniques, and
Tools. 2nd edition.  Personally I don't like it because it is based on
outdated (although classical) book.  I attached a review of this book
which I wrote more than year ago (when the book was not ready).


This one is old, but it is a classic. The 1st edition should be on
every compiler engineer's book shelf, just because.  I have never seen
the 2nd edition myself.

Grune et. al. "Modern Compiler Design" is another good introduction
text, especially if you're interested in various parsing techniques.

Gr.
Steven


Re: Problem with reg_equiv_alt_mem

2007-03-12 Thread Steven Bosscher

On 3/12/07, Unruh, Erwin <[EMAIL PROTECTED]> wrote:

In a private port I had the problem that reg_equiv_alt_mem_list did
contain the
same RTL as reg_equiv_memory_loc. This caused an assert in
delete_output_reload,
where these are compared with equal_rtx_p.
The list is build with push_reg_equiv_alt_mem, but only when "tem !=
orig". The
value tem is build with find_reloads_address. Within that function we
have some
code which simply unshares the RTL. The return value will be the new
RTL, even
in some cases where the RTL did not change.

I think the test in front of push_reg_equiv_alt_mem should be done via
rtx_equal_p. This would match the assert in delete_output_reload.

My private port is based on GCC 4.1.0, but the code looks the same in
4.3.

I do not have papers on file so someone else should prepare a patch.


For sufficiently small patches (usually less than 10 lines changed is
used as the norm) you don't need to have a copyright assignment on
file.  Such small changes are apparently not covered by copyright.

So if you could send a patch, that'd be quite helpful ;-)

Gr.
Steven


Re: S/390 Bootstrap failure: ICE in cse_find_path, at cse.c:5930

2007-03-12 Thread Steven Bosscher

On 3/12/07, Andreas Krebbel <[EMAIL PROTECTED]> wrote:

Hi,

gcc currently doesn't boostrap on s390 and s390x:


See http://gcc.gnu.org/ml/gcc-bugs/2007-03/msg00930.html

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-12 Thread Steven Bosscher

On 3/12/07, David Edelsohn <[EMAIL PROTECTED]> wrote:

I thought that the Tuples conversion was suppose to address this
in the long term.


The tuples conversion is only going to make things worse in the short term.

Doug, isn't there a lang_tree bit you can make available, and use it
to make the tree code field 9 bits wide?  I know this is also not
quite optimal, but adding 24 bits like this is an invitation to
everyone to start using those bits, and before you know it we're stuck
with a larger-than-necessary tree structure... :-(  (plus, it's not 32
bits but 64 bits extra on 64 bits hosts...)

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-12 Thread Steven Bosscher

On 3/12/07, Paolo Carlini <[EMAIL PROTECTED]> wrote:

we are unavoidably
adding tree codes and we must solve the issue, one way or another.


Another real solution would perhaps be to not use 'tree' for front end
specific data structures in C++, and instead just define g++ specific
data structures to represent all the language details ;-)

G++ needs 64 (!) language specific tree codes, almost 7 times more
than any other front end, and in total more than twice as many as all
other front ends (java, ada, and objc)  together.

IMHO, now all languages are going to suffer from a larger 'tree' and a
slower compiler, because g++ basically abuses a shared data structure.

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-12 Thread Steven Bosscher

On 3/12/07, Andrew Pinski <[EMAIL PROTECTED]> wrote:

Can I recommend something just crazy, rewrite the C and C++ front-ends
so they don't use the tree structure at all except when lowering until
gimple like the rest of the GCC front-ends?


The C front end already emits generic, so there's almost no win in
rewriting it (one lame tree code in c-common.def -- not worth the
effort ;-).

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-12 Thread Steven Bosscher

On 3/12/07, Paolo Carlini <[EMAIL PROTECTED]> wrote:

In my opinion, "visions" for a better
future do not help here.


No, I fully agree.  I mean, imagine we'd have a long term plan for
GCC. That would be so out of line!

;-)

I'm not arguing against a practical solution. But to me at least it is
just *so* frustrating to know that this issue was known literally
years ago, and yet nothing has happened to avoid the situation before
it could occur.

Now we're looking at another set of hacks to "fix" the issue.  And you
know just as well as I do, that we're going to be stuck with those
hacks forever, because nobody will have any motivation to fix the real
problem for once.

But oh well.  SEP.

Gr.
Steven


Re: No ifcvt during ce1 pass (fails i386/ssefp-2.c)

2007-03-15 Thread Steven Bosscher

On 3/15/07, Uros Bizjak <[EMAIL PROTECTED]> wrote:

compile this with -O2 -msse2 -mfpmath=sse, and this testcase should
compile to maxsd.


I'll look into it this weekend.

Gr.
Steven


Re: No ifcvt during ce1 pass (fails i386/ssefp-2.c)

2007-03-15 Thread Steven Bosscher

On 3/15/07, Uros Bizjak <[EMAIL PROTECTED]> wrote:

BTW: Your patch also causes

FAIL: gcc.dg/torture/pr25183.c  -O0  (internal compiler error)
FAIL: gcc.dg/torture/pr25183.c  -O0  (test for excess errors)


Yes. Known. I bootstrapped a fix and had a box test it yesterday.
I'll look at the test results tonight and commit the fix if there are
no new failures (and this one is fixed).

This failure is caused by problems with dead jump tables. There's
another bug (with a PR filed for it) that is also related to dead jump
tables. The fix I have should fix both these cases.

Gr.
Steven


Re: No ifcvt during ce1 pass (fails i386/ssefp-2.c)

2007-03-15 Thread Steven Bosscher

On 3/15/07, Uros Bizjak <[EMAIL PROTECTED]> wrote:

The testcase is:

double x;
q()
{
  x=x<5?5:x;
}

compile this with -O2 -msse2 -mfpmath=sse, and this testcase should
compile to maxsd.


This happens because a "fallthrough edge" is meaningless in cfglayout
mode, but ifcvt.c still gives special meaning to the fallthrough edge.
This should not matter, but it does for some reason, and I'm
investigating this right now. I'll try to come up with a fix asap.

Gr.
Steven


Re: RFC: obsolete __builtin_apply?

2007-03-16 Thread Steven Bosscher

On 3/16/07, Andrew Pinski <[EMAIL PROTECTED]> wrote:

On 3/16/07, Steve Ellcey <[EMAIL PROTECTED]> wrote:
> My thinking is that if libobjc was changed then we could put in a
> depreciated message on these builtins for 4.3 and maybe remove them for
> 4.4.

libobjc has not changed yet.  There was a patch a while back to change
libobjc to use libffi but I need to go back to it and review it (as it
was before I became a libobjc maintainer).


Do you mean this patch:
http://gcc.gnu.org/ml/gcc-patches/2004-12/msg00841.html
?

Gr.
Steven


Re: Building mainline and 4.2 on Debian/amd64

2007-03-18 Thread Steven Bosscher

On 3/18/07, Florian Weimer <[EMAIL PROTECTED]> wrote:

I don't need the 32-bit libraries, so disabling their compilation
would be fine. --enable-targets at configure time might do the trick,
but I don't know what arguments are accepted.


Would --disable-multilib work?

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-19 Thread Steven Bosscher

On 3/19/07, Doug Gregor <[EMAIL PROTECTED]> wrote:

I went ahead and implemented this, to see what the real impact would
be. The following patch frees up TREE_LANG_FLAG_5, and uses that extra
bit for the tree code.

On tramp3d, memory usage remains the same (obviously), and the
performance results are not as bad as I had imagined:

8-bit tree code, --enable-checking:

real1m56.776s
user1m54.995s
sys 0m0.541s

9-bit tree code, --enable-checking:

real2m16.095s
user2m12.132s
sys 0m0.562s

8-bit tree code, --disable-checking:

real0m55.693s
user0m43.734s
sys 0m0.414s

9-bit tree code, --disable-checking:

real0m58.821s
user0m46.122s
sys 0m0.443s

So, about 16% slower with --enable-checking, 5% slower with --disable-checking.


Just because I'm curious and you have a built tree ready...  Does the
patch that Alex sent to gcc-patches the other day help reduce this 5%
penalty?

See the patch here:
http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01234.html

There are other bitfield optimization related bugs (Roger Sayle should
know more about those) that we can give a higher priority if we decide
to go with the 9 bit tree code field.  IMHO this is still the better
solution than the subcodes idea.

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-19 Thread Steven Bosscher

On 3/19/07, Doug Gregor <[EMAIL PROTECTED]> wrote:

GCC has also been getting improved functionality, better
optimizations, and better language support. Some of these improvements
are going to cost us at compile time, because better optimizations can
require more time, and today's languages require more work to compile
and optimize than yesterday's. No, I don't want my compiler to be 5%
slower, but I'll give up 5% for better standards conformance and
improved code generation.


Of course, the problem is not 5%, but the yet again 5%, on top of, I
don't know, 200% since GCC 2.95.3??

Also, it is "better optimizations" for some purposes, but not for
others. For example, many of the >140 passes are redundant for typical
C code.



It's not all bad news, either. Canonical types got us 3-5% speedup in
the C++ front end (more on template-heavy code), so I figure I have at
least a 3% speedup credit I can apply against the 9-bit code patch.
That brings this patch under 2% net slow-down, so we should just put
it in now :)


But only for C++.

I'm still in favor of the 9-bit code patch.  But I think the slowdown
should not be taken so lightly as you appear to do ;-)

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-19 Thread Steven Bosscher

On 3/20/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

As for the current problem, I think a 3% hit is pretty big.  I didn't
find subcodes that ugly, so I guess I'm inclined to go with subcodes,
and avoid the hit.


We know that one still mostly unaddressed problem that tree-ssa left
us with, is poorer code for bitfield operations. That means the 3% can
probably be reduced further.
Another thing I like about the 9-bit tree code approach is that we
keep the size of the 'tree' data structure the same, so there is no
effect on memory.
I think that 3% is unfortunate but worth it because the impact on the
structure of the compiler is negligible, while subcodes require
significant rewrites of some parts of gcc.

Let's be fair here: A 3% hit is small compared to the cumulative
slowdown we already have in GCC 4.3 since the start of stage 1, and
negligible compared to the total slowdown we've accumulated over the
years. I know this is not really an argument, but let's face it: Much
larger patches and branch merges have unintentionally increased
compile time by more than 3%, and we didn't have a large discussion
about it. Those were the power plants, and Doug's patch is the
(you've guessed it!) bikeshed! ;-)


Back to the technical arguments...

Subcodes require a bigger 'tree' data structure so there will be a
memory usage hit, I don't think there's disagreement about that. We
don't know if subcodes will have no compiler speed hit. At least, I
don't recall seeing any numbers yet. But if 'tree' is bigger, the
chances are that we'll see poorer cache behavior, and therefore a
slower compiler. So the subcodes approach may end up no better than
the 9-bit tree code approach wrt. compiler speed. (Of course, for a
good technical decision, you'd have to try both approaches and do a
fair comparison.)

I also think subcudes are bug prone, because you have more cases to
handle and people are unfamiliar with this new structure. The impact
of subcodes on the existing code bases is just too large for my taste.



 I think it's fair for front ends to pay for their
largesse.  There are also relatively cheap changes in the C++ front end
to salvage a few codes, and postpone the day of reckoning.


I think that day of reckoning will come very soon again, with more
C++0x work, more autovect work, OpenMP 3.0, and the tuples and LTO
projects, etc., all requiring more tree codes.

And if there comes a point somewhen, where we can go back to a smaller
tree code field, it is much easier to do so with the 9-bit tree code
approach, than with subcodes.

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-20 Thread Steven Bosscher

On 3/20/07, Doug Gregor <[EMAIL PROTECTED]> wrote:

> So the memory hit shouldn't be
> as big as e.g. going to 16 bit tree codes if that means increasing the size
> of most of the trees the compiler uses.

Yes, this is true.


But this could be solved if all LANG_TREE_x bits could move to
language specific trees, could it?

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-22 Thread Steven Bosscher

On 3/22/07, Joe Buck <[EMAIL PROTECTED]> wrote:

But these numbers show that subcodes don't cost *ANY* time, or the
cost is in the noise, unless enable-checking is on.  The difference
in real-time seems to be an artifact, since the user and sys times
are basically the same.


The subcodes cost complexity. And the cost with checking enabled is
IMHO unacceptable.

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-22 Thread Steven Bosscher

On 3/22/07, Doug Gregor <[EMAIL PROTECTED]> wrote:

The results, compile time:


For what test case?


For a bootstrapped, --disable-checking compiler:

8-bit tree code (baseline):

real0m51.987s
user0m41.283s
sys 0m0.420s

subcodes (this patch):

real0m53.168s
user0m41.297s
sys 0m0.432s

9-bit tree code (alternative):

real0m56.409s
user0m43.942s
sys 0m0.429s


Did the 9-bit tree code include Alexandre Oliva's latest bitfield
optimization improvements patch
(http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01397.html)?

What about the 16-bit tree code?

Gr.
Steven


Re: We're out of tree codes; now what?

2007-03-22 Thread Steven Bosscher

On 3/22/07, Mike Stump <[EMAIL PROTECTED]> wrote:

is more obvious than the correctness of the subcoding. Thoughts?


I fully agree.

Gr.
Steven


  1   2   3   4   5   6   7   8   9   10   >