Re: expand_omp_parallel typo?

2006-10-18 Thread Diego Novillo

Marcin Dalecki wrote on 10/18/06 00:27:


   bsi_insert_after (&si, t, TSI_SAME_STMT);

Shouldn't this bee

   bsi_insert_after (&si, t, BSI_SAME_STMT);

instead?

Yes.  We lucked out because both symbols have the same numeric value. 
Patch pre-approved as obvious.



PS: Please do not use existing threads to start an unrelated one.


Re: stability of libgomp and libssp

2006-10-19 Thread Diego Novillo

Eric Christopher wrote on 10/19/06 17:33:


I was wondering if anyone planned on changing abi or if we can depend
on all changes not breaking the abi of these libraries?

There is nothing planned in that area, but I wouldn't want to guarantee 
ABI stability.  Mostly as a result of bug fixing.  Since this will

be the first official release, I expect several bugs that may introduce
ABI problems.

Unless you can use some kind of versioning, I don't see a good way to 
address this.


Re: Question about LTO dwarf reader vs. artificial variables and formal arguments

2006-10-21 Thread Diego Novillo

Ian Lance Taylor wrote on 10/21/06 14:59:


That is, we are not going to write out DWARF.  We can't, because DWARF
is not designed to represent all the details which the compiler needs
to represent.  What we are going to write out is a superset of DWARF.
And in fact, if it helps, I think that we shouldn't hesitate to write
out something which is similar to but incompatible with DWARF.

In general reading and writing trees is far from the hardest part of
the LTO effort.  I think it is a mistake for us to get too tied up in
the details of how to represent things in DWARF.  (I also think that
we could probably do better by defining our own bytecode language, one
optimized for our purposes, but it's not an issue worth fighting
over.)

Agreed.  I don't think we'll get far if we focus too much on DWARF, as 
it clearly cannot be used as a bytecode language for our purposes.


We will need to evolve our own bytecode language, either as an extension 
to DWARF (much like we did with SIMPLE) or do something from scratch. 
Implementing type support starting from DWARF is a start, but we should 
not constrain ourselves to it.


Re: fdump-tree explanation

2006-10-26 Thread Diego Novillo

Dino Puller wrote on 10/26/06 10:11:


How many times gcc simplify expressions like: x/x, 0*x, 1*y, a+0,
x*x/x and so on

You are probably looking at folding then.  An initial idea might be to 
put some code in fold-const.c:fold that compares the input tree 
expression with the output, if they are different, increment your counter.


Re: memory benchmark of tuples branch

2006-10-27 Thread Diego Novillo

Aldy Hernandez wrote on 10/26/06 10:40:


As we have hoped, every single function exhibits memory savings.  Yay.


Nice!


I don't know if this merits merging into mainline, or if it's preferable to
keep plodding along and convert the rest of the tuples.  What do you guys
think?  Either way, I have my work cut out for me, though I believe the
hardest part is over (FLW).

My vote is to merge into mainline sooner rather than later.  However, it 
is a big patch and affects just about every module in the compiler, so I 
wouldn't want to barge in without getting some consensus first.


As for the rest of the conversion, I don't think there are many changes 
left that are as invasive as this one.  Mostly global search and 
replace, but it's been a few weeks since I looked at the code.


I'm hoping we can extend the rest of the changes into late Stage1 and 
Stage2, but we can see about that as we go.


Re: memory benchmark of tuples branch

2006-10-27 Thread Diego Novillo

Aldy Hernandez wrote on 10/27/06 09:35:


How does this sound to y'all?

Sounds good to me.  I would add an additional memory savings check 
between 4 and 5.


Re: fdump-tree explanation

2006-10-27 Thread Diego Novillo

Dino Puller wrote on 10/27/06 11:25:


The idea is a bit complex. Anyway the fold function has no one only
return so i can't compare input tree with output one, and it's called
from a lot of others functions of others files.

>
Of course you can.  fold() does not modify the input tree anymore.  Just 
write a wrapper for it which calls the real fold().  Rename fold() to 
something else and that should be all you need to do.


Now, if you want to do this on a per-pass manner, then it would probably 
require a bit more code.  In that case, you'd have to write entry/exit 
code in passes.c:execute_one_pass.



Even if i find the right places to put my code, how can i output
collected infos?

Collect it in a buffer and dump it at the end of compilation.  Check in 
final.c:rest_of_handle_final.


Re: memory benchmark of tuples branch

2006-10-27 Thread Diego Novillo

Mark Mitchell wrote on 10/27/06 12:25:

Aldy Hernandez wrote:

Does the tuples branch include the CALL_EXPR reworking from the LTO branch?

No.


Though, that is a similar global-touch-everything project, so hopefully 
whatever consensus develops from tuples will carry over.


I feel the same about LTO.  We seem to have lots of destabilizing stuff 
in various branches.  It may be better to move chunks of long-lived 
branches as we go along.  Particularly things that we feel won't change 
much over the lifetime of the branch.


So, if the CALL_EXPR rework in LTO is "done", we should think about 
moving it in.  But other folks may want to play it more conservatively, 
so I would rather have consensus here.


Re: compiling very large functions.

2006-11-06 Thread Diego Novillo

Kenneth Zadeck wrote on 11/04/06 15:17:


1) defining the set of optimizations that need to be skipped. 2)
defining the set of functions that trigger the special processing.


 This seems too simplistic.  Number of variables/blocks/statements is a
factor, but they may interact in ways that are difficult or not possible
to compute until after the optimization has started (it may depend on
how many blocks have this or  that property, in/out degree, number of
variables referenced in statements, grouping of something or other, etc).

So, in my view, each pass should be responsible for throttling itself.
The pass gate functions already give us the mechanism for on/off.  I
agree that we need more graceful throttles.  And then we have components
of the pipeline that cannot really be turned on/off (like alias
analysis) but could throttle themselves based on size (working on that).


The compilation manager could then look at the options, in particular
 the -O level and perhaps some new options to indicate that this is a
 small machine or in the other extreme "optimize all functions come
hell or high water!!" and skip those passes which will cause
performance problems.


All this information is already available to the gate functions.  There
isn't a lot here that the pass manager needs to do.  We already know
compilations options, target machine features, and overall optimization
level.

What we do need is for each pass to learn to throttle itself and/or turn
itself off.  Turning the pass off statically and quickly could be done
in the gating function.  A quick analysis of the CFG made by the pass
itself may be enough to decide.

We could provide a standard group of heuristics with standard metrics
that lazy passes could use.  Say a 'cfg_too_big_p' or 'cfg_too_jumpy_p'
that passes could call and decide not to run, or set internal flags that 
would partially disable parts of the pass (much like DCE can work with 
or without control-dependence information).


Re: compiling very large functions.

2006-11-06 Thread Diego Novillo

Kenneth Zadeck wrote on 11/06/06 12:54:


I am not saying that my original proposal was the best of all possible
worlds, but solving hacking things on a pass by pass or pr by pr basis
is not really solving the problem.

I don't think it's a hackish approach.  We have policy setting at the 
high level (-O[123]), and local implementation of that policy via the 
gating functions.


Providing common predicates that every pass can use to decide whether to 
switch itself off is fine (size estimators, high connectivity, etc), but 
ultimately the analysis required to determine whether a function is too 
expensive for a pass may not be the same from one pass to the other.


OTOH, just using the gating function is not enough.  Sometimes you want 
the pass to work in a partially disabled mode (like the CD-DCE example I 
mentioned earlier).


In terms of machinery, I don't think we are missing a lot.  All the 
information is already there.  What we are missing is the implementation 
of more throttling/disabling mechanisms.


Re: compiling very large functions.

2006-11-06 Thread Diego Novillo

Brooks Moses wrote on 11/06/06 17:41:

Is there a need for any fine-grained control on this knob, though, or 
would it be sufficient to add an -O4 option that's equivalent to -O3 but 
with no optimization throttling?


We need to distinguish two orthogonal issues here: effort and enabled 
transformations.  Currently, -O3 means enabling transformations that (a) 
may not result in an optimization improvement, and (b) may change the 
semantics of the program.  -O3 will also enable "maximum effort" out of 
every transformation.


In terms of effort, we currently have individual knobs in the form of -f 
and/or --params settings.  It should not be hard to introduce a global 
-Oeffort=xxx parameter.  But, it will take some tweaking to coordinate 
what -f/--params/-m switches should that enable.


Re: compiling very large functions.

2006-11-07 Thread Diego Novillo

Jan Hubicka wrote on 11/07/06 05:07:

-O3 enables inlining, unswitching and GCSE after reload.  How those 
change semantics of the program?



Bah, I was convinced we were switching on -ffast-math at -O3.  Never mind.


Re: Control Flow Graph

2006-11-15 Thread Diego Novillo

[EMAIL PROTECTED] wrote on 11/15/06 06:06:

Hi all,
i must use cfg library to build and manipulate a control flow graph. I have 
read more but i have not found an answer to my question: It is possible to 
build a cfg structure directly from a file .cfg ?? How i can building a cfg 
from a file??
Thanks to all,

Ask for a dump using the -blocks switch and post-process the dump file 
with the attached script.


$ gcc -fdump-tree-all-blocks file.c
$ dump2dot file.c.XXXt.yyy

It generates a graphviz file with the flow graph of the function.  The 
script is fairly simplistic and will not handle more than one function 
too gracefully, but that should be easy to change.
#!/bin/sh
#
# (C) 2005 Free Software Foundation
# Contributed by Diego Novillo <[EMAIL PROTECTED]>.
#
# This script is Free Software, and it can be copied, distributed and
# modified as defined in the GNU General Public License.  A copy of
# its license can be downloaded from http://www.gnu.org/copyleft/gpl.html

if [ "$1" = "" ] ; then
echo "usage: $0 file"
echo
echo "Generates a GraphViz .dot graph file from 'file'."
echo "It assumes that 'file' has been generated with -fdump-tree-...-blocks"
echo
exit 1
fi

file=$1
out=$file.dot
echo "digraph cfg {"> $out
echo "  node [shape=box]"   >>$out
echo '  size="11,8.5"'  >>$out
echo>>$out
(grep -E '# BLOCK|# PRED:|# SUCC:' $file |  \
sed -e 's:\[\([0-9\.%]*\)*\]::g;s:([a-z_,]*)::g' |  \
awk '{  #print $0;  \
if ($2 == "BLOCK")  \
{   \
bb = $3;\
print "\t", bb, "[label=\"", bb, "\", style=filled, 
color=gray]";   \
}   \
else if ($2 == "PRED:") \
{   \
for (i = 3; i <= NF; i++)   \
print "\t", $i, "->", bb, ";";  \
}   \
}') >> $out
echo "}">> $out


Re: Control Flow Graph

2006-11-15 Thread Diego Novillo

albino aiello wrote on 11/15/06 10:14:

Thanks, but i want to use the .cfg file to construct directly a
tree_cfg  in C language using the TREE SSA libraries of gcc.


There is no such thing as a tree ssa library.  If you are adding a pass 
to GCC, then you already have the CFG at your disposal.  In fact, you 
are pretty much forced to work over the CFG.  If you want to use this 
functionality outside of GCC, I'm afraid you cannot do that (without a 
lot of work).


Re: how to load a cfg from a file by tree-ssa

2006-11-24 Thread Diego Novillo

Rob Quill wrote on 11/23/06 12:41:

I haven't looked into this yet, but as I think I may need to be able
to do something similar, is it possible to parse the cfg file that is
given out, and build a C structure like that?

Parsing a CFG dump is trivial.  See the script I posted in 
http://gcc.gnu.org/ml/gcc/2006-11/msg00576.html.  You can then convert 
it to whichever C data structure you want.


Re: machine-dependent Passes on GIMPLE/SSA Tree's?

2006-11-27 Thread Diego Novillo

Markus Franke wrote on 11/27/06 12:50:


Are there also some other optimisation passes working on the GIMPLE/SSA
representation which make use of any machine-dependent features?

Yes.  Passes like vectorization and loop optimizations will use so 
called 'target hooks' which allow the high-level passes to query the 
target for various capabilities and attributes.  See the tree-vect*.c 
files for several examples.


Re: writing a new pass: association with an option string

2006-12-04 Thread Diego Novillo

Andrea Callia D'Iddio wrote on 12/04/06 03:48:

Dear all,
I wrote a new pass for gcc. Actually the pass is always executed, but
I'd like to execute it only if I specify an option from shell (ex. gcc
--mypass pippo.c). How can I do?

Create a new flag in common.opt and read its value in the gate function 
of your pass.  I *believe* this is documented somewher in the internals 
manual, but I'm not sure.


You can check how other passes do it.  See, for instance, 
flag_tree_vectorize in common.opt and in the vectorizer's gating predicate.


Re: [PATCH]: Require MPFR 2.2.1

2006-12-04 Thread Diego Novillo

Richard Guenther wrote on 12/04/06 11:23:

On 12/3/06, Kaveh R. GHAZI <[EMAIL PROTECTED]> wrote:


I'd like to give everyone enough time to update their personal 
installations and regression testers before installing this.  Does

 one week sound okay?  If there are no objections, that's what I'd
 like to do.


Please don't.  It'll be a hassle for us again and will cause 
automatic testers to again miss some days or weeks during stage1 
(given christmas holiday season is near). Rather defer to the start 
of stage3 please.



Agreed, please don't.  The whole MPFR thing is already fairly annoying.
I have just updated all my machines with a special RPM I got from Jakub. 
 I don't want to go through that again so soon.


Re: Announce: MPFR 2.2.1 is released

2006-12-05 Thread Diego Novillo

Kaveh R. GHAZI wrote on 12/04/06 21:32:


That idea got nixed, but I think it's time to revisit it.  Paolo has
worked out the kinks in the configury and we should apply his patch and
import the gmp/mpfr sources, IMHO.

Yes, I vote to include gmp/mpfr in the tree.  If gmp/mpfr is still a 
fluid target, we could add svn glue code to avoid commits to the 
sub-tree and rely exclusively on wholesale import.


Re: How to save a va_list object into a buffer and restore it from there?

2006-12-06 Thread Diego Novillo

Hoehenleitner, Thomas wrote on 12/06/06 07:08:


after unsuccessful search in the doc, the web and this mailing list I
decided to launch this question here:
 
Offtopic in this forum.  Please use [EMAIL PROTECTED] or comp.lang.c. 
 This list is for GCC *development*.


Re: Gimple Type System

2006-12-06 Thread Diego Novillo

Richard Warburton wrote on 12/06/06 07:59:


I would be most grateful of an answer to these questions, since I find
the implementation of the gimple type system to be a little puzzling.

That's because there is *no* GIMPLE type system.  GIMPLE latches on to 
the type system of the input language, via the so called 'language 
hooks' (see lang_hooks.types_compatible_p).  This is a limitation that 
has not been addressed yet.


Re: Gimple Type System

2006-12-06 Thread Diego Novillo

Richard Warburton wrote on 12/06/06 09:44:

Thanks for this information.  I presume from you response that there
is a plan to address this issues, is this something that will be
happening in the 'near-term', by that I mean within the next 6-9
months?

Well, we will need something like this for the LTO project 
(http://gcc.gnu.org/wiki/LinkTimeOptimization).  I am not sure whether 
there is anyone actively working on it ATM, though.  As far as 
timelines, that is fairly hard to predict.  It may be a few months or a 
couple of years.


The only change in this area that I'm aware of is the streaming of type 
information using dwarf.  But that is not the same as having a proper 
GIMPLE type system.


Re: Help with traversing block statements in pragma handler

2006-12-15 Thread Diego Novillo

Ferad Zyulkyarov wrote on 12/15/06 05:02:


9: FOR_EACH_BB_FN (bb, this_cfun)
10: for (bsi = bsi_start(bb); !bsi_end_p(bsi); bsi_next(&bsi))
11: {
12: tree stmt = bsi_stmt(bsi);
13: debug_tree(stmt);
14: /* Do something */
15: }
16: } /* End of void handle_pragma_test */

This is way too early in the compilation.  At this point we are not even 
in GIMPLE mode, so there will not be a flowgraph.


I recommend that you follow what happens when GCC handles structurally 
similar pragmas.  In your case, try following through #pragma omp 
parallel.  Its behaviour is very similar to what you show in your 
#pragma test.


Re: Help with traversing block statements in pragma handler

2006-12-15 Thread Diego Novillo

Ferad Zyulkyarov wrote on 12/15/06 08:46:


And something more, what is the difference between c_register_pragma
and cpp_register_deferred_pragma functions? Unfortunately, I couldn't
fined a descriptive information about these two functions.

You need to look in ../libcpp/directives.c.  Deferred pragmas are 
registered to avoid calling pragma handling while we are pre-processing.


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Robert Kennedy wrote on 12/21/06 11:37:

The situation is that some SSA_NAMEs are disused (removed from the 
code) without being released onto the free list by 
release_ssa_name().


Yes, it happens if a name is put into the set of names to be updated by 
update_ssa.


After update_ssa, it should be true that every SSA name with no 
SSA_NAME_DEF_STMT is in the free list.


However, if we have SSA names with no defining statement that are still 
in considered active, I would hardly consider it a serious bug.  It's a 
waste of memory, which you are more than welcome to fix, but it should 
not cause correctness issues.



Please discuss.


Test case?


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Daniel Berlin wrote on 12/21/06 12:21:


for (i = 0; i < num_ssa_names; i++)
{
  tree name = ssa_name (i);
  if (name && !SSA_NAME_IN_FREELIST (name)
   DFS (name)

>
I see that you are not checking for IS_EMPTY_STMT.  Does DFS need to 
access things like bb_for_stmt?


In any case, that is not important.  I agree that every SSA name in the 
SSA table needs to have a DEF_STMT that is either (a) an empty 
statement, or, (b) a valid statement still present in the IL.


Note that this is orthogonal to the problem of whether we free up unused 
names from this list.  Every time a statement S disappears, we should 
make sure that the names defined by S get their SSA_NAME_DEF_STMT set to 
 NOP.


Frankly, I'm a bit surprised that we are running into this. I'd like to 
see a test case, if you have one.


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Robert Kennedy wrote on 12/21/06 13:58:

Right now we can have SSA_NAMEs in the
list which are no longer used, and we have no way to tell whether they
are used or not.  Thus the only way to see all valid SSA_NAMEs is to
walk the code.


To wit: are there iteration macros somewhere that will help me walk
the code while abstracting away all the ugly details like stmt/bb
boundaries, etc.?

No.  The code is segmented into basic blocks.  To walk the code, you 
must walk each basic block.


Something I forgot to add in my previous message.  Notice that it is not 
altogether rare to find cases where we have more SSA names than 
statements.  Are you walking the SSA names because you assume it's 
always shorter than walking the statements?


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Jeffrey Law wrote on 12/21/06 12:48:

True.  But remember, the stated purpose of the SSA_NAME recycling
code was _not_ to track every SSA_NAME that went "dead" and recycle
it, but instead to get the majority of them (and to ultimately save
memory by recycling them).  Orphan SSA_NAME were always expected.

But this is orthogonal to the recycling issue.  They are traversing the 
SSA name table and finding SSA names that have invalid DEF_STMT entries. 
I believe that we should support this kind of usage of the SSA table.




Alternately we can revisit the entire recycling question as well --
things have changed significantly since that code was written and
I've speculated that the utility of the recycling code has
diminished, possibly to the point of being a useless waste of time
and code.

That'd be interesting to try, yes.  Though we *do* want to invalidate 
SSA_NAME_DEF_STMT for the SSA names whose defining statement gets deleted.


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Ian Lance Taylor wrote on 12/21/06 13:08:


If that is acceptable, then there is no issue here.  If that is not
acceptable, then we need to fix the code to correctly mark SSA_NAMEs
which are no longer used.  Whether we recycle the memory in the unused
SSA_NAMEs is a separate (and less interesting) discussion.


Agreed.

We have various passes that walk through the SSA table, so I want to 
keep supporting that.


We do have cases where an SSA name may get its defining statement zapped 
and yet we need to keep it around.  The renamer uses names_to_release in 
those cases, and makes sure not to visit the defining statement.


If every statement removal were to set SSA_NAME_DEF_STMT to NOP for 
every name generated by the removed statement, then the renamer would 
probably not need to do that.  However, the renamer *needs* the SSA name 
itself not to be recycled (for name->name mappings).


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-21 Thread Diego Novillo

Robert Kennedy wrote on 12/21/06 15:01:
Something I forgot to add in my previous message.  Notice that it is not 
altogether rare to find cases where we have more SSA names than 
statements.  Are you walking the SSA names because you assume it's 
always shorter than walking the statements?


No. I'm walking the SSA names because logically that's what the
algorithm is interested in. At that level, the algorithm doesn't care
about statements.

OK.  Good enough.  To fix this bug, I also suggest what Jeff and Ian 
have been discussing:


1- A verifier in verify_ssa

2- Fix bugs found in #1 by making sure that every time we remove a 
statement, the SSA_NAME_DEF_STMT of all the affected names is changed to 
point to an empty statement.


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-22 Thread Diego Novillo

Jeffrey Law wrote on 12/22/06 01:09:

On Thu, 2006-12-21 at 14:05 -0500, Diego Novillo wrote:
In any case, that is not important.  I agree that every SSA name in the 
SSA table needs to have a DEF_STMT that is either (a) an empty 
statement, or, (b) a valid statement still present in the IL.

Just to be 100% clear.  This is not true at the current time; see the
discussion about the sharing of a single field for TREE_CHAIN and
SSA_NAME_DEF_STMT.  If you want to make that statement true, then
you need to fix both the orphan problem and the sharing of a field
for SSA_NAME_DEF_STMT and TREE_CHAIN.


I think we are agreeing violently.


[mem-ssa] Updated documentation

2006-12-22 Thread Diego Novillo


I've updated the document describing Memory SSA.  The section on mixing 
static and dynamic partitioning is still being implemented, so it's a 
bit sparse on details and things will probably shift somewhat before I'm 
done.


http://gcc.gnu.org/wiki/mem-ssa

Feedback welcome.  Thanks.


Re: changing "configure" to default to "gcc -g -O2 -fwrapv ..."

2007-01-02 Thread Diego Novillo

Mark Mitchell wrote on 01/01/07 14:46:


What a thread this has turned out to be.


Indeed.

In general, I'm not too thrilled with the idea of disabling
transformations for the sake of non-conforming code.  However, I would
not mind a -fconforming flag similar to -fstrict-aliasing.



I haven't yet seen that anyone has actually tried the obvious: run SPEC
with and without -fwrapv.  Would someone please do that?  Or, pick your
favorite high-performance application and do the same.  But, let's get
some concrete data as to how much this optimization helps.


On x86_64:

SPEC2000int is almost identical.
SPEC2000fp shows a -6% drop when using -fwrapv.


-
  HARDWARE
  
CPU: Intel(R) Core(TM)2 CPU  6400  @ 2.13GHz
CPU MHz: 2128.001
FPU: Integrated
 CPU(s) enabled: 2
Secondary Cache: 2048 KB
 Memory: 2053792 kB


  SOFTWARE
  
   Operating System: Linux 2.6.18-1.2868.fc6
   Compiler: GNU C version 4.3.0 20070101 (experimental) (x86_64-unknow
n-linux-gnu)

   NOTES
   -
Base is -O2 -march=nocona -mtune=generic -fwrapv
Peak is -O2 -march=nocona -mtune=generic


   SPEC CINT2000 Summary

Estimated Estimated
  Base  Base  Base  Peak  Peak  Peak
  BenchmarksRef Time  Run Time   RatioRef Time  Run Time   Ratio
              
  164.gzip  1400  1271099* 1400  1271106*
  175.vpr   1400  1141227* 1400  1161209*
  176.gcc   X X
  181.mcf   1800  195 921* 1800  194 927*
  186.crafty1000  48.7  2054* 1000  48.2  2076*
  197.parser1800  197 915* 1800  195 923*
  252.eon   1300  64.7  2011* 1300  64.8  2005*
  253.perlbmk   X X
  254.gap   1100  64.9  1696* 1100  65.3  1685*
  255.vortexX X
  256.bzip2 1500  1081384* 1500  1081395*
  300.twolf 3000  1691771* 3000  1691775*
  Est. SPECint_base2000 1391
  Est. SPECint20001394



   SPEC CFP2000 Summary

Estimated Estimated
  Base  Base  Base  Peak  Peak  Peak
  BenchmarksRef Time  Run Time   RatioRef Time  Run Time   Ratio
              
  168.wupwise   1600 1031547* 1600  88.3  1812*
  171.swim  3100 1522033* 3100 1452131*
  172.mgrid 1800 193 935* 1800 1311376*
  173.applu 2100 1951076* 2100 1881116*
  177.mesa  1400  71.1  1968* 1400  71.3  1964*
  178.galgel2900 1072699* 2900  99.1  2927*
  179.art   2600  96.7  2689* 2600  94.6  2749*
  183.equake1300  69.0  1884* 1300  67.1  1939*
  187.facerec   1900 1491273* 1900 1461302*
  188.ammp  2200 1701292* 2200 1681312*
  189.lucas 2000 1021965* 2000  98.7  2025*
  191.fma3d 2100 1951079* 2100 1921092*
  200.sixtrack  1100 199 553* 1100 198 556*
  301.apsi  2600 2081248* 2600 1961329*
  Est. SPECfp_base2000  1463
  Est. SPECfp2000 1560

-


Re: Do we want non-bootstrapping "make" back?

2007-01-05 Thread Diego Novillo

Daniel Jacobowitz wrote on 12/30/06 02:08:

Once upon a time, the --disable-bootstrap configure option wasn't
necessary.  "make" built gcc, and "make bootstrap" bootstrapped it.

Is this behavior useful?  Should we have it back again?


That'd be great.  I miss the old behaviour.


[RFC] Our release cycles are getting longer

2007-01-23 Thread Diego Novillo


So, I was doing some archeology on past releases and we seem to be 
getting into longer release cycles.  With 4.2 we have already crossed 
the 1 year barrier.


For 4.3 we have already added quite a bit of infrastructure that is all 
good in paper but still needs some amount of TLC.


There was some discussion on IRC that I would like to move to the 
mailing list so that we get a wider discussion.  There's been thoughts 
about skipping 4.2 completely, or going to an extended Stage 3, etc.


Thoughts?


release-cycle.pdf
Description: Adobe PDF document


Re: [mem-ssa] Updated documentation

2007-01-24 Thread Diego Novillo

Ira Rosen wrote on 01/02/07 03:44:


In the example of dynamic partitioning below (Figure 6), I don't understand
why MEM7 is not killed in line 13 and is killed in line 20 later. As far as
I understand, in line 13 'c' is in the alias set, and it's currdef is MEM7,
so it must be killed by the store in line 14. What am I missing?

You are absolutely correct.  MEM7 should indeed be killed in line 13 
(serves me right for manually changing the code).


Thanks for pointing it out.  I will correct the document.


Re: Which optimization levels affect gimple?

2007-01-24 Thread Diego Novillo

Paulo J. Matos wrote on 01/24/07 12:44:


check what kind of gimple code you get with -fdump-tree-gimple and
-O0 and -O3 have different results,


-fdump-tree-gimple is the first dump *before* any optimizations occur.
To see the effect of all the GIMPLE optimizations you should use
-fdump-tree-optimized.


however, -O3 and -O9 have exactly the same output. Will -Ox for x >
3, generate the same gimple trees? (i.e., are done in backend of
gcc?)


-On for n >= 3 is identical to -O3.  This may change in the future.


Re: [RFC] Our release cycles are getting longer

2007-01-25 Thread Diego Novillo

Mark Mitchell wrote on 01/25/07 00:09:


First, I haven't had as much time to put in as RM lately as in past, so
I haven't been nagging people as much.

>
Sure, but this is a trend that started with 3.1 and it's gotten 
progressively worse.  Granted, we are now dealing with a much bigger 
project and perhaps the amount of engineering cycles has not kept up:


Release Size (KLOC)
 1.21 1988 58
 1.38 1990 87
  2.0 1992229
2.8.1 1998416
 EGCS 1998603
 2.95 1999715
  3.0 2001  1,007
  3.1 2002  1,336
  4.0 2005  1,813
  4.1 2006  2,109
  4.2 2007  2,379


some people want/suggest more frequent releases.  But, I've also had a
number of people tell me that the 4.2 release cycle was too quick in its
early stages, and that we didn't allow enough time to get features in --
even though doing so would likely have left us even more bugs to fix.

>
That's also true.  The duration of our stage1 cycles has gone down quite 
a bit since 3.3.  The data I have for the 3.x releases is a bit 
incomplete and we had a strange 3.2 release which I didn't include 
because we suddenly jumped from branching 3.1 to releasing 3.2 (that was 
the C++ ABI thing, IIRC).  Anyway, here's the data I got from our 
release schedule.  These are the durations of each stage since 3.1


Release Stage 1 Stage 2 Stage 3 Release
 3.1 2002   0   65  69  212
 3.3 2003   169 1   61  271
 3.4 2004   262 103 93  289
 4.0 2005   172 64  170 288
 4.1 2006   59  74  133 309
 4.2 2007   61  59  216 393

There is some correlation between the length of Stage1 to Stage3.  It's 
as if longer Stage1s lead to shorter Stage3s.  Perhaps we could consider 
lengthening the early stages, which by all accounts are the more "fun", 
and shorten the pain during stage 3.


Long-lived branches are painful to maintain.  If we allow them more time 
to get in mainline, it may help spread the stabilization work during 
stage1 (a lot more exposure).


Another thing we could try again is going into mini-freeze cycles 
spanning 2-3 weeks.  We've done that in the past when mainline was in a 
pathetic state and I think it was helpful.




Some folks have suggested that we ought to try to line up FSF releases
to help the Linux distributors.

>
I don't think that's in our best interest.  We can't really help what 
distros do.  The fact is, however, that when distros pick up a specific 
release, that release tends to be pretty solid (e.g. 4.1).



I don't think that some of the ideas (like saying that you have to fix N
bugs for every patch you contribute) are very practical.  What we're
seeing is telling us something about "the market" for GCC; there's more
pressure for features, optimization, and ports than bug fixes.  If there
were enough people unhappy about bugs, there would be more people
contributing bug fixes.

Agreed.  We are now in a featuritis phase.  We still have many marketing 
bullet points that folks want filled in.  I believe this will continue 
for at least a couple more releases.  We are also being pulled from many 
directions at once, our user base is very diverse.


Making the infrastructure more palatable for external folks to get 
involved in development and attract more engineering cycles is probably 
one of our best long term bets.


Re: gcc compile time support for assumptions

2007-01-25 Thread Diego Novillo

Ian Lance Taylor wrote on 01/18/07 10:51:


Well, internally, we do have ASSERT_EXPR.  It would probably take a
little work to permit the frontends to generate it, but the optimizers
should understand it.

By default, they do not.  When I initially implemented VRP, I was adding 
ASSERT_EXPRs right after gimplification.  The rationale was to have the 
ASSERT_EXPRs rewritten into SSA form by the initial SSA pass.  This was 
convenient, but it destroyed the quality of generated code.  Suddenly, 
little or no copies, constants were being propagated, jump threading 
wasn't working, PRE wasn't doing its job, etc.


The problem was that all these ASSERT_EXPRs were not being grokked by 
the optimizers, every optimizer would see the assertion, think the worst 
and block transformations.


It also meant quite a bit of bulk added to the IL which increased 
compilation times.  So, if we decide to add ASSERT_EXPRs early in the 
pipeline, we have to mind these issues.


In the end, I went for adding assertions inside VRP and fixing up the 
SSA form incrementally.  Perhaps we can do something similar for other 
passes that may want to deal with assertions.


Now, if these are assertions inserted by the user, that's another 
problem.  The IL wouldn't bulk up so much, but we would still need to 
handle them everywhere.  Assertions shouldn't block scalar cleanups.


Re: Which optimization levels affect gimple?

2007-01-26 Thread Diego Novillo

Paulo J. Matos wrote on 01/26/07 06:52:


Is the output of -fdump-tree-optimized a subset of GIMPLE?


Yes.  The output is an incomplete textual representation of the GIMPLE
form of the program.



Re: Which optimization levels affect gimple?

2007-01-26 Thread Diego Novillo

Richard Guenther wrote on 01/26/07 07:28:


It's after doing TER, so the statements are no longer valid GIMPLE statements.

Silly me.  Richard's right.  You want the output of -fdump-tree-uncprop. 
 That's the last GIMPLE dump (if my memory doesn't fail me again).


Re: Can C and C++ object files be linked into an executable?

2007-01-27 Thread Diego Novillo

Ray Hurst wrote on 01/27/07 16:48:

I think this was the answer I was looking for.
By the way, was this the correct place to post it?


No.  That was a language question.  gcc-help or comp.std.c++ would have 
been a better forum.  This deals with the development of GCC, not its use.


Re: Which optimization levels affect gimple?

2007-01-28 Thread Diego Novillo

Paulo J. Matos wrote on 01/28/07 18:03:

On 1/24/07, Diego Novillo <[EMAIL PROTECTED]> wrote:

Paulo J. Matos wrote on 01/24/07 12:44:


check what kind of gimple code you get with -fdump-tree-gimple and
-O0 and -O3 have different results,


-fdump-tree-gimple is the first dump *before* any optimizations occur.
To see the effect of all the GIMPLE optimizations you should use
-fdump-tree-optimized.



So the dump-tree-optimized will also return GIMPLE? a subset of... GIMPLE?

No, that's back to GENERIC.  During out-of-ssa we recombine GIMPLE 
expressions into GENERIC because of limitations in the way RTL expansion 
works.  In the future, we may go from GIMPLE SSA directly into RTL, so 
this may change.


Re: Which optimization levels affect gimple?

2007-01-28 Thread Diego Novillo

Paulo J. Matos wrote on 01/28/07 18:08:

On 1/26/07, Diego Novillo <[EMAIL PROTECTED]> wrote:

Richard Guenther wrote on 01/26/07 07:28:


It's after doing TER, so the statements are no longer valid GIMPLE statements.


Silly me.  Richard's right.  You want the output of -fdump-tree-uncprop.
  That's the last GIMPLE dump (if my memory doesn't fail me again).



Ok, so, I guess that being GIMPLE a language, there's somewhere a
specification of possible constructs, right? Where can I find it? I
tried to search in the inside gcc manual but there seems to be nothing
like a formal specification of the nodes in the GIMPLE AST.

The GIMPLE grammar is documented in the internals manual.  In the tree 
ssa sections.  Search for "GIMPLE grammar" or something along those lines.



Moreover, is there anything referencing which optimizations are
performed between fdump-tree-gimple and fdump-tree-uncprop (being
those, afaik, the first and last gimple dumps)?

-fdump-tree-all gives you all the dumps by the high-level optimizers. 
-fdump-all-all gives you all the dumps by both GIMPLE and RTL optimizers.



If reading source code could help me understand GIMPLE, could please
someone tell me where to start looking at it. I've svn'ed gcc and it's
huge so I'm quite lost. Where is the output of fdump-tree-gimple done?


Start with passes.c:init_optimization_passes.  That's the pass sequencer.


Re: 2007 GCC Developers Summit

2007-01-29 Thread Diego Novillo

Ben Elliston wrote on 01/28/07 17:45:


One idea I've always pondered is to have brief (perhaps 1-2 hr)
tutorials, given by people in their area of expertise, as a means for
other developers to come up to speed on a topic that interests them.  Is
this something that appeals to others?

Sounds good to me.  For instance, the new java front end, a description 
of the new build system, etc.


Re: Which optimization levels affect gimple?

2007-01-29 Thread Diego Novillo

Paulo J. Matos wrote on 01/29/07 06:35:

On 1/29/07, Diego Novillo <[EMAIL PROTECTED]> wrote:

-fdump-tree-all gives you all the dumps by the high-level optimizers.
-fdump-all-all gives you all the dumps by both GIMPLE and RTL optimizers.



Is this -fdump-all-all version specific? Doesn't work on 4.1.1:
$ g++ -fdump-all-all allocation.cpp
cc1plus: error: unrecognized command line option "-fdump-all-all"

No, I goofed.  I must've dreamed the -all-all switch.  You have to use 
-fdump-tree- for GIMPLE dumps and -fdump-rtl- for RTL dumps. 
It's also possible that -fdump-rtl doesn't work on the 4.1 series (I 
don't recall when -fdump-rtl was introduced, sorry).


Check the invocation sections in the GCC 4.1 manual.  Grep for fdump-.


Re: After GIMPLE...

2007-01-31 Thread Diego Novillo

Paulo J. Matos wrote on 01/30/07 10:11:


Well, I spent the morning looking at the code and since what I need is
only the flow of gcc up until I have the GIMPLE tree, I could add a
pass after the pass which generates the gimple tree, in that pass I do
what I need with the gimple tree and then call exit(). Would this be a
good idea?

It would probably not be a good idea.  Passes are called for each 
function in the callgraph.  If you stop immediately after your pass, you 
will leave all the other functions unprocessed.


What is it that you want to do?  If you need dataflow information, you 
probably also need to have the GIMPLE code in SSA form.



If yes, then the idea would be to create a pass and add it in passes.c
after the line
NEXT_PASS (pass_lower_cf);

since from what I heard in #gcc, this is where the gimple tree is
created, right?

Well, it depends on what you need.  If your pass can work in high GIMPLE 
then you can insert it before that.  pass_lower_cf lowers control flow 
and lexical scopes, but not EH.


Perhaps if you describe a little bit what you are trying to do, we can 
give you a better idea.


Re: After GIMPLE...

2007-01-31 Thread Diego Novillo

Paulo J. Matos wrote on 01/31/07 11:26:


So, ideally, I would like just the gcc part until the first part of
the middleend where you have a 'no optimizations', language
independent AST of the source file.


OK, so you probably want to inject your pass right before pass_build_ssa
(in init_optimization_passes).  All the facilities to traverse the IL 
and flowgraph described in the Tree SSA section of the internals manual 
should apply.


Re: After GIMPLE...

2007-02-01 Thread Diego Novillo

Paulo J. Matos wrote on 02/01/07 04:37:


What can I do then to stop gcc to further process things? After
informing the user there's no more reason on my site to continue.

Stop gracefully or just stop?  The latter is easy.  The former involves 
writing code to skip all passes after a certain point, or just don't 
schedule the passes don't want to run.  See init_optimization_passes.


Re: After GIMPLE...

2007-02-06 Thread Diego Novillo

Paulo J. Matos wrote on 02/06/07 14:19:


Why before pass_build_ssa? (version 4.1.1)

It depends on the properties your pass requires.  If you ask for 
PROP_cfg and PROP_gimple_any then you should schedule it after the CFG 
has been built, but if you need PROP_ssa, then you must be after 
pass_build_ssa which implies that your pass only gets enabled at -O1+.


Re: which opt. flags go where? - references

2007-02-07 Thread Diego Novillo

Kenneth Hoste wrote on 02/07/07 08:56:


[1] Almagor et al., Finding effective compilation sequences (LCES'04)
[2] Cooper et al., Optimizing for Reduced Code Space using Genetic  
Algorithms (LCTES'99)
[3] Almagor et al., Compilation Order Matters: Exploring the  
Structure of the Space of Compilation Sequences Using Randomized  
Search Algorithms  (Tech.Report)
[3] Acovea: Using Natural Selection to Investigate Software  
Complexities (http://www.coyotegulch.com/products/acovea/)


You should also contact Ben Elliston (CC'd) and Grigori Fursin (sorry, 
no email).


Ben worked on dynamic reordering of passes, his thesis will have more 
information about it.


Grigori is working on an API for iterative an adaptive optimization, 
implemented in GCC.  He presented at the last HiPEAC 2007 GCC workshop. 
 Their presentation should be available at http://www.hipeac.net/node/746



Some other questions:

* I'm planning to do this work on an x86 platform (i.e. Pentium4),  
but richi told me that's probably not a good idea, because of the low  
number of registers available on x86. Comments?


When deriving ideal flag combinations for -Ox, we will probably want 
common sets for the more popular architectures, so I would definitely 
include x86.


* Since we have done quite some analysis on the SPEC2k benchmarks,  
we'll also be using them for this work. Other suggestions are highly  
appreciated.


We have a collection of tests from several user communities that we use 
as performance benchmarks (DLV, TRAMP3D, MICO).  There should be links 
to the testers somewhere in http://gcc.gnu.org/


* Since there has been some previous work on this, I wonder why none  
of it has made it into GCC development. Were the methods proposed  
unfeasible for some reason? What would be needed to make an approach  
to automatically find suitable flags for -Ox interesting enough to  
incorporate it into GCC? Any references to this previous work?


It's one of the things I would like to see implemented in GCC in the 
near future.  I've been chatting with Ben and Grigori about their work 
and it would be a great idea if we could discuss this at the next GCC 
Summit.  I'm hoping someone will propose a BoF about it.


Re: Performance regression on the 4.3 branch?

2007-02-14 Thread Diego Novillo

H. J. Lu wrote on 02/14/07 09:22:


Is this the saem as

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30735


No, it isn't.


Re: Finalizer after pass?

2007-02-28 Thread Diego Novillo

Paulo J. Matos wrote on 02/28/07 11:07:


Is there a way to install a finalizing function? (to be called after
all functions in the pass have been processed)
Or to know if the current function being processed is the last one?
(maybe if I know the number of times my pass will be called!)

Perhaps it's easier to implement your feature as an IPA pass.  For IPA 
passes, you are not called with a specific function.  Instead, you get 
to traverse the callgraph yourself.  See passes like ipa-cp.c for details.


Re: Finalizer after pass?

2007-03-01 Thread Diego Novillo

Paulo J. Matos wrote on 03/01/07 10:41:


My IPA pass seems to be run only for -On, n>=1, is there a way to make
it ran even for -O0?


No, we only run IPA passes if flag_unit_at_a_time is set.  That only is 
set when optimizing.  At -O0, we simply emit functions individually.


Re: Accessing function code from CFG

2007-03-02 Thread Diego Novillo

Paulo J. Matos wrote on 03/02/07 10:12:


In an IPA pass, for each CFG node, I have a tree decl member from
which I can access the return type, name of the function, argument
names and its types, but I can't seem to find a way to get the
function code. I would guess it would be a basic block list but I
don't know where I can get it.


You need to get at the function structure from the cgraph node with 
DECL_STRUCT_FUNCTION (cgraph_node->decl).  Then you can use one of the 
CFG accessors like basic_block_info_for_function().


Re: Massive SPEC failures on trunk

2007-03-03 Thread Diego Novillo
Grigory Zagorodnev wrote on 03/03/07 02:27:

> There are three checkins, candidates for the root of regression:
>   http://gcc.gnu.org/viewcvs?view=rev&revision=122487
>   http://gcc.gnu.org/viewcvs?view=rev&revision=122484
>   http://gcc.gnu.org/viewcvs?view=rev&revision=122479
> 
SPEC2k works as usual[1] for me on x86_64 as of revision 122484.  The only new
compile failure I see is building 300.twolf with:

mt.c: In function 'MTEnd':
mt.c:46: warning: incompatible implicit declaration of built-in function 'free'
mt.c:46: error: too many arguments to function 'free'
specmake: *** [mt.o] Error 1

Ian, looks like your VRP patch may be involved.


[1] 176.gcc and 253.perlbmk usually miscompare for me.  Not sure why.


Re: Improvements of the haifa scheduler

2007-03-05 Thread Diego Novillo
Maxim Kuvyrkov wrote on 03/05/07 02:14:

>o Fix passes that invalidate tree-ssa alias export.

Yes, this should be good and shouldn't need a lot of work.

>o { Fast but unsafe Gupta's aliasing patch, Unsafe tree-ssa alias 
> export } in scheduler's data speculation.

"unsafe" alias export?  I would definitely like to see the tree->rtl
alias information transfer fixed once and for all.  Finishing RAS's
tree->rtl work would probably make a good SoC project.


Re: Signed overflow patches OK for 4.2?

2007-03-05 Thread Diego Novillo
Eric Botcazou wrote on 03/05/07 15:59:
>> Then it should also be disabled by default also in 4.1.3 and should
>> have been disabled in 4.1.2 which was only released last month so
>> there is no reason why it has to be disabled in 4.2.0 if everyone is
>> using 4.1 anyways.
> 
> VRP has become more aggressive in 4.2.x than in 4.1.x though.

Agreed.  I don't see the need to backport this functionality to 4.1.  It
has been out for quite some time now, used in various distros and we
have not been flooded with requests from users.

While this represents a new feature in 4.2, I don't think it's too
risky.  Whatever failures are triggered should be easy to identify and fix.

I personally don't like this feature very much as it may represent a
slippery slope into forcing us to warn in every optimization that
exploits undefined aspects of the standard.  But user pressure obviously
exists, so *shrug*.


Re: Signed overflow patches OK for 4.2?

2007-03-05 Thread Diego Novillo
Ian Lance Taylor wrote on 03/05/07 18:23:

> I gather you are saying here that it is OK with you to backport
> -fstrict-overflow/-Wstrict-overflow to 4.2.

Yes.


Re: Massive SPEC failures on trunk

2007-03-06 Thread Diego Novillo
Ian Lance Taylor wrote on 03/06/07 09:49:
> "Vladimir Sysoev" <[EMAIL PROTECTED]> writes:
> 
>> Bug has been already reported
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31037
> 
> I don't think this one could have anything to do with my VRP changes,
> but I'll try to take a look later today.
> 
Actually, this looks more related to my aliasing patch.  I'll be dealing
with this one soon.



Re: Accessing function code from CFG

2007-03-07 Thread Diego Novillo
Paulo J. Matos wrote on 03/07/07 10:36:

> Is this normal? It seems there are no basic blocks set for the
> functions. Probably my pass is being run before the bbs are created?

Looks like it.  Set a breakpoint in build_tree_cfg and your function.
If gdb stops in your function first, you found the problem.




Re: Accessing function code from CFG

2007-03-07 Thread Diego Novillo
Paulo J. Matos wrote on 03/07/07 11:43:

> What am I missing?

You are debugging the wrong binary.  I'd suggest you browse through
http://gcc.gnu.org/wiki/DebuggingGCC

You need to debug one of cc1/cc1plus/jc1



Re: Libiberty functions

2007-03-08 Thread Diego Novillo
Dave Korn wrote on 03/08/07 07:30:

>   (Also, bear in mind that if you want your new pass to work correctly with
> pre-compiled headers, you really ought to be using Gcc's garbage-collected
> memory management facilities.  See
> http://gcc.gnu.org/onlinedocs/gccint/Type-Information.html#Type-Information
> for the full gory details)

That's not right.  GCC does not use GC memory everywhere, passes can
also heap-allocate and/or use obstacks.  You do have to be careful when
mixing heap-allocated with GC memory, but for pass-local memory
allocation schemes, heap and obstacks are perfectly fine.

Paulo, you may also want to use the XNEW/XCNEW wrapper macros.  They are
 handy shorthand wrappers around malloc.

Another convenient way of allocating a pool of memory is to use obstacks
(See libiberty/obstack.c).


Re: Import GCC 4.2.0 PRs

2007-03-13 Thread Diego Novillo
Richard Guenther wrote on 03/13/07 05:57:

> Yes, this is a similar issue as PR30840 on the mainline, the CCP propagator 
> goes
> up the lattice in some cases.  This is something Diego promised me to look at 
> ;)
> But we might be able to paper over this issue in 4.2 ...

I'll take a look later this week.



Re: Query regarding struct variables in GIMPLE

2007-03-13 Thread Diego Novillo
Karthikeyan M wrote on 03/13/07 21:32:

> appears as  x.j = 10 inside the GIMPLE dump of the function body . Is
> there some place from where I can get it in the following( or any
> other simpler ) form

No, we don't unnecessarily take addresses of variables.  Structure
references are left intact.  For some aggregates that cannot be
scalarized we try to create artificial tags to represent the fields (to
get field sensitive results in points-to resolution).


Re: can't find VCG viewer

2007-03-15 Thread Diego Novillo
Sunzir Deepur wrote on 03/14/07 05:36:

> any idea where I can find a (free) graphical VCG viewer suitable
> for gcc's vcg outputs ?

I'd recommend the attached script.  Feed the output to GraphViz.  The
script may need changes if you are using RTL dumps.
#!/bin/sh
#
# (C) 2005 Free Software Foundation
# Contributed by Diego Novillo <[EMAIL PROTECTED]>.
#
# This script is Free Software, and it can be copied, distributed and
# modified as defined in the GNU General Public License.  A copy of
# its license can be downloaded from http://www.gnu.org/copyleft/gpl.html

if [ "$1" = "" ] ; then
echo "usage: $0 file"
echo
echo "Generates a GraphViz .dot graph file from 'file'."
echo "It assumes that 'file' has been generated with -fdump-tree-...-blocks"
echo
exit 1
fi

file=$1
out=$file.dot
echo "digraph cfg {"> $out
echo "  node [shape=box]"   >>$out
echo '  size="11,8.5"'  >>$out
echo>>$out
(grep -E '# BLOCK|# PRED:|# SUCC:' $file |  \
sed -e 's:\[\([0-9\.%]*\)*\]::g;s:([a-z_,]*)::g' |  \
awk '{  #print $0;  \
if ($2 == "BLOCK")  \
{   \
bb = $3;\
print "\t", bb, "[label=\"", bb, "\", style=filled, 
color=gray]";   \
}   \
else if ($2 == "PRED:") \
{   \
for (i = 3; i <= NF; i++)   \
print "\t", $i, "->", bb, ";";  \
}   \
}') >> $out
echo "}">> $out


ANNOUNCE: Gelato ICE GCC track, San Jose, CA, April 16-18, 2007

2007-03-15 Thread Diego Novillo

The GCC track will be on Mon 16/Apr and Tue 17/Apr.  The program should
be complete by now

 - Program at-a-glance:
http://ice.gelato.org/pdf/gelatoICE_ataglance.pdf

 - Speaker list and abstracts:
http://ice.gelato.org/program/program.php

-

The following GCC track is part of the Gelato ICE (Itanium Conference
& Expo) technical program, April 16-18, 2007, San Jose, CA. All
interested GCC developers are invited to attend .
A working list of speakers and topics can be found here:


This year there is a strong focus on Linux. Andrew Morton and Wim
Coekaerts, Senior Director for Linux Engineering at Oracle, are
keynote speakers. In addition to the GCC track, there are tracks
covering the Linux IA-64 kernel, virtualization, tools and tuning,
multi-core programming, and research.

GCC Track at Gelato ICE:

- Update on Scheduler Work & Discussion of New Software Pipelining
Work, Arutyun Avetisyan, Russian Academy of Science
- GPL2 and GPL3, Dan Berlin, Google
- Update on the Gelato GCC Build Farm, Matthieu Delahaye,
Gelato Central Operations
- Update on Prefetching Work, Zdenek Dvorak, SuSE
- Interprocedural Optimization Framework, Jan Hubicka, SuSE
- Update on Superblock Work, Bob Kidd, University of Illinois
- GCC and Osprey Update, Shin-Ming Liu, HP
- Compiling Debian Using GCC 4.2 and Osprey, Martin Michlmayr, Debian
- Update on Alias Analysis Work, Diego Novillo, Redhat
- Update on LTO, Kenneth Zadeck, NaturalBridge



Re: Query regarding struct variables in GIMPLE

2007-03-15 Thread Diego Novillo
Karthikeyan M wrote on 03/15/07 15:06:
> Thanks.
> Can you point me to documentation / code where I can get more
> information about these artificial tags ?

gcc/tree-ssa-alias.c:create_structure_vars()

The section on Structural alias analysis in the internals documentation
should also help.


Re: Google SoC Project Proposal: Better Uninitialized Warnings

2007-03-19 Thread Diego Novillo
Manuel López-Ibáñez wrote on 03/17/07 14:28:

> This is the project proposal that I am planning to submit to Google
> Summer of Code 2007. It is based on previous work of Jeffrey Laws,
> Diego Novillo and others. I hope someone will find it interesting and

Yes, I can act as a mentor.

I'm particularly interested in what we are going to do at -O0.  Ideally,
I would try to build the SSA form and/or a predicated SSA form and try
to phrase the problem in terms of propagation of the uninitialized
attribute.

I agree with your goal of consistency.  The erratic behaviour of the
current -Wuninitialized implementation is, to me, one of the most
annoying traits of GCC.  We can't even reorder the pass pipeline without
running into this problem.


Re: We're out of tree codes; now what?

2007-03-19 Thread Diego Novillo
Steven Bosscher wrote on 03/19/07 10:14:

> IMHO this is still the better solution than the subcodes idea.

Agreed.  If the performance hit is not too large, getting a wider tree
code is much simpler to maintain.


Re: Google SoC Project Proposal: Better Uninitialized Warnings

2007-03-19 Thread Diego Novillo
Manuel López-Ibáñez wrote on 03/19/07 14:45:

> Is building this early SSA form something that can be tackled by a
> newbie developer with almost zero middle-end knowledge within the time
> frame of the Summer of Code?

Yes, it should not be too hard.  See tree_lowering_passes.

You may also want to read on the compilation flow to get some idea on
how things are processed after parsing.  There are some diagrams at
http://people.redhat.com/dnovillo/Papers/#cgo2007


Re: Creating parameters for functions calls

2007-03-28 Thread Diego Novillo
Antoine Eiche wrote on 03/27/07 13:28:

> Thanks for any help in finishing this pass

See how omp-low.c builds calls to the child parallel functions
(create_omp_child_function).


Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-29 Thread Diego Novillo
Mark Mitchell wrote on 03/22/07 22:10:

> Diego, Roger, Jason, would you please let me know if you can work on the
> issues above?  I'm going to try to test Jim's patch for PR 31273 tonight.

I'm looking at 29585 today.


Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-30 Thread Diego Novillo
Mark Mitchell wrote on 03/22/07 22:10:

> PR 29585 (Novillo): ICE-on-valid

This one seems to be a bug in the C++ FE, compounded by alias analysis
papering over the issue.  We are failing to mark DECLs in vtbl
initializers as addressable.  This causes the failure during aliasing
because it is added to a points-to set but not marked for renaming.

Since the variable does not have its address taken, we initially do not
consider it interesting in the setup routines of alias analysis.
However, the variable ends up inside points-to sets and later on we put
it inside may-alias sets.  This causes it to appear in virtual operands,
but since it had not been marked for renaming, we fail.

I traced the problem back to the building of vtables.  I'm simply
calling cxx_mark_addressable after building the ADDR_EXPR (I'm wondering
if building ADDR_EXPR shouldn't just call langhooks.mark_addressable).

Another way of addressing this would be to mark symbols addressable
during referenced var discovery.  But that is a bit hacky.

Mark, does this look OK?  (not tested yet)

Index: cp/class.c
===
--- cp/class.c  (revision 123332)
+++ cp/class.c  (working copy)
@@ -7102,6 +7102,7 @@
   /* Figure out the position to which the VPTR should point.  */
   vtbl = TREE_PURPOSE (l);
   vtbl = build1 (ADDR_EXPR, vtbl_ptr_type_node, vtbl);
+  cxx_mark_addressable (vtbl);
   index = size_binop (PLUS_EXPR,
  size_int (non_fn_entries),
  size_int (list_length (TREE_VALUE (l;



Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-30 Thread Diego Novillo
Jason Merrill wrote on 03/30/07 11:45:

> Looks fine to me.  Many places in the front end use build_address rather 
> than build1 (ADDR_EXPR) to avoid this issue.

Yeah, I found other cases in Java and in c-*.c.  In one case, we are
building the address of a LABEL_DECL for a computed goto
(finish_label_address_expr).

Interestingly enough, mark_addressable refuses to mark the label as
addressable, but we need the label addressable so that it's processed
properly by the compute_may_aliases machinery.

Given that we need to be very consistent about addressability marking in
the FEs, wouldn't we be better off doing this in build1_stat()?

Index: tree.c
===
--- tree.c  (revision 123332)
+++ tree.c  (working copy)
@@ -2922,7 +2922,11 @@ build1_stat (enum tree_code code, tree t

 case ADDR_EXPR:
   if (node)
-   recompute_tree_invariant_for_addr_expr (t);
+   {
+ recompute_tree_invariant_for_addr_expr (t);
+ if (DECL_P (node))
+   TREE_ADDRESSABLE (node) = 1;
+   }
   break;

 default:


Thanks.


Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-30 Thread Diego Novillo
Mark Mitchell wrote on 03/30/07 12:22:

> So, I think the right fix is (a) the change above, (b) remove the
> TREE_ADDRESSABLE setting from mark_vtable_entries (possibly replacing it
> with an assert.)

After removing the papering over TREE_ADDRESSABLE we were doing in the
aliaser, I found that other users of ADDR_EXPR are not consistently
setting the addressability bit.

This led me to this patch, which I'm now testing.  This removes the
workaround we had in the aliaser and consistently marks every DECL
that's put in an ADDR_EXPR as addressable.

One thing that I'm wondering about this patch is why hasn't this been
done before?  We seem to purposely separate TREE_ADDRESSABLE from
ADDR_EXPR.  Perhaps to prevent pessimistic assumptions?  The current
aliasing code removes addressability when it can prove otherwise.

This patch bootstraps all default languages.  I'll test Ada later on,
but I need input from all the FE folks.


Thanks.


2007-03-30  Diego Novillo  <[EMAIL PROTECTED]>

	* tree.c (build1_stat): When building ADDR_EXPR of a DECL,
	mark it addressable.
	* tree-ssa-alias.c (add_may_alias): Assert that ALIAS may be
	aliased.
	* c-typeck.c (c_mark_addressable): Handle LABEL_DECL.

Index: tree.c
===
--- tree.c	(revision 123332)
+++ tree.c	(working copy)
@@ -2922,7 +2922,11 @@ build1_stat (enum tree_code code, tree t
 
 case ADDR_EXPR:
   if (node)
-	recompute_tree_invariant_for_addr_expr (t);
+	{
+	  recompute_tree_invariant_for_addr_expr (t);
+	  if (DECL_P (node))
+	lang_hooks.mark_addressable (node);
+	}
   break;
 
 default:
Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c	(revision 123332)
+++ tree-ssa-alias.c	(working copy)
@@ -2045,11 +2045,7 @@ add_may_alias (tree var, tree alias)
   gcc_assert (var != alias);
 
   /* ALIAS must be addressable if it's being added to an alias set.  */
-#if 1
-  TREE_ADDRESSABLE (alias) = 1;
-#else
   gcc_assert (may_be_aliased (alias));
-#endif
 
   if (v_ann->may_aliases == NULL)
 v_ann->may_aliases = VEC_alloc (tree, gc, 2);
Index: c-typeck.c
===
--- c-typeck.c	(revision 123332)
+++ c-typeck.c	(working copy)
@@ -3247,6 +3247,7 @@ c_mark_addressable (tree exp)
 
 	/* drops in */
   case FUNCTION_DECL:
+  case LABEL_DECL:
 	TREE_ADDRESSABLE (x) = 1;
 	/* drops out */
   default:


Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-30 Thread Diego Novillo
Diego Novillo wrote on 03/30/07 13:21:

> This patch bootstraps all default languages.  I'll test Ada later on,
> but I need input from all the FE folks.

Sigh.  I forgot to include Mark's suggestion in the patch.  With this
patch, calling build_address in dfs_accumulate_vtbl_inits is not
strictly required (because we mark the DECL addressable in build1 now),
but I will include it in the final version.


Re: GCC 4.2.0 Status Report (2007-03-22)

2007-03-30 Thread Diego Novillo
Richard Kenner wrote on 03/30/07 13:45:

> One concern I have in marking a DECL addressable that early on is that
> it may stay "stuck" even if the ADDR_EXPR is later eliminated.  This can
> be common in inlined situations, I thought.

The aliaser is fairly aggressive at removing TREE_ADDRESSABLE from
variables that do not need it anymore, so that should not be a problem.


> We *do* have to make up our mind, of course, on a precise time when it's
> set and be very clear about whether we can reset it (and how) if we
> discover later that the address actually isn't being taken.

Agreed.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Richard Guenther wrote on 04/10/07 08:01:

> It looks decent, but I also would go one step further with location
> information and
> PHI node "canonicalization".

What would be the "step further" for PHI nodes?  We haven't really left
anything to spare inside GS_PHI.

For insn locators, the idea is to simply move the RTL insn locator code
into GIMPLE.  This can be even done early in the implementation process,
but for simplicity we left it for later.  If someone wants to work on
that, then great.

> Further for memory usage we may want to use
> available padding on gimple_statement_base as flags or somehow trick gcc to 
> use
> tail-padding for inheritance...

There is only going to be padding on 64 bit hosts.  Instructions with no
subcode will use those bits as bitflags.

> For traversal speed I'd also put the chain first in the structure:

Sure.  That's easy enough to experiment with while doing the implementation.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Andrew Pinski wrote on 04/10/07 01:43:

> Yes clobbers for asm are strings and don't really need to be full
> trees.  This should help out more.  though it does make it harder to
> implement.

Hmm, could be.  The problem is that __asm__ are so infrequent that I'm
not sure it's worth the additional headache.

> For "GS COND", you forgot about all the unorder conditionals which
> don't happen that often but can.

Good point.  Thanks.

> Most of the labels can go away with better handling of the CFG.  Gotos
> really should pointing to the basic block rather than labels.

Remember that when we emit GIMPLE, we are not working with a CFG. One
could argue that we would want to make the GOTO target a union of a
label and block, but a label-to-block map is just as good, and easier to
work with.

> I also think we can improve our current gimplification which produces
> sometimes ineffient gimplification which will change your numbers of
> how many copies exist, see PRs 27798, 27800, 23401, 27810 (this one
> especially as combine.i numbers show that).

True.  But remember that the stats were gathered inside the SSA
verification routines, which happens during optimization.  All those
gimplification inefficiencies are quickly eliminated by the first few
scalar cleanups.

I very much doubt that you would see significantly different instruction
distribution profiles if you made improvements in the gimplifier.


> Also I noticed in your pdf, you have "PHI NODE" as 12%, we can improve
> the memory usage for this statement by removing the usage of
> TREE_CHAIN/TREE_TYPE, so we can save 4/8 bytes for those 12% without
> doing much work.  I can send a patch in the next week or so (I am busy
> at a conference the next two days but I can start writting a patch
> tomorrow).

Well, if you want to do this patch for 4.3 and it gives us sufficient
benefit, go for it.  But it will have no effect on the tuples branch.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
J.C. Pizarro wrote on 04/10/07 01:24:

> 1. Are there fields for flags, annotations, .. for special situations?

Yes, some instructions have no sub-codes and will use the subcodes field
for flags.  Annotations and such will be discouraged as much as
possible.  If an attribute is very frequently used and causes
significant degraded behaviour in pointer-maps or hash tables, then we
can see about adding it somewhere.

> 2. These structures are poorly specified.
>   Have they advanced structures like lists, e.g., list of predecessors
>instructions of loops, predecessors instructions of forwarded
>jumps, etc. instead of poor "prev"?

I think you are missing the point.  This structure defines the bits
needed for representing GIMPLE.  All we need is a double-chain of
instructions.  These chains are embedded inside basic blocks.

> 3. Are there fields for more debug information?

More debug information?  What debug information are you looking for?


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
J.C. Pizarro wrote on 04/10/07 02:08:

> However, they've appeared the "conditional moves" to don't jump
> and consecuently to reduce the penalization of the conditional jump.

We already have conditional moves.  Notice that subcodes for GS_ASSIGN
have the same meaning as they do today.  GS_ASSIGN:{EQ_EXPR,NE_EXPR...}
will have three operands.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Steven Bosscher wrote on 04/10/07 02:43:
> On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote:
>> Thoughts/comments on the proposal?
> 
> This looks a lot like the RTL insn!
> 
> For locus, you can use just an "int" instead of a word if you use the
> same representation for locations as we do for RTL (INSN_LOCATOR). You
> mention this step as "straightforward to implement" after the
> conversion to tuples is complete. Why did you decide not to just use
> the scheme right from the start?

One less thing to think about.  That's the only reason, if anyone wants
to work on this from day 1, then great.  But since this is easily
fixable afterwards, and we'll already have enough headaches with the
basic conversion, it seemed simpler just to let this be for now.

> this: You can use the same locator information in GIMPLE as in RTL,
> which saves a conversion step; and you free up 32 bits on 64-bits
> hosts, which is nice when you add the inevitable so-many-bits for
> flags to the GIMPLE tuples ;-)

Absolutely.  The advantages are very clear.  Location information at
every insn is extremely redundant.

> I don't really like the idea for promoting subcodes to first-level
> codes, like you do for GS_COND NE and EQ. Looks complicated and
> confusing to me. What is the benefit of this?

Mostly, speed of recognition.  I'm not totally against dropping this.
As Andrew M. mentioned during our internal discussions, we will now have
to implement predicates that recognize all the insns in the "GS_COND"
family.

This is something that we can do some experimentation.

> Looks like a nice document overall.  I hope we can keep it up to date,
> it's a good start for new-GIMPLE documentation ;-)

That's certainly a challenge.  I could mumble something about doxygen
and how much easier it would be if we could embed this in the source
code.  But the synchronization problem still remains.  Many times we
have comments totally out of sync with the code.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
J.C. Pizarro wrote on 04/10/07 08:17:

> Is a need to build several tables in HTML of the codes (with subcodes).
> Each table has an explanation. It's like a roadmap.

Hmm, what?


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
J.C. Pizarro wrote on 04/10/07 10:24:

> By example, worth weigths, use's frecuencies, statistical data, ... of GIMPLE.
> To debug the GIMPLE too.

That's kept separately.  Pointer maps, hash tables...

> How you debug the failed GIMPLE?

Lots of debug_*() functions available.  You also use -fdump-tree-... a
lot.  In the future, I would like us to be able to inject GIMPLE
directly at any point in the pipeline to give us the illusion of unit
testing.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Richard Guenther wrote on 04/10/07 10:45:
> On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote:
>> Steven Bosscher wrote on 04/10/07 02:43:
>>> I don't really like the idea for promoting subcodes to first-level
>>> codes, like you do for GS_COND NE and EQ. Looks complicated and
>>> confusing to me. What is the benefit of this?
>> Mostly, speed of recognition.  I'm not totally against dropping this.
>> As Andrew M. mentioned during our internal discussions, we will now have
>> to implement predicates that recognize all the insns in the "GS_COND"
>> family.
>>
>> This is something that we can do some experimentation.
> 
> Will this replace the tree code class or is it merely sub-classing and
> the sub-code
> is the real code we have now?

Sorry, I don't understand what you are trying to say.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Richard Guenther wrote on 04/10/07 11:02:

> Well, we now have a tcc_comparison for example - is the gimple statement code
> something like that?  Or will the grouping of gimple statement codes
> to code classes
> persist?  If we were able to encode the class directly in the gimple
> statement we
> can avoid a memory reference to look up the operands class.

For most situations, I would like to avoid the class lookup and be able
to go off the statement code directly.  I have to admit that I am not
totally convinced that this promotion of subcodes to first-level codes
is a good idea.

Richard suggested it to avoid having to lookup the subcode when
recognizing frequent codes like copy assignments.  But that also means
that we now have to add more cases to switch() and or need || predicates
to determine what kind of GS_ASSIGN we are dealing with.

I'm ready to be convinced either way.  OTOH, either approach would not
make the design drastically different.  We could explore both options
and get numbers.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Ian Lance Taylor wrote on 04/10/07 13:53:

> Don't you need four operands for a conditional move?  Is that what you
> meant?

Ah, yes.  The two comparison operands and the two assignment values.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Ian Lance Taylor wrote on 04/10/07 13:53:

> I seem to recall that at one point somebody worked on a gensimplify
> program or something like that.  Would it make sense to revive that
> approach, and use it to generate simplifiers for trees, GIMPLE, and
> RTL, to avoid triplification of these basic optimizations?

Perhaps.  This would allow us to define folding/simplification using a
pattern matching system.  I think I like this better than the other two
choices.

Replicating fold-const.c for GIMPLE would involve a bit of code
duplication, but since GIMPLE is a strict subset of ASTs, I think it
would be a fraction of what we have today.  Still, it's annoying and we
should probably avoid it.


> Or should we instead rewrite fold-const.c to work on GIMPLE rather
> than trees, thus essentially removing constant folding from the
> front-ends?  If we follow that path somebody would need to think about
> the effect on warnings issued by the front-end, and on
> __builtin_constant_p.

I don't think we want to do that.  Folding and simplification needs to
be done at just about every level.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Ian Lance Taylor wrote on 04/10/07 14:13:
> Diego Novillo <[EMAIL PROTECTED]> writes:
> 
>> Following up on the recent discussion about GIMPLE tuples
>> (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized
>> our main ideas and implementation proposal in the attached document.
>>
>> This should be enough to get the implementation going, but there will be
>> many details that still need to be addressed.
>>
>> Thoughts/comments on the proposal?
> 
> For purposes of LTO, it is essential that we be able to efficiently
> load the IR during the whole program optimization phase.

Certainly.  Now, is GIMPLE going to be our bytecode?  Or do we want to
invest the time in some other bytecode representation with an eye
towards a future JIT component?

Regardless of that, streaming GIMPLE is certainly something worth
pursuing.  Even if it is just to give us the ability to load IL snippets
and inject them into the optimizers without having to go through all the
intermediate steps.

> Part of that is almost certainly going to mean having some sort of
> index which will tell us whether to load the IR at all--if the
> functions represented in some .o file are rarely called, then we
> should use the .o file as-is, and not try to further optimize it.
> This is not part of the current LTO plan, but I think it is inevitable
> if we are to avoid an hours-long compilation process.
> 
> But there is another part: we need to have an IR which can be very
> quickly loaded into memory for further processing.  When it comes to
> loading IR, there is nothing faster than mmap.  That requires that the
> IR be stored in the .o file in a form which can be used directly when
> it is read in from memory.  And ideally that means no pointer
> swizzling: the IR should be usable when loaded without modification.
> And because the link phase can see arbitrary .o files, we can not use
> the PCH hack of requiring a specific memory address.  So that requires
> an IR which is position independent.
> 
> The obvious way to make the proposed tuples position independent would
> be to use array offsets rather than pointers.  This has the obvious
> disadvantage that every access through a pointer requires an
> additional memory reference.  On the other hand, it has some other
> advantages: it may no longer be necessary to keep a previous pointer

I doubt this.  We had started with single-linked chains but reverse
traversals do occur, and they were very painful and slow.

> in each tuple; we can delete tuples by marking them with a deleted
> code, and we can periodically garbage collect deleted tuples and fix
> up the next pointers.  On a 64-bit system, we do not need to burn 64
> bits for each pointer; 32 bits will be sufficient for an array offset.
> 
> I would like us to seriously think about this approach.  Most of the
> details would be hidden by accessor macros when it comes to actual
> coding.  The question is whether we can tolerate some slow down for
> normal processing in return for a benefit to LTO.
> 
> If anybody can see how to get the best of both worlds, that would of
> course be even better.

I've thought about this a little bit and it may not be all that onerous.
 So, if you take the components of a tuple:

  nextCould be a UID to the next tuple
  prevLikewise
  bb  Use bb->index here
  locus   Not needed.  INSN_LOCATOR.
  block   Likewise.

The operands may get tricky, but perhaps not so much.  We have

a- _DECLs.  These are easily replaced with their UID and a symbol table.
b- SSA_NAMEs.  Just the SSA name version is enough.
c- *_CONST.  They're just a bit pattern, no swizzling required.  But we
may need to define byte ordering.
c- *_REF.  These may be tricky, but we could emit them using a REF table
and just put the index here.

We then reserve the first few bits to distinguish the class of operand
and the remaining bits as the index into the respective table.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Daniel Berlin wrote on 04/10/07 15:18:
> There is no need for the addresses_taken bitmap, it's a waste of space.

Awesome.  I was going to check, but I forgot.  I did check the
stmt_makes_clobbering_call and that one is also write-only.


> Neither of these really needs it, and i have a patch to remove it entirely.

Excellent.  Thanks.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Dave Korn wrote on 04/10/07 15:39:

>   Reverse-traversing an array really isn't all that painful or slow!

Instructions are not laid out in an array.  Insertion and removal is
done constantly.  It would be very expensive to use arrays to represent
basic blocks.  Insertions and deletions are all too common.

>   How about delta-linked lists?  Makes your iterators bigger, but makes every
> single node smaller.

Worth a shot, I guess.  Don't recall what other properties these things had.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Andrew Pinski wrote on 04/10/07 16:04:
> On 4/10/07, Andrew Pinski <[EMAIL PROTECTED]> wrote:
>> Here is the quick patch (thanks to the work done for gimple tuple)
>> which does this, removes the unneeded type from phi nodes.  I have not
>> tested it except for a quick test on some small testcases so there
>> might be more places which use TREE_CHAIN instead of PHI_CHAIN.
> 
> This patch has one problem, the GC issue with chain_next and
> lang_tree_node, this problem is not hard to fix.

I'm not sure what you want me to do with this patch.  It's not related
to this thread and would not be applicable to the tuples branch.

I would suggest that you thoroughly test the patch, measure the benefit
and if it's good enough propose it for 4.3.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Richard Henderson wrote on 04/10/07 20:30:

> Perhaps I misunderstood what Diego was proposing, but I 
> would have thought the subcode would continue to be the
> tree PLUS_EXPR, and not a GS_PLUS something.

Yes.

> With that, build_foldN does essentially what we want,
> without having to regenerate tree nodes on the input side.

Sure, but things will be different if/when the operands stop being 'tree'.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Ian Lance Taylor wrote on 04/10/07 20:49:

> I'm having a hard time seeing it.  fold_build2 calls fold_binary; I
> agree that if we can handle fold_binary, we can handle fold_build2.
> But fold_binary takes trees as parameters.  How are you thinking of
> calling it?

the gimple version of z = x + y is

stmt -> GS_ASSIGN 

we have a wrapper that calls:

op0 = GIMPLE_OPERAND (stmt, 0);
op1 = GIMPLE_OPERAND (stmt, 1);
op2 = GIMPLE_OPERAND (stmt, 2);
t = fold_build2 (GIMPLE_SUBCODE(stmt), TREE_TYPE (op0), op0, op1, op2);

and then stuffs t's operands back into 'stmt'.


Re: RFC: GIMPLE tuples. Design and implementation proposal

2007-04-10 Thread Diego Novillo
Richard Henderson wrote on 04/10/07 21:19:
> On Tue, Apr 10, 2007 at 08:48:27PM -0400, Diego Novillo wrote:
>> Sure, but things will be different if/when the operands stop being 'tree'.
> 
> We'll burn that bridge when we come to it.

Works for me.


New wiki page on testing compile times and memory usage

2007-04-12 Thread Diego Novillo

I've added a collection of scripts that I have gathered over time to
test compile time and memory usage when making changes to the compiler.

http://gcc.gnu.org/wiki/PerformanceTesting

If you have other scripts or tests that could be used for this, please
add them to this page.


Thanks.


GIMPLE tuples document uploaded to wiki

2007-04-13 Thread Diego Novillo

I have added the design document and links to most of the discussions
we've had so far.  Aldy updated the document to reflect the latest thread.

http://gcc.gnu.org/wiki/tuples


Re: GIMPLE tuples document uploaded to wiki

2007-04-14 Thread Diego Novillo
Jan Hubicka wrote on 04/14/07 16:14:

> Looks great, still I think "locus" and "block" could be both merged into
> single integer, like RTL land has INSN_LOCATOR.

That's the idea.  But it's simpler to do this for now.  The insn locator
is easily done at anytime during the implementation.

> Also ssa_operands structures should be somewhere in the header and uid
> would be handy for on-side datastructures.

No.  SSA operands need to be split in the instructions that actually
need them.  Also, UIDs are tempting but not really needed.  I would only
consider them if using pointer-maps or hash tables gets outrageously
expensive.


> In CFG  getting rid of labels in GS_COND woulsd actually save us
> noticeable amount of memory by avoiding the need for labels.  Perhaps we
> can simply define "true branch target"/"false branch target" to point to
> label or BB depending on CFG presence or to be NULL after CFG conversion
> and rely on CFG edges. GS_SWITCH would be harder, since association with
> CFG edges is not so direct.

Sure, that would be something to consider.


Re: GIMPLE tuples document uploaded to wiki

2007-04-14 Thread Diego Novillo
Jan Hubicka wrote on 04/14/07 21:14:

> I just wondered if your document is documenting the final shape or what
> should be done during hte first transition.  If the second, probably 2
> words should be accounted for location as source_locues is currently a
> structure.

The document is describing what the initial implementation will look
like.  It will/should evolve as the implementation progresses.


> So you expect the ssa_operands to be associated via a hashtable

Hmm?  No, they are there.  Notice gimple_statement_with_ops and
gimple_statement_with_memory_ops.


> Concerning uids, it is always dificult to get some good data on this
> sort of thing.  It seems to me that the UID would be handly and easy to
> bundle to some other integer, but it is not too important, especially if
> get some handy abstraction to map data with statements that we can
> easilly turn in between hashtables and arrays to see the difference
> instead of writting hasthables by hand in every pass doing this.

I grepped for uid and IIRC there are only two passes using UIDs: DSE and
PRE.  We should see how they react to having uid taken away from them.


> I have some data for this, lets discuss it at ICE.

Sounds good.


  1   2   3   4   5   6   7   8   9   10   >