Re: [C++] Possible GCC bug

2012-11-16 Thread Dodji Seketeli
Jiri Palecek  a écrit:

> Ulf Magnusson wrote:
>> On Wed, Nov 14, 2012 at 6:10 PM, Piotr Wyderski
>>   wrote:
>>> The following snippet:
>>>
>>> class A {};
>>> class B : public A {
>>>
>>> typedef A super;
>>>
>>> public:
>>>
>>> class X {};
>>> };
>>>
>>>
>>> class C : public B {
>>>
>>> typedef B super;
>>>
>>> class X : public super::X {
>>>
>>>typedef super::X super;
>>> };
>>> };
>>>
>>> compiles without a warning on Comeau and MSVC, but GCC (4.6.1 and
>>> 4.7.1) failes with the following message:
>>>
>>> $ gcc -c bug.cpp
>>> bug.cpp:18:24: error: declaration of ‘typedef class B::X C::X::super’
>>> [-fpermissive]
>>> bug.cpp:14:14: error: changes meaning of ‘super’ from ‘typedef class B
>>> C::super’ [-fpermissive]
>>>
>>> Should I file a report?
>>>
>>>  Best regards, Piotr
>> Here's a two-line TC:
>>
>> typedef struct { typedef int type; } s1;
>> struct S2 { s1::type s1; };
>>
>> Fails with GCC 4.6.3; succeeds with clang 3.0. Looks like a bug to me.
>>
>> /Ulf
>>
> In your example, GCC is in fact right. Basically, you mustn't have a name 
> refer to two things in a class:
>
> 3.3.6/1: ... A name N used in a class S shall refer to the same declaration 
> in its context and when re-evaluated in the
> completed scope of S. No diagnostic is required for a violation of
> this rule. ...

That, and [dcl.typedef]/6 says:

In a given scope, a typedef specifier shall not be used to redefine
the name of any type declared in that scope to refer to a different
type.

So, I tend to think that GCC is right here.

-- 
Dodji


Re: Unifying the GCC Debugging Interface

2012-11-16 Thread Martin Jambor
Hi,

On Wed, Nov 14, 2012 at 05:12:15PM -0800, Lawrence Crowl wrote:
> Diego and I seek your comments on the following (loose) proposal.
> 
> 
> It is sometimes hard to remember which printing function is used
> for debugging a type, or even which type you have.

Yeah, from time to time a still need to look into my list.  I like the
goal, however...

> 
> We propose to rely on overloading to unify the interface to a small
> set of function names.  Every major data type should have associated
> debug/dump functionality.  We will unify the current *_dump/*_debug
> functions under the same common overloaded name.
> 
> We intend to only apply this approach to functions that take the
> type to display as an argument, and that are routinely used in
> debugging.
> 
> We propose to provide several function overload sets, as below.
> 
> 
> dump_pretty
> 
> This function overload set provides the bulk of the printing.
> They will use the existing pretty-printer functions in their
> implementation.

So you do not plan to replace/rename at least some of them?  This
seems like unnecessary and confusing layering just to avoid the work
to do the right thing.

> 
> dump_raw
> 
> This function overload set provides the raw oriented dump,
> e.g. a tuple.

I'm not sure I understand the whole raw thing.

> 
> dump_verbose
> 
> This function overload set provides the extra details dump.
> 

> 
> All of these functions come in two forms.
> 
> function (FILE *, item_to_dump, formatting)
> function (item_to_dump, formatting)
> 

I'm afraid we can't really always rely on overloading.  For example,
even though I often use debug_tree to examine a tree, probably even
more often I just use debug_generic_expr.  When I write stuff into a
dump file, I rarely ever use the verbose variants but I certainly want
them to exist.  And there might be other similar cases, like the
.*_brief dumping functions that are sometimes also used.

Nevertheless, it would be great if we had fewer and consistent names
of dumping functions, even though perhaps not just three.  It would
also be nice if all the file variants had an integer indent parameter
;-)

Thanks,

Martin

> If the FILE* is not specified, the output is to stderr.  The
> formatting argument is optional, with a default suitable to the kind
> of item to dump.
> 
> 
> We should remove tree-browser.c.  It is not used at all and it is
> likely broken.
> 
> -- 
> Lawrence Crowl


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Basile Starynkevitch
On Thu, Nov 15, 2012 at 07:59:36PM -0500, Diego Novillo wrote:
> As we continue adding new C++ features in the compiler, gengtype
> is becoming an increasing source of pain.  In this proposal, we
> want to explore different approaches to GC that we could
> implement.

Just a minor remark: we don't only speak of Gengtype, but also of Ggc and of 
PCH.
I agree that they all closely related (and perhaps even LTO serialization might 
be affected). And I am biased, because GCC MELT is built about the GCC garbage 
collector
(if you don't know about MELT see http://gcc-melt.org/ ; MELT is a domain 
specific 
language to extend GCC). However, I probably will be able to adapt MELT to new 
conventions.

> 
> At this point, we are trying to reach consensus on the general
> direction that we should take.  Given how intertwined GC and PCH
> are, the choices we make for one affect the other.

My belief was that PCH (pre-compiled header) is deprecated with PPH 
(preprocessed headers). Will PCH continue to exist once the PPH effort goes 
mainline. 
Or is PPH abandoned??? How is the idea of "getting rid of GC" related to PPH?

> We don't have a strong preference at the moment, but we are
> leaning in these directions:
> 
> Diego:
>   Get rid of GC completely.  Move into pool allocation.
>   If we continue to use GC, then either use boehm-gc or
>   continue use the precise allocator, but with user
>   generated  marking.
> 
> Lawrence:
>   Get rid of GC completely.  Move into pool allocation.
>   If we continue to use GC, either move to user-generated
>   marking or implement the GTY attributes in cc1plus (read
>   below).  For PCH, used a fixed address where all the
>   PCH-able data would go, as this would represent less work
>   than implementing streamable types.

I actually disagree with the "Get rid of GC" idea, but I am not sure 
that we all understand the same thing about it (and I have the feeling 
of the opposite). I would probably agree with "Get rid of Gengtype+Ggc+PCH 
and replace it with something better" which might be what "Get rid of GC" mean.


My strong belief is that a compiler project as gigantic as GCC needs some kind 
of garbage collection. I also believe that the current (4.7) garbage 
collection *implementation* (which is probably what both Diego and 
Lawrence call the "GC" to get rid of) is grossly unsatisfactory 
(so I parse "Get rid of GC" with a big lot of ambiguity).

To be more specific, I call garbage collection a scheme where (small newbie) 
GCC contributors 
can contribute easily some code to GCC without having to understand when, and 
how precisely, 
some data will be freed. If a user adds a pass (or a builtin) which adds e.g. 
some Gimple, 
he does not immediately know when should his Gimple be freed (and it certainly 
should be 
freed outside of his pass).

Memory liveness is a global non-modular property of data. 
Whatever is done in GCC should be helpful to newbie GCC contributors.
The current situation is quite bad : only a few (perhaps a dozen or two) 
GCC gurus have in their brain a clear enough picture of GCC memory to be 
able to assess it reliably. And in the current scheme, a pass (e.g. added by a 
plugin) 
may have to manage its memory manually.

I believe that we should state what are the objectives with the memory 
allocation part.
My opinion is one of the strongest objective should be to ease the contribution 
of GCC newbies; 
they should not need to read a lot of code or documentation related to memory 
management 
before understanding how to add their own work, in a reliable way, to GCC. 
I even have the opinion that adding features inside GCC which increases 
slightly 
the compilation time (while maintaining the quality of the output object code)
can be justified by the ease of external contributions.
In particular, any rework of the memory management should not make 
the coding of plugins harder but simpler.
 
A compiler as strong as GCC has a lot of data types (about 2000). 
A compiler is working on complex internal representations which are inherently 
very circular: 
GCC is dealing with a lot of intermixed circular directed graphs. 
This is in sharp contrast with e.g. huge graphical toolkits like Qt or Gtk, 
where the memory references are tree-like (an X11 window contains a set of 
sub-windows, and each X11 window belongs to exactly one parent; hence naturally
a Qt or Gtk widget belongs to at most one parent widget. We don't have such 
nice properties 
inside GCC). So whatever we propose, we have to deal with that unpleasant fact: 
the pointers 
inside a cc1 (or lto1) process are very very messy and circular, and it is not 
easy 
to know when a pointed zone (i.e. a GCC "object") should be freed. We also have
a difference with graphical toolkits: the major resource we have to manage 
globally is 
internal memory (in contrast, Qt or Gtk have to manage X11 windows which are 
external 
resources inside the X11 server and outside of 

Re: RFC - Alternatives to gengtype

2012-11-16 Thread Laurynas Biveinis
> === Approach: Use the Boehm collector.
>
> The general approach is to define allocation and deallocation
> functions to the Boehm collector that handle memory within the
> PCH range.  We would also provide a class to register global
> roots at the point of declaration of those roots. We would likely
> configure the collector to collect only on command, not
> automatically.  Laurynas says previous efforts showed
> that the peak memory usage was slightly higher that with the
> existing collectors due to Boehm's conservativeness. The run time
> was comparable.

Yes. To be certain one would have to update Boehm's GC branch to the
current GCC and re-test. Barring any unknown unknowns, it requires
relatively little implementation effort.

> === Approach: Move RTL back to obstacks.
>
> Laurynas started this in 2011.  I'm not sure what the status of
> this is.  Laurynas?

Curently it's abandoned. I started it based on Bernd Schmidt's
experimental patch. I had good progress but could not (and still
cannot) allocate the time required to finish it. I think this project
is a good idea regardless of overall memory management strategy
chosen.

-- 
Laurynas


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Uday P. Khedker



Basile Starynkevitch wrote, On Friday 16 November 2012 03:36 PM:



To be more specific, I call garbage collection a scheme where (small newbie) 
GCC contributors
can contribute easily some code to GCC without having to understand when, and 
how precisely,
some data will be freed. If a user adds a pass (or a builtin) which adds e.g. 
some Gimple,


Couldn't agree with you more!

Imagine my frustration when GDFA (generic data flow analyzer,
http://www.cse.iitb.ac.in/grc/index.php?page=gdfa) that was working fine with 
GCC-4.3.0
suddenly started seg faulting in current version. The first few passes (eg. 
available
expressions analysis, live variables analysis, partially available expressions 
analysis)
work fine but the later pass of partial redundancy elimination seg faults. It 
took us some
time to figure out that the data flow information computed in earlier passes is 
no longer
available in the later passes.




Memory liveness is a global non-modular property of data.
Whatever is done in GCC should be helpful to newbie GCC contributors.
The current situation is quite bad : only a few (perhaps a dozen or two)
GCC gurus have in their brain a clear enough picture of GCC memory to be
able to assess it reliably. And in the current scheme, a pass (e.g. added by a 
plugin)
may have to manage its memory manually.


Can someone point me to a document (or may be comments in some file) that 
describe the
overall memory management strategy? Which data created by a pass can be 
expected to live
outside of the pass, which data cannot be expected to live outside of the pass?



I believe that we should state what are the objectives with the memory 
allocation part.
My opinion is one of the strongest objective should be to ease the contribution 
of GCC newbies;
they should not need to read a lot of code or documentation related to memory 
management
before understanding how to add their own work, in a reliable way, to GCC.


Completely agree. It is very important to avoid surprises and unimaginable side 
effects.

Uday Khedker.




Re: RFC - Alternatives to gengtype

2012-11-16 Thread Georg-Johann Lay
Diego Novillo wrote:

> === Approach: Move GTY to cc1plus.
> 
> Instead of a separate weak parser, we would make cc1plus understand GTY
> attributes.  The compiler would emit IL in the object files instead of
> generating source.

Does this mean GTY is implemented as a C++ language extension and the generated
code cares for whaterer is needed automatically, e.g. might need support of
something like a libgty++ support library?

Johann

> This solution would require a first boot stage that did not support PCH,
> because we cannot rely on the seed compiler supporting GTY.  We would
> probably need to use the Boehm collector during the first stage as well.
> 
> Because the first stage would be fundamentally different from the second
> stage, we may need to add an additional pass for correct object file
> comparisons.


Re: Simplifying Gimple Generation

2012-11-16 Thread Michael Matz
Hi,

On Thu, 15 Nov 2012, Lawrence Crowl wrote:

> They allow us to use the same name for the same actions in two
> different contexts.  In particular, distinguishing between statement
> construction in SSA and non-SSA.

I don't see the difference, and I don't see where you need context data to 
distinguis that, in case you really need to.

> > That is, I'm not yet convinced that you really need any local
> > data for which you'd have to add helper classes.
> 
> We also need to ask if we will ever need local data.  If we plan
> for it now, future changes might be possible without affecting
> existing code.  Otherwise, we might require more substantial patches.

But so it _now_ requires substantial patches without it being clear that 
we ever need the complications.  Please don't optimize prematurely.

> > statements without introduction of any helper class.  Basically, 
> > whenever a helper class merely contains one member of a different 
> > type, then I think that other type should be improved so that the 
> > wrapper class isn't necessary anymore.  Fewer types -> good :)
> 
> While I would agree that unnecessary types are bad, I have found that 
> having types to represent distinct concepts is a good way to structure 
> my thinking and catch my errors.

But where exactly are your concepts different?  We have gimple_seq, we can 
append to it.  You had ssa_seq, you could append to it (nicer).  We can 
construct new statements already (boilerplaty), you can construct new 
statements (nicer).  It just the same with a leaner interface.


Ciao,
Michael.


Re: Unifying the GCC Debugging Interface

2012-11-16 Thread Diego Novillo
On Fri, Nov 16, 2012 at 4:38 AM, Martin Jambor  wrote:

> So you do not plan to replace/rename at least some of them?  This
> seems like unnecessary and confusing layering just to avoid the work
> to do the right thing.

No, we plan to replace all the existing dumping routines.  We are just
not planning to add *new* ones.  Duplicating the existing routines
would indeed be useless.

>>
>> dump_raw
>>
>> This function overload set provides the raw oriented dump,
>> e.g. a tuple.
>
> I'm not sure I understand the whole raw thing.

This is to distinguish between:

a = b + c;

from



ASTs and RTL have something similar.  The raw output gives you a
different view.  Not every data structure will have that distinction.


> I'm afraid we can't really always rely on overloading.  For example,
> even though I often use debug_tree to examine a tree, probably even
> more often I just use debug_generic_expr.  When I write stuff into a
> dump file, I rarely ever use the verbose variants but I certainly want
> them to exist.  And there might be other similar cases, like the
> .*_brief dumping functions that are sometimes also used.

Sure.  The idea is to provide these variants via symbolic TDF_ style
flags.  For combinations that are very popular, we provide alternate
entry names so that you don't have to be specifying the flags all the
time.


> Nevertheless, it would be great if we had fewer and consistent names
> of dumping functions, even though perhaps not just three.  It would
> also be nice if all the file variants had an integer indent parameter
> ;-)

Ah, good idea.  Thanks.


Diego.


Re: Unifying the GCC Debugging Interface

2012-11-16 Thread Diego Novillo
On Wed, Nov 14, 2012 at 8:48 PM, Andrew Pinski  wrote:

> Here is my proposal though I don't have time to work on it.  Make some
> python scripts which do the basic function of the debug_* functions.

No.  Debug traces and -fdump-* support.

Python pretty printers for gdb would be great, of course.  They are
just not a replacement for having our own dump facilities in the
compiler.


Diego.


Re: Simplifying Gimple Generation

2012-11-16 Thread Diego Novillo
On Thu, Nov 15, 2012 at 2:31 AM, Xinliang David Li  wrote:

>> ssa_stmt t = q.stmt (NE_EXPR, shadow, 0);
>> ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7);
>> ssa_stmt b = q.stmt (shadow_type, a);
>> ssa_stmt c = q.stmt (PLUS_EXPR, b, offset);
>> ssa_stmt d = q.stmt (GE_EXPR, c, shadow);
>> ssa_stmt e = q.stmt (BIT_AND_EXPR, t, d);
>
>
> seq_seq::stmt(...) sounds like a getter interface, not a creator.

Sure. They could be named new_stmt() or build_stmt() or something similar.

> x = q.new_assignment (...);
> x = q.new_call (..);
> x.add_arg(..);
> x = q.new_icall (..);
>
> l1 = q.new_label ("xx");
> l2 = q.new_label ("xxx");
> join_l = q.new_label ("...");
>
> x = new_if_then_else (cond, l1, l2, join_l);
> q.insert_label (l1);
> q.new_assignment (...);
> q.insert_label(l2);
> ...
> q.insert_label(join_l);
> q.close_if_then_else(x);

What I was thinking for if_then_else constructs was something along
the lines of:

stmt_seq then_body(s1);
then_body.add_stmt(s2);

stmt_seq else_body(r1);
else_body.add_stmt(r2);
stmt if_then_else(cond, then_body, else_body);

You can then take 'if_then_else' and insert it inside a basic block or
an edge.  When that happens, the builder takes care of the block/edge
splitting for you.

>> .. The statement result type is that of the arguments.
>>
>> .. The type of integral constant arguments is that of the other
>> argument.  (Which implies that only one argument can be constant.)
>>
>> .. The 'stmt' method handles linking the statement into the sequence.
>>
>> .. The 'set_location' method iterates over all statements.
>>
>> There will be another class of builders for generating GIMPLE
>> in normal form (gimple_stmt).  We expect that this will mostly
>> affect all transformations that need to generate new expressions
>> and statements, like instrumentation passes.
>
> What are the uses of the raw forms?

Sorry, what are these "raw forms" that you refer to?

>> tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
>> record_builder rec ("__asan_global");
>> rec.field ("__beg", const_ptr_type_node);
>> rec.field ("__size", inttype);
>> rec.field ("__size_with_redzone", inttype);
>> rec.field ("__name", const_ptr_type_node);
>> rec.field ("__has_dynamic_init", inttype);
>> rec.finish ();
>> tree ret = rec.as_tree ();
>
> Again, something like new_field or add_field is more intuitive.

Sure.


Diego.


Re: Simplifying Gimple Generation

2012-11-16 Thread Diego Novillo
On Thu, Nov 15, 2012 at 1:06 AM, Basile Starynkevitch
 wrote:
> On Wed, Nov 14, 2012 at 05:13:12PM -0800, Lawrence Crowl wrote:
>> Diego and I seek your comments on the following (loose) proposal.
>>
>>
>> Generating gimple and tree expressions require lots of detail,
>> which is hard to remember and easy to get wrong.  There is some
>> amount of boilerplate code that can, in most cases, be reduced and
>> managed automatically.
>>
>> We will add a set of helper classes to be used as local variables
>> to manage the details of handling the existing types.  That is,
>> a layer over 'gimple_build_*'. We intend to provide helpers for
>> those facilities that are both commonly used and have room for
>> significant simplification.
>
> I do agree (in principle) on this and the previous (debugging-like) proposal, 
> but:
>
>   do you target the 4.8 release? (I believe not, since its stage 1 is ending)

No, this would be a 4.9 feature.

>   do you intend to remove the current way of doing?

No.  The simplified interface will necessarily not be able to handle
all kinds of IL creation.  It is a layer over the low-level routines
that helps with the common cases.


Diego.


Re: Simplifying Gimple Generation

2012-11-16 Thread Diego Novillo
On Thu, Nov 15, 2012 at 9:48 AM, Michael Matz  wrote:
> Hi Lawrence,
>
> On Wed, 14 Nov 2012, Lawrence Crowl wrote:
>
>> Diego and I seek your comments on the following (loose) proposal.
>
> In principle I agree with the goal, I'm not sure I like the specific way
> yet, and even if I do I have some suggestions:

Sure.  We do not have very firm notions yet.  We have only started
exploring this recently.  We wanted to discuss our ideas early on to
make sure we are going in the right direction.

>> We will add a set of helper classes to be used as local variables
>> to manage the details of handling the existing types.
>
> I think one goal should be to minimize the number of those helper classes
> if we can.  And use clear names, for the statement builder e.g.
> stmt_builder, or the like, not just ssa_seq.

Sure.

>
>> We propose a simplified form using new build helper classes ssa_seq
>> and ssa_stmt that would allow the above code to be written as
>> follows.
>>
>> ssa_seq q;
>> ssa_stmt t = q.stmt (NE_EXPR, shadow, 0);
>> ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7);
>> ssa_stmt b = q.stmt (shadow_type, a);
>
> I think consistency should trump brevity here, so also add a tree code for
> the converter, i.e.
>   ssa_stmt b = q.stmt (NOP_EXPR, shadow_type, a);

Ah, yes.  This one was amusing.  When we were drafting the proposal,
Lawrence kept wondering what this NOP_EXPR thing is.  I've been
suffering this name for so long, that it no longer irritates me.  Had
it been named CAST_EXPR, or even NOP_CAST_EXPR, he would have probably
kept it in the example code :)

> The method name should imply the action, e.g. 'add_stmt' or append_stmt
> or the like.  I'm not sure if we need the ssa_stmt class.  We could use
> overloading to accept 'gimple' as operands, with the understanding that
> those will be implicitely converted to 'tree' by accessing the LHS:

Hm, maybe you are right.  The main goal was reduce all the ssa name
and temporary creation needed to glue the statements together.

> gimple append_stmt (gimple g, tree_code code, gimple op1, tree op2)
> {
>   return append_stmt (g, code, gimple_lhs (op1), op2);
> }
>
> (where gimple_lhs would ICE when the stmt in question doesn't have
> one).  As gimple statements are their own iterators meanwhile I'm not even
> sure if you need the ssa_seq (or ssa_stmt_builder) class.  The above
> append_stmt would simply add the newly created statement to after 'g'.

Yes, I think that could work.  More details will surface as we start
the implementation, of course.

> All in all I think we can severely improve on building gimple statements
> without introduction of any helper class.  Basically, whenever a helper
> class merely contains one member of a different type, then I think that
> other type should be improved so that the wrapper class isn't necessary
> anymore.  Fewer types -> good :)

Sure.  If the helper classes do not contain any significant state that
cannot be gleaned from the operands to the builder, then we won't
introduce them.

>
>> Generating a Type
>
> Apart from naming of some methods to imply the action done, I don't have
> any opinion about this at this point.  Though I agree that again the
> boilerplate sequencing (_CONTEXT and _CHAIN) should be improved.

Right.  I think it's mostly record types that will benefit from this
type of shortening.  Building other types is usually more
straightforward.


Diego.


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Diego Novillo
On Fri, Nov 16, 2012 at 5:06 AM, Basile Starynkevitch
 wrote:
> On Thu, Nov 15, 2012 at 07:59:36PM -0500, Diego Novillo wrote:
>> As we continue adding new C++ features in the compiler, gengtype
>> is becoming an increasing source of pain.  In this proposal, we
>> want to explore different approaches to GC that we could
>> implement.
>
> Just a minor remark: we don't only speak of Gengtype, but also of Ggc and of 
> PCH.
> I agree that they all closely related (and perhaps even LTO serialization 
> might
> be affected). And I am biased, because GCC MELT is built about the GCC 
> garbage collector
> (if you don't know about MELT see http://gcc-melt.org/ ; MELT is a domain 
> specific
> language to extend GCC). However, I probably will be able to adapt MELT to new
> conventions.
>
>>
>> At this point, we are trying to reach consensus on the general
>> direction that we should take.  Given how intertwined GC and PCH
>> are, the choices we make for one affect the other.
>
> My belief was that PCH (pre-compiled header) is deprecated with PPH
> (preprocessed headers). Will PCH continue to exist once the PPH effort goes 
> mainline.
> Or is PPH abandoned??? How is the idea of "getting rid of GC" related to PPH?

PPH in its current implementation is on hold.  It will come back in a
different shape if/when C++ adds modules to the language.

> I actually disagree with the "Get rid of GC" idea, but I am not sure
> that we all understand the same thing about it (and I have the feeling
> of the opposite). I would probably agree with "Get rid of Gengtype+Ggc+PCH
> and replace it with something better" which might be what "Get rid of GC" 
> mean.

We mean get rid of it.  No garbage collection, whatsoever.  We both
think that it is better to structure the compiler around memory pools.
 However, we also concede that we are probably in the minority and we
will need to keep GC around.

> My strong belief is that a compiler project as gigantic as GCC needs some kind
> of garbage collection. I also believe that the current (4.7) garbage
> collection *implementation* (which is probably what both Diego and
> Lawrence call the "GC" to get rid of) is grossly unsatisfactory
> (so I parse "Get rid of GC" with a big lot of ambiguity).

No.  We mean no garbage collection.  Period.

> To be more specific, I call garbage collection a scheme where (small newbie) 
> GCC contributors
> can contribute easily some code to GCC without having to understand when, and 
> how precisely,
> some data will be freed. If a user adds a pass (or a builtin) which adds e.g. 
> some Gimple,
> he does not immediately know when should his Gimple be freed (and it 
> certainly should be
> freed outside of his pass).

Memory pools.  At the end of your pass, you simply discard the pool
you were using.  Additionally, by using C++ one can use other
type-based memory strategies like smart pointers.

> Thr fact that a compiler deals with a big lot of circular data makes me think
> that naive reference-counting approaches (and in my opinion, reference 
> counting is
> just a very poor method of doing garbage collection, which does not work well
> with circular references) cannot work inside a compiler, why they do work
> inside graphical widget libraries à la Qt or GTK.

> Hence, I don't understand well how a pool-allocator would work in GCC
> without touching to a huge amount of code.

Right.  It would be a large effort to sort out.  Particularly, since
GC has been around for a while and we've gotten lazy and are probably
relying on it quite a bit.  For these reasons, even if we convinced
the community to go in this direction, it would take a while to get
there.

> I would also remark that GC might be related to LTO, even if currently it is 
> not. LTO is
> the serialization of GCC internal representations to disk, and that problem 
> is very
> related to memory management (in both cases, we are dealing with some 
> transitive closure
> of pointer references, so copying GC-s use exactly the same algorithms as 
> serialization).
> I actually don't understand why PCH uses gengtype but not LTO

Because LTO relies on proper bytecode streaming.  PCH simply writes
memory pages out.
PCH uses the wrong approach to streaming (though we understand why it
was implemented this way).

>> === Approach: Limit the Language Used
>>
>> We could avoid the problem by limiting the language we use to
>> near that which gengtype currently understands.  This approach
>> has significant consequences. It will make the standard library
>> incompatible with GTY.
>
> Which standard library are we talking about? I guess it is libstdc++
> and its standard containers like std::map and std::vector, but
> I am not sure. (Maybe is it just libiberty?)

Yes, libstdc++.

>> Full C++ support would essentially require building a new C++
>> parser.
>
> I tend to disagree with that conclusion. I believe we should strongly
> separate the header parts with the code parts. It seems to me that we
> could pe

Re: Simplifying Gimple Generation

2012-11-16 Thread Michael Matz
Hi,

On Fri, 16 Nov 2012, Diego Novillo wrote:

> > I think consistency should trump brevity here, so also add a tree code for
> > the converter, i.e.
> >   ssa_stmt b = q.stmt (NOP_EXPR, shadow_type, a);
> 
> Ah, yes.  This one was amusing.  When we were drafting the proposal,
> Lawrence kept wondering what this NOP_EXPR thing is.  I've been
> suffering this name for so long, that it no longer irritates me.  Had
> it been named CAST_EXPR, or even NOP_CAST_EXPR, he would have probably
> kept it in the example code :)

We have CONVERT_EXPR, but it currently doesn't do _quite_ the same as 
NOP_EXPR.  I once wanted to merge them (with CONVERT_EXPR surviving), but 
it stalled somewhere, couple years ago.


Ciao,
Michael.


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Ian Lance Taylor
On Fri, Nov 16, 2012 at 2:06 AM, Basile Starynkevitch
 wrote:
>
> My strong belief is that a compiler project as gigantic as GCC needs some kind
> of garbage collection.

I suspect that is correct, especially given the way the compiler is
currently implemented.  But I also suspect that we could use C++
reference-counted pointers rather than a mark-and-sweep garbage
collector.

Ian


Re: [C++] Possible GCC bug

2012-11-16 Thread Piotr Wyderski
Dodji Seketeli wrote:

> That, and [dcl.typedef]/6 says:
>
> In a given scope, a typedef specifier shall not be used to redefine
> the name of any type declared in that scope to refer to a different
> type.
>
> So, I tend to think that GCC is right here.

Right *where*? In case of the snippet provided by Ulf -- yes, obviously.
In my case there is no "that scope", i.e. I redefine the type "super" defined
in the *surrounding* scope, not in the very same, as Ulf did. It is exactly
the same situation as:

{ int i;
  float i;
}

and:

{int i;
{ float i;}
}

Best regards, Piotr


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Andrew MacLeod

On 11/15/2012 07:59 PM, Diego Novillo wrote:


At this point, we are trying to reach consensus on the general
direction that we should take.  Given how intertwined GC and PCH
are, the choices we make for one affect the other.

We don't have a strong preference at the moment, but we are
leaning in these directions:


Highlander:
   Long term, I don't see much need for GC in the compiler. As we move 
to better a better structured source base, we ought to be able to either 
determine the proper lifetime, or have the object/component/whatever 
figure it out if its longer living.   This is particularly true if we 
can get to the point where we don't have objects dangling from one 
component into another, ie with a complete FE/ME/BE separation(or 
whatever we eventually end up at), plus good pass to pass separation.  I 
also think we can have better facilities to detect out-of-lifetime usage 
which was the real problem with obstacks back in the day.


   So whatever direction ones heads, I'd kept eventual elimination in 
mind as the long term goal.  I *do* know I hate the current GC setup...  
It often encourages us to be lazy which isn't usually a good thing.


Andrew

PS I'd also prefer the term 'memory pool' or something... the term 
'obstack' still makes my skin crawl :-)






Re: RFC - Alternatives to gengtype

2012-11-16 Thread Gabriel Dos Reis
On Fri, Nov 16, 2012 at 9:42 AM, Andrew MacLeod  wrote:

> PS I'd also prefer the term 'memory pool' or something... the term 'obstack'
> still makes my skin crawl :-)

Amen.

-- Gaby, old enough to remember the obstack days


Re: [C++] Possible GCC bug

2012-11-16 Thread Andrew Pinski
On Fri, Nov 16, 2012 at 7:15 AM, Piotr Wyderski
 wrote:
> Dodji Seketeli wrote:
>
>> That, and [dcl.typedef]/6 says:
>>
>> In a given scope, a typedef specifier shall not be used to redefine
>> the name of any type declared in that scope to refer to a different
>> type.
>>
>> So, I tend to think that GCC is right here.
>
> Right *where*? In case of the snippet provided by Ulf -- yes, obviously.
> In my case there is no "that scope", i.e. I redefine the type "super" defined
> in the *surrounding* scope, not in the very same, as Ulf did. It is exactly
> the same situation as:
>
> { int i;
>   float i;
> }
>
> and:
>
> {int i;
> { float i;}
> }
>

The scope is obvious:

 class X : public super::X

 {
   typedef super::X super;
};

> 3.3.6/1: ... A name N used in a class S shall refer to the same declaration 
> in its context and when re-evaluated in the
> completed scope of S. No diagnostic is required for a violation of
> this rule. ...

The N here is supper, the class S here is X.  supper at the context of
the declaration of supper means the out scope supper and then its
meaning has changed by the end of the completed scope of X.  So GCC is
correct in erroring If you go by the literal use of 3.3.6/1.  The
other compilers are correct only because no diagnostic is required but
this is still invalid code.

Thanks,
Andrew Pinski


Re: [C++] Possible GCC bug

2012-11-16 Thread Jiri Palecek

Piotr Wyderski wrote:

Dodji Seketeli wrote:


That, and [dcl.typedef]/6 says:

 In a given scope, a typedef specifier shall not be used to redefine
 the name of any type declared in that scope to refer to a different
 type.

So, I tend to think that GCC is right here.

Right *where*? In case of the snippet provided by Ulf -- yes, obviously.
In my case there is no "that scope", i.e. I redefine the type "super" defined
in the *surrounding* scope, not in the very same, as Ulf did. It is exactly
the same situation as:

 { int i;
   float i;
 }

and:

 {int i;
 { float i;}
 }


GCC is right in both cases and, similarly, [dcl.typedef]/6 doesn't affect any 
of them at all, because, as you noted, there aren't any redeclarations in the 
same scope in any of the two snippets.

However, you should not forget about [basic.scope.class]/1. I will not post the 
wording again, but it means a name used in a class must mean the same thing in 
all uses and at the end of the class. [basic.lookup.unqual]/7 says

A name used in the definition of a class X outside of a member function 
body or nested class definition [This refers to unqualified names following the 
class name; such a name may be used in the base-clause or may be used in the 
class definition.] ...

So, the name of the base class *is* used in the derived class and as such, must not be 
redefined throughout the whole class; in your example, "super" is violating 
that rule.

Note that no diagnostic is required for violation of this rule, so the compiler 
might just accept it. However, it is still invalid code and gcc is right to 
reject it.

Regards
Jiří Paleček





Re: Simplifying Gimple Generation

2012-11-16 Thread Andrew Pinski
On Fri, Nov 16, 2012 at 6:30 AM, Michael Matz  wrote:
> Hi,
>
> On Fri, 16 Nov 2012, Diego Novillo wrote:
>
>> > I think consistency should trump brevity here, so also add a tree code for
>> > the converter, i.e.
>> >   ssa_stmt b = q.stmt (NOP_EXPR, shadow_type, a);
>>
>> Ah, yes.  This one was amusing.  When we were drafting the proposal,
>> Lawrence kept wondering what this NOP_EXPR thing is.  I've been
>> suffering this name for so long, that it no longer irritates me.  Had
>> it been named CAST_EXPR, or even NOP_CAST_EXPR, he would have probably
>> kept it in the example code :)
>
> We have CONVERT_EXPR, but it currently doesn't do _quite_ the same as
> NOP_EXPR.  I once wanted to merge them (with CONVERT_EXPR surviving), but
> it stalled somewhere, couple years ago.

I think the only difference now is in the front-ends IIRC.  Everything
else has been merged with respect to CONVERT_EXPR and NOP_EXPR.  So we
should recommend using CONVERT_EXPR in new code.

Thanks,
Andrew Pinski


gcc-4.6-20121116 is now available

2012-11-16 Thread gccadmin
Snapshot gcc-4.6-20121116 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20121116/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch 
revision 193577

You'll find:

 gcc-4.6-20121116.tar.bz2 Complete GCC

  MD5=f45f723057e5de739847a3b22589f4a2
  SHA1=eb81caef649f7f1c164e3548a714ccc122e2a5a8

Diffs from 4.6-20121109 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Unused Field in graphite-poly.h?

2012-11-16 Thread Lawrence Crowl
I think the field "htab_t original_pddrs" in struct scop in
graphite-poly.h is unused.  A macro to access that field is also
unused.

/* A SCOP is a Static Control Part of the program, simple enough
   to be represented in polyhedral form.  */
struct scop
{
  ...
  /* A hashtable of the data dependence relations for the original
 scattering.  */
  htab_t original_pddrs;
  ...
}

$ grep 'original_pddrs' {,*/,*/*/}*.[ch]
graphite-poly.h:  htab_t original_pddrs;
graphite-poly.h:#define SCOP_ORIGINAL_PDDRS(S) (S->original_pddrs)

I have successfully built with them #ifdef'd out.  Should I remove
them?

-- 
Lawrence Crowl


Re: Simplifying Gimple Generation

2012-11-16 Thread Xinliang David Li
On Fri, Nov 16, 2012 at 5:13 AM, Diego Novillo  wrote:
> On Thu, Nov 15, 2012 at 2:31 AM, Xinliang David Li  wrote:
>
>>> ssa_stmt t = q.stmt (NE_EXPR, shadow, 0);
>>> ssa_stmt a = q.stmt (BIT_AND_EXPR, base_addr, 7);
>>> ssa_stmt b = q.stmt (shadow_type, a);
>>> ssa_stmt c = q.stmt (PLUS_EXPR, b, offset);
>>> ssa_stmt d = q.stmt (GE_EXPR, c, shadow);
>>> ssa_stmt e = q.stmt (BIT_AND_EXPR, t, d);
>>
>>
>> seq_seq::stmt(...) sounds like a getter interface, not a creator.
>
> Sure. They could be named new_stmt() or build_stmt() or something similar.
>
>> x = q.new_assignment (...);
>> x = q.new_call (..);
>> x.add_arg(..);
>> x = q.new_icall (..);
>>
>> l1 = q.new_label ("xx");
>> l2 = q.new_label ("xxx");
>> join_l = q.new_label ("...");
>>
>> x = new_if_then_else (cond, l1, l2, join_l);
>> q.insert_label (l1);
>> q.new_assignment (...);
>> q.insert_label(l2);
>> ...
>> q.insert_label(join_l);
>> q.close_if_then_else(x);
>
> What I was thinking for if_then_else constructs was something along
> the lines of:
>
> stmt_seq then_body(s1);
> then_body.add_stmt(s2);
>
> stmt_seq else_body(r1);
> else_body.add_stmt(r2);
> stmt if_then_else(cond, then_body, else_body);


That looks good. The interface should also allow user to specify
branch prob. It is probably useful to consider support creating
if-then-else with multiple conditions with short circuit  semantics.
The interface should look very similar.

thanks,

David


>
> You can then take 'if_then_else' and insert it inside a basic block or
> an edge.  When that happens, the builder takes care of the block/edge
> splitting for you.
>
>>> .. The statement result type is that of the arguments.
>>>
>>> .. The type of integral constant arguments is that of the other
>>> argument.  (Which implies that only one argument can be constant.)
>>>
>>> .. The 'stmt' method handles linking the statement into the sequence.
>>>
>>> .. The 'set_location' method iterates over all statements.
>>>
>>> There will be another class of builders for generating GIMPLE
>>> in normal form (gimple_stmt).  We expect that this will mostly
>>> affect all transformations that need to generate new expressions
>>> and statements, like instrumentation passes.
>>
>> What are the uses of the raw forms?
>
> Sorry, what are these "raw forms" that you refer to?
>
>>> tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
>>> record_builder rec ("__asan_global");
>>> rec.field ("__beg", const_ptr_type_node);
>>> rec.field ("__size", inttype);
>>> rec.field ("__size_with_redzone", inttype);
>>> rec.field ("__name", const_ptr_type_node);
>>> rec.field ("__has_dynamic_init", inttype);
>>> rec.finish ();
>>> tree ret = rec.as_tree ();
>>
>> Again, something like new_field or add_field is more intuitive.
>
> Sure.
>
>
> Diego.


Re: RFC - Alternatives to gengtype

2012-11-16 Thread Xinliang David Li
>
>
> My strong belief is that a compiler project as gigantic as GCC needs some kind
> of garbage collection

Can you name another compiler written in C/C++ using GC ? :)

>. I also believe that the current (4.7) garbage
> collection *implementation* (which is probably what both Diego and
> Lawrence call the "GC" to get rid of) is grossly unsatisfactory
> (so I parse "Get rid of GC" with a big lot of ambiguity).
>
> To be more specific, I call garbage collection a scheme where (small newbie) 
> GCC contributors
> can contribute easily some code to GCC without having to understand when, and 
> how precisely,
> some data will be freed. If a user adds a pass (or a builtin) which adds e.g. 
> some Gimple,
> he does not immediately know when should his Gimple be freed (and it 
> certainly should be
> freed outside of his pass).

What is missing is a more modernized memory system implementing
various allocation policies.  Relying on GC eventually leads to
sloppiness. Compared with competitor, GCC's memory usage is high and
compiler time is slow. One of the big contributor to the compile time
is garbage collection. The following profile is from compiling a large
C++ source at -O0 using an older compiler:

142516  6.30%  check_qualified_type
79699  3.52%  gt_ggc_mx_lang_tree_node
74232  3.28%  ggc_set_mark
57612  2.55%  ggc_internal_alloc_stat
55323  2.45%  lookup_field_1


>
> Memory liveness is a global non-modular property of data.
> Whatever is done in GCC should be helpful to newbie GCC contributors.
> The current situation is quite bad : only a few (perhaps a dozen or two)
> GCC gurus have in their brain a clear enough picture of GCC memory to be
> able to assess it reliably. And in the current scheme, a pass (e.g. added by 
> a plugin)
> may have to manage its memory manually.

But relying on GC is to hide the problem -- not to solve it.

>
> I believe that we should state what are the objectives with the memory 
> allocation part.
> My opinion is one of the strongest objective should be to ease the 
> contribution of GCC newbies;
> they should not need to read a lot of code or documentation related to memory 
> management
> before understanding how to add their own work, in a reliable way, to GCC.
> I even have the opinion that adding features inside GCC which increases 
> slightly
> the compilation time (while maintaining the quality of the output object code)
> can be justified by the ease of external contributions.
> In particular, any rework of the memory management should not make
> the coding of plugins harder but simpler.
>
> A compiler as strong as GCC has a lot of data types (about 2000).
> A compiler is working on complex internal representations which are 
> inherently very circular:
> GCC is dealing with a lot of intermixed circular directed graphs.
> This is in sharp contrast with e.g. huge graphical toolkits like Qt or Gtk,
> where the memory references are tree-like (an X11 window contains a set of
> sub-windows, and each X11 window belongs to exactly one parent; hence 
> naturally
> a Qt or Gtk widget belongs to at most one parent widget. We don't have such 
> nice properties
> inside GCC). So whatever we propose, we have to deal with that unpleasant 
> fact: the pointers
> inside a cc1 (or lto1) process are very very messy and circular, and it is 
> not easy
> to know when a pointed zone (i.e. a GCC "object") should be freed. We also 
> have
> a difference with graphical toolkits: the major resource we have to manage 
> globally is
> internal memory (in contrast, Qt or Gtk have to manage X11 windows which are 
> external
> resources inside the X11 server and outside of the Gtk or Qt application).
>

This should not be  too hard a problem to solve. Ironically, even GCC
uses GC, but it is not the easiest infrastructure for newbies.


> Thr fact that a compiler deals with a big lot of circular data makes me think
> that naive reference-counting approaches (and in my opinion, reference 
> counting is
> just a very poor method of doing garbage collection, which does not work well
> with circular references) cannot work inside a compiler, why they do work
> inside graphical widget libraries à la Qt or GTK.
> Hence, I don't understand well how a pool-allocator would work in GCC
> without touching to a huge amount of code.

A lot of changes are certainly expected.

>
>
> I would also remark that GC might be related to LTO, even if currently it is 
> not. LTO is
> the serialization of GCC internal representations to disk, and that problem 
> is very
> related to memory management

It does not have to be that way if GCC has a well specified persistent
IR (See WHIRL, LLVM bitcode).


David


> (in both cases, we are dealing with some transitive closure
> of pointer references, so copying GC-s use exactly the same algorithms as 
> serialization).
> I actually don't understand why PCH uses gengtype but not LTO
> (the good reason is that gengtype is so messy that nobody likes hacking 
>