On Thu, Nov 15, 2012 at 07:59:36PM -0500, Diego Novillo wrote:
> As we continue adding new C++ features in the compiler, gengtype
> is becoming an increasing source of pain.  In this proposal, we
> want to explore different approaches to GC that we could
> implement.

Just a minor remark: we don't only speak of Gengtype, but also of Ggc and of 
PCH.
I agree that they all closely related (and perhaps even LTO serialization might 
be affected). And I am biased, because GCC MELT is built about the GCC garbage 
collector
(if you don't know about MELT see http://gcc-melt.org/ ; MELT is a domain 
specific 
language to extend GCC). However, I probably will be able to adapt MELT to new 
conventions.

> 
> At this point, we are trying to reach consensus on the general
> direction that we should take.  Given how intertwined GC and PCH
> are, the choices we make for one affect the other.

My belief was that PCH (pre-compiled header) is deprecated with PPH 
(preprocessed headers). Will PCH continue to exist once the PPH effort goes 
mainline. 
Or is PPH abandoned??? How is the idea of "getting rid of GC" related to PPH?

> We don't have a strong preference at the moment, but we are
> leaning in these directions:
> 
> Diego:
>       Get rid of GC completely.  Move into pool allocation.
>       If we continue to use GC, then either use boehm-gc or
>       continue use the precise allocator, but with user
>       generated  marking.
> 
> Lawrence:
>       Get rid of GC completely.  Move into pool allocation.
>       If we continue to use GC, either move to user-generated
>       marking or implement the GTY attributes in cc1plus (read
>       below).  For PCH, used a fixed address where all the
>       PCH-able data would go, as this would represent less work
>       than implementing streamable types.

I actually disagree with the "Get rid of GC" idea, but I am not sure 
that we all understand the same thing about it (and I have the feeling 
of the opposite). I would probably agree with "Get rid of Gengtype+Ggc+PCH 
and replace it with something better" which might be what "Get rid of GC" mean.


My strong belief is that a compiler project as gigantic as GCC needs some kind 
of garbage collection. I also believe that the current (4.7) garbage 
collection *implementation* (which is probably what both Diego and 
Lawrence call the "GC" to get rid of) is grossly unsatisfactory 
(so I parse "Get rid of GC" with a big lot of ambiguity).

To be more specific, I call garbage collection a scheme where (small newbie) 
GCC contributors 
can contribute easily some code to GCC without having to understand when, and 
how precisely, 
some data will be freed. If a user adds a pass (or a builtin) which adds e.g. 
some Gimple, 
he does not immediately know when should his Gimple be freed (and it certainly 
should be 
freed outside of his pass).

Memory liveness is a global non-modular property of data. 
Whatever is done in GCC should be helpful to newbie GCC contributors.
The current situation is quite bad : only a few (perhaps a dozen or two) 
GCC gurus have in their brain a clear enough picture of GCC memory to be 
able to assess it reliably. And in the current scheme, a pass (e.g. added by a 
plugin) 
may have to manage its memory manually.

I believe that we should state what are the objectives with the memory 
allocation part.
My opinion is one of the strongest objective should be to ease the contribution 
of GCC newbies; 
they should not need to read a lot of code or documentation related to memory 
management 
before understanding how to add their own work, in a reliable way, to GCC. 
I even have the opinion that adding features inside GCC which increases 
slightly 
the compilation time (while maintaining the quality of the output object code)
can be justified by the ease of external contributions.
In particular, any rework of the memory management should not make 
the coding of plugins harder but simpler.
 
A compiler as strong as GCC has a lot of data types (about 2000). 
A compiler is working on complex internal representations which are inherently 
very circular: 
GCC is dealing with a lot of intermixed circular directed graphs. 
This is in sharp contrast with e.g. huge graphical toolkits like Qt or Gtk, 
where the memory references are tree-like (an X11 window contains a set of 
sub-windows, and each X11 window belongs to exactly one parent; hence naturally
a Qt or Gtk widget belongs to at most one parent widget. We don't have such 
nice properties 
inside GCC). So whatever we propose, we have to deal with that unpleasant fact: 
the pointers 
inside a cc1 (or lto1) process are very very messy and circular, and it is not 
easy 
to know when a pointed zone (i.e. a GCC "object") should be freed. We also have
a difference with graphical toolkits: the major resource we have to manage 
globally is 
internal memory (in contrast, Qt or Gtk have to manage X11 windows which are 
external 
resources inside the X11 server and outside of the Gtk or Qt application).

Thr fact that a compiler deals with a big lot of circular data makes me think 
that naive reference-counting approaches (and in my opinion, reference counting 
is 
just a very poor method of doing garbage collection, which does not work well 
with circular references) cannot work inside a compiler, why they do work 
inside graphical widget libraries à la Qt or GTK. 
Hence, I don't understand well how a pool-allocator would work in GCC 
without touching to a huge amount of code.


I would also remark that GC might be related to LTO, even if currently it is 
not. LTO is 
the serialization of GCC internal representations to disk, and that problem is 
very 
related to memory management (in both cases, we are dealing with some 
transitive closure 
of pointer references, so copying GC-s use exactly the same algorithms as 
serialization). 
I actually don't understand why PCH uses gengtype but not LTO 
(the good reason is that gengtype is so messy that nobody likes hacking inside 
it). 
And it seems that it makes some things incredibly hard with LTO. In my 
understanding, 
it is currently very difficult to make a plugin which, in some new passes in 
cc1, 
add new gimples outside of any function, and in some later pass in lto1, 
decides to use 
(or not) and incorporate these outside gimples inside some function. In other 
words, 
current LTO requires every serialized gimple to belong to a function, and that 
is a pain.
 
A major issue with memory management is that it concerns every GCC contributor; 
very probably, any improvement or change should be accepted by everyone 
(and should avoid breaking existing GCC code when possible).
Hence, few people dare working on that memory management thing: there is a 
considerable risk to them that a huge amount of their work will be rejected by 
others (unless these people are as famous as Diego). 
This is one of the reasons I (Basile) never dared working on this (even if I am 
attracted by that subject).

The current Ggc has a huge shortcoming which should be addressed: 
it does not handle local pointers, so it cannot easily be called inside 
a pass (only between them). Because of that, GCC contributors are dissuaded 
from 
using it inside their passes, and have to code very differently their 
internal data and data which would survive a single GCC pass.
 
> Gengtype is a poor C parser. It is a long way from handling all
> of the C++ constructs what are used by the C++ standard library.
> We have a few potential solutions. Solution: Stick with Gengtype
> 
> 
> As gengtype currently does not handle what we need, something
> must change.
> 
> 
> === Approach: Limit the Language Used
> 
> We could avoid the problem by limiting the language we use to
> near that which gengtype currently understands.  This approach
> has significant consequences. It will make the standard library
> incompatible with GTY.

Which standard library are we talking about? I guess it is libstdc++ 
and its standard containers like std::map and std::vector, but 
I am not sure. (Maybe is it just libiberty?????)

> It will prevent the use of common idioms,
> such as iterators, within GTY types.  These limitations are
> significant and not desirable. Approach: Upgrade Gengtype
> 
> 
> Full C++ support would essentially require building a new C++
> parser.

I tend to disagree with that conclusion. I believe we should strongly 
separate the header parts with the code parts. It seems to me that we 
could perhaps afford require changes to header gcc/*.h & gcc/*.def files, 
but we could much less afford require changes to gcc/*.c & gcc/*.cc code files
(this is simply because most of GCC source code is in gcc/*.{c,cc} files so 
having to touch all of them is unrealistic and not desirable). 
Maybe having to touch to header files is more acceptable, 
provided the code files still work.

An alternative route might be to describe all the data types inside GCC 
in some other files (perhaps using a GTY-friendly syntax, or not), 
a bit like some IDLs 
http://en.wikipedia.org/wiki/Interface_description_language do.

Notice that today's gengtype is sort-of doing that, and that we could translate 
to new 
IDL syntax by having a tool which parses the current gtype.state file and 
transform it into our IDL syntax.

Parsing C++ with some GTY annotation is painful, in particular because C++ 
header files 
mix code (e.g. inline function members) and data, so gengtype has to parse 
nearly all the complexity of C++. Maybe a C++ friendly gengtype might be easier 
if we had the conventions of moving all small inline functions member outside 
of the class {...} declarations.

We could have an IDL which only care about data fields, single-inheritance, 
member functions' 
signature (but not their code), and would output the C++ declarations, and some 
major 
C++ utility code (notably those related to memory, serialization, and perhaps 
dump, 
PCH, LTO and introspection). both gengtype in GCC and moc in Qt is sort-of 
doing that.

Such an approach does not limit the use of libstdc++ containers, we could have 
the 
IDL generate std::vector or std::map things. 
And that IDL could be designed and implemented to emit C++ code which fits very 
well 
in the current code base.

Notice that such an IDL might generate some significant amount of hand-written 
code, 
not only GC marking (or forwarding), but dump routines, LTO & PPH 
serialization, 
an perhaps even Python script for gdb debugging etc... I do believe that 
generating 
some introspective data & code would be a considerable plus 
(see https://live.gnome.org/GObjectIntrospection in GTK) 
notably for GCC plugins (and for those wishing to extend GCC in other 
languages, 
be it MELT, Lua, Python or Ocaml).

> 
> === Approach: Both Limit and Upgrade
> 
> We can try upgrading gengtype to handle a more extensive subset
> of C++, without trying to handle the full language. See Thoughts
> on Gengtype and Single Inheritance. Taking this approach would
> likely mean we would be unable to use the C++ standard library
> within GCC.
> 
> 
> === Approach: Move GTY to cc1plus.
> 
> Instead of a separate weak parser, we would make cc1plus
> understand GTY attributes.  The compiler would emit IL in the
> object files instead of generating source.

I like the idea. The GTY thing could be implemented as a GCC plugin.
Actually, we could push the idea even further, 
and consider GCC to be coded in its 
own super-set GCC++ of (a subset of) standard C++. 
[gengtype is already sort-of doing that by the way]
The first stage could even be implemented as a GCC++ to C translator, 
by having it emitting the Gimple in C form. 
(Notice that Qt is doing the same: their moc is translating their C++ 
extensions 
to code, and clever preprocessing tricks are used to ignore these extensions).
Having GCC coded in its own super-set GCC++ is a bit like my IDL idea. 
And we could add some further tricks inside GCC++ eg a pattern matching 
statement.

> This solution would require a first boot stage that did not
> support PCH, because we cannot rely on the seed compiler
> supporting GTY.  We would probably need to use the Boehm
> collector during the first stage as well.
> 
> Because the first stage would be fundamentally different from the
> second stage, we may need to add an additional pass for correct
> object file comparisons.

Did you mean "additional stage" instead of "additional pass"?? I understand it 
like 
an additional stage.

This first stage (which we might call stage zero) don't even need to link a lot 
of 
GCC optimizations. So it could be made to compile quite quickly. 
(we could put a lot of optimizing passes in some library, and 
don't even link that library in stage zero).
 

> 
> 
> === Approach: Do GTY manually
> 
> In this approach we would be converting GTY from a declarative
> form into a procedural one using the C++ type system and
> manually-implemented structure field handlers.  At the highest
> level, this roughly means that all the structures using GTY(())
> markers will start using GTY((user)).

I am not very fan of that idea. But I don't understand why you say that it 
converts GTY 
into a procedural thing.
> 
> The marking code dealing with the semantics of the marked data
> structure is co-located with the data structure.  When GC
> problems occur, this simplifies debugging.  The user is no longer
> faced with a mass of auto generated code that is unfamiliar and
> hard to trace. The code to implement is straightforward.  For
> every pointer field in the given instance pointer, a single
> "mark" function needs to be called.
> 
> Common facilities to mark arrays will be provided via a base GC
> class. No generated header files to #include at the end of the
> source file. No new dependencies to add to the Makefile. No need
> to parse C++.

I don't follow you. Why don't you like generated header files?
> 
> The problem with this approach is that the declarative approach
> puts the field additions and the GTY annotations in the same
> place, as opposed to in separate code. While it is possible to
> use standard library containers with GC pointers in them, we
> would not be able to write these containers to PCH images.

> 
> === Approach: Get rid of GTY
> 
> This approach engenders more choices.  The pre-compiled header
> implementation uses gengtype, so this approach is not viable
> unless we have an alternate implementation for PCH. We will also
> need another approach to memory management. 
> 
> 
> === Approach: Permanent Addresses for PCH


But I thought that PPH is superseding PCH, so in the long run we 
don't care about PCH anymore????

> Create an alternate implementation of PCH using mmap and a region
> allocator. The essential difference is that we need to allocate
> the data structures in their final location, rather than the
> current approach of allocating someplace and then walking the
> trees to move them to the desired and fixed mmap address. Because
> this implementation would not do a final copy, it may leave
> allocation holes in the memory region.  To reduce this cost, we
> can zero all unallocated memory in the region and then compress
> it before writing the PCH file.  Compressing the file has already
> proven effective on existing PCH files, and so the compression
> work would not be a problem.
> 
> We still need to indicate which objects go into PCH memory.  For
> compiler-specific types that always go into PCH memory, we can
> create a base class that provides class-member operators new and
> delete. These would allocate from the region allocator.  For C++
> standard library structures parameterized by allocators, we could
> provide an allocator that does the same thing.  For other types,
> we would require more explicit allocation and deallocation.
> 
> 
> === Approach: Manual PCH using streamable types
> 
> Every type knows how to build a bytecode vector for its contents.
> A stream manager asks all the instances to build their bytecode
> vectors and writes them to disk. This build on the streaming
> support implemented for LTO.
> 
> 
> === Approach: Use the Boehm collector.
> 
> The general approach is to define allocation and deallocation
> functions to the Boehm collector that handle memory within the
> PCH range.  We would also provide a class to register global
> roots at the point of declaration of those roots. We would likely
> configure the collector to collect only on command, not
> automatically.  Laurynas says previous efforts showed
> that the peak memory usage was slightly higher that with the
> existing collectors due to Boehm's conservativeness. The run time
> was comparable.

A big advantage of using Boehm GC is that the interface is familar to everyone, 
and that it handles very nicely local pointers without pain. 
So data inside passes could also be GC-ed.

> 
> 
> === Approach: Move RTL back to obstacks.
> 
> Laurynas started this in 2011.  I'm not sure what the status of
> this is.  Laurynas?

I don't know RTL, but my intuition is that going back the manual 
obstack management in any part of GCC is a huge mistake, because I believe 
it makes life much harder to external contributors (and plugin writers) to 
understand how to manage their memory in a way compatible with GCC.
> 
> 
> === Approach: Replace GC with smart pointers.
> 
> We need to identify all pointers that result in circular
> structures and not use smart pointers for them.  We would
> probably still have some circular structures resulting from the
> compilation of mutually referencing structs.

I thing that circular data is so central to optimized compilation that 
I don't believe that this approach is feasible, and newbie contributors 
will have hard time understanding how to handle manually circularity 
and when does it happen. It could happen "accidentally" 
because there are so many types inside GCC that few people understand all 
the relations between them, and can understand when circularity is 
happenning. I just don't think that identify circular structures is possible
(and a purely syntactic approach might not be enough; one could add some 
circularity in type declarations which don't really happen at runtime).


> 
> In this, we want to stress that we would rather shoot for
> long-term viability/maintainability, even if it means
> short/medium term pain.  We are thinking of targetting 4.9 or
> even 5.0.

I am glad of that focus on maintainability. I translate that as
"making newbie contributors to GCC, and GCC plugin developers, 
more comfortable".
> 
> In the meantime, we will need to apply some more duct tape and
> string to keep gengtype puttering along.

Yes. However, we could also have something parsing the gtype.state 
and outputting some code to help us. My feeling is that making a distinction 
between header files and code files could be relevant.

A big thanks to Diego & Lawrence (and others Google people) 
for their work and discussion, and to all for having read my long reply.

Regards.

-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

Reply via email to