On 12-05-24 04:16 , Richard Guenther wrote:
On Wed, May 23, 2012 at 9:48 PM, Diego Novillo<dnovi...@google.com> wrote:
This series of 3 patches re-implement vec.[ch] in C++.
The main goal of this first step is to minimize changes to client
code. I tried very hard to maintain the existing API. This means
that vectors are still simple structures accessed via the existing
VEC_* functions.
The major change introduced here is the removal of all the macro-based
type creation in vec.h:
1- Given a type T and an allocation strategy A, VEC(T,A) used to
expand into the C type 'struct VEC_T_A'. The family of accessor
functions for VEC_T_A were created via macro expansion and token
pasting.
After this patch, VEC(T,A) expands to 'vec_t<T>':
template<typename T>
struct GTY(()) vec_t
{
vec_prefix prefix;
T GTY((length ("%h.prefix.num"))) vec[1];
};
The family of accessor functions for vec_t<T> are free functions
templated on<T> or<T,A> depending on whether the function needs
to perform allocation operations.
2- All the DEF_VEC_[IOP] and DEF_VEC_ALLOC_[IOP] macros expand to the
nil expansion 'struct vec_swallow_trailing_semi'. This represents
the bulk of the removal in vec.h.
3- I removed one indirection from the existing VEC structures. Before
this patch, the visible structure created by VEC_T_A had a single
element 'base', which, in turn, was the structure holding 'prefix'
and 'vec'. The new vec_t<T> has 'prefix' and 'vec' in it. This
may change in the second iteration of this patch.
I believe this was because of aliasing - you may dig in the svn history
to find out what we miscompiled when not doing this and whether it is
still necessary or not.
Some client code changes were needed:
1- The allocation names 'heap', 'stack' and 'gc' are not embedded in
the type name anymore. They are enum values used as a template
argument for functions like VEC_alloc() and VEC_free(). Since they
are now separate identifiers, some existing code that declared
variables with the names 'heap' and 'stack' had to change.
I did not change these enum values to better names because that
would have meant changing a lot more client code. I will change
this in a future patch.
Yeah. Can we have
template<...>
struct vec {
enum { gc, heap, stack };
...
}
and use vec<T, vec::stack> instead? Not sure if this or something similar
is valid C++.
Sure. Something involving a new namespace, like Gaby suggested down-thread.
2- VEC_index and VEC_last now return a reference to the element. The
existing APIs implemented two versions of these functions. One
returning the element itself (used for vec_t<T *>). Another returning
a pointer to the element (used for vec_t<T>).
This is impossible to represent in C++ directly, as functions
cannot be overloaded by their return value. Instead, I made them
return a reference to the element. This means that vectors storing
pointers (T *) do not need to change, but vectors of objects (T) do
need to change.
We have fewer vec_t<T> than vec_t<T *>, so this was the smaller change
(which still meant changing a few hundred call sites).
We still can have const overloads when taking a const vec though?
I'm going to say 'sure' because I don't really think I understand this
question :)
3- The family of VEC_*push and VEC_*replace functions have overloads
for handling 'T' and 'const T *' objects. The former is used for
vec_t<T>, the latter for vec_t<T*>. This means, however, than when
pushing the special value 0 on a vec_t<T*>, the compiler will not
know which overload we meant, as 0 matches both. In these cases
(not very many), a static cast is all we need to help it.
I'm not terribly content with this one. I'll try to have a better
approach in the second iteration (which will liberally change all
client code).
I wonder whether explicitely specifying the template type is better? Thus,
VEC_safe_push<T *, gc>(0) instead of automatically deducing the template
argument?
Not sure. I didn't like the casts, Lawrence had another suggestion
involving something like C++11's nullptr. Lawrence, what was that
nullptr thing you were telling me earlier?
4- I extended gengtype to understand templated types. These changes
are not as ugly as I thought they would be. Gengtype has some
hardwired knowledge of VEC_*, which I renamed to vec_t. I'm not
even sure why gengtype needs to care about vec_t in this way, but I
was looking for minimal changes, so I just did it.
The other change is more generic. It allows gengtype to deal with
templated types. There is a new function (filter_type_name) that
recognizes C++ special characters '<','>' and ':'. It turns them
into '_'. This way, gengtype can generate a valid identifier for
the pre-processor. It only does this in contexts where it needs a
pre-processor identifier. For everything else, it emits the type
directly. So, the functions emitted in gt-*.h files have proper
template type references.
Well, that part is of course what needs to change. I don't think we want
gengtype to work like this for templates as this does not scale.
Well, certainly, but that was outside the scope of this one patch. I'm
trying to make small incremental steps.
2- Change VEC_index into operator[].
3- Change VEC_replace(...) to assignments using operator[].
I think these should not be done for now - they merely add syntactic
sugar, no?
It makes the code more compact, readable and easier to write. I can
certainly wait until the branch has been merged to make all these
changes. It's going to affect a boatload of call sites.
Diego.