On Tue, 2013-11-05 at 14:18 -0700, Jeff Law wrote:
> On 10/31/13 10:26, David Malcolm wrote:
> > The gimple statement types are currently implemented using a hand-coded
> > C inheritance scheme, with a "union gimple_statement_d" holding the
> > various possible structs for a statement.
> >
> > The following series of patches convert it to a C++ hierarchy, using the
> > existing structs, eliminating the union. The "gimple" typedef changes
> > from being a
> > (union gimple_statement_d *)
> > to being a:
> > (struct gimple_statement_base *)
> >
> > There are no virtual functions in the new code: the sizes of the various
> > structs are unchanged.
> >
> > It makes use of "is-a.h", using the as_a <T> template function to
> > perform downcasts, which are checked (via gcc_checking_assert) in an
> > ENABLE_CHECKING build, and are simple casts in an unchecked build,
> > albeit it in an inlined function rather than a macro.
> >
> > For example, one can write:
> >
> > gimple_statement_phi *phi =
> > as_a <gimple_statement_phi> (gsi_stmt (gsi));
> >
> > and then directly access the fields of the phi, as a phi. The existing
> > accessor functions in gimple.h become somewhat redundant in this
> > scheme, but are preserved.
> >
> > The earlier versions of the patches made all of the types GTY((user))
> > and provided hand-written implementations of the gc and pch marker
> > routines. In this new version we rely on the support for simple
> > inheritance that I recently added to gengtype, by adding a "desc"
> > to the GTY marking for the base class, and a "tag" to the marking
> > for all of the concrete subclasses. (I say "class", but all the types
> > remain structs since their fields are all publicly accessible).
> >
> > As noted in the earlier patch, I believe this is a superior scheme to
> > the C implementation:
> >
> > * We can get closer to compile-time type-safety, checking the gimple
> > code once and downcasting with an as_a, then directly accessing
> > fields, rather than going through accessor functions that check
> > each time. In some places we may want to replace a "gimple" with
> > a subclass e.g. phis are always of the phi subclass, to get full
> > compile-time type-safety.
> >
> > * This scheme is likely to be easier for newbies to understand.
> >
> > * Currently in gdb, dereferencing a gimple leads to screenfuls of text,
> > showing all the various union values. With this, you get just the base
> > class, and can cast it to the appropriate subclass.
> >
> > * With this, we're working directly with the language constructs,
> > rather than rolling our own, and thus other tools can better
> > understand the code. (e.g. doxygen).
> >
> > Again, as noted in the earlier patch series, the names of the structs
> > are rather verbose. I would prefer to also rename them all to eliminate
> > the "_statement" component:
> > "gimple_statement_base" -> "gimple_base"
> > "gimple_statement_phi" -> "gimple_phi"
> > "gimple_statement_omp" -> "gimple_omp"
> > etc, but I didn't do this to mimimize the patch size. But if the core
> > maintainers are up for that, I can redo the patch series with that
> > change also, or do that as a followup.
> >
> > The patch is in 6 parts; all of them are needed together.
> And that's part of the problem. There's understandable resistance to
> (for example) the as_a casting.
>
> There's a bit of natural tension between the desire to keep patches
> small and self-contained and the size/scope of the changes necessary to
> do any serious reorganization work. This set swings too far in the
> latter direction :-)
>
> Is there any way to go forward without the is_a/as_a stuff? ie, is
> there're a simpler step towards where we're trying to go that allows
> most of this to go forward now rather than waiting?
>
> >
> > * Patch 1 of 6: This patch adds inheritance to the various gimple
> > types, eliminating the initial baseclass fields, and eliminating the
> > union gimple_statement_d. All the types remain structs. They
> > become marked with GTY(()), gaining GSS_ tag values.
> >
> > * Patch 2 of 6: This patch ports various accessor functions within
> > gimple.h to the new scheme.
> >
> > * Patch 3 of 6: This patch is autogenerated by "refactor_gimple.py"
> > from https://github.com/davidmalcolm/gcc-refactoring-scripts
> > There is a test suite "test_refactor_gimple.py" which may give a
> > clearer idea of the changes that the script makes (and add
> > confidence that it's doing the right thing).
> > The patch converts code of the form:
> > {
> > GIMPLE_CHECK (gs, SOME_CODE);
> > gimple_subclass_get/set_some_field (gs, value);
> > }
> > to code of this form:
> > {
> > some_subclass *stmt = as_a <some_subclass> (gs);
> > stmt->some_field = value;
> > }
> > It also autogenerates specializations of
> > is_a_helper <T>::test
> > equivalent to a GIMPLE_CHECK() for use by is_a and as_a.
> Conceptually I'm fine with #1-#3.
>
> >
> > * Patch 4 of 6: This patch implement further specializations of
> > is_a_helper <T>::test, for gimple_has_ops and gimple_has_mem_ops.
> Here's where I start to get more concerned.
Thanks for looking through this.
Both you and Andrew objected to my use of the is-a.h stuff. Is this due
to the use of C++ templates in that code? If I were to rewrite things
in a more C idiom, would that be acceptable?
For instance, rather than, say:
p = as_a <gimple_statement_asm> (
gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK,
ninputs + noutputs + nclobbers + nlabels));
we could have an inlined as_a equivalent in C syntax:
p = gimple_as_a_gimple_asm (
gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK,
ninputs + noutputs + nclobbers + nlabels));
where there could be, say, a pair of functions like this (to handle
const vs non-const):
inline gimple_asm
gimple_as_a_gimple_asm (gimple gs)
{
GIMPLE_CHECK (gs->code == GIMPLE_ASM);
return (gimple_asm)gs;
}
inline const_gimple_asm
gimple_as_a_gimple_asm (const_gimple gs)
{
GIMPLE_CHECK (gs->code == GIMPLE_ASM);
return (const_gimple_asm)gs;
}
(where typedef gimple_statement_asm *gimple_asm)
That would avoid template usage within the patch, leaving the use of C++
inheritance as the only overtly C++ish aspect.
We could do the above using preprocessor magic, but I'd prefer to have
actual code to do it.
Similarly, instead of:
const gimple_statement_with_ops *ops_stmt =
dyn_cast <const gimple_statement_with_ops> (g);
if (!ops_stmt)
return NULL;
we could have:
const_gimple_with_ops ops_stmt =
gimple_dyn_cast_gimple_with_ops (g);
if (!ops_stmt)
return NULL;
> > * Patch 5 of 6: This patch does the rest of porting from union access
> > to subclass access (all the fiddly places that the script in patch 3
> > couldn't handle).
> >
> > * Patch 6 of 6: This patch updates the gdb python pretty-printing
> > hook.
> Conceptually #5 and #6 shouldn't be terribly controversial.
(...though they're implicitly using the template specializations from #3
and #4)
> THe question is can we move forward without patch #4, even if that means
> we aren't getting the static typechecking we want?
Maybe. If the above idea is still too far, we could keep the
GIMPLE_CHECK checking, and cast by hand. I suspect the results would be
more ugly (though it's clear that beauty is in the eye of the beholder
here :))
BTW, how do you feel about static_cast<> vs C-style casts?
Thanks
Dave