Re: Instructions vs Expressions in the backend (was Re: RFA: Rework FOR_BB_INSNS iterators)

Oleg Endo Mon, 23 Jun 2014 13:39:14 -0700

On Mon, 2014-06-23 at 14:54 -0400, David Malcolm wrote:

> FWIW I'm actually working on such a change, or, at least, to make
> instructions be a subclass of rtx_def; presumably this should make it
> easier to make it an entirely separate type, if that's desirable.
> 
> Executive summary is that it's still a work in progress, and I'm going
> to be giving a talk about it at Cauldron next month ("A proposal for
> typesafe RTL", currently scheduled for the Sunday at 12.45 in Stream 2).
> 
> More detailed version:
> I experimented with this a few months ago, and came up with a 159-patch
> patch series:
> http://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v1/
> 
> That patch kit builds on x86_64 (and regrtests iirc), and uses a
> separate subclass for insns e.g. in the core basic block class, in the
> scheduler, register allocators, etc, but:
>   (A) it only builds on about 70 of the ~200 configurations in
> config-list.mk
>   (B) there was a tangled mess of dependencies between the patches
> making it hard to fix (A)
>   (C) I went with the rather verbose "rtx_base_insn" name, which in v1
> is a typedef for a pointer to an insn, along with a family of other
> typedefs, which I now realize is frowned upon e.g.:
> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01520.html
> https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01632.html
>   (D) I was using as_a *methods* rather than just as_a <>, like
> in v1 of my gimple-classes patches:
> (https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00887.html)
> 
> 
> So I've been reworking it, incorporating feedback from the gimple
> statement classes work; the latest patch series is v7 and can be seen
> here:
> http://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v7/
> 
> It builds and regrtests on x86_64 on top of r211061 (aka
> dcd5393fd5b3ac53775a5546f3103958b64789bb) and builds on 184 of the
> configurations in config-list.mk; I have patches to fix the remaining
> ones to achieve parity with an unpatched build.
> 
> The current class hierarchy looks like this:
> /* Subclasses of rtx_def, using indentation to show the class
>    hierarchy, along with the relevant invariant.  */
> class rtx_def;
>   class rtx_insn; /* (INSN_P (X) || NOTE_P (X)
>                      || JUMP_TABLE_DATA_P (X) || BARRIER_P (X)
>                      || LABEL_P (X)) */
>     class rtx_real_insn;         /* INSN_P (X) */
>       class rtx_debug_insn;      /* DEBUG_INSN_P (X) */
>       class rtx_nonjump_insn;    /* NONJUMP_INSN_P (X) */
>       class rtx_jump_insn;       /* JUMP_P (X) */
>       class rtx_call_insn;       /* CALL_P (X) */
>     class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
>     class rtx_barrier;           /* BARRIER_P (X) */
>     class rtx_code_label;        /* LABEL_P (X) */
>     class rtx_note;              /* NOTE_P (X) */
> 
> and the code now uses "rtx_insn *" in hundreds of places where it would
> previously have used "rtx".
> 
> My aim is that it always builds on every config: each patch is
> self-contained.  To do this, the initial patches add scaffolding that
> adds some strategically-placed checked downcasts to rtx_insn * (e.g. on
> the result of PREV_INSN/NEXT_INSN).  The subsequent patches then use
> these promises in order to strengthen the insn-handling code to work on
> the new subclass. The idea is that the final patches then remove this
> scaffolding, with the core BB types becoming rtx_insn *.  The drawback
> of course is that during the process the resulting compiler is much
> slower - until the scaffolding is removed.
> 
> v1 of the patch series also added some non-insn subclasses of rtx, for
> EXPR_LIST, INSN_LIST, and for SEQUENCE, allowing rewriting of iterations
> like this (in jump.c:rebuild_jump_labels_1), where "insn" and
> "forced_labels" are INSN_LIST and hence are "rtx_insn_list *":
> 
> -    for (insn = forced_labels; insn; insn = XEXP (insn, 1))
> -      if (LABEL_P (XEXP (insn, 0)))
> -       LABEL_NUSES (XEXP (insn, 0))++;
> +    for (insn = forced_labels; insn; insn = insn->next ())
> +      if (LABEL_P (insn->element ()))
> +       LABEL_NUSES (insn->element ())++;
> 
> Thoughts?


Personally, I'd like to see usage of standard STL-like iterator usage.
I've proposed something for edge_iterator a while ago, but people don't
seem very fond of it.  See also
https://gcc.gnu.org/ml/gcc-patches/2013-12/msg01129.html

Have you also considered passing the new rtx_* types by value or
reference instead of pointer?  A long time ago I've quickly put together
a class 'rtxx' which was just a pointer wrapper for the rtx_def*
(basically the same as I proposed for edge_iterator).
I've converted the SH backend code to use it just to see what it would
look like.  The conversion itself was pretty straight forward -- just
replace 'rtx' with 'rtxx'.  Appropriate conversion
operators/constructors in 'class rtxx' made both interchangeable and
allowed co-existence of both and thus step-by-step conversion of the
code base.
Another advantage of passing around by value/ref is that it allows
operator overloading.  One use case could be instead of:

if (MEM_P (XEXP (x, 0)))
  *total = address_cost (XEXP (XEXP (x, 0), 0),
                         GET_MODE (XEXP (x, 0)),
                         MEM_ADDR_SPACE (XEXP (x, 0)), true);


something like that (overloading operator[]):
if (x[0] == rtx_mem::type)
  *total = address_cost (x[0][0], x[0].mode (),
                         x[0].mem_addr_space (), true);

... where rtx_mem::type would be some type for which 'rtxx' (or whatever
the name of the base class is) would provide the according operator
==, != overloads.

Another thing is rtx construction in C++ code, which could look like:

emit_insn (rtx_insn (rtx_set (rtx_reg (0),
                              rtx_plus (rtx_reg (1), rtx_reg (2)))));
emit_insn (rtx_barrier ());

Although I'm not sure how often this is needed in current practice,
since most of the time rtx instances are created from the .md patterns.
Maybe it could be useful for implementing some kind of ad-hoc rtx
matching, as it's found in cost calculations, predicates, constrants,
e.g.

if ((GET_CODE (XEXP (x, 0)) == SMAX || GET_CODE (XEXP (x, 0)) == SMIN)
    && CONST_INT_P (XEXP (XEXP (x, 0), 1))
    && REG_P (XEXP (XEXP (x, 0), 0))
    && CONST_INT_P (XEXP (x, 1)))

could become:
if (matches (x, rtx_smax (reg_rtx (), const_int (), const_int ()))
    || matches (x, rtx_smin (reg_rtx (), const_int (), const_int ()))

somehow I find that easier to read and write.

Cheers,
Oleg

Re: Instructions vs Expressions in the backend (was Re: RFA: Rework FOR_BB_INSNS iterators)

Reply via email to