On 12/13/06, Frank Riese <[EMAIL PROTECTED]> wrote:
One of my professors stated that a GCC Back End uses the Control Flow Graph as
its input and that generation of RTL expressions occurs later on.

That is not true.

What roles
do Back and Middle End play in generation of RTL? Would you consider the CFG
or RTL expressions as the input for a GCC Back End?

Let me first say that the definitions of front end, back end, and
middle end are a bit hairy.  You have to carefully define what you
classify as belonging to the middle end or the back end.  I actually
try to avoid the terms nowadays.

Also, you have to be specific about the version of GCC that you're
talking about. GCC2, GCC3 and GCC4 are completely different
internally, and even the differences between various GCC4 releases are
quite significant.

Anyway...

The steps through the compiler are as follows:

1. front end runs, produces GENERIC
2. GENERIC is lowered to GIMPLE
3. a CFG is constructed for GIMPLE
4. GIMPLE (tree-ssa) optimizers run
5. GIMPLE is expanded to RTL, while preserving the CFG
6. RTL optimizers run
7. assembly is written out

The RTL generation in step 5 is done one statement at a time.   The
part of the compiler that generates the RTL is a mix of shared code
and of back end code: A single GIMPLE statement at a time is passed to
the middle-end expand routines, which tries to produce RTL for this
statement using instructions available on the target machine.  The
available instructions are defined by the target machine description
(i.e. the back end).

Try to understand cfgexpand.c and the section on named RTL patterns in
the GCC internals manual.

I also remembered having read the following line from the gcc internals
documentation. However, I'm still not sure how to interpret this:

"A control flow graph (CFG) is a data structure built on top of the
intermediate code representation (the RTL or tree instruction stream)
abstracting the control flow behavior of a function that is being compiled"

Does that mean that a control flow graph is built after rtl has been generated
or that information about that information about the control flow is
incorporated into the RTL data structures?

Neither.

I'm assuming you're interested in how this works in recent GCC
releases, i.e. GCC4 based. In GCC4, the control flow graph is built on
GIMPLE, the tree-ssa optimizers need a CFG too.  This CFG is kept
up-to-date through the optimizers and through expansion to RTL.  This
means that GCC builds the CFG only once for each function.

The data structures for the CFG are in basic-block.h.  These data
structures are most definitely *not* incorporated into the RTL
structures.  The CFG is independent of the intermediate
representations for the function instructions.  It has to be, or you
could have the same CFG data structures for both GIMPLE and RTL.

Hope this helps,

Gr.
Steven

Reply via email to