I'll be submitting our current ptx backend in a series of 23 patches in
reply to this mail. This is currently a work-in-progress and still rough
around the edges. We'd like to do all our OpenACC work on the gomp4
branch, so I'm submitting this as a proposal to see if it would be
acceptable for this branch in its current state. Beyond these patches,
some other pieces are necessary for it to be useful: a post-processor
for gcc output that reorders it in such a way that ptxas can process it,
a small "library" assembly file that provides a _main entry point that
calls main as expected, and a modified version of the CUDA ptxjit
example program that can take an assembly file produced by the toolchain
and execute that _main entry. This makes it possible to run a fair
number of gcc testcases.

The first 22 patches are preliminary. Some fix problems exposed by the
port and could be applied to mainline now if someone wanted to approve
them. Others deal with a number of issues unique to ptx:

 * It's a virtual target, which means register allocation happens in the
   "assembler". We changed gcc to have a target hook to disable
   most everything from IRA onwards.
 * The assembly syntax is sufficiently different from what is normal
   that we need extra hooks to print out variables. We also need
   to print declarations for all referenced functions and variables in
   the output file.
 * Everything must live in an address space. There are several for
   global variables, constant data, and local variables. We have C
   frontend changes to apply and deal with implicit address spaces.

It's not clear whether we'll want the C frontend changes in the final
version of this - we may not care about compiling C directly to ptx
without going through OpenACC. For the moment it's very useful to be
able to run parts of the gcc testsuite.

Bootstrapped and tested on x86_64-linux with all patches applied to
gomp4-branch.


Bernd

Reply via email to