On Mon, Oct 20, 2014 at 4:17 PM, Bernd Schmidt <ber...@codesourcery.com> wrote:
> This is a patch kit that adds the nvptx port to gcc. It contains preliminary
> patches to add needed functionality, the target files, and one somewhat
> optional patch with additional target tools. There'll be more patch series,
> one for the testsuite, and one to make the offload functionality work with
> this port. Also required are the previous four rtl patches, two of which
> weren't entirely approved yet.
>
> For the moment, I've stripped out all the address space support that got
> bogged down in review by brokenness in our representation of address spaces.
> The ptx address spaces are of course still defined and used inside the
> backend.
>
> Ptx really isn't a usual target - it is a virtual target which is then
> translated by another compiler (ptxas) to the final code that runs on the
> GPU. There are many restrictions, some imposed by the GPU hardware, and some
> by the fact that not everything you'd want can be represented in ptx. Here
> are some of the highlights:
>  * Everything is typed - variables, functions, registers. This can
>    cause problems with K&R style C or anything else that doesn't
>    have a proper type internally.
>  * Declarations are needed, even for undefined variables.
>  * Can't emit initializers referring to their variable's address since
>    you can't write forward declarations for variables.
>  * Variables can be declared only as scalars or arrays, not
>    structures. Initializers must be in the variable's declared type,
>    which requires some code in the backend, and it means that packed
>    pointer values are not representable.
>  * Since it's a virtual target, we skip register allocation - no good
>    can probably come from doing that twice. This means asm statements
>    aren't fixed up and will fail if they use matching constraints.

So with this restriction I wonder why it didn't make sense to go the
HSA "backend" route emitting PTX from a GIMPLE SSA pass.  This
would have avoided the LTO dance as well ...

That is, what is the advantage of expanding to RTL here - what
main benefits do you get from that which you thought would be
different to handle if doing code generation from GIMPLE SSA?

For HSA we even do register allocation (to a fixed virtual register
set), sth simple enough on SSA.  We of course also have to do
instruction selection but luckily virtual ISAs are easy to target.

So were you worried about "duplicating" instruction selection
and or doing it manually instead of with well-known machine
descriptions?

I'm just curious - I am not asking you to rewrite the beast ;)

Thanks,
Richard.

>  * No support for indirect jumps, label values, nonlocal gotos.
>  * No alloca - ptx defines it, but it's not implemented.
>  * No trampolines.
>  * No debugging (at all, for now - we may add line number directives).
>  * Limited C library support - I have a hacked up copy of newlib
>    that provides a reasonable subset.
>  * malloc and free are defined by ptx (these appear to be
>    undocumented), but there isn't a realloc. I have one patch for
>    Fortran to use a malloc/memcpy helper function in cases where we
>    know the old size.
>
> All in all, this is not intended to be used as a C (or any other source
> language) compiler. I've gone through a lot of effort to make it work
> reasonably well, but only in order to get sufficient test coverage from the
> testsuites. The intended use for this is only to build it as an offload
> compiler, and use it through OpenACC by way of lto1. That leaves the
> question of how we should document it - does it need the usual constraint
> and option documentation, given that user's aren't expected to use any of
> it?
>
> A slightly earlier version of the entire patch kit was bootstrapped and
> tested on x86_64-linux. Ok for trunk?
>
>
> Bernd

Reply via email to