Richard Biener wrote:
We're going to look at supporting HSA from GCC (which would make it
more or less trivial to also target openCL I think)
For the friends of link-time optimization (LTO):
Unless I missed some fine point in OpenACC and OpenMP's target, they
only work with directives which are locally visible. Thus, if one does a
function call in the device/target section, it can only be placed on the
accelerator if the function can be inlined.
Thus, it would be useful, if LTO could be used to inline such function
into device code. I know one OpenACC code which calls functions in
different translation units (TU) - and the Cray compiler handles this
via LTO. Thus, it would be great if the HSA/OpenMP target/OpenACC
middle-end infrastructure could do likewise, which also means deferring
the error that an external function cannot be used to the middle-end/LTO
FE and not placing it into the FE. - In the mentioned code, the called
function does not have any OpenACC annotation but only consists of
constructs which are permitted by the accelerator - thus, no automatic
code gen of accelerator code happens for that. TU.
(I just want to mention this to ensure that this kind of LTO/accelerator
inlining is kept in mind when implementing the infrastructure for
HSA/OpenACC/OpenMP target/OpenCL - even if cross-TU inlining is not
supported initially.)
Tobias