Ross Ridge wrote:
Basile STARYNKEVITCH writes:
It seems to me that some specifications
seems to be available. I am not a GPU expert, but
http://developer.amd.com/documentation/guides/Pages/default.aspx
contains a R8xx Family Instruction Set Archictectire document at
http://developer.amd.com/gpu_assets/r600isa.pdf and at a very quick
first glance (perhaps wrongly) I feel that it could be enough to design &
write a code generator for it.
Oh, ok, that makes a world of difference. Even with just AMD GPU
support a GCC-based OpenCL implementation becomes a lot more practical.
Ross Ridge
I am also interested in working on OpenCL support for GCC, for several
reasons.
1. This would be a way to increase accessibility of OpenCL to a wider
set of users, programming environments (and source languages), and
target platforms, including embedded ones (think of ARM/NVidia Tigra).
2. It would make good use of the high-level optimizations building up in
GCC, including LTO (for specialization purposes, something LLVM is great
at, but which could be done at a much larger scale in GCC), Graphite,
automatic vectorization for Larrabee-like SIMD architectures mixing
vectors and threads, etc.
3. Certainly a great step towards automatic parallelization for
heterogeneous architectures, and research in performance portability in
general.
Those 3 reasons (and there may be others) advocate for a front-end and
middle-end awareness about OpenCL, not only library stuff, and certainly
advocate for doing it in GCC rather than anywhere else.
I think the killer argument for GCC support of OpenCL is Larrabee, and
heterogeneous multicores in general. GCC must see those architectures as
one single target with multiple sub-targets. This may become a survival
issue within a couple of years.
A side question is whether GCC should become
single-source-multiple-target compiler, where a single compilation unit
can lead to code generated on multiple ISAs. Note that this is not out
of reach at all, since a short-cut exists, with attributes guarding the
code generation of some functions or variable declarations, allowing to
generate code only for a given target at a time. Several people tried
it, and it does not require reworking any machine description, although
multiple runs of cc1/... would still be necessary.
Feedback welcome!
Albert Cohen