Hello,

I'm master student at high-performance computing at barcelona
supercomputing center. And I'm working on my thesis regarding openmp
accelerator model implementation onto our compiler (OmpSs). Actually i
almost finished implementation of all new directives  to generate CUDA
code and same implementation OpenCL doesn't take so much according to
my design. But i haven't even tried for Intel mic and apu other
hardware accelerator :) Now i'm bench-marking output kernel codes
which are generated by my compiler. although output kernel is
generally naive, speedup is not very very bad. when I compare results
with HMPP OpenACC 3.2.x compiler, speedups are almost same or in some
cases my results are slightly better than. That's why in this term, i
am going to work on compiler level or runtime level optimizations for
gpus.

When i looked gcc openmp 4.0 project, i couldn't see any things about
code generation. Are you going to announce later? or should i apply
gsoc with my idea about code generations and device code
optimizations?

Güray Özen
~grypp

Reply via email to