On Mon, Nov 09, 2015 at 08:59:15AM -0500, Nathan Sidwell wrote: > >This I'm afraid performs often two copies rather than just one (one to copy > >the host value to the present_copyin mapped value, another one in the > >region), > > I don't think that can be avoided. The host doesn't have control over when > the CTAs (a gang) start -- they may even be serialized onto the same > physical HW. So each gang has to initialize its own instance. Or did you > mean something else?
So, what is the scope of the private and firstprivate vars in OpenACC? In OpenMP if a variable is private or firstprivate on the target construct, unless further privatized in inner constructs it is really shared among all the threads in all the teams (ro one var per all CTAs/workers in PTX terms). Is that the case for OpenACC too, or are the vars e.g. private to each CTA already or to each thread in each CTA, something different? If they are shared by all CTAs, then you should hopefully be able to use the GOMP_MAP_FIRSTPRIVATE{,_INT}, if not, then I'd say you should at least use those to provide you the initializer data to initialize your private vars from as a cheaper alternative to mapping. Jakub