On Mon, Oct 26, 2015 at 20:05:39 +0100, Jakub Jelinek wrote: > On Mon, Oct 26, 2015 at 09:35:52PM +0300, Ilya Verbin wrote: > > On Fri, Jul 17, 2015 at 15:05:59 +0200, Jakub Jelinek wrote: > > > As the testcases show, #pragma omp declare target has now a new form > > > (well, > > > two; with some issues on it pending), where it is used just as a single > > > declarative directive rather than a pair of them and allows marking > > > vars and functions by name as "omp declare target" vars/functions (which > > > the > > > middle-end etc. already handles), but also "omp declare target link", > > > which > > > is a deferred var, that is not initially mapped (on devices without shared > > > memory with host), but has to be mapped explicitly. > > > > I don't quite understand how link should work. OpenMP 4.5 says: > > > > "The list items of a link clause are not mapped by the declare target > > directive. > > Instead, their mapping is deferred until they are mapped by target data or > > target constructs. They are mapped only for such regions." > > > > But doesn't this mean that the example bellow should work identically > > with/without USE_LINK defined? Or is there some difference on other > > testcases? > > On your testcase, the end result is pretty much the same, the variable is > not mapped initially to the device, and at the beginning of omp target it is > mapped to device, at the end of the region it is unmapped from the device > (without copying back). > > But consider: > > int a = 1, b = 1; > #pragma omp declare target link (a) to (b) > int > foo (void) > { > return a++ + b++; > } > #pragma omp declare target to (foo) > int > main () > { > a = 2; > b = 2; > int res; > #pragma omp target map (to: a, b) map (from: res) > { > res = foo () + foo (); > } > // This assumes only non-shared address space, so would need to be guarded > // for that. > if (res != (2 + 1) + (3 + 2)) > __builtin_abort (); > return 0; > } > > Without declare target link or to, you can't use the global variables > in orphaned accelerated routines (unless you e.g. take the address of the > mapped variable in the region and pass it around). > The to variables (non-deferred) are always mapped and are initialized with > the original initializer, refcount is infinity. link (deferred) work more > like the normal mapping, referencing those vars when they aren't explicitly > (or implicitly) mapped is unspecified behavior, if it is e.g. mapped freshly > with to kind, it gets the current value of the host var rather than the > original one. But, beyond the mapping the compiler needs to ensure that > all uses of the link global var (or perhaps just all uses of the link global > var outside of the target construct body where it is mapped, because you > could use there the pointer you got from GOMP_target) are replaced by > dereference of some artificial pointer, so a becomes *a_tmp and &a becomes > &*a_tmp, and that the runtime library during registration of the tables is > told about the address of this artificial pointer. During registration, > I'd expect it would stick an entry for this range into the table, with some > special flag or something similar, indicating that it is deferred mapping > and where the offloading device pointer is. During mapping, it would map it > as any other not yet mapped object, but additionally would also set this > device pointer to the device address of the mapped object. We also need to > ensure that when we drop the refcount of that mapping back to 0, we get it > back to the state where it is described as a range with registered deferred > mapping and where the device pointer is.
Ok, got it, I'll try implement this... > > > we actually replace the variables with pointers to variables, then need > > > to somehow also mark those in the offloading tables, so that the library > > > > I see 2 possible options: use the MSB of the size, or introduce the third > > field > > for flags. > > Well, it can be either recorded in the host variable tables (which contain > address and size pair, right), or in corresponding offloading device table > (which contains the pointer, something else?). It contains a size too, which is checked in libgomp: gomp_fatal ("Can't map target variables (size mismatch)"); Yes, we can remove this check, and use second field in device table for flags. -- Ilya