On Tue, Dec 11, 2018 at 09:47:10PM +0800, Chung-Lin Tang wrote: > I have revised the patch to make both gomp_[un]map_vars and > gomp_[un]map_vars_async > point to gomp_[un]map_vars_internal, which is static always_inline. This > should > alleviate that part of the concerns.
> @@ -263,8 +279,9 @@ gomp_to_device_kind_p (int kind) > } > } > > -static void > +attribute_hidden void > gomp_copy_host2dev (struct gomp_device_descr *devicep, > + struct goacc_asyncqueue *aq, > void *d, const void *h, size_t sz, > struct gomp_coalesce_buf *cbuf) Have you tried sticking the struct goacc_asyncqueue * into struct gomp_coalesce_buf? If that doesn't work for some reason (please explain why), then I'd prefer that argument to come last, not second, various targets have small limits on how many arguments they can pass in registers. > @@ -293,14 +310,23 @@ gomp_copy_host2dev (struct gomp_device_descr *devi > } > } > } > - gomp_device_copy (devicep, devicep->host2dev_func, "dev", d, "host", h, > sz); > + if (aq) Can you please use __builtin_expect (aq != NULL, 0) here? Because ptr != NULL test is by default predicted more likely than ptr == NULL and the gomp_device_copy call is in there for both all OpenMP and for OpenACC except for async, so more likely. > + goacc_device_copy_async (devicep, devicep->openacc.async.host2dev_func, > + "dev", d, "host", h, sz, aq); > + else > + gomp_device_copy (devicep, devicep->host2dev_func, "dev", d, "host", h, > sz); > } > > -static void > +attribute_hidden void > gomp_copy_dev2host (struct gomp_device_descr *devicep, > + struct goacc_asyncqueue *aq, > void *h, const void *d, size_t sz) > { > - gomp_device_copy (devicep, devicep->dev2host_func, "host", h, "dev", d, > sz); > + if (aq) Likewise. Jakub