Hi Thomas,
actually the current version of the acc_get/set_default_async patch is
combined into:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01426.html
This patch you're referring here was a version from early 2017.
I'll try to reply to the still applying comments here below.
On 2018/11/18 10:36 AM, Thomas Schwinge wrote:
--- include/gomp-constants.h (revision 245382)
+++ include/gomp-constants.h (working copy)
/* Asynchronous behavior. Keep in sync with
libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t. */
+#define GOMP_ASYNC_DEFAULT 0
#define GOMP_ASYNC_NOVAL -1
#define GOMP_ASYNC_SYNC -2
This means that "acc_set_default_async(acc_async_default)" will set
acc-default-async-var to "0", that is, the same as
"acc_set_default_async(0)". It thus follows that
"async"/"async(acc_async_noval)" is the same as "async(0)". Is that
intentional?
It is in line with the OpenACC 2.5 specification: "initial value [...] is
implementation defined", but I wonder why map it to "async(0)", and not
to its own, unspecified, but separate queue. In the latter case,
"acc_async_default" etc. would then map to a negative value to denote
this unspecified, but separate queue (and your changes would need to be
adapted for that).
I have not verified whether we're currently already having (on trunk
and/or openacc-gcc-8-branch) the semantics of the queue of
"async(acc_async_noval)" mapping to the same queue as "async(0)"?
As long as the thr->default_async variable == 0 (as it is initially)
then async(acc_async_noval) maps to async(0).
I'm fine to accept your changes as proposed (basically, everthing from
your patch posted that has a "default_async" in it), for that's an
incremental improvement anyway. But -- unless you tell me I've
misunderstood something -- I'll get the issue I raised clarified with the
OpenACC technical committee, and we will then later improve this further.
No matter what the outcome, the implementation-defined behavior should be
documented. (Can do that once we get the intentions clarified.)
Well, the current submitted implementation of async queues manages an array of
them
for each thread. So the intuitive default queue is the first (index 0), and
to support reverting to default when accepting 'acc_async_default' as an
argument,
defining acc_async_default == 0 is the logical choice.
The 'default' async is not symbolically a specific queue, it is simply a
thread-local
variable for what is referred by default when 'acc_async_noval' is encountered.
From that sense, initializing it as some negative integer doesn't make sense
Of course, if really desired, we implement the "default default" to be an
alternative
queue separate from the non-negative queue space, but I feel this is overkill.
+void
+acc_set_default_async (int async)
+{
+ if (async < acc_async_sync)
+ gomp_fatal ("invalid async argument: %d", async);
(This will nowadays use "async_valid_stream_id_p" or some such.)
Okay, I'll revise this part if needed. Although I am not sure if such a
specific check is really needed in the new async code, because most
(if not all) checking is centralized when indexing into the goacc_asyncqueue
array (and NULL is returned if error).
+ thr->default_async = async;
+}
--- libgomp/oacc-plugin.c (revision 245382)
+++ libgomp/oacc-plugin.c (working copy)
+/* Return the default async number from the TLS data for the current thread.
*/
+
+int
+GOMP_PLUGIN_acc_thread_default_async (void)
+{
+ struct goacc_thread *thr = goacc_thread ();
+ return thr ? thr->default_async : acc_async_default;
+}
As I understand, the need for this function will disappear with your
later "async re-work" changes, so OK as posted, but I wondered in which
cases we would not have a valid "goacc_thread" when coming here? (Might
again related to the "goacc_lazy_initialize" issue mentioned above.)
Yes, this thing was deleted in the final upstream async rework submission.
Any further questions need not apply, the new way is entirely different.
This plugin routine was kind of like an artifact of inappropriate layering of
logic :)
--- libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c (revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c (revision 0)
@@ -0,0 +1,904 @@
+[...]
+ acc_set_default_async (s);
+[...]
This is the one single test case using this functionality, but it only
verifies "correct results', but doesn't observe the actual queues (for
example, CUDA streams) being used.
Need test cases for "acc_get_default_async", too, and also Fortran ones.
We'll supplement more testcases later.
Generally, I envision test cases running a few "acc_get_cuda_stream"
calls with relevant argument values, to see whether the expected
queues/streames are being used. (Similar for other offload targets.)
But I suppose we might again need to get clarified whether
"acc_get_cuda_stream(acc_async_sync)",
"acc_get_cuda_stream(acc_async_noval)", or
"acc_get_cuda_stream(acc_async_default)" are actually valid calls (given
that these argument values are not valid "async value"s), and these would
then return the respective CUDA stream handles, different from the one
returned for "acc_get_cuda_stream(0)" etc.
That said, we can certainly implement it that way, because that's not
against the specification.
I think the likely clarification we'll ever get on this is that it's
implementation defined :P
Thanks,
Chung-Lin