[RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)
This is a RFC/WIP patch about: (A) OpenMP (C/C++/Fortran) omp target map(iterator(i=n:m),to : x(i)) (B) Fortran: (1) omp target map(to : dt_var, class_var) (2) omp parallel allocator(my_alloc) firstprivate(class_var) (3) call co_reduce(dt_coarray, my_func) The problem with (A) is that there is not a compile-time countable number of iterations such that it cannot be easily add to the array used to call GOMP_target_ext. The problem with (B) is that dt_var can have allocatable components which complicates stuff and with recursive types, the number of elements it not known at compile time - not with polymorphic types as it depends on the recursion depth and dynamic type, respectively. Comments/questions/remarks ... to the proposal below? Regarding mapping, I currently have no idea how to handle the virtual table. Thoughts? * * * The idea for OpenMP mapping is a callback function - such that integer function f() result(ires) implicit none integer :: a !$omp target map(iterator(i=1:5), to: a) !$omp end target ires = 7 end becomes #pragma omp target map(iterator(integer(kind=4) i=1:5:1):to:a) and then during gimplify: #pragma omp target num_teams(1) thread_limit(0) map(map_function:f_._omp_mapfn.0 [len: 0]) with unsigned long f_._omp_mapfn.0 (unsigned long (*) (void *) cb_fn, void * token, void * base, unsigned short flags) { ... with the loop around the cb_fn call and flag = GOMP_MAP_TO. (Not fully working yet. ME part needs still to generate the loop similar to depend or affinity. For C/C++, the basic parsing is done but some more code changes are needed in the FE.) * * * Fortran - with an OpenMP example: module m implicit none (type, external) type t3 end type t3 type t class(t3), allocatable :: cx type(t3), pointer :: ptx end type t end module m use m implicit none (type, external) class(t), allocatable :: var !$omp target map(to:var) if (allocated(var)) stop 1 !$omp end target end The idea is that this becomes: #pragma omp target map(to:var) map(map_function:var._vptr->_callback [len: 1]) map(to:var [len: 0]) That's: * 'var' is first normally mapped * Then the map function is added which gets 'var' as argument (For an array, I plan to add an internal function which calls the callback function in a scalarization loop.) On the Fortran side - this requires in the vtable a new entry, (*ABI breakage*) which points to: integer(kind=8) __callback_m_T ( integer(kind=8) (*) (void *, void *, integer(kind=8), void (*) (void), integer(kind=2)) cb, void * token, struct t & restrict scalar, integer(kind=4) f_flags) { __result___callback_m_T = 0; if (scalar->cx._data != 0B) { void * D.4384; D.4384 = (void *) scalar->cx._data; __result___callback_m_T = cb (token, D.4384, scalar->cx._vptr->_size, 0B, 0) + __result___callback_m_T; __result___callback_m_T = cb (token, *scalar->cx._data, 0, *scalar->cx._vptr->_callback, 0) + __result___callback_m_T; } if (scalar->ptx != 0B) { void * D.4386; D.4386 = (void *) scalar->ptx; __result___callback_m_T = cb (token, D.4386, 0, 0B, 0) + __result___callback_m_T; } return __result___callback_m_T; } That is: * For pointer, the CB is called with SIZE = 0, permitting the caller to remap pointer - or ignore the callback call. * For allocatables, it passes the SIZE, permitting to map the allocatable * If the allocatable is a CLASS or has allocatable components, cb is called with a callback function - which that those can be mapped as well. (and SIZE = 0) (The GOMP_MAP_TO needs to be handled by libgomp, e.g. by putting it into the void *token.) The vtable's callback function can then also be used with * OpenMP ALLOCATOR or for * deep copying with CO_REDUCE. Question: Does this way of passing make sense or not? Comments? Tobias PS: The patch has a lot of pieces in places, but still lacks both some glue code and some other bit. :-/ - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 gcc/c/c-parser.c | 69 - gcc/cp/parser.c | 70 +++-- gcc/fortran/class.c | 351 ++ gcc/fortran/dump-parse-tree.c | 14 +- gcc/fortran/gfortran.h| 1 + gcc/fortran/intrinsic.c | 2 +- gcc/fortran/module.c | 9 +- gcc/fortran/openmp.c | 41 - gcc/fortran/resolve.c | 2 +- gcc/fortran/trans-expr.c | 5 + gcc/fortran/trans-intrinsic.c | 3 +- gcc/fortran/trans-openmp.c| 59 ++- gcc/fortran/trans.h | 1 + gcc/gimplify.c
Re: [RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)
On Mon, Dec 06, 2021 at 03:00:30PM +0100, Tobias Burnus wrote: > This is a RFC/WIP patch about: > > (A) OpenMP (C/C++/Fortran) >omp target map(iterator(i=n:m),to : x(i)) > > (B) Fortran: > (1) omp target map(to : dt_var, class_var) > (2) omp parallel allocator(my_alloc) firstprivate(class_var) > (3) call co_reduce(dt_coarray, my_func) > > The problem with (A) is that there is not a compile-time countable > number of iterations such that it cannot be easily add to the array > used to call GOMP_target_ext. > > The problem with (B) is that dt_var can have allocatable components > which complicates stuff and with recursive types, the number of > elements it not known at compile time - not with polymorphic types > as it depends on the recursion depth and dynamic type, respectively. I think there is no reason why the 3 arrays passed to GOMP_target_ext (etc., for target data {, enter, exit} too and because this affects to and from clauses as well, target update as well) need to be constant size. We can allocate them as VLA or from heap as well. I guess only complication for using __builtin_allocate_with_align would be target data, where the construct body could be using alloca and we wouldn't want to silently free those allocas at the end of the construct, though I bet we already have that problem whenever we privatize some variable length variables on constructs that don't result in outlined body into a new function, and outlining a body into a new function will also break alloca across the boundaries. We do a lot of sorting of the map clauses especially during gimplification, one question is whether it is ok to sort the whole map clause with iterator as one clause, or if we'd need to do the sorting at runtime. With arbitrary lvalue expressions, the clauses with iterator don't need to be just map(iterator(i=0:n),to : x[i]) but can be e.g. map(iterator(i=0:n), tofrom : i == 0 ? a : i == 1 ? b : c[i - 2]) etc. (at least in C++, in C I think ?: doesn't give lvalues), or *(i == 0 ? &a : i == 1 ? &b : &c[i - 2]) otherwise, though I hope that is ok, it isn't much different from such lvalue expressions when i isn't an iterator but say function parameter or some other variable, I think we only map value in that case and don't really remap the vars etc. (but sure, for map(iterator(i=0:n), to : foo(i).a[i].b[i]) we should follow the rules for []s and . So, I wouldn't be really afraid of going into dynamic allocation of the arrays if the count isn't compile time constant. Another thing is that it would be nice to optimize some most common cases where some mappings could be described in more compact ways, and that wouldn't be solely about iterator clause, but also when we start properly implementing all the mapping nastiness of 5.0 and beyond, like mapping of references, or the declare mapper stuff etc. So if we come up with something like array descriptors Fortran has to describe mapping of some possibly non-contiguous multidimensional array with strides etc. in a single map element, it will be nice, but I'd prefer not to outline complex expressions from map's clause as separate function each, it can use many variables etc. from the parent function and calling those as callbacks would be too ugly. Jakub
Re: [RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)
On 06.12.21 16:16, Jakub Jelinek wrote: I think there is no reason why the 3 arrays passed to GOMP_target_ext (etc., for target data {, enter, exit} too and because this affects to and from clauses as well, target update as well) need to be constant size. We do a lot of sorting of the map clauses especially during gimplification, one question is whether it is ok to sort the whole map clause with iterator as one clause, or if we'd need to do the sorting at runtime. Regarding sorting at runtime: It looks as if Julian's patches at [PATCH 00/16] OpenMP: lvalues in "map" clauses and struct handling rework can do without run-time sorting. Regarding the sorting and iterators: I think we already have this problem intrinsically – for depend/affinity, we create for (iterator(...) : a, b) a single loop - also to have a consistency with regards to the array bounds. But if we want to put 'd' between 'a' and 'b' - we either need to split the loop - or 'd' cannot be put between 'a' and 'b'. That's a fundamental issue. I am not sure whether that's a real issue as all have the same map type, but still. but I'd prefer not to outline complex expressions from map's clause as separate function each, it can use many variables etc. from the parent function and calling those as callbacks would be too ugly. I concur that it would be useful to avoid using callbacks; it it seems as if it can be avoided for iterators. I am not sure how well, but okay. But I have no idea how to avoid callbacks for allocatable components in Fortran. For type t type(t), allocatable :: a end t type(t) :: var (recursive type) - it is at least semi-known at compile time: e = var; while (e) { map(e); e = e->a; } I am not sure how to pass this on to the middle end - but code for it can be generated. But as soon as polymorphism comes into play, I do not see how callbacks can be avoided. Like for: class(t) :: var2 Here, it is known at compile time that var2%a exists (recursively). But the dynamic type might additionally have var2%b(:) which in turn might have var2%(:)%c. I see two places for calling the callback: Either by passing the Fortran callback function on to libgomp or by generating the function call handling inside omp-low.c - to populate a nonconstant array. Which solution do you prefer? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)
On Mon, Dec 06, 2021 at 05:06:10PM +0100, Tobias Burnus wrote: > Regarding the sorting and iterators: I think we already have this problem > intrinsically – for depend/affinity, we create for (iterator(...) : > a, b) > a single loop - also to have a consistency with regards to the array bounds. depend and affinity don't need to sort anything, we ignore affinity altogether, depend is just an unordered list of (from what we care about) addresses with the kinds next to them, it can contain duplicates etc. (and affinity if we implemented it can too). > > But if we want to put 'd' between 'a' and 'b' - we either need to split > the loop - or 'd' cannot be put between 'a' and 'b'. That's a fundamental > issue. I am not sure whether that's a real issue as all have the same map > type, but still. > > > but I'd > > prefer not to outline complex expressions from map's clause as separate > > function each, it can use many variables etc. from the parent function > > and calling those as callbacks would be too ugly. > > I concur that it would be useful to avoid using callbacks; it it seems > as if it can be avoided for iterators. I am not sure how well, but okay. > > But I have no idea how to avoid callbacks for allocatable components in > Fortran. For > > type t > type(t), allocatable :: a > end t > type(t) :: var > > (recursive type) - it is at least semi-known at compile time: > e = var; > while (e) >{ map(e); e = e->a; } > I am not sure how to pass this on to the middle end - but > code for it can be generated. I bet we'd need to add a target hook for that, but other than that, I don't see why we'd need a callback at runtime. Let a target hook in first phase compute how many slots in the 3 arrays will be needed, then let's allocate the 3 arrays, fill in the static parts in there and when filling such maps follow the target hook to emit inline code that fills in those extra mappings. Note, I think it might be better to do declare mapper support before doing the recursive allocatables or Fortran polymorphism, because it will necessarily be affected by declare mapper at each level too. But generally, I don't see why whatever you want to do with a callback couldn't be done by just emitting a runtime loop that does something when filling the arrays. After all, we'll have such runtime loops even for simple iterator unless we optimize those as an array descriptor, map(iterator(i=0:n), to: *foo (i)) - in some way it is inlining what the callback would do at the GOMP_target_ext etc. caller, but it is actually the other way around, callbacks would mean outlining what can be done in mere runtime loops inside of the function that has all the vars etc. accessible there. Jakub
[patch, Fortran] IEEE support for aarch64-apple-darwin
Hi everyone, Since support for target aarch64-apple-darwin has been submitted for review, it’s time to submit the Fortran part, i.e. enabling IEEE support on that target. The patch has been in use now for several months, in a developer branch shipped by some distros on macOS (including Homebrew). It was authored more than a year ago, but I figured it wasn’t relevant to submit until the target was actually close to be in trunk: https://github.com/iains/gcc-darwin-arm64/commit/b107973550d3d9a9ce9acc751adbbe2171d13736 Bootstrapped and tested on aarch64-apple-darwin20 (macOS Big Sur) and aarch64-apple-darwin21 (macOS Monterey). OK to merge? Can someone point me to the right way of formatting ChangeLogs and commit entries, nowadays? Thanks, FX b107973550d3d9a9ce9acc751adbbe2171d13736.patch Description: Binary data
[patch, power-ieee128, committed] First stab at the library
Hi, here is a first stab at the library side of power-ieee128, committed to the branch. It compiles, but probably still has a lot of issues. It is also not called from the compiler yet. Regards Thomas 2021-10-19 Thomas Koenig Prepare library for REAL(KIND=17). This prepares the library side for REAL(KIND=17). It is not yet tested, but at least compiles cleanly on POWER 9 and x86_64. fixincludes/ChangeLog: * configure: Regenerate. * fixincl.x: Regenerate.2021-10-19 Thomas Koenig Prepare library for REAL(KIND=17). This prepares the library side for REAL(KIND=17). It is not yet tested, but at least compiles cleanly on POWER 9 and x86_64. fixincludes/ChangeLog: * configure: Regenerate. * fixincl.x: Regenerate. intl/ChangeLog: * aclocal.m4: Regenerate. * configure: Regenerate. libatomic/ChangeLog: * Makefile.in: Regenerate. * configure: Regenerate. * testsuite/Makefile.in: libcc1/ChangeLog: * Makefile.in: Regenerate. * configure: Regenerate. libdecnumber/ChangeLog: * configure: Regenerate. libgcc/ChangeLog: * configure: Regenerate. libgfortran/ChangeLog: * Makefile.am: Add _r17 and _c17 files. Build them with -mabi=ieeelongdouble on POWER. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: New flag HAVE_REAL_17. * kinds-override.h: (HAVE_GFC_REAL_17): New macro. (HAVE_GFC_COMPLEX_17): New macro. (GFC_REAL_17_HUGE): New macro. (GFC_REAL_17_LITERAL_SUFFIX): New macro. (GFC_REAL_17_LITERAL): New macro. (GFC_REAL_17_DIGITS): New macro. (GFC_REAL_17_RADIX): New macro. * libgfortran.h (POWER_IEEE128): New macro. (gfc_array_r17): Typedef. (GFC_DTYPE_REAL_17): New macro. (GFC_DTYPE_COMPLEX_17): New macro. (__acoshieee128): Prototype. (__acosieee128): Prototype. (__asinhieee128): Prototype. (__asinieee128): Prototype. (__atan2ieee128): Prototype. (__atanhieee128): Prototype. (__atanieee128): Prototype. (__coshieee128): Prototype. (__cosieee128): Prototype. (__erfieee128): Prototype. (__expieee128): Prototype. (__fabsieee128): Prototype. (__jnieee128): Prototype. (__log10ieee128): Prototype. (__logieee128): Prototype. (__powieee128): Prototype. (__sinhieee128): Prototype. (__sinieee128): Prototype. (__sqrtieee128): Prototype. (__tanhieee128): Prototype. (__tanieee128): Prototype. (__ynieee128): Prototype. * m4/mtype.m4: Make a bit more readable. Add KIND=17. * generated/_abs_c17.F90: New file. * generated/_abs_r17.F90: New file. * generated/_acos_r17.F90: New file. * generated/_acosh_r17.F90: New file. * generated/_aimag_c17.F90: New file. * generated/_aint_r17.F90: New file. * generated/_anint_r17.F90: New file. * generated/_asin_r17.F90: New file. * generated/_asinh_r17.F90: New file. * generated/_atan2_r17.F90: New file. * generated/_atan_r17.F90: New file. * generated/_atanh_r17.F90: New file. * generated/_conjg_c17.F90: New file. * generated/_cos_c17.F90: New file. * generated/_cos_r17.F90: New file. * generated/_cosh_r17.F90: New file. * generated/_dim_r17.F90: New file. * generated/_exp_c17.F90: New file. * generated/_exp_r17.F90: New file. * generated/_log10_r17.F90: New file. * generated/_log_c17.F90: New file. * generated/_log_r17.F90: New file. * generated/_mod_r17.F90: New file. * generated/_sign_r17.F90: New file. * generated/_sin_c17.F90: New file. * generated/_sin_r17.F90: New file. * generated/_sinh_r17.F90: New file. * generated/_sqrt_c17.F90: New file. * generated/_sqrt_r17.F90: New file. * generated/_tan_r17.F90: New file. * generated/_tanh_r17.F90: New file. * generated/bessel_r17.c: New file. * generated/cshift0_c17.c: New file. * generated/cshift0_r17.c: New file. * generated/cshift1_16_c17.c: New file. * generated/cshift1_16_r17.c: New file. * generated/cshift1_4_c17.c: New file. * generated/cshift1_4_r17.c: New file. * generated/cshift1_8_c17.c: New file. * generated/cshift1_8_r17.c
[PATCH] PR fortran/103591 - ICE in gfc_compare_string, at fortran/arith.c:1119
Dear all, we didn't check the type of the upper bound in a case range. Bummer. Simply add a corresponding check. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald From b4e7aeae4f6c59d8fe950d7981832e3f9c6a8f0e Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Mon, 6 Dec 2021 23:15:11 +0100 Subject: [PATCH] Fortran: add check for type of upper bound in case range gcc/fortran/ChangeLog: PR fortran/103591 * match.c (match_case_selector): Check type of upper bound in case range. gcc/testsuite/ChangeLog: PR fortran/103591 * gfortran.dg/select_9.f90: New test. --- gcc/fortran/match.c| 9 + gcc/testsuite/gfortran.dg/select_9.f90 | 10 ++ 2 files changed, 19 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/select_9.f90 diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c index 2bf21434a42..52bc5af7542 100644 --- a/gcc/fortran/match.c +++ b/gcc/fortran/match.c @@ -6075,6 +6075,15 @@ match_case_selector (gfc_case **cp) m = gfc_match_init_expr (&c->high); if (m == MATCH_ERROR) goto cleanup; + if (m == MATCH_YES + && c->high->ts.type != BT_LOGICAL + && c->high->ts.type != BT_INTEGER + && c->high->ts.type != BT_CHARACTER) + { + gfc_error ("Expression in CASE selector at %L cannot be %s", + &c->high->where, gfc_typename (c->high)); + goto cleanup; + } /* MATCH_NO is fine. It's OK if nothing is there! */ } } diff --git a/gcc/testsuite/gfortran.dg/select_9.f90 b/gcc/testsuite/gfortran.dg/select_9.f90 new file mode 100644 index 000..c580e8162bd --- /dev/null +++ b/gcc/testsuite/gfortran.dg/select_9.f90 @@ -0,0 +1,10 @@ +! { dg-do compile } +! PR fortran/103591 - ICE in gfc_compare_string +! Contributed by G.Steinmetz + +program p + integer :: n + select case (n) + case ('1':2.) ! { dg-error "cannot be REAL" } + end select +end -- 2.26.2
Re: [power-ieee128] What should the math functions be annotated with?
On Sun, Dec 05, 2021 at 12:16:38PM +0100, Thomas Koenig wrote: > > On 05.12.21 01:35, Peter Bergner wrote: > > Instead of setting LD_LIBRARY_PATH=/home/tkoenig/lib64 could you try > > setting it to LD_LIBRARY_PATH='$ORIGIN/lib64' instead? This would > > allow the other system binaries to not find your /home/tkoenig/lib64 > > directory so they'd behave normally. However, any binary that was > > compiled in a directory where your lib64/ exists would find your > > new libs and use them. I'm not sure if that cramps your testing > > or not, to limit yourself to compiling your tests in that one directory. > > > > If that doesn't work, could you instead not set LD_LIBRARY_PATH and > > instead compile using -L/home/bergner/lib64 -R/home/bergner/lib64 ? > > I think I shall forsake the dubious joys of dynamic linking and use > -static-libgfortran instead. Yes, I tend to use -static-libgfortran for running Fortran spec things, and -static-libstdc++ for C++, since it can be a quaqmire getting the right library when you have several libraries on the system. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
[PATCH] PR fortran/103588 - ICE: Simplification error in gfc_ref_dimen_size, at fortran/array.c:2407
Dear all, using a bad expression as stride in a subsequent array section did lead to an internal error which was directly invoked after the failure. We better return a failure code to let error recovery do its expected job. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald From 1487d327b13b45acca79c0c691a748ca1a50bc04 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Mon, 6 Dec 2021 23:34:17 +0100 Subject: [PATCH] Fortran: catch failed simplification of bad stride expression gcc/fortran/ChangeLog: PR fortran/103588 * array.c (gfc_ref_dimen_size): Do not generate internal error on failed simplification of stride expression; just return failure. gcc/testsuite/ChangeLog: PR fortran/103588 * gfortran.dg/pr103588.f90: New test. --- gcc/fortran/array.c| 8 +++- gcc/testsuite/gfortran.dg/pr103588.f90 | 8 2 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr103588.f90 diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c index 5762c8d92d4..5f9ed17f919 100644 --- a/gcc/fortran/array.c +++ b/gcc/fortran/array.c @@ -2403,11 +2403,9 @@ gfc_ref_dimen_size (gfc_array_ref *ar, int dimen, mpz_t *result, mpz_t *end) { stride_expr = gfc_copy_expr(ar->stride[dimen]); - if(!gfc_simplify_expr(stride_expr, 1)) - gfc_internal_error("Simplification error"); - - if (stride_expr->expr_type != EXPR_CONSTANT - || mpz_cmp_ui (stride_expr->value.integer, 0) == 0) + if (!gfc_simplify_expr (stride_expr, 1) + || stride_expr->expr_type != EXPR_CONSTANT + || mpz_cmp_ui (stride_expr->value.integer, 0) == 0) { mpz_clear (stride); return false; diff --git a/gcc/testsuite/gfortran.dg/pr103588.f90 b/gcc/testsuite/gfortran.dg/pr103588.f90 new file mode 100644 index 000..198e1766cd2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr103588.f90 @@ -0,0 +1,8 @@ +! { dg-do compile } +! PR fortran/103588 - ICE: Simplification error in gfc_ref_dimen_size +! Contributed by G.Steinmetz + +program p + integer, parameter :: a(:) = [1,2] ! { dg-error "cannot be automatic or of deferred shape" } + integer :: b(2) = a(::a(1))! { dg-error "Invalid" } +end -- 2.26.2