[patch, fortran] Fix PR 83540
Hello world, this rather self-explanatory patch makes sure we don't get an error using reallocation on assignment for inlining matmul when we don't have reallocation on assignment. Regression-tested. OK for trunk? Regards Thomas 2017-12-25 Thomas Koenig PR fortran/83540 * frontend-passes.c (create_var): If an array to be created has unknown size and -fno-realloc-lhs is in effect, return NULL. 2017-12-25 Thomas Koenig PR fortran/83540 * gfortran.dg/inline_matmul_20.f90: New test. Index: frontend-passes.c === --- frontend-passes.c (Revision 255788) +++ frontend-passes.c (Arbeitskopie) @@ -720,6 +720,11 @@ create_var (gfc_expr * e, const char *vname) if (e->expr_type == EXPR_CONSTANT || is_fe_temp (e)) return gfc_copy_expr (e); + /* Creation of an array of unknown size requires realloc on assignment. + If that is not possible, just return NULL. */ + if (flag_realloc_lhs == 0 && e->rank > 0 && e->shape == NULL) +return NULL; + ns = insert_block (); if (vname) ! { dg-do run } ! { dg-additional-options "-fno-realloc-lhs -ffrontend-optimize" } ! This used to segfault at runtime. ! Original test case by Harald Anlauf. program gfcbug142 implicit none real, allocatable :: b(:,:) integer :: n = 5 character(len=20) :: line allocate (b(n,n)) call random_number (b) write (unit=line,fmt='(2I5)') shape (matmul (b, transpose (b))) if (line /= '55') call abort end program gfcbug142
[PATCH, PR82391] Fold acc_on_device with const arg
Hi, the openacc standard states: If the acc_on_device routine has a compile-time constant argument, it evaluates at compile time to a constant. The purpose of this is to remove non-applicable device-specific code during compilation. In the case of asm insns which are device-specific, removal is even needed to be able to compile for host. When optimizing, the compiler complies with this requirement, through gimple_fold_builtin_acc_on_device and following optimizations. But that doesn't work at -O0. Consequenly, a test-case like f.i. loop-auto-1.c that has device-specific asm insns: ... #pragma acc routine seq static int __attribute__((noinline)) place () { int r = 0; if (acc_on_device (acc_device_nvidia)) { int g = 0, w = 0, v = 0; __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g)); __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w)); __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v)); r = (g << 16) | (w << 8) | v; } return r; } ... skips -O0: ... /* This code uses nvptx inline assembly guarded with acc_on_device, which is not optimized away at -O0, and then confuses the target assembler. { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */ ... This patch adds folding of acc_on_device with constant argument at -O0. This folding is done by fold_builtin_acc_on_device_cst_arg during pass_oacc_device_lower, which also propagates the folded value to it's uses, which allows TODO_cleanup_cfg to remove the dead code. This solution works fine for C, but for C++ things are a bit more complicated. In C, the 'int acc_on_device (acc_device_t)' maps onto the 'int __builtin_acc_on_device (int)', but for C++ that's not the case. The current solution for that problem is an inline function in openacc.h, but at -O0 that adds too much indirection to still be able to remove the dead code. The easiest solution is: ... #define acc_on_device(dev) __builtin_acc_on_device ((int)dev) ... but that's not strictly compliant with the openacc standard, which requires an openacc interface function 'int acc_on_device(acc_device_t)', not a macro. So we end up with a kludge in oacc_xform_acc_on_device that maps the openacc interface function acc_on_device onto the builtin function. Bootstrapped and reg-tested on x86_64. Build and reg-tested for x86_64 with nvptx accelerator. OK for trunk? Thanks, - Tom Fold acc_on_device with const arg 2017-12-22 Tom de Vries PR libgomp/82391 * omp-offload.c (fold_builtin_acc_on_device_cst_arg) (oacc_xform_acc_on_device, oacc_device_lower_non_offloaded): New function. (execute_oacc_device_lower): Call oacc_device_lower_non_offloaded. Call oacc_xform_acc_on_device. * openacc.h [__cplusplus] (acc_on_device (int)): Remove. [__cplusplus] (acc_on_device (acc_device_t)): Remove definition, and declare instead with __builtin_acc_on_device attributes. * testsuite/libgomp.oacc-c-c++-common/acc-on-device-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Remove int casts from args of acc_on_device calls. * testsuite/libgomp.oacc-c-c++-common/gang-static-2.c: Remove skip for -O0. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/tile-1.c: Same. --- gcc/omp-offload.c | 121 - libgomp/openacc.h | 14 +-- .../libgomp.oacc-c-c++-common/acc-on-device-4.c| 18 +++ .../libgomp.oacc-c-c++-common/gang-static-2.c | 3 - .../libgomp.oacc-c-c++-common/loop-auto-1.c| 4 - .../libgomp.oacc-c-c++-common/loop-dim-default.c | 3 - .../testsuite/libgomp.oacc-c-c++-common/loop-g-1.c | 4 - .../testsuite/libgomp.oacc-c-c++-common/loop-g-2.c | 4 - .../libgomp.oacc-c-c++-common/loop-gwv-1.c | 4 - .../libgomp.oacc-
Re: [PATCH] sel-sched: fix zero-usefulness case in sel_rank_for_schedule (PR 83513)
On 25.12.2017 19:47, Alexander Monakov wrote: Hello, we need the following follow-up fix for priority comparison in sel_rank_for_schedule as demonstrated by PR 83513. Checked on x86_64 by running a bootstrap and also checking for no regressions in make -k check-gcc RUNTESTFLAGS="--target_board=unix/-fselective-scheduling/-fschedule-insns" OK to apply? Yes. Andrey PR rtl-optimization/83513 * sel-sched.c (sel_rank_for_schedule): Order by non-zero usefulness before priority comparison. diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c index c1be0136551..be3813717ba 100644 --- a/gcc/sel-sched.c +++ b/gcc/sel-sched.c @@ -3396,17 +3396,22 @@ sel_rank_for_schedule (const void *x, const void *y) else if (control_flow_insn_p (tmp2_insn) && !control_flow_insn_p (tmp_insn)) return 1; + /* Prefer an expr with non-zero usefulness. */ + int u1 = EXPR_USEFULNESS (tmp), u2 = EXPR_USEFULNESS (tmp2); + + if (u1 == 0) +{ + if (u2 == 0) +u1 = u2 = 1; + else +return 1; +} + else if (u2 == 0) +return -1; + /* Prefer an expr with greater priority. */ - if (EXPR_USEFULNESS (tmp) != 0 || EXPR_USEFULNESS (tmp2) != 0) -{ - int p2 = EXPR_PRIORITY (tmp2) + EXPR_PRIORITY_ADJ (tmp2), - p1 = EXPR_PRIORITY (tmp) + EXPR_PRIORITY_ADJ (tmp); - - val = p2 * EXPR_USEFULNESS (tmp2) - p1 * EXPR_USEFULNESS (tmp); -} - else -val = EXPR_PRIORITY (tmp2) - EXPR_PRIORITY (tmp) - + EXPR_PRIORITY_ADJ (tmp2) - EXPR_PRIORITY_ADJ (tmp); + val = (u2 * (EXPR_PRIORITY (tmp2) + EXPR_PRIORITY_ADJ (tmp2)) + - u1 * (EXPR_PRIORITY (tmp) + EXPR_PRIORITY_ADJ (tmp))); if (val) return val;
Re: [Patch, fortran] PR83076 - [8 Regression] ICE in gfc_deallocate_scalar_with_status, at fortran/trans.c:1598
Hi All, This is a complete rework of the patch and of the original mechanism for adding caf token fields and finding them. In this patch, the token fields are added to the derived types after all the components have been resolved. This is done so that all the tokens appear at the very end of the derived type, including the hidden string lengths. This avoids the present situation, where the token appears immediately after its associated component such that the the derived types are not compatible with modules or libraries compiled without -fcoarray selected. All trans-types has to do now is to find the component and have the component token field point to its backend_decl. PR83319 is fixed by unconditionally adding the token field to the descriptor, when -fcoarray=lib whatever the value of codimen. This is something of a belt-and-braces approach, in that the token fields will sometimes be added when not needed. However, it is better that than the ICEs that occur when they are missing. Bootstrapped and regtested on FC23/x86_64 - OK for trunk and 7-branch? Paul 2017-12-26 Paul Thomas PR fortran/83076 * resolve.c (resolve_fl_derived0): Add caf_token fields for allocatable and pointer scalars, when -fcoarray selected. * trans-types.c (gfc_copy_dt_decls_ifequal): Copy the token field as well as the backend_decl. (gfc_get_derived_type): Flag GFC_FCOARRAY_LIB for module derived types that are not vtypes. Components with caf_token attribute are pvoid types. For a component requiring it, find the caf_token field and have the component token field point to its backend_decl. PR fortran/83319 *trans-types.c (gfc_get_array_descriptor_base): Add the token field to the descriptor even when codimen not set. 2017-12-26 Paul Thomas PR fortran/83076 * gfortran.dg/coarray_45.f90 : New test. PR fortran/83319 * gfortran.dg/coarray_46.f90 : New test. On 3 December 2017 at 23:48, Dominique d'Humières wrote: > Dear Paul, > >> Bootstrapped and regtested on FC23/x86_64 - OK for trunk? > > See my comment 7 in the PR. > > Dominique > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein Index: gcc/fortran/gfortran.h === *** gcc/fortran/gfortran.h (revision 256000) --- gcc/fortran/gfortran.h (working copy) *** typedef struct *** 870,876 unsigned alloc_comp:1, pointer_comp:1, proc_pointer_comp:1, private_comp:1, zero_comp:1, coarray_comp:1, lock_comp:1, event_comp:1, defined_assign_comp:1, unlimited_polymorphic:1, ! has_dtio_procs:1; /* This is a temporary selector for SELECT TYPE or an associate variable for SELECT_TYPE or ASSOCIATE. */ --- 870,876 unsigned alloc_comp:1, pointer_comp:1, proc_pointer_comp:1, private_comp:1, zero_comp:1, coarray_comp:1, lock_comp:1, event_comp:1, defined_assign_comp:1, unlimited_polymorphic:1, ! has_dtio_procs:1, caf_token:1; /* This is a temporary selector for SELECT TYPE or an associate variable for SELECT_TYPE or ASSOCIATE. */ Index: gcc/fortran/resolve.c === *** gcc/fortran/resolve.c (revision 256000) --- gcc/fortran/resolve.c (working copy) *** resolve_fl_derived0 (gfc_symbol *sym) *** 13992,13997 --- 13992,14022 if (!success) return false; + /* Now add the caf token field, where needed. */ + if (flag_coarray != GFC_FCOARRAY_NONE + && !sym->attr.is_class && !sym->attr.vtype) + { + for (c = sym->components; c; c = c->next) + if (!c->attr.dimension && !c->attr.codimension + && (c->attr.allocatable || c->attr.pointer)) + { + char name[GFC_MAX_SYMBOL_LEN+9]; + gfc_component *token; + sprintf (name, "_caf_%s", c->name); + token = gfc_find_component (sym, name, true, true, NULL); + if (token == NULL) + { + if (!gfc_add_component (sym, name, &token)) + return false; + token->ts.type = BT_VOID; + token->ts.kind = gfc_default_integer_kind; + token->attr.access = ACCESS_PRIVATE; + token->attr.artificial = 1; + token->attr.caf_token = 1; + } + } + } + check_defined_assignments (sym); if (!sym->attr.defined_assign_comp && super_type) Index: gcc/fortran/trans-types.c === *** gcc/fortran/trans-types.c (revision 256000) --- gcc/fortran/trans-types.c (working copy) *** gfc_get_array_descriptor_base (int dimen *** 1837,1843 TREE_NO_WARNING (decl) = 1; } ! if (flag_coarray == GFC_FCOARRAY_LIB && codimen) { decl = gfc_add_field_to_struc
Re: [patch, fortran] Fix PR 83540
OK - thanks for the patch. Paul On 26 December 2017 at 12:12, Thomas Koenig wrote: > Hello world, > > this rather self-explanatory patch makes sure we don't get an error > using reallocation on assignment for inlining matmul when > we don't have reallocation on assignment. > > Regression-tested. OK for trunk? > > Regards > > Thomas > > 2017-12-25 Thomas Koenig > > PR fortran/83540 > * frontend-passes.c (create_var): If an array to be created > has unknown size and -fno-realloc-lhs is in effect, > return NULL. > > 2017-12-25 Thomas Koenig > > PR fortran/83540 > * gfortran.dg/inline_matmul_20.f90: New test. -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
[testsuite, committed] Use relative line number in unroll-5.c
[ was: Re: [C/C++] Add support for #pragma GCC unroll v3 ] On 11/25/2017 11:15 AM, Eric Botcazou wrote: Index: testsuite/c-c++-common/unroll-5.c === --- testsuite/c-c++-common/unroll-5.c (revision 0) +++ testsuite/c-c++-common/unroll-5.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ + +extern void bar (int); + +int j; + +void test (void) +{ + #pragma GCC unroll 4+4 + for (unsigned long i = 1; i <= 8; ++i) +bar(i); + + #pragma GCC unroll -1/* { dg-error "requires an assignment-expression that evaluates to a non-negative integral constant less than or equal to" } */ + for (unsigned long i = 1; i <= 8; ++i) +bar(i); + + #pragma GCC unroll 200 /* { dg-error "requires an assignment-expression that evaluates to a non-negative integral constant less than or equal to" } */ + for (unsigned long i = 1; i <= 8; ++i) +bar(i); + + #pragma GCC unroll j /* { dg-error "requires an assignment-expression that evaluates to a non-negative integral constant less than or equal to" } */ +/* { dg-error "cannot appear in a constant-expression|is not usable in a constant expression" "" { target c++ } 21 } */ + for (unsigned long i = 1; i <= 8; ++i) +bar(i); + + #pragma GCC unroll 4.2 /* { dg-error "requires an assignment-expression that evaluates to a non-negative integral constant less than or equal to" } */ + for (unsigned long i = 1; i <= 8; ++i) +bar(i); +} Hi, this patch changes the absolute line number into a relative one. Tested on x86_64 and committed. Thanks, - Tom Use relative line number in unroll-5.c 2017-12-26 Tom de Vries * c-c++-common/unroll-5.c: Use relative line number. --- gcc/testsuite/c-c++-common/unroll-5.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/c-c++-common/unroll-5.c b/gcc/testsuite/c-c++-common/unroll-5.c index 754f3b1..b728066 100644 --- a/gcc/testsuite/c-c++-common/unroll-5.c +++ b/gcc/testsuite/c-c++-common/unroll-5.c @@ -19,7 +19,7 @@ void test (void) bar(i); #pragma GCC unroll j /* { dg-error "requires an assignment-expression that evaluates to a non-negative integral constant less than" } */ -/* { dg-error "cannot appear in a constant-expression|is not usable in a constant expression" "" { target c++ } 21 } */ +/* { dg-error "cannot appear in a constant-expression|is not usable in a constant expression" "" { target c++ } .-1 } */ for (unsigned long i = 1; i <= 8; ++i) bar(i);