[PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore 2020-03-27 Sandra Loosemore gcc/testsuite/ * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust line numbering. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Add

[PATCH 05/40] Fix bug in processing of array dimensions in data clauses.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore The g++ front end wraps the array length and low_bound values in NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail. The test case c-c++-common/goacc/kernels-loop-annotation-1.c was tickling this bug and giving bogus errors in g++ because it was falling t

[PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore 2020-08-19 Sandra Loosemore gcc/ * tree.h (OACC_LOOP_COMBINED): New. gcc/c/ * c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/cp/ * parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/fortra

[PATCH 07/40] Annotate inner loops in "acc kernels loop" directives (C/C++).

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the co

[PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran).

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the co

[PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore This tweak to the OpenACC kernels loop annotation relaxes the restrictions on function calls in the loop body. Normally calls to functions not explicitly marked with a parallelism attribute are not permitted, but C/C++ builtins and Fortran intrinsics have known semantics s

[PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Several of the Fortran tests for kernels loop annotation were failing due to changes in the formatting of "acc loop" constructs in the dump file. Now the "auto" clause appears first, instead of after "private". 2020-08-23 Sandra Loosemore gcc/testsuite/

[PATCH 11/40] Clean up loop variable extraction in OpenACC kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore The code for identifying annotatable loops in OpenACC kernels regions previously looked for the loop variable as the left-hand side of the comparison in the loop end test. However, front end optimizations sometimes switch the sense of the comparison, making this method unr

[PATCH 12/40] Relax some restrictions on the loop bound in kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore OpenACC loop semantics require that the loop bound be computable before entering the loop, rather than the C/C++ semantics where the end test is evaluated on every iteration. Formerly the kernels loop annotater permitted only constants and variables not modified in the loo

[PATCH 15/40] graphite: Extend SCoP detection dump output

2021-12-15 Thread Frederik Harwath
Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). ChangeLog: * graphite-scop-detection.c (scop_detection::can_represent_loop): Output reason for failure to dump file. (scop_detection::harmful_loop_in_regi

[PATCH 13/40] Fortran: Delinearize array accesses

2021-12-15 Thread Frederik Harwath
The Fortran front end presently linearizes accesses to multi-dimensional arrays by combining the indices for the various dimensions into a series of explicit multiplies and adds with refactoring to allow CSE of invariant parts of the computation. Unfortunately this representation interferes with Gr

[PATCH 16/40] graphite: Rename isl_id_for_ssa_name

2021-12-15 Thread Frederik Harwath
The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. gcc/ChangeLog: * graphite-sese-to-poly.c (isl_id_for_ssa_name): Rename to ...

[PATCH 17/40] graphite: Fix minor mistakes in comments

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo in comment. --- gcc/graphite-isl-ast-to-gimple.c | 2 +- gcc/graphite-sese-to-p

[PATCH 14/40] openacc: Move pass_oacc_device_lower after pass_graphite

2021-12-15 Thread Frederik Harwath
The OpenACC device lowering pass must run after the Graphite pass to allow for the use of Graphite for automatic parallelization of kernels regions in the future. Experimentation has shown that it is best, performancewise, to run pass_oacc_device_lower together with the related passes pass_oacc_loo

[PATCH 18/40] Move compute_alias_check_pairs to tree-data-ref.c

2021-12-15 Thread Frederik Harwath
Move this function from tree-loop-distribution.c to tree-data-ref.c and make it non-static to enable its use from other parts of GCC. gcc/ChangeLog: * tree-loop-distribution.c (data_ref_segment_size): Remove function. (latch_dominated_by_data_ref): Likewise. (compute_alias_

[PATCH 19/40] graphite: Add runtime alias checking

2021-12-15 Thread Frederik Harwath
Graphite rejects a SCoP if it contains a pair of data references for which it cannot determine statically if they may alias. This happens very often, for instance in C code which does not use explicit "restrict". This commit adds the possibility to analyze a SCoP nevertheless and perform an alias

[PATCH 21/40] openacc: Add "can_be_parallel" flag info to "graph" dumps

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * graph.c (oacc_get_fn_attrib): New declaration. (find_loop_location): New declaration. (draw_cfg_nodes_for_loop): Print value of the can_be_parallel flag at the top of loops in OpenACC functions. --- gcc/graph.c | 35

[PATCH 22/40] openacc: Remove unused partitioning in "kernels" regions

2021-12-15 Thread Frederik Harwath
With the old "kernels" handling, unparallelized regions would get executed with 1x1x1 partitioning even if the user provided explicit num_gangs, num_workers clauses etc. This commit restores this behavior by removing unused partitioning after assigning the parallelism dimensions to loops. gcc/Cha

[PATCH 23/40] Add function for printing a single OMP_CLAUSE

2021-12-15 Thread Frederik Harwath
Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump the whole OMP clause chain") changed the dumping behavior for OMP_CLAUSEs. The old behavior is required for a follow-up commit ("openacc: Add data optimization pass") that optimizes single OMP_CLAUSEs. gcc/ChangeLog: * t

[PATCH 24/40] openacc: Add data optimization pass

2021-12-15 Thread Frederik Harwath
From: Andrew Stubbs Address PR90591 "Avoid unnecessary data transfer out of OMP construct", for simple (but common) cases. This commit adds a pass that optimizes data mapping clauses. Currently, it can optimize copy/map(tofrom) clauses involving scalars to copyin/map(to) and further to "private"

[PATCH 26/40] openacc: Warn about "independent" "kernels" loops with data-dependences

2021-12-15 Thread Frederik Harwath
This commit concerns loops in OpenACC "kernels" region that have been marked up with an explicit "independent" clause by the user, but for which Graphite found data dependences. A discussion on the private internal OpenACC mailing list suggested that warning the user about the dependences woud be

[PATCH 25/40] openacc: Add runtime alias checking for OpenACC kernels

2021-12-15 Thread Frederik Harwath
From: Andrew Stubbs This commit adds the code generation for the runtime alias checks for OpenACC loops that have been analyzed by Graphite. The runtime alias check condition gets generated in Graphite. It is evaluated by the code generated for the IFN_GOACC_LOOP internal function calls. If ali

[PATCH 27/40] openacc: Handle internal function calls in pass_lim

2021-12-15 Thread Frederik Harwath
The loop invariant motion pass correctly refuses to move statements out of a loop if any other statement in the loop is unanalyzable. The pass does not know how to handle the OpenACC internal function calls which was not necessary until recently when the OpenACC device lowering pass was moved to a

[PATCH 28/40] openacc: Disable pass_pre on outlined functions analyzed by Graphite

2021-12-15 Thread Frederik Harwath
The additional dependences introduced by partial redundancy elimination proper and by the code hoisting step of the pass very often cause Graphite to fail on OpenACC functions. On the other hand, the pass can also enable the analysis of OpenACC loops (cf. e.g. the loop-auto-transfer-4.f90 testcase)

[PATCH 29/40] graphite: Tune parameters for OpenACC use

2021-12-15 Thread Frederik Harwath
The default values of some parameters that restrict Graphite's resource usage are too low for many OpenACC codes. Furthermore, exceeding the limits does not alwas lead to user-visible diagnostic messages. This commit increases the parameter values on OpenACC functions. The values were chosen to

[PATCH 30/40] graphite: Adjust scop loop-nest choice

2021-12-15 Thread Frederik Harwath
The find_common_loop function is used in Graphite to obtain a common super-loop of all loops inside a SCoP. The function is applied to the loop of the destination block of the edge that leads into the SESE region and the loop of the source block of the edge that exits the region. The exit block i

[PATCH 31/40] graphite: Accept loops without data references

2021-12-15 Thread Frederik Harwath
It seems that the check that rejects loops without data references is only included to avoid handling non-profitable loops. Including those loops in Graphite's analysis enables more consistent diagnostic messages in OpenACC "kernels" code and does not introduce any testsuite regressions. If execu

[PATCH 32/40] Reference reduction localization

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (privatize_reduction): New struct. (localize_reductions_r, localize_reductions): New functions. (gimplify_omp_for): Call localize_reductions. (gimplify_omp_workshare): Likewise. * omp-low.c (lower_oacc_reductions

[PATCH 33/40] Fix tree check failure with reduction localization

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (gimplify_omp_workshare): Use OMP_CLAUSES, OMP_BODY instead of OMP_TARGET_CLAUSES, OMP_TARGET_BODY. --- gcc/gimplify.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 9a4331c7

[PATCH 34/40] Use more appropriate var in localize_reductions call

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (gimplify_omp_for): Use for_stmt in call to localize_reductions. --- gcc/gimplify.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 04ffbc256442..daa69ccf6202 100644 --- a/gcc

[PATCH 35/40] Handle references in OpenACC "private" clauses

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (localize_reductions): Rewrite references for OMP_CLAUSE_PRIVATE also. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test. * testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.

[PATCH 36/40] openacc: Enable reduction variable localization for "kernels"

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * gimplify.c (gimplify_omp_for): Enable localization on "kernels" regions. (gimplify_omp_workshare): Likewise. --- gcc/gimplify.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index bf37388f

[PATCH 37/40] Fix for is_gimple_reg vars to 'data kernels'

2021-12-15 Thread Frederik Harwath
From: Tobias Burnus Nearly all variable mapping is moved from 'kernels' to a surrounding 'data kernels' and then 'force_present' mapped for the 'kernels'. However, as libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as there is a special case for is_gimple_reg in mapping

[PATCH 38/40] openacc: fix privatization of by-reference arrays

2021-12-15 Thread Frederik Harwath
From: Tobias Burnus Replacing of a by-reference variable in a private clause by a local variable makes sense; however, for arrays, the size is not directly known by the type. This causes an ICE via create_tmp_var which indirectly invokes force_constant_size in this case - but the latter only hand

[PATCH 39/40] openacc: Check type for references in reduction lowering

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * omp-low.c (lower_oacc_reductions): Only create a reference if variable has pointer type. --- gcc/omp-low.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index ae5cdfc5e260..2b8b848ec03a 100644 --- a/gcc/omp-

[PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling

2021-12-16 Thread Frederik Harwath
Adjust the testsuite to changed expectations with the new Graphite-based "kernels" handling. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Adjust. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Adjust. * testsuite/libgomp.oacc-c-c++-common/acc_prof

[PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-03-24 Thread Frederik Harwath
with both nvptx-none and amdgcn-amdhsa offloading. Best regards, Frederik Frederik Harwath (7): openmp: Add Fortran support for "omp unroll" directive openmp: Add C/C++ support for "omp unroll" directive openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE openmp: A

[PATCH 2/7] openmp: Add C/C++ support for "omp unroll" directive

2023-03-24 Thread Frederik Harwath
This commit implements the C and the C++ front end changes to support the "omp unroll" directive. The execution of the loop transformation relies on the pass that has been added as a part of the earlier Fortran patch. gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): H

[PATCH 3/7] openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE

2023-03-24 Thread Frederik Harwath
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_

[PATCH 4/7] openmp: Add Fortran support for "omp tile"

2023-03-24 Thread Frederik Harwath
This commit implements the Fortran front end support for the "omp tile" directive and the corresponding middle end transformation. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE. (enum gfc_exec_op): Add EXEC_OMP_TILE. (loop_trans

[PATCH 5/7] openmp: Add C/C++ support for "omp tile"

2023-03-24 Thread Frederik Harwath
This commit adds the C and C++ front end support for the "omp tile" directive. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE. * c-pragma.cc (omp_pragmas_simd): Likewise. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE. (enum pragma

[PATCH 6/7] openmp: Add Fortran support for loop transformations on inner loops

2023-03-24 Thread Frederik Harwath
So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the

[PATCH 7/7] openmp: Add C/C++ support for loop transformations on inner loops

2023-03-24 Thread Frederik Harwath
Add the parsing of loop transformations on inner loops of a loop-nest. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (c_parser_omp_tile): ... adjust use here,

Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Frederik Harwath via Gcc-patches
Hi Jakub, On 15.05.23 12:19, Jakub Jelinek wrote: On Fri, Mar 24, 2023 at 04:30:38PM +0100, Frederik Harwath wrote: this patch series implements the OpenMP 5.1 "unroll" and "tile" constructs. It includes changes to the C,C++, and Fortran front end for parsing the new

Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-17 Thread Frederik Harwath via Gcc-patches
Hi Jakub, On 16.05.23 13:00, Jakub Jelinek wrote: On Tue, May 16, 2023 at 11:45:16AM +0200, Frederik Harwath wrote: The place where different compilers implement the loop transformations was discussed in an OpenMP loop transformation meeting last year. Two compilers (another one and GCC with

[PATCH] Docs, OpenMP: Small fixes to internal OMP_FOR doc

2023-04-19 Thread Frederik Harwath via Gcc-patches
= 0;   return D_2064; } (Strictly speaking, the OMP_FOR is represented as a gomp_for at this point, but this does not really matter.) Can I commit the patch? Best regards, Frederik From 8af01114c295086526a67f56f6256fc945b1ccb5 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Wed, 19 Apr 2023 13

Re: [PATCH 1/7] openmp: Add Fortran support for "omp unroll" directive

2023-04-06 Thread Frederik Harwath via Gcc-patches
ay, even with 100 of repeated test executions ;-). Best regards, Frederik From 3f471ed293d2e97198a65447d2f0d2bb69a2f305 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Thu, 6 Apr 2023 14:52:07 +0200 Subject: [PATCH] openmp: Fix loop transformation tests libgomp/ChangeLog: * testsuite/

<    1   2