[committed] [OG10] Re: Re: [Patch] [OpenMP, Fortran] Add structure/derived-type element mapping

2020-08-19 Thread Kwok Cheung Yeung
2a0205da31e3948f67cc754e9208e85fb Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Wed, 19 Aug 2020 12:50:42 -0700 Subject: [PATCH] Fix gfortran.dg/goacc/pr70828.f90 testcase Array mapping was changed by the patch '[OpenMP, Fortran] Add structure/derived-type element mapping'.

[PATCH] fortran, openmp: PR fortran/93660 Fix ICE when coarrays used with 'omp declare simd'

2020-09-10 Thread Kwok Cheung Yeung
ith Nvidia offloading. Okay for trunk? Thanks Kwok commit e842728189edc14c3f8b7c0a93cb51b007f20220 Author: Kwok Cheung Yeung Date: Thu Sep 10 13:59:51 2020 -0700 Fix ICE when coarrays used with the OpenMP 'declare simd' directive The ICE occurs when Fortran coarray

Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-07-08 Thread Kwok Cheung Yeung
ted this on a x64 host with offloading to nvptx and gcn. On AMD GCN, it also produces a couple of extra linker warnings that I have added dg-warning entries for. Okay for trunk/OG10 together with the previous mkoffload patch? Thanks Kwok commit 43238117c261285a6b95d881bcc2f9efd9f752ad Author: K

[PATCH] libgomp: Fix hang when profiling OpenACC programs with CUDA 9.0 nvprof

2020-07-13 Thread Kwok Cheung Yeung
wok commit d20f269e8571a76d682a500e78654ddd260ffaf1 Author: Kwok Cheung Yeung Date: Fri Jul 10 14:06:26 2020 -0700 libgomp: Fix hang when profiling OpenACC programs with CUDA 9.0 nvprof The version of nvprof in CUDA 9.0 causes a hang when used to profile an OpenACC program. This is becaus

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-07-15 Thread Kwok Cheung Yeung
On 01/07/2020 3:28 pm, Tom de Vries wrote: I looked at the implementation, and it looks ok to me, though I think we need to make explicit in a comment what the assumptions are: - that we have read and write access to the entire word, and Is there a situation where an 8/16-bit portion of memory

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-07-20 Thread Kwok Cheung Yeung
patch on standalone nvptx, and that the new reduction-16.c testcase passes with both nvptx and AMD GCN offloading. Is this version okay for master and og10? Thanks Kwok commit 4661232905d55a4bc1354cb717b2e5d950d215af Author: Kwok Cheung Yeung Date: Thu Jul 16 12:00:24 2020 -0700 nv

[PATCH] [og10] Fix goacc/note-parallelism-combined-kernels-loop-auto.c test

2020-07-22 Thread Kwok Cheung Yeung
s-loop-auto.c were just missed by accident? This patch removes the dg-messages for the no-longer occurring messages. Okay for OG10? Kwok From 59c6bc996000fb69ee8cf2cee3ec2e279524e66f Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Thu, 18 Jun 2020 13:32:40 -0700 Subject: [PATCH 1/6] Fi

[PATCH] [og10] Fix goacc/routine-4-extern.c test

2020-07-22 Thread Kwok Cheung Yeung
nts - test cases' (commit 6a0b5806b24bfdefe0b0f3ccbcc51299e5195dca) did not include a fix for routine-4-extern.c. This patch removes the now outdated dg-error and dg-warnings from the test. Okay for OG10? Kwok From 5774f048563df311f2a35a654b8c2d7b1af9f2da Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung

[PATCH] [og10] Fix goacc/loop-processing-1.c testcase

2020-07-22 Thread Kwok Cheung Yeung
ues. Okay for OG10? Kwok From 22e91315f3ce7c486017c6b9245dc1ea2d6bdede Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Fri, 19 Jun 2020 06:58:40 -0700 Subject: [PATCH 3/6] Fix broken testcase gcc.dg/goacc/loop-processing-1.c 2020-07-21 Kwok Cheung Yeung gcc/testsuite/ * gcc.dg/goacc/loop-pr

[PATCH] [og10] Fix routine-nohost-1.c testcase

2020-07-22 Thread Kwok Cheung Yeung
vious'. Kwok From f921b0988c41ba086e968faf08e93f7a230e55a1 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Fri, 19 Jun 2020 09:34:27 -0700 Subject: [PATCH 4/6] Fix failure in testcase c-c++-common/goacc/routine-nohost-1.c 2020-07-21 Kwok Cheung Yeung gcc/testsuite/

[PATCH] [og10] Fix goacc/loop-2-kernels.f95 testcase

2020-07-22 Thread Kwok Cheung Yeung
rsion the default; adjust and add tests' (commit 757f56ddc43fd80bb8740222ec352111b26d66e9), so the Fortran version should probably be XFAILed too. Okay for OG10? Kwok From 87cf165b9b45f4cedd9cda362d9238486024a527 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Thu, 25 Jun 2020 12:0

[PATCH] [og10] Fix gfortran.dg/goacc/routine-module-mod-1.f90 testcase

2020-07-22 Thread Kwok Cheung Yeung
ned gang vector loop parallelism, which triggers the message as there is no worker parallelism. This patch makes the message expected. Okay for OG10 branch? Kwok From 824a4d600380a8b02bb65f055ff0423bbd849a4f Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Wed, 1 Jul 2020 08:26:42 -0700

Re: [PATCH] [og10] Fix goacc/routine-4-extern.c test

2020-07-26 Thread Kwok Cheung Yeung
g. I have reverted all the previous changes and replaced the orphan loop gang reductions with empty loops as suggested, and checked that the tests now pass. Is this version okay for OG10? Thanks Kwok From 280957dc80090bd0b92ad7a73f528851aad94051 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeun

[PING] Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-08-04 Thread Kwok Cheung Yeung
Hello I posted a revised patchset about two weeks ago at: https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550291.html Are you able to take a look at it? Thanks Kwok

[PATCH] [amdgcn] Add support for unordered floating-point comparisons

2020-04-02 Thread Kwok Cheung Yeung
regressions noted. Okay for trunk? Kwok commit ea811ce38ae2127554f0aca9cd34aca6e42f814d Author: Kwok Cheung Yeung Date: Thu Apr 2 07:47:28 2020 -0700 amdgcn: Support unordered floating-point comparison operators 2020-04-02 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c

[PATCH] amdgcn: Add stub personality function

2020-04-22 Thread Kwok Cheung Yeung
standalone builds of GCC for AMD GCN. Okay for trunk? Kwok 2020-04-22 Kwok Cheung Yeung libgcc/ * config/gcn/unwind-gcn.c (__gxx_personality_v0): New. diff --git a/libgcc/config/gcn/unwind-gcn.c b/libgcc/config/gcn/unwind-gcn.c index 813f03f..6508b45 100644 --- a/libgcc/config/gcn

Re: [PATCH] amdgcn: Add stub personality function

2020-04-23 Thread Kwok Cheung Yeung
On 23/04/2020 12:05 pm, Thomas Schwinge wrote: So we should simply disable it properly (see below)... ... instead of adding such stub functions. I suggest we instead apply what I'd proposed a month ago in "[amdgcn] ld: error: undefined symbol: __gxx_personality_v0

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-10-07 Thread Kwok Cheung Yeung
On 07/10/2019 10:25 am, Thomas Schwinge wrote: Hi Kwok, Tobias! On 2019-07-29T21:55:52+0100, Kwok Cheung Yeung wrote: > if (omp_is_reference (new_var) > && TREE_CODE (TREE_TYPE (new_var)) != POINTER_TYPE) As is, this code in lower_omp_targ

Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-12-18 Thread Kwok Cheung Yeung
ve retested all the gomp tests in the main testsuite, retested libgomp, and checked bootstrapping. Is this version okay for trunk now? Thanks Kwok From ef4a42c5174372dd0d72dc0efe2c608e693c7877 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Thu, 17 Dec 2020 12:10:18 -0800 Subject: [P

[OG10][committed] Backport openmp: Implicit 'declare target' for C++ static initializers

2020-12-18 Thread Kwok Cheung Yeung
Hello I have now backported the "openmp: Implicit 'declare target' for C++ static initializers" patch from mainline to the devel/omp/gcc-10 branch. The testcase required a small tweak as the gimple output has changed since OG10 was branched. This has been committed as commit 83797c2d47aaa011b

Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-12-18 Thread Kwok Cheung Yeung
On 18/12/2020 7:31 pm, Jakub Jelinek wrote: On Fri, Dec 18, 2020 at 03:10:52PM +, Kwok Cheung Yeung wrote: 2020-12-17 Kwok Cheung Yeung gcc/testsuite/ * g++.dg/gomp/declare-target-3.C: New. Note the test fails on the trunk when one doesn't have offloading confi

Re: [OG10][committed] Backport openmp: Implicit 'declare target' for C++ static initializers

2020-12-18 Thread Kwok Cheung Yeung
On 18/12/2020 7:27 pm, Kwok Cheung Yeung wrote: Hello I have now backported the "openmp: Implicit 'declare target' for C++ static initializers" patch from mainline to the devel/omp/gcc-10 branch. The testcase required a small tweak as the gimple output has changed sin

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-11 Thread Kwok Cheung Yeung
wrote: On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote: --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14942,6 +14942,11 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) pc = &OMP_CLAUSE_CHAIN (c); continue; + case OMP_CLAUSE_DE

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-15 Thread Kwok Cheung Yeung
On 10/12/2020 2:38 pm, Jakub Jelinek wrote: On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote: --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14942,6 +14942,11 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) pc = &OMP_CLAUSE_CHAI

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-15 Thread Kwok Cheung Yeung
On 15/01/2021 3:07 pm, Kwok Cheung Yeung wrote: I have tested bootstrapping on x86_64 (no offloading) with no issues, and running the libgomp testsuite with Nvidia offloading shows no regressions. I have also tested all the gomp.exp tests in the main gcc testsuite, also with no issues. I am

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-16 Thread Kwok Cheung Yeung
Thanks for the review. On 16/01/2021 9:25 am, Jakub Jelinek wrote: On Fri, Jan 15, 2021 at 03:07:56PM +, Kwok Cheung Yeung wrote: + { + tree detach_decl = OMP_CLAUSE_DECL (*detach_seen); + + for (pc = &clauses, c = clauses; c ; c =

[PATCH] wwwdocs: Document devel/omp/gcc-10 branch

2020-06-10 Thread Kwok Cheung Yeung
to the old branch, as new development should occur on the newer branch. OK to push? Thanks, Kwok commit ef2aef1c8649a9620f0975a3fe5d4cadaa0b9d1e Author: Kwok Cheung Yeung Date: Wed Jun 10 06:08:06 2020 -0700 Document devel/omp/gcc-10 branch This also removes references to the

Re: [PATCH] wwwdocs: Document devel/omp/gcc-10 branch

2020-06-10 Thread Kwok Cheung Yeung
case it should probably go after openacc-gcc-9-branch)? Kwok On 10/06/2020 4:15 pm, Tobias Burnus wrote: On 6/10/20 3:34 PM, Kwok Cheung Yeung wrote: This patch updates the previous entry on the website for devel/omp/gcc-9 to refer to the new branch instead. This removes references to the old

[PATCH] nvptx: Add support for subword compare-and-swap

2020-06-15 Thread Kwok Cheung Yeung
cause any regressions on the nvptx offloading tests, and that the new test passes with both nvptx and amdgcn as offload targets. Okay for master and OG10? Kwok commit 7c3a9c23ba9f5b8fe953aa5492ae75617f2444a3 Author: Kwok Cheung Yeung Date: Mon Jun 15 12:34:55 2020 -0700 nvptx: Add support

Re: [PATCH] wwwdocs: Document devel/omp/gcc-10 branch

2020-06-15 Thread Kwok Cheung Yeung
where it all began. This is a history chapter, after all. commit b719899acf24974fd4c51f14538b426f99259384 Author: Kwok Cheung Yeung Date: Wed Jun 10 06:08:06 2020 -0700 Document devel/omp/gcc-10 branch This also moves the old devel/omp/gcc-9 branch to the inactive branches

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-24 Thread Kwok Cheung Yeung
On 23/06/2020 5:44 pm, Thomas Schwinge wrote: Hi! On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung wrote: This patch adds support on nvptx for __sync_val_compare_and_swap operations on 1- and 2-byte values. Is this a thorough review that these are the only functions missing, or did you just

[PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-24 Thread Kwok Cheung Yeung
the gfortran and libgomp testsuites. Okay for master/OG10? Thanks Kwok commit 052993de7457af85d5749b2ab119ffcc65e341e5 Author: Kwok Cheung Yeung Date: Thu Jun 18 12:40:16 2020 -0700 libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs The unmod

Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-25 Thread Kwok Cheung Yeung
as obvious.) Hello I have committed your patch along with the testcase as 'obvious'. I have confirmed that it does not regress the gfortran and libgomp testsuites. Kwok commit f530bac8a11da9c85bdd8e58d589747f9825e38d Author: Kwok Cheung Yeung Date: Thu Jun 25 04:40:53

Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-26 Thread Kwok Cheung Yeung
s currently only non-zero for nvptx. I think the easiest fix would be to expect different numbers of matches depending on whether nvptx offloading is enabled. This requires an extra function in gcc/testsuite/lib/target-supports.exp. Okay for master/OG10? Thanks Kwok commit 04bdcaa20827d814c32384

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-30 Thread Kwok Cheung Yeung
On 23/06/2020 5:51 pm, Jakub Jelinek wrote: On Tue, Jun 23, 2020 at 06:44:26PM +0200, Thomas Schwinge wrote: On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung wrote: This patch adds support on nvptx for __sync_val_compare_and_swap operations on 1- and 2-byte values. Is this a thorough review

Re: [PATCH] openmp: Retire nest-var ICV

2020-11-18 Thread Kwok Cheung Yeung
On 18/11/2020 11:41 am, Jakub Jelinek wrote: On Thu, Nov 12, 2020 at 10:44:35PM +, Kwok Cheung Yeung wrote: + /* OMP_NESTED is deprecated in OpenMP 5.0. */ + if (parse_boolean ("OMP_NESTED", &nested)) + gomp_global_icv.max_active_levels_var = +

Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-11-19 Thread Kwok Cheung Yeung
regressions, and no regressions in the libgomp testsuite with Nvidia offloading. Thanks, Kwok From 0348b149474d0922d79209705e6777e7af271e0d Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Wed, 18 Nov 2020 13:54:01 -0800 Subject: [PATCH] openmp: Implicitly add 'declare target'

Nested declare target support

2020-11-20 Thread Kwok Cheung Yeung
Hello New OpenMP 5.0 features that won't be available in GCC 9, are planned for GCC 10 or later versions as time permits: ... - nested declare target support You said in an email two years ago that nested declare target was not supported yet. I do not see any patches that claim to implemen

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-11-27 Thread Kwok Cheung Yeung
Hello This is an updated version of the WIP patch for task detach support. Any comments would be welcome! On 11/11/2020 7:06 pm, Kwok Cheung Yeung wrote: - No error checking at the front-end. The detach clause is now parsed properly in C, C++ and Fortran, and will raise an error if the

PING Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-11-27 Thread Kwok Cheung Yeung
Hello This patch still needs review. Thanks Kwok On 19/11/2020 6:07 pm, Kwok Cheung Yeung wrote: On 29/10/2020 10:03 am, Jakub Jelinek wrote: I'm actually not sure how this can work correctly. Let's say we have int foo () { return 1; } int bar () { return 2; } int baz () { retur

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-12-09 Thread Kwok Cheung Yeung
e to wait until GCC 11 is branched off)? Thanks Kwok commit 3d82db0fc3623e9dc241bed4c4cfd266574d45e7 Author: Kwok Cheung Yeung Date: Wed Dec 9 09:33:46 2020 -0800 openmp: Add support for the OpenMP 5.0 task detach clause 2020-12-09 Kwok Cheung Yeung gcc/

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-12-09 Thread Kwok Cheung Yeung
On 09/12/2020 5:53 pm, Jakub Jelinek wrote: On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote: I believe this patch is largely complete now. I have done a bootstrap on x86_64 and run the testsuites with no regressions. I have also run the libgomp testsuite with offloading to

[PATCH] openmp: Add support for omp_get_supported_active_levels

2020-10-13 Thread Kwok Cheung Yeung
aa519103d7eeaeed825fd358e9532bf51f4be0a9 Author: Kwok Cheung Yeung Date: Wed Oct 7 09:34:32 2020 -0700 openmp: Add support for the omp_get_supported_active_levels runtime library routine This patch implements the omp_get_supported_active_levels runtime routine from the OpenMP 5.0 specification, which returns

Re: [PATCH] openmp: Add support for omp_get_supported_active_levels

2020-10-13 Thread Kwok Cheung Yeung
Now committed to trunk with the suggested fixes. Thanks for the quick review. Kwok On 13/10/2020 7:36 pm, Jakub Jelinek wrote: I'd suggest to #define gomp_supported_active_levels INT_MAX in libgomp.h and leave out the const variable. Another possibility is an enumerator, but we don't include

[PATCH] openmp: Implement support for OMP_TARGET_OFFLOAD

2020-10-14 Thread Kwok Cheung Yeung
ommit a22f434d5ec9e62c158912b693275ce89a2cbab0 Author: Kwok Cheung Yeung Date: Thu Oct 8 10:08:27 2020 -0700 openmp: Implement support for OMP_TARGET_OFFLOAD environment variable This implements support for the OMP_TARGET_OFFLOAD environment variable introduced in the OpenMP 5.0 standard,

Re: [PATCH] openmp: Add support for omp_get_supported_active_levels

2020-10-15 Thread Kwok Cheung Yeung
On 14/10/2020 9:20 am, Jakub Jelinek wrote: On Tue, Oct 13, 2020 at 07:05:10PM +0100, Kwok Cheung Yeung wrote: +* omp_get_supported_active_levels:: Maxiumum number of active levels supported Sorry for not catching it during review, but there is a typo above. Fixed with patch below, committed

Re: [PATCH] openmp: Implement support for OMP_TARGET_OFFLOAD

2020-10-19 Thread Kwok Cheung Yeung
Thanks Kwok commit 82555f50d2930f973ab20782ebcb836b719bce96 Author: Kwok Cheung Yeung Date: Mon Oct 19 10:47:42 2020 -0700 openmp: Implement support for OMP_TARGET_OFFLOAD environment variable This implements support for the OMP_TARGET_OFFLOAD environment variable introduc

Re: [PATCH] openmp: Implement support for OMP_TARGET_OFFLOAD

2020-10-20 Thread Kwok Cheung Yeung
On 20/10/2020 1:57 pm, Jakub Jelinek wrote: On Tue, Oct 20, 2020 at 02:17:26PM +0200, Tobias Burnus wrote: On 10/20/20 2:11 PM, Tobias Burnus wrote: Unfortunately, the committed patch (r11-4121-g1bfc07d150790fae93184a79a7cce897655cb37b) causes build errors. The error seems to be provoked by f

[PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-10-28 Thread Kwok Cheung Yeung
the compiler with no offloading on x86-64. Okay for trunk? Thanks Kwok commit d2c8c5bd2826851b727e93a8ea2141596e50a621 Author: Kwok Cheung Yeung Date: Wed Oct 28 07:13:14 2020 -0700 openmp: Implicitly add 'declare target' directives for dynamic static initializers in C++

Re: deprecations in OpenMP 5.0

2020-10-28 Thread Kwok Cheung Yeung
Hello I found this almost two-year old thread while looking for how the OpenMP 5.0 deprecations were to be handled. E.g. if somebody tries hard to write portable OpenMP code and has: omp_lock_t lock; #if __OPENMP__ >= 201811L omp_init_lock_with_hint (&lock, omp_sync_hint_contended); #

[PATCH] amdgcn: Add builtins for vectorized native versions of abs, floorf and floor

2022-11-08 Thread Kwok Cheung Yeung
37f49b204d501327d0867b3e8a3f01b9445fb9bd Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Tue, 8 Nov 2022 11:59:58 + Subject: [PATCH] amdgcn: Add builtins for vectorized native versions of abs, floorf and floor 2022-11-08 Kwok Cheung Yeung gcc/ * config/gcn/gcn

[COMMITTED] amdgcn: Fix expansion of GCN_BUILTIN_LDEXPV builtin

2022-11-08 Thread Kwok Cheung Yeung
2001 From: Kwok Cheung Yeung Date: Tue, 8 Nov 2022 14:38:23 + Subject: [PATCH] amdgcn: Fix expansion of GCN_BUILTIN_LDEXPV builtin 2022-11-08 Kwok Cheung Yeung gcc/ * config/gcn/gcn.cc (gcn_expand_builtin_1): Expand first argument of GCN_BUILTIN_LDEXPV to V64DFmode

Re: [OG12] [committed] amdgcn: Enable SIMD vectorization of math library functions

2022-11-08 Thread Kwok Cheung Yeung
Hello These additional patches were pushed onto the devel/omp/gcc-12 branch to fix various issues with the SIMD math library: ecf1603b7ad amdgcn: Fix expansion of GCN_BUILTIN_LDEXPV builtin 6c40e3f5daa amdgcn: Various fixes for SIMD math library 8e6c5b18e10 amdgcn: Fixed intermittent failure i

[PATCH] openmp: Add support for 'present' modifier

2023-02-03 Thread Kwok Cheung Yeung
ses. Bootstrapped on x86-64, no regressions in GCC testsuite, libgomp tested with x86-64 (no offloading), AMD GCN and NVPTX offloading. This is too late for GCC 13 now, but will this be okay for GCC 14? Thanks KwokFrom ba9368f88514a27f374d84e53e36ce36fa9ac5bc Mon Sep 17 00:00:00 2001 From: Kwo

[OG12][committed] openmp: Add support for the 'present' modifier

2023-02-09 Thread Kwok Cheung Yeung
Hello I've ported my patch for supporting the OpenMP 5.1 'present' modifier and committed it to the devel/omp/gcc-12 development branch: 229b705862c openmp: Add support for the 'present' modifier Tested with offloading on amdgcn and nvptx. Kwok

[committed] wwwdocs: Document devel/omp/gcc-12

2022-06-29 Thread Kwok Cheung Yeung
previous devel/omp/gcc-11 branch now joins the list of inactive OMP branches. KwokFrom 0695e5e969eba730e517a6adbdf38b8774f89437 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Wed, 29 Jun 2022 22:32:39 +0100 Subject: [PATCH] Document devel/omp/gcc-12 branch Also moves the old devel/omp/gcc-11

[commit] [OG10] amdgcn: Add gfx908 support

2021-03-25 Thread Kwok Cheung Yeung
Hello I have backported commit 3535402e20118655b2ad4085a6e1d4f1b9c46e92 (amdgcn: Add gfx908 support) from mainline to the devel/omp/gcc-10 branch as commit bb55967ccde0b48f285150caf6443a327159b4a2. This adds support for the gfx908 GPU type. Kwok

[PATCH] Check suitability of spill register for mode

2019-11-14 Thread Kwok Cheung Yeung
ps on x86_64, though it currently does not use spill registers. Okay for trunk? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * lra-spills.c (assign_spill_hard_regs): Check that the spill register is suitable for the mode. --- gcc/lra-spills.c | 3 ++- 1 file changed, 2 inser

[PATCH] [GCN] Fix handling of VCC_CONDITIONAL_REG

2019-11-14 Thread Kwok Cheung Yeung
SGPR_REGS) to avoid expensive spills into memory. Built for and tested on the AMD GCN target with no regressions. Okay for trunk? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG register class for VCC_LO and

[PATCH 0/5] [amdgcn] Reduce register usage on AMD GCN

2019-11-14 Thread Kwok Cheung Yeung
Hello Although GCN has a large register file, these registers are distributed among the threads (wavefronts) running on the same compute unit, so (up to a point) the fewer registers used in a kernel, the more kernels can run concurrently. While this is of limited use in trunk at the moment wi

[PATCH 1/5] [amdgcn] Use first lane of v1 for zero constant

2019-11-14 Thread Kwok Cheung Yeung
2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and prologue use of v0. (print_operand_address): Use v1 for zero vector offset. --- gcc/config/gcn/gcn.c | 17 +++-- 1 file changed, 3 insertions(+), 14

[PATCH 2/5] [amdgcn] Reinitialize registers for every function

2019-11-14 Thread Kwok Cheung Yeung
implementation is actually dead code!). I have added a call to reinit_regs in gcn_init_cumulative_args to setup the available registers for each function. Okay for trunk? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs

[PATCH 3/5] [amdgcn] Restrict register usage in non-kernel functions

2019-11-14 Thread Kwok Cheung Yeung
to the newlib patch 'Stash reent marker in upper bits of s1 on AMD GCN' and the first patch in this series). Okay to commit? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (default_requested_args): New. (gcn_parse_amdgpu_hsa_kernel_attribute):

[PATCH 5/5] [amdgcn] Unfix frame pointer

2019-11-14 Thread Kwok Cheung Yeung
unk? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.h (FIXED_REGISTERS): Unfix frame pointer. (CALL_USED_REGISTERS): Make frame pointer callee-saved. --- gcc/config/gcn/gcn.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/gcn/g

[PATCH 4/5] [amdgcn] Update lower limits requested by non-leaf kernels

2019-11-14 Thread Kwok Cheung Yeung
? Kwok 2019-11-14 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT, MAX_NORMAL_VGPR_COUNT): New. (gcn_conditional_register_usage): Use constants in place of hard-coded values. (gcn_hsa_declare_function_name): Set lower bound for number of SGPRs/VGPRs in

Re: [PATCH 4/5] [amdgcn] Update lower limits requested by non-leaf kernels

2019-11-15 Thread Kwok Cheung Yeung
On 15/11/2019 11:32 am, Andrew Stubbs wrote: On 14/11/2019 15:33, Kwok Cheung Yeung wrote: The kernel attributes are changed to request at least 64 SGPRs and 24 VGPRs (i.e. the non-kernel maximum, otherwise the callees may not have enough registers to run in) for non-leaf kernels to take

[og9] Backport AMD GCN backend improvements from mainline

2019-11-18 Thread Kwok Cheung Yeung
for mode I will commit this later. Thanks, Kwok 2019-11-07 Kwok Cheung Yeung gcc/ * ira.c (setup_alloc_regs): Setup no_unit_alloc_regs for frame pointer in multiple registers. (ira_setup_eliminable_regset): Setup eliminable_regset, ira_no_alloc_regs and

[PATCH] [og9] Fix libgomp.oacc-fortran/lib-16.f90 test

2019-11-22 Thread Kwok Cheung Yeung
osed to be identical to lib-16.f90 except for using 'include "openacc_lib.h"' instead of 'use openacc', any changes to lib-16-2.f90 should also apply to lib-16.f90. When the changes are applied to lib-16.f90, the test passes. Okay to commit to OG9? Than

Re: [PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-12-02 Thread Kwok Cheung Yeung
So this is the OG12-specific part (including metadirective and dynamic context selectors) of the previous patch. Once https://gcc.gnu.org/r13-4446-ge41b243302e996 is backported, is it OK for OG12? Looks good to me, thanks! Kwok

[PATCH] Fix detection of thread support with uClibc in libgcc

2014-10-11 Thread Kwok Cheung Yeung
C++11 thread library together with the uClibc implementation of libpthread. This caused a large number of failed tests from the g++, libgomp and libstdc++ testsuites when run on a MIPS Linux target with uClibc as the C library. Kwok 2014-10-11 Kwok Cheung Yeung libgcc/ * gthr-po

Re: [PATCH] Fix detection of thread support with uClibc in libgcc

2014-10-11 Thread Kwok Cheung Yeung
On 11/10/2014 5:56 PM, Andrew Pinski wrote: On Sat, Oct 11, 2014 at 9:42 AM, Kwok Cheung Yeung wrote: __gthread_active_p() in libgcc checks for thread support by looking for the presence of a symbol from libpthread. With glibc, it looks for __pthread_key_create. However, it determines that

Re: [PATCH] Fix for PR26702: Emit .size for BSS variables on arm-eabi

2015-04-30 Thread Kwok Cheung Yeung
Hello The target of the pr26702.c testcase was changed while committing from: { target arm*-*-eabi* } in my original patch to: { target arm_eabi } The check_effective_target_arm_eabi test (in gcc/testsuite/lib/target-supports.exp) checks for the presence of the __ARM_EABI__ preprocessor def

[PATCH] Fix for PR26702: Emit .size for BSS variables on arm-eabi

2015-03-30 Thread Kwok Cheung Yeung
... 6: 4 NOTYPE LOCAL DEFAULT3 static_foo ... The testsuite has been run with a i686-pc-linux-gnu hosted cross-compiler targetted at arm-none-eabi with no regressions. Kwok 2015-03-30 Kwok Cheung Yeung gcc/ PR target/26702 * conf

Re: [wwwdocs] Document existence of openacc-gcc-9-branch

2019-06-11 Thread Kwok Cheung Yeung
Hello On 04/06/2019 11:05 pm, Julian Brown wrote: Hi, I've pushed a new branch "openacc-gcc-9-branch" to the Git mirror (i.e. as a Git-only branch), for development of OpenACC and related functionality on top of the GCC 9 branch. It's currently based off the gcc-9_1_0-release tag, and contains

Re: [PATCH, og9] Port OpenACC profiling interface to OG9

2019-07-26 Thread Kwok Cheung Yeung
On 24/07/2019 11:45 am, Thomas Schwinge wrote: +2017-02-28 Thomas Schwinge + + [...] + * oacc-parallel.c (GOACC_parallel_keyed_internal): Set device_api for + profiling. --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -275,6 +275,8 @@ GOACC_parallel_keyed_in

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-29 Thread Kwok Cheung Yeung
On 12/07/2019 12:41 pm, Jakub Jelinek wrote: +/* Return true if DECL is a Fortran optional argument. */ + +bool +omp_is_optional_argument (tree decl) +{ + /* A passed-by-reference Fortran optional argument is similar to + a normal argument, but since it can be null the type is a + POINT

Re: [PATCH 4/5, OpenACC] Allow optional arguments to be used in the use_device OpenACC clause

2019-07-29 Thread Kwok Cheung Yeung
On 12/07/2019 12:38 pm, Kwok Cheung Yeung wrote: This patch fixes a similar situation that occurs with the use_device clause, where the lowering would result in a null dereference if applied to a non-present optional argument. This patch builds a conditional check that skips the dereference if

Re: [PATCH 04/10, OpenACC] Turn OpenACC kernels regions into a sequence of, parallel regions

2019-08-05 Thread Kwok Cheung Yeung
On 18/07/2019 10:30 am, Jakub Jelinek wrote: On Wed, Jul 17, 2019 at 10:06:07PM +0100, Kwok Cheung Yeung wrote: --- a/gcc/omp-oacc-kernels.c +++ b/gcc/omp-oacc-kernels.c @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. If not see #include "backend.h" #include "targe

Re: [PATCH 06/10, OpenACC] Adjust parallelism of loops in gang-single parts of OpenACC kernels regions

2019-08-05 Thread Kwok Cheung Yeung
The change to patch 04 (Turn OpenACC kernels regions into a sequence of parallel regions) necessitates an additional include of 'diagnostic-core.h' in omp-oacc-kernels.c, as it is no longer indirectly included by 'cp/cp-tree.h'. Kwok On 17/07/2019 10:12 pm, Kwok Cheung Ye

Re: [PATCH 02/10, OpenACC] Add OpenACC target kinds for decomposed kernels regions

2019-08-05 Thread Kwok Cheung Yeung
I have run the whole patch series through check_GNU_style.sh and fixed up the formatting where indicated. Do I need to post the reformatted patchset? Thanks Kwok On 18/07/2019 10:24 am, Jakub Jelinek wrote: On Wed, Jul 17, 2019 at 10:04:10PM +0100, Kwok Cheung Yeung wrote: @@ -2319,7

[PATCH 0/5, OpenACC] Add support for Fortran optional arguments in OpenACC

2019-07-12 Thread Kwok Cheung Yeung
This patchset allows the use of Fortran optional arguments in OpenACC programs in accordance with section 2.17 of the OpenACC 2.6 specification. These patches were originally posted at https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01750.html for the OG8 branch. This version is targeted at trun

[PATCH 1/5, OpenACC] Allow NULL as an argument to OpenACC 2.6 directives

2019-07-12 Thread Kwok Cheung Yeung
Fortran pass-by-reference optional arguments behave much like normal Fortran arguments when lowered to GENERIC/GIMPLE, except they can be null (representing a non-present argument). Some parts of libgomp (those dealing with updating mappings) currently do not expect to take a null address and

[PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-12 Thread Kwok Cheung Yeung
Reference types used by Fortran often need to be treated specially in the OACC lowering to deal with the referenced object as well as the reference itself. However, as optional arguments can be null, they are pointer types rather than reference types, so the code to detect these situations need

[PATCH 3/5, OpenACC] Add support for allocatable arrays as optional arguments

2019-07-12 Thread Kwok Cheung Yeung
This patch allows allocatable arrays passed as Fortran optional arguments to be used in OpenACC. The GIMPLE code generated by the current lowering unconditionally attempts to access fields within the structure representing the array, resulting in a null dereference if the array is non-present.

[PATCH 4/5, OpenACC] Allow optional arguments to be used in the use_device OpenACC clause

2019-07-12 Thread Kwok Cheung Yeung
This patch fixes a similar situation that occurs with the use_device clause, where the lowering would result in a null dereference if applied to a non-present optional argument. This patch builds a conditional check that skips the dereference if the argument is non-present, and ensures that opt

[PATCH 5/5, OpenACC] Add tests for Fortran optional arguments in OpenACC 2.6

2019-07-12 Thread Kwok Cheung Yeung
This adds testcases exercising the use of optional arguments in the various OpenACC directives. Where applicable, both the present and non-present cases are tested, with an integer, array of integers and allocatable array of integers as the argument. libgomp/ * testsuite/libgom

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-17 Thread Kwok Cheung Yeung
On 12/07/2019 12:41 pm, Jakub Jelinek wrote: This should be done through a langhook. Are really all PARM_DECLs wtih DECL_BY_REFERENCE and pointer type optional arguments? I mean, POINTER_TYPE is used for a lot of cases. Hmmm... I thought it was the case that if you pass an argument in by refer

[PATCH 00/10, OpenACC] Rework handling of OpenACC kernels regions

2019-07-17 Thread Kwok Cheung Yeung
This series of patches reworks the way that OpenACC kernels regions are processed by GCC. Instead of relying on the parloops pass for auto-parallelisation of the kernel region, the contents of the region are transformed into a sequence of offloaded regions, which are then processed individually

[PATCH 01/10, OpenACC] Use "-fopenacc-kernels=parloops" to document "parloops" test cases

2019-07-17 Thread Kwok Cheung Yeung
This patch introduces a new option "-fopenacc-kernels" to control how OpenACC kernels are processed. The current behaviour will be equivalent to '-fopenacc-kernels=parloops'. 2019-07-16 Thomas Schwinge gcc/ * flag-types.h (enum openacc_kernels): New type. gcc/c-fami

[PATCH 02/10, OpenACC] Add OpenACC target kinds for decomposed kernels regions

2019-07-17 Thread Kwok Cheung Yeung
This patch is in preparation for changes that will cut up OpenACC kernels regions into individual parts. For the new sub-regions that will be generated, this adds the following new kinds of OpenACC regions for internal use: - GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED for parts of ke

[PATCH 03/10, OpenACC] Separate OpenACC kernels regions in data and parallel parts

2019-07-17 Thread Kwok Cheung Yeung
In the future, kernels regions will be transformed into data regions containing a sequence of serial and parallel offloaded regions. This first patch sets up a new pass that is responsible for this transformation, and in a first step constructs the new data region containing a parallel region wi

[PATCH 04/10, OpenACC] Turn OpenACC kernels regions into a sequence of, parallel regions

2019-07-17 Thread Kwok Cheung Yeung
This patch decomposes each OpenACC kernels region into a sequence of parallel regions. Each OpenACC loop nest turns into its own region; any code between such loop nests is gathered up into a region as well. The loop regions can be distributed across gangs if the original kernels region had a nu

[PATCH 05/10, OpenACC] Handle conditional execution of loops in OpenACC, kernels regions

2019-07-17 Thread Kwok Cheung Yeung
Any OpenACC loop controlled by an if statement or a non-OpenACC loop must be executed in a gang-single region. Detecting such loops is not trivial as OpenACC kernels expansion is done on GIMPLE but before computation of the control flow graph. This patch adds an auxiliary analysis for determinin

[PATCH 06/10, OpenACC] Adjust parallelism of loops in gang-single parts of OpenACC kernels regions

2019-07-17 Thread Kwok Cheung Yeung
Loops in gang-single parts of kernels regions cannot be executed in gang-redundant mode. If the user specified gang clauses on such loops, emit an error and remove these clauses. Adjust automatic partitioning to exclude gang partitioning in gang-single regions. 2019-07-16 Gergö Barany

[PATCH 07/10, OpenACC] Launch kernels asynchronously in OpenACC kernels regions

2019-07-17 Thread Kwok Cheung Yeung
Kernels regions are decomposed into one or more smaller regions that are to be executed in sequence. With this patch, all of these regions are launched asynchronously, and a wait directive is added after them. This means that the host only waits once for the kernels to complete, not once per ker

[PATCH 08/10, OpenACC] New OpenACC kernels region decompose algorithm

2019-07-17 Thread Kwok Cheung Yeung
Previously, OpenACC kernels region bodies were decomposed into a sequence of alternating gang-single and gang-parallel "parallel" regions. The new algorithm in this patch introduces a third possibility: Loops that look like they might benefit from the parloops pass are converted into old "kernel

[PATCH 09/10, OpenACC] Avoid introducing 'create' mapping clauses for loop index variables in kernels regions

2019-07-17 Thread Kwok Cheung Yeung
This patch avoids adding CREATE mapping clauses for loop index variables. It also sets TREE_ADDRESSABLE on newly mapped declarations, which fixes an ICE that sometimes appears due to an assert firing in omp-low.c. 2019-07-16 Julian Brown gcc/ * omp-oacc-kernels.c (find_omp_f

[PATCH 10/10, OpenACC] Make new OpenACC kernels conversion the default; adjust and add tests

2019-07-17 Thread Kwok Cheung Yeung
This patch makes the new kernel conversion scheme the default, and adjusts the tests accordingly. 2019-07-16 Thomas Schwinge Kwok Cheung Yeung gcc/c-family/ * c.opt (fopenacc-kernels): Default to "split". gcc/fortran/ * lang.opt

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-18 Thread Kwok Cheung Yeung
On 18/07/2019 10:28 am, Tobias Burnus wrote: Hi all, I played around and came up with another second way one gets a single "*" without 'optional'. I haven't checked whether which of those match the proposed omp_is_optional_argument's +&& DECL_BY_REFERENCE (decl) +&& TREE_CODE

[PATCH] Add myself to MAINTAINERS

2019-01-25 Thread Kwok Cheung Yeung
Fei Yang Jeffrey Yasskin Joey Ye +Kwok Cheung Yeung Greta Yorsh David Yuste

<    1   2   3   >