Re: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior - runtime

2018-06-20 Thread Cesar Philippidis
On 06/20/2018 09:45 AM, Jakub Jelinek wrote: > On Tue, Jun 19, 2018 at 10:01:20AM -0700, Cesar Philippidis wrote: >> >From 53ee03231c5e6e4747b4ef01335079a2d4a98480 Mon Sep 17 00:00:00 2001 >> From: Cesar Philippidis >> Date: Tue, 19 Jun 2018 09:33:04 -0700 >> Subjec

Re: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior - runtime

2018-06-20 Thread Cesar Philippidis
On 06/20/2018 10:03 AM, Jakub Jelinek wrote: > On Wed, Jun 20, 2018 at 09:59:29AM -0700, Cesar Philippidis wrote: >> If it means anything, we have a significant async change that removes >> the async_refcount field in that struct. > > Wasn't async_refcount removed 2 y

[patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-06-20 Thread Cesar Philippidis
is patch OK for trunk? Thanks, Cesar 2018-06-20 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Delete define. (PTX_DEFAULT_RUNTIME_DIM): New define. (nvptx_goacc_validate_dims): Use it to allow the runtime to dynamically allocate

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-06-21 Thread Cesar Philippidis
On 06/20/2018 03:15 PM, Tom de Vries wrote: > On 06/20/2018 11:59 PM, Cesar Philippidis wrote: >> Now it follows the formula contained in >> the "CUDA Occupancy Calculator" spreadsheet that's distributed with CUDA. > > Any reason we're not using the cuda

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-06-29 Thread Cesar Philippidis
Ping. Ceasr On 06/20/2018 02:59 PM, Cesar Philippidis wrote: > At present, the nvptx libgomp plugin does not take into account the > amount of shared resources on GPUs (mostly shared-memory are register > usage) when selecting the default num_gangs and num_workers. In certain > si

[patch] Update support for Fortran arrays in OpenACC

2018-06-29 Thread Cesar Philippidis
e reported line number in fortran combined OpenACC directives Is this patch OK for trunk? It bootstrapped / regression tested cleanly for x86_64 with nvptx offloading. Thanks, Cesar 2018-06-29 Cesar Philippidis gcc/fortran/ * trans-array.c (gfc_trans_array_bounds): Add an INIT_VLA ar

Re: [patch] Update support for Fortran arrays in OpenACC

2018-06-29 Thread Cesar Philippidis
On 06/29/2018 10:49 AM, Jakub Jelinek wrote: > On Fri, Jun 29, 2018 at 10:33:56AM -0700, Cesar Philippidis wrote: >> @@ -1044,21 +1046,6 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p) >> return; >> >>tree decl = OMP_CLAUSE_DECL (c); >> - >

[patch] various OpenACC reduction enhancements

2018-06-29 Thread Cesar Philippidis
s. Thanks, Cesar

Re: [patch] various OpenACC reduction enhancements - ME and nvptx changes

2018-06-29 Thread Cesar Philippidis
The attached patch includes the nvptx and GCC ME reductions enhancements. Is this patch OK for trunk? It bootstrapped / regression tested cleanly for x86_64 with nvptx offloading. Thanks, Cesar 2018-06-29 Cesar Philippidis Nathan Sidwell gcc/ * config/nvptx/nvptx.c

Re: [patch] various OpenACC reduction enhancements - FE changes

2018-06-29 Thread Cesar Philippidis
Attaches are the FE changes for the OpenACC reduction enhancements. It depends on the ME patch. Is this patch OK for trunk? It bootstrapped / regression tested cleanly for x86_64 with nvptx offloading. Thanks, Cesar 2018-06-29 Cesar Philippidis Nathan Sidwell gcc/c/ * c-parser.c

Re: [patch] various OpenACC reduction enhancements - test cases

2018-06-29 Thread Cesar Philippidis
Attached are the updated reductions tests cases. Again, these have been bootstrapped and regression tested cleanly for x86_64 with nvptx offloading. Is it OK for trunk? Thanks, Cesar 2018-06-29 Cesar Philippidis Nathan Sidwell gcc/testsuite/ * c-c++-common/goacc/orphan-reductions-1.c

[patch] Add OpenACC Fortran support for deviceptr and variable in common blocks

2018-06-29 Thread Cesar Philippidis
is patch OK for trunk? It bootstrapped / regression tested cleanly for x86_64 with nvptx offloading. Thanks, Cesar 2018-06-29 Cesar Philippidis James Norris gcc/fortran/ * openmp.c (gfc_match_omp_map_clause): Re-write handling of the deviceptr clause. Add new common_blocks a

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-06-29 Thread Cesar Philippidis
On 06/29/2018 10:12 AM, Cesar Philippidis wrote: > Ping. While porting the vector length patches to trunk, I realized that I mistakenly removed support for the environment variable GOMP_OPENACC_DIM in this patch (thanks for adding those test case Tom!). I'll post an updated version of th

Re: [PATCH,PTX] Add support for CUDA 9

2018-01-17 Thread Cesar Philippidis
On 12/27/2017 01:16 AM, Tom de Vries wrote: > On 12/21/2017 06:19 PM, Cesar Philippidis wrote: >> My test results are somewhat inconsistent. On MG's build servers, there >> are no regressions in CUDA 8. > > Ack. > >> On my laptop, there are fewer regressions

[PATCH,NVPTX] Fix PR83920

2018-01-17 Thread Cesar Philippidis
me failure with gemm example in the PR, so I didn't include it in the patch. However, this patch does fix the failure with da-1.c in og7. This patch does not cause any regressions. Is it OK for trunk? Thanks, Cesar diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 5

Re: [PATCH,PTX] Add support for CUDA 9

2018-01-18 Thread Cesar Philippidis
On 12/19/2017 04:39 PM, Tom de Vries wrote: > On 12/20/2017 12:25 AM, Cesar Philippidis wrote: >> og7-ptx-cuda9.diff >> >> >> 2017-12-19  Cesar Philippidis  >> >> gcc/ >> * config/nvptx/nvptx.c (output_init_frag): Don't use generic addres

[og7] backport fix for PR83920

2018-01-19 Thread Cesar Philippidis
I've backported the patch Tom committed to trunk to fix PR83920 to openacc-gcc-7-branch in revision d0a1e0fa43ca4004fde33707cb6a93c01cb11507. No changes were required for og7. The original email can be found here <https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01729.html>. Cesar

[og7,nvptx] Backport CUDA 9 support from trunk.

2018-01-19 Thread Cesar Philippidis
into trunk. This patch keeps both trunk and og7 consistent. Cesar [nvptx] Backport CUDA 9 support from trunk. 2018-01-19 Cesar Philippidis Backport from mainline: 2018-01-19 Cesar Philippidis PR target/83790 gcc/ * config/nvptx/nvptx.c (output_init_frag): diff --git a/gcc/config/nvpt

[og7] Build libffi during bootstrap.

2018-01-25 Thread Cesar Philippidis
al variables at runtime. Cesar Build libffi during bootstrap. 2018-01-25 Cesar Philippidis * Makefile.def: Bootstrap module libffi. Add libffi dependency to all-target-libgomp. * Makefile.in: Regenerate. * configure.ac: Add libffi to bootstrap_target_libs when libgomp is bootstrapped. * config

[og7] Privatize independent OpenACC reductions

2018-01-26 Thread Cesar Philippidis
committee argue that the reduction variable in inner-reduction.c should be firstprivate, not copy. Cesar Privatize independent OpenACC reductions. 2018-01-26 Cesar Philippidis gcc/ * gimplify.c (oacc_privatize_reduction): New function. (omp_add_variable): Use it to determine if a reduction va

[og7] Enable firstprivate OpenACC reductions

2018-01-31 Thread Cesar Philippidis
nside gimplify.c:omp_add_variable. I know that it's been a while since you last worked on this. Let me know if you have any state on that code, otherwise I'll handle the cleanup. Cesar Enable firstprivate OpenACC reductions 2018-01-31 Cesar Philippidis gcc/ * gimplify.c (omp_add_variable): Allow certain Op

[og7] Properly handle alloca'd arrays in OpenACC data mappings

2018-01-31 Thread Cesar Philippidis
this problem would have been detected sooner. I'm considering moving the PTX .param pass later, possible during oaccdevlow. But that will have to wait for some other time. I've applied this patch to openacc-gcc-7-branch. Cesar Properly handle alloca'd OpenACC data mappings 2018-01-3

[og7] vector_length extension part 1: generalize function and variable names

2018-03-01 Thread Cesar Philippidis
it will be used in other places, including nvptx_validate_dims and the nvptx reduction handling code. This patch has been committed to openacc-gcc-7-branch. Cesar 2018-03-01 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH, PTX_DEFAULT_RUNTIME_DIM): Move

[og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-02 Thread Cesar Philippidis
oversial. I'll commit this patch to openacc-gcc-7-branch once the other patches are ready. There will be three more patches in this series. Cesar 2018-03-02 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (oacc_bcast_partition): Declare. (nvptx_init_axis_predicate): Initi

[og7] vector_length extension part 3: reductions

2018-03-02 Thread Cesar Philippidis
finalizer will be slow. However, that's a project for another day. I'll commit this patch to openacc-gcc-7-branch after Tom reviews the new nvptx_red_partition insn. Cesar 2018-03-02 Cesar Philippidis gcc/ * config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Decl

[og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-02 Thread Cesar Philippidis
code. Overall, the changes in this patch are mild. I'll apply it to openacc-gcc-7-branch after Tom approves the reduction patch. Cesar 2018-03-02 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (NVPTX_GOACC_VL_WARP): Define. (nvptx_goacc_needs_vl_warp): New function. (nvptx_goac

[og7] vector_length extension part 5: libgomp and tests

2018-03-02 Thread Cesar Philippidis
c-7-branch once the reduction changes have been approved. Cesar 2018-03-02 Cesar Philippidis libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of workers and vectors. * testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: New test. * testsuite/libgomp.oacc-fortran/gemm.f90: Ne

[og7] Update nvptx_fork/join barrier placement

2018-03-08 Thread Cesar Philippidis
et that I posted last week. However, that patch set didn't consider the placement of the joining barrier. I've applied this patch to openacc-gcc-7-branch. Tom, is a similar patch OK for trunk? The major difference between trunk and og7 is that og7 changed the name of nvptx_warp_sync to n

Re: [og7] vector_length extension part 1: generalize function and variable names

2018-03-09 Thread Cesar Philippidis
On 03/09/2018 07:29 AM, Thomas Schwinge wrote: > On Thu, 1 Mar 2018 13:17:01 -0800, Cesar Philippidis > wrote: >> To reduce the size of the final patch, >> I've separated all of the misc. function and variable renaming into this >> patch. > > Yes, please

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-09 Thread Cesar Philippidis
On 03/09/2018 08:21 AM, Tom de Vries wrote: > On 03/09/2018 12:31 AM, Cesar Philippidis wrote: >> Nvidia Volta GPUs now support warp-level synchronization. > > Well, let's try to make that statement a bit more precise. > > All Nvidia architectures have supported synch

[og7] Backport PR74048 and PR81352 nvptx fixes

2018-03-12 Thread Cesar Philippidis
cts unused parallelism (in this case, num_workers was being set but there was no worker partitioned loop). That problem went away with an extra dg-warning line. Cesar 2018-03-12 Cesar Philippidis Backport from trunk: 2018-01-25 Tom de Vries PR target/84028 gcc/ * config/nvptx/nvptx.c (nv

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Cesar Philippidis
On 03/19/2018 07:04 AM, Tom de Vries wrote: > On 03/09/2018 05:55 PM, Cesar Philippidis wrote: >> On 03/09/2018 08:21 AM, Tom de Vries wrote: >>> On 03/09/2018 12:31 AM, Cesar Philippidis wrote: >>>> Nvidia Volta GPUs now support warp-level synchronization. >&g

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Cesar Philippidis
On 03/19/2018 10:02 AM, Tom de Vries wrote: > On 03/19/2018 03:55 PM, Cesar Philippidis wrote: >>> Note that this changes ordering of the vector-neutering jump and >>> worker-neutering jump at the end. In principle, this should not be >>> harmful, but it viol

[og7] backport fix for PR84952

2018-03-20 Thread Cesar Philippidis
to nvptx_cta_sync so that function can be used for both large vector_lengths along with workers. Other than that, I didn't have to make any changes to his patch. Cesar 2018-03-20 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (nvptx_single): Revert changes from 7445a4d40. Backport fro

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-21 Thread Cesar Philippidis
On 03/21/2018 08:49 AM, Tom de Vries wrote: > On 03/02/2018 08:18 PM, Cesar Philippidis wrote: > >> og7-vl-part4-hooks.diff > >> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c >> index 5642941c6a3..507c8671704 100644 >> --- a/gcc/config/nvptx/

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-21 Thread Cesar Philippidis
On 03/21/2018 10:10 AM, Tom de Vries wrote: > On 03/02/2018 05:55 PM, Cesar Philippidis wrote: >> In addition, nvptx_cta_sync and the corresponding nvptx_barsync insn, >> have been extended to take a barrier ID and a thread count. The idea >> here is to assign one barrier fo

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
On 03/22/2018 06:43 AM, Tom de Vries wrote: > On 03/22/2018 04:59 AM, Cesar Philippidis wrote: >> On 03/21/2018 10:10 AM, Tom de Vries wrote: >>> Changing the code generation scheme for workers is fine, but obviously >>> that should be a minimal, separate patch

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
On 03/22/2018 07:23 AM, Tom de Vries wrote: > On 03/02/2018 05:55 PM, Cesar Philippidis wrote: > >> (nvptx_declare_function_name): Emit a .maxntid directive hint and >> call nvptx_init_oacc_workers. > >> + >> +  /* Emit a .maxntid hint to help the PTX JIT

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
On 03/22/2018 07:44 AM, Tom de Vries wrote: > On 03/02/2018 05:55 PM, Cesar Philippidis wrote: >> The attached patch generalizes the worker state propagation and >> synchronization code to handle large vectors. When the vector_length is >> larger than a CUDA warp, the nvptx B

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
10; i++) >     a[i] = i; > >   return 0; > } > ... > > I get: > ... >  .maxntid 32, 16, 1 > ... > > That's the change you need to isolate. I attached an updated patch which incorporates the cfun->machine->axis_dim changes. It now generates more precise

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
On 03/22/2018 10:39 AM, Tom de Vries wrote: > On 03/02/2018 05:55 PM, Cesar Philippidis wrote: >> +  rtx red_partition; /* Similar to bcast_partition, except for vector >> +    reductions.  */ > > Shouldn't this be in "[og7] vector_length extension part 3: r

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Cesar Philippidis
On 03/22/2018 10:51 AM, Tom de Vries wrote: > On 03/22/2018 06:24 PM, Cesar Philippidis wrote: >> On 03/22/2018 09:18 AM, Tom de Vries wrote: >> >>> That's obviously not good enough. >>> >>> When I compile this test-case: >>> ... >>

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-26 Thread Cesar Philippidis
On 03/26/2018 07:14 AM, Tom de Vries wrote: > On 03/02/2018 08:18 PM, Cesar Philippidis wrote: >> diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c >> index ba3f4317f4e..f15ce6b8f8d 100644 >> --- a/gcc/omp-offload.c >> +++ b/gcc/omp-offload.c >> @@ -626,7 +626,8

[PATCH,nvptx] Fix PR85056

2018-03-26 Thread Cesar Philippidis
this patch OK for trunk if the results come back clean? Thanks, Cesar 2018-03-26 Cesar Philippidis gcc/ PR target/85056 * config/nvptx/nvptx.c (nvptx_assemble_decl_begin): Add '[]' to extern array declarations. gcc/testsuite/ * testsuite/gcc.target/nvptx/pr85056.c: New test.

Re: [PATCH,nvptx] Fix PR85056

2018-03-28 Thread Cesar Philippidis
On 03/27/2018 01:17 AM, Tom de Vries wrote: > On 03/26/2018 11:57 PM, Cesar Philippidis wrote: >> As noted in PR85056, the nvptx BE isn't declaring external arrays using >> PTX array notation. Specifically, it's emitting code that's missing the >> empty angle b

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-30 Thread Cesar Philippidis
On 03/30/2018 07:45 AM, Tom de Vries wrote: > On 03/30/2018 03:07 AM, Tom de Vries wrote: >> On 03/02/2018 05:55 PM, Cesar Philippidis wrote: >>> As a follow up patch will show, the nvptx BE falls back to using >>> vector_length = 32 when a vector loop is nested in

[PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-04-06 Thread Cesar Philippidis
ds to be fixed up and call cleanup_cfg as necessary. But I wanted to keep the OMP and OACC code paths similar, so I took the former approach. I regression tested this patch on x86_64-linux using nvptx offloading. Is this patch OK for trunk and GCC 7 (and probably GCC 6). Thanks, Cesar Fix PR84955 20

[og7] Enable worker partitioning with warp-sized vector_length

2018-04-10 Thread Cesar Philippidis
. Consequently, not all of the CUDA threads were being utilized when vector_length = 32 (which is the default case). I've committed this patch to openacc-gcc-7-branch which allows warp-sized vectors to nest inside worker-partitioned loops. Cesar 2018-04-10 Cesar Philippidis gcc/ * config/

Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-04-11 Thread Cesar Philippidis
On 04/09/2018 04:31 AM, Richard Biener wrote: > On Fri, 6 Apr 2018, Jakub Jelinek wrote: > >> On Fri, Apr 06, 2018 at 06:48:52AM -0700, Cesar Philippidis wrote: >>> 2018-04-06 Cesar Philippidis >>> >>> PR middle-end/84955 >>> >>>

Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-04-12 Thread Cesar Philippidis
On 04/12/2018 11:27 AM, H.J. Lu wrote: > On Wed, Apr 11, 2018 at 12:30 PM, Cesar Philippidis > wrote: >> On 04/09/2018 04:31 AM, Richard Biener wrote: >>> On Fri, 6 Apr 2018, Jakub Jelinek wrote: >>> >>>> On Fri, Apr 06, 2018 at 06:48:52AM -0700, Cesar

[openacc] Teach gfortran to lower OpenACC routine dims

2018-09-05 Thread Cesar Philippidis
, Cesar [openacc] Teach gfortran to lower OpenACC routine dims gcc/fortran/ * gfortran.h (oacc_function): New enum. (gfc_oacc_routine_name): Add locus loc field. * openmp.c (gfc_oacc_routine_dims): Return oacc_function. (gfc_match_oacc_routine): Update routine clause syntax checking. Populate

[patch][OpenACC] Add target hook TARGET_GOACC_ADJUST_PARALLELISM

2018-09-05 Thread Cesar Philippidis
vectors fit inside workers. The target hook itself doesn't do anything for the host, but the nvptx BE will make use of it. Is this patch OK for trunk? I regtested and bootstrapped for x86_64 with nvptx offloading. Thanks, Cesar [openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM gcc/

[OpenACC] Enable firstprivate OpenACC reductions

2018-09-05 Thread Cesar Philippidis
for (...) { #pragma acc loop reduction(+:s2) Here s2 will be transferred into the accelerator as firstprivate instead of copy. Is this OK for trunk? I regtested and bootstrapped for x86_64 with nvptx offloading. Cesar [OpenACC] Enable firstprivate OpenACC reductions 2018-XX-YY Cesar

Re: [patch,nvptx] Basic -misa support for nvptx

2018-09-05 Thread Cesar Philippidis
On 09/05/2018 07:30 AM, Tom de Vries wrote: > On 09/05/2018 12:19 AM, Cesar Philippidis wrote: >> On 09/02/2018 07:57 AM, Cesar Philippidis wrote: >>> On 09/01/2018 12:04 PM, Tom de Vries wrote: >>>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote: >&g

Re: [PATCH, OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions

2018-09-10 Thread Cesar Philippidis
} } */ > + > +#include > +#include > +#include > + > +int > +main (int argc, char **argv) > +{ > + const int N = 256; > + int i; > + int async = 8; > + unsigned char *h; > + > + h = (unsigned char *) malloc (N); > + > + for (i = 0; i < N; i++) > +{ > + h[i] = i; > +} > + > + acc_copyin_async (h, N, async); > + > + memset (h, 0, N); > + > + acc_wait (async); > + > + acc_copyout_async (h, N, async + 1); > + > + acc_wait (async + 1); > + > + for (i = 0; i < N; i++) > +{ > + if (h[i] != i) > + abort (); > +} > + > + free (h); > + > + return 0; > +} > Index: libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c > === > --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (nonexistent) > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (working copy) > @@ -0,0 +1,45 @@ > +/* { dg-do run } */ > +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ > + > +#include > +#include > +#include > + > +int > +main (int argc, char **argv) > +{ > + const int N = 256; > + int i, q = 5; > + unsigned char *h, *g; > + void *d; > + > + h = (unsigned char *) malloc (N); > + g = (unsigned char *) malloc (N); > + for (i = 0; i < N; i++) > +{ > + g[i] = i; > +} > + > + acc_create_async (h, N, q); > + > + acc_memcpy_to_device_async (acc_deviceptr (h), g, N, q); > + memset (&h[0], 0, N); > + > + acc_wait (q); > + > + acc_update_self_async (h, N, q + 1); > + acc_delete_async (h, N, q + 1); > + > + acc_wait (q + 1); > + > + for (i = 0; i < N; i++) > +{ > + if (h[i] != i) > + abort (); > +} > + > + free (h); > + free (g); > + > + return 0; > +} > Index: libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 > === > --- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (nonexistent) > +++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (working copy) > @@ -0,0 +1,57 @@ > +! { dg-do run } > +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } > + > +program main > + use openacc > + implicit none > + > + integer, parameter :: N = 256 > + integer, allocatable :: h(:) > + integer :: i > + integer :: async = 5 > + > + allocate (h(N)) > + > + do i = 1, N > +h(i) = i > + end do > + > + call acc_copyin (h) > + > + do i = 1, N > +h(i) = i + i > + end do > + > + call acc_update_device_async (h, sizeof (h), async) > + > + if (acc_is_present (h) .neqv. .TRUE.) call abort > + > + h(:) = 0 > + > + call acc_copyout_async (h, sizeof (h), async) > + > + call acc_wait (async) > + > + do i = 1, N > +if (h(i) /= i + i) call abort > + end do > + > + call acc_copyin (h, sizeof (h)) > + > + h(:) = 0 > + > + call acc_update_self_async (h, sizeof (h), async) > + > + if (acc_is_present (h) .neqv. .TRUE.) call abort > + > + do i = 1, N > +if (h(i) /= i + i) call abort > + end do > + > + call acc_delete_async (h, async) > + > + call acc_wait (async) > + > + if (acc_is_present (h) .neqv. .FALSE.) call abort > + > +end program > While I can't approve this patch, it seems reasonable to me. I like how you cleaned up things from OG8 (e.g., replacing return (n ? 1 : 0) with return n != NULL'). Are there any other OG8 async patches in your queue? Thanks, Cesar

Re: [PATCH, OpenACC] C++ reference mapping (PR middle-end/86336)

2018-09-10 Thread Cesar Philippidis
On 09/10/2018 10:37 AM, Jason Merrill wrote: > On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown wrote: >> This patch (by Cesar) changes the way C++ references are mapped in >> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase. >> >> Post-patch, references

[patch,nvptx] Add atomic_fetch* support for SImode arguments.

2018-09-17 Thread Cesar Philippidis
named atomic-fetch-2.c incorrectly; there should be an underscore between atomic and fetch. This patch also fixes that. I tested this patch using both a standalone nvptx compiler and x86_64 Linux with nvptx offloading. Cesar [nvptx] Add atomic_fetch* support for SImode arguments. 2018-09-17

Re: [PATCH,nvptx] Remove use of CUDA unified memory in libgomp

2018-09-18 Thread Cesar Philippidis
On 08/01/2018 04:12 AM, Tom de Vries wrote: > On 07/31/2018 05:27 PM, Cesar Philippidis wrote: >>/* Copy the (device) pointers to arguments to the device (dp and hp might >> in >> fact have the same value on a unified-memory system). */ > > This comment

[nvptx] vector length patch series

2018-09-18 Thread Cesar Philippidis
x target with nvptx offloading. Thanks, Cesar

Re: [openacc] Teach gfortran to lower OpenACC routine dims

2018-09-20 Thread Cesar Philippidis
On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote: > On Wed, 5 Sep 2018 12:52:03 -0700 > Cesar Philippidis wrote: > >> At present, gfortran does not encode the gang, worker or vector >> parallelism clauses when it creates acc routines dim attribute for >> subro

[patch,openacc] Better distinguish OpenACC and OpenMP sections in libgomp.texi

2018-09-20 Thread Cesar Philippidis
unk? I verified that libgomp.pdf looks ok. Thanks, Cesar [OpenACC] Update _OPENACC value and documentation for OpenACC 2.5 2018-XX-YY Thomas Schwinge Cesar Philippidis gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510". gcc/fortran/ * cpp.c

[patch,opencc] Don't mark OpenACC auto loops as independent inside acc parallel regions

2018-09-20 Thread Cesar Philippidis
directives https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC] Don't mark OpenACC auto loops as independent inside acc parallel regions 2018-XX-YY Cesar Philippidis gc

[patch,openacc] Fix acc_shutdown issue

2018-09-20 Thread Cesar Philippidis
strapped and regtested it for x86_64 Linux with nvptx offloading and I didn't encounter any regressions. Thanks, Cesar [OpenACC] Fix acc_shutdown issue 2018-XX-YY James Norris Cesar Philippidis libgomp/ * oacc-init.c (acc_shutdown_1): Replace use of gomp_free_memmap with go

[patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-09-20 Thread Cesar Philippidis
this patch OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar Fix infinite recursion in OMP clause pretty-printing, default label Apparently, Tom ran into an ICE when we were adding support for new clauses back in the gomp-4_0-branch days. This

[patch,openacc] Generate sequential loop for OpenACC loop directive inside kernels

2018-09-20 Thread Cesar Philippidis
ntions that this allows the kernels parallelization to work when '#pragma acc loop' makes the front-ends create OMP_FOR, which the loop analysis phases don't understand. I bootstrapped and regtested it on x86_64 Linux with nvptx offloading. Is this patch OK for trunk? T

[patch,openacc] handle missing OMP_LIST_ clauses in fortran's parse tree debugger

2018-09-20 Thread Cesar Philippidis
ted it for x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC] handle missing OMP_LIST_ clauses in fortran's parse tree debugger 2018-XX-YY Cesar Philippidis gcc/fortran/ * dump-parse-tree.c (show_omp_clauses): Add missing omp list_types and reorder the switch cases to ma

[patch,openacc] Fix hang when running oacc exec with CUDA 9.0 nvprof

2018-09-20 Thread Cesar Philippidis
this for x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC] Fix hang when running oacc exec with CUDA 9.0 nvprof 2018-XX-YY Tom de Vries Cesar Philippidis libgomp/ * oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread): New variable. (acc_init_1): Set ac

[patch,openacc] Fix PR71959: lto dump of callee counts

2018-09-20 Thread Cesar Philippidis
x27;ll add some support for member data OpenACC 2.6, but some of the OpenACC C++ semantics are still unclear. Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar [PR71959] lto dump of callee counts 2018-XX-YY Nathan Sidwell Cesar P

[patch,openacc] Propagate independent clause for OpenACC kernels pass

2018-09-20 Thread Cesar Philippidis
t introduce any regressions. We do have a couple of other standalone kernels patches in og8, but those depend on other patches. Thanks, Cesar [OpenACC] Propagate independent clause for OpenACC kernels pass 2018-XX-YY Chung-Lin Tang Cesar Philippidis gcc/ * cfgloop.h (struct loop): Add 'boo

[patch,openacc] Set safelen to INT_MAX for oacc independent pragma

2018-09-20 Thread Cesar Philippidis
. The original discussion for this patch can be found here <https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01872.html>. Is this patch OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC] Set safelen to INT_MAX for oacc independent

[patch,openacc] Update _OPENACC value and documentation for OpenACC 2.5

2018-09-20 Thread Cesar Philippidis
nvptx offloading. Thanks, Cesar [OpenACC] Update _OPENACC value and documentation for OpenACC 2.5 2018-XX-YY Thomas Schwinge Cesar Philippidis gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510". gcc/fortran/ * cpp.c (cpp_define_builtins): Upd

Re: [patch,openacc] handle missing OMP_LIST_ clauses in fortran's parse tree debugger

2018-09-20 Thread Cesar Philippidis
On 09/20/2018 11:22 AM, Paul Richard Thomas wrote: > Hi Cesar, > > It looks OK to me. > > Thanks for the patch. > > Paul Thanks! Committed in r264446. Cesar > On 20 September 2018 at 18:21, Cesar Philippidis > wrote: >> This patch updates Fortran's pa

Re: [patch,openacc] Generate sequential loop for OpenACC loop directive inside kernels

2018-09-20 Thread Cesar Philippidis
On 09/20/2018 10:14 AM, Cesar Philippidis wrote: > As Chung-Lin noted here > <https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01079.html>: > > This patch adjusts omp-low.c:expand_omp_for_generic() to expand to a > "sequential" loop form (without the OM

Re: [openacc] Teach gfortran to lower OpenACC routine dims

2018-09-24 Thread Cesar Philippidis
On 09/20/2018 09:10 AM, Bernhard Reutner-Fischer wrote: > On Thu, 20 Sep 2018 07:41:08 -0700 > Cesar Philippidis wrote: > >> On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote: >>> On Wed, 5 Sep 2018 12:52:03 -0700 >>> Cesar Philippidis wrote: > >>

[patch,openacc] update fortran nested parallelism error messages

2018-09-24 Thread Cesar Philippidis
nvptx offloading. Cesar [OpenACC] update fortran nested parallelism error messages 2018-09-24 Bernhard Reuther-Fischer Cesar Philippidis gcc/fortran/ * openmp.c (resolve_oacc_loop_blocks): gcc/testsuite/ * gfortran.dg/goacc/nested-parallelism.f90: New test. --- gcc/fortran/ope

Re: [PATCH][OpenACC] Update deviceptr handling during gimplification

2018-09-26 Thread Cesar Philippidis
On 09/25/2018 05:55 PM, Julian Brown wrote: > On Tue, 7 Aug 2018 15:09:38 -0700 > Cesar Philippidis wrote: > >> I had previously posted this patch as part of a monster deviceptr >> patch here >> <https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html>. Thi

[patch,openacc] C, C++ OpenACC wait diagnostic change

2018-09-26 Thread Cesar Philippidis
x offloading. Thanks, Cesar [OpenACC] C, C++ OpenACC wait diagnostic change 2018-XX-YY James Norris Cesar Philippidis gcc/c/ * c-parser.c (c_parser_oacc_wait_list): Change error message. gcc/cp/ * parser.c (cp_parser_oacc_wait_list): Change error message. gcc/testsuite/

[patch,openacc] use existing local variable in cp_parser_oacc_enter_exit_data

2018-09-26 Thread Cesar Philippidis
This is an old gomp4 patch that updates the location of the clause for acc enter/exit data. Apparently, it didn't impact any test cases. Is this OK for trunk or should we drop it from OG8? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC

[patch,openacc] Don't gimplify in ssa mode if seen_error in oacc_xform_loop

2018-09-26 Thread Cesar Philippidis
hat do you want to do with this patch Thomas? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Cesar [OpenACC] Don't gimplify in ssa mode if seen_error in oacc_xform_loop 2018-XX-YY Tom de Vries Cesar Philippidis gcc/ PR tree-optimization/68977 * omp-offloa

[patch] nvptx libgcc atomic routines

2018-09-26 Thread Cesar Philippidis
tx BE. Therefore, I'm not sure if the nvptx port still needs support for atomic fetch_and_*. Tom and Thomas, do either of you have any thoughts on this? Should I commit it to trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar nvptx libgcc atomic routines

[patch,openacc] Use correct location information for OpenACC shape and simple, clauses in C/C++

2018-09-26 Thread Cesar Philippidis
it for x86_64 Linux with nvptx offloading. Cesar [OpenACC] Use correct location information for OpenACC shape and simple clauses in C/C++ 2018-XX-YY Thomas Schwinge Cesar Philippidis gcc/c/ * c-parser.c (c_parser_oacc_shape_clause) (c_parser_oacc_simple_clause): Add loc formal param

[patch,wip] warn on noncontiguous pointers

2018-09-26 Thread Cesar Philippidis
call random_number(fptr1) !Test pointer reshape II fptr3(1:2,1:2,1:2) => fptr1(4:) end program Note how fptr1 doesn't have a contiguous attribute. Does anyone have thoughts on this? Maybe the ScaTeLib code needs to be updated. Thanks, Cesar Disable "Assignment to contiguou

Re: [patch,wip] warn on noncontiguous pointers

2018-09-26 Thread Cesar Philippidis
On 09/26/2018 01:49 PM, Thomas Koenig wrote: > Hi Cesar, > >> As of GCC 8, gfortran now errors when a pointer with a contiguous >> attribute is set to point to a target without a contiguous attribute. I >> think this is overly strict, and should probably be demoted to

Re: [patch,openacc] C, C++ OpenACC wait diagnostic change

2018-09-26 Thread Cesar Philippidis
On 09/26/2018 12:50 PM, Joseph Myers wrote: > On Wed, 26 Sep 2018, Cesar Philippidis wrote: > >> Attached is an old patch which updated the C and C++ FEs to use %<)%> >> for the right ')' symbol. It's mostly a cosmetic change. All of the >> change

[patch,openacc] Use oacc_verify_routine_clauses for C/C++

2018-10-02 Thread Cesar Philippidis
OK for trunk? I bootstrapped and regression tested it for x86_64 Linux with nvptx offloading. This is only touches the OpenACC code path. Cesar [OpenACC] Use oacc_verify_routine_clauses for C/C++ 2018-XX-YY Thomas Schwinge Cesar Philippidis gcc/ * omp-general.c (oacc_build_routine_dims): M

[patch,openacc] Add support for OpenACC routine nohost clause

2018-10-02 Thread Cesar Philippidis
r trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks Cesar [OpenACC] Add support for OpenACC routine nohost clause (was OpenACC bind, nohost changes) 2018-XX-YY Thomas Schwinge Cesar Philippidis gcc/ * tree-core.h (omp_clause_code): Add OMP_CLA

[patch,openacc] Repeated use of the OpenACC routine directive

2018-10-02 Thread Cesar Philippidis
ch too large. Is this patch OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. This patch is also self-contained to the OpenACC code path. Thanks, Cesar [OpenACC] Repeated use of the OpenACC routine directive 2018-XX-YY Thomas Schwinge Cesar Philippidis

[patch,openacc] Check clauses with intrinsic function specified in !$ACC ROUTINE ( NAME )

2018-10-02 Thread Cesar Philippidis
this. Maybe certain intrinsic functions should default to having an implied acc routine directive. But I suppose that's something for another patch. Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar [PR fortran/72741] Check clauses

[patch,openacc] Check for sufficient parallelism when calling acc routines in Fortran

2018-10-02 Thread Cesar Philippidis
tries to use an acc routine with insufficient parallelism, e.g., calling a gang routine from a vector loop. Is this patch OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. Thanks, Cesar [OpenACC] Check for sufficient parallelism when calling acc routines in fo

[patch,openacc] Add warning for unused acc routine parallelism

2018-10-02 Thread Cesar Philippidis
(although certain Fortran routines fall though to this). Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx offloading. This patch only touches the OpenACC code path. Thanks, Cesar [OpenACC] Add warning for unused acc routine parallelism (was [OpenACC] Don't err

[OpenACC] initial manual deep copy in c

2018-10-02 Thread Cesar Philippidis
is all an early work in progress. I'm still experimenting with some other functionality. If you checkout that branch, beware it may be rebased. Cesar [OpenACC] Initial Manual Deep Copy 2018-10-02 Cesar Philippidis gcc/c/ * c-typeck.c (handle_omp_array_sections_1): Enable structs

[PATCH] Update nvptx newlib installation requirements

2018-04-23 Thread Cesar Philippidis
This patch updates the install documentation to point the the upstream newlib sources instead of the Mentor Embedded github mirror. I don't see tarballs for any point releases on newlib's website, so I added a reference to the git revision containing nvptx port. Is this OK for trunk?

Re: [PATCH] Update nvptx newlib installation requirements

2018-04-24 Thread Cesar Philippidis
sk. But in the meantime, having tarballs for the build dependencies would be nice. > Otherwise OK. > > Btw, can you also update the GCC wiki with regarding to this change? Done. I added a new 'Build Dependencies' section to the nvptx wiki: https://gcc.gnu.org/wiki/nvptx Cesar

[PATCH] cleanup libgomp's coalesce chunk data structures

2018-05-02 Thread Cesar Philippidis
er by introducing a new gomp_coalesce_chunk structure with explicit start and end members. Beyond that, there's no functional changes to this patch. Is it OK for trunk? I tested it against x86_64-linux with nvptx acceleration. Thanks, Cesar 2018-05-02 Cesar Philippidis libgomp/ * target

[og7] Update deviceptr handling in Fortran

2018-05-07 Thread Cesar Philippidis
ls on at least one legacy driver. Cesar 2018-05-07 Cesar Philippidis gcc/fortran/ * trans-openmp.c (gfc_omp_finish_clause): Don't create pointer data mappings for deviceptr clauses. (gfc_trans_omp_clauses_1): Likewise. gcc/ * gimplify.c (enum gimplify_omp_var_data): Add GOVD_DEVICETP

[og7] Backport libgomp gomp_copy_host2dev coalesce optimization from trunk

2018-05-07 Thread Cesar Philippidis
This patch backports Jakub's gomp_copy_host2dev optimization from <https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01800.html>. There were a couple of changes required due to the new async infrastructure in og7. I've applied this patch to og7. Cesar 2018-05-07 Thomas Schw

Re: [og7] Update deviceptr handling in Fortran

2018-05-09 Thread Cesar Philippidis
c-c++-common/goacc/deviceptr-4.c -std=c++98 (test for excess > errors) I forgot to update the expected data mapping in devicetpr-4.c. Now, instead of implicitly adding a 'copy' clause for know deviceptr variables, the gimplifier will assign a force_deviceptr clause. I've ap

Fix PR85782: C++ ICE with continue statements inside acc loops

2018-05-15 Thread Cesar Philippidis
cause cp_genericize_r uses if statements to check for statement types instead of a huge switch statement. Cesar 2018-05-15 Cesar Philippidis PR c++/85782 gcc/cp/ * cp-gimplify.c (cp_genericize_r): Call genericize_omp_for_stmt for OACC_LOOPs. gcc/testsuite/ * c-c++-common/goacc/pr85782.c

Re: Fix PR85782: C++ ICE with continue statements inside acc loops

2018-05-18 Thread Cesar Philippidis
Ping. For reference, I've attached the patch for gcc7. Cesar On 05/15/2018 07:11 AM, Cesar Philippidis wrote: > This patch resolves the issue in PR85782, which involves a C++ ICE > caused by OpenACC loops which contain continue statements. The problem > is that genericize_continu

[patch,gomp4] make fortran loop variables implicitly private in openacc

2014-08-11 Thread Cesar Philippidis
also appear inside a reduction clause. I've also included a fix for this in this patch. Is this OK for gomp-4_0-branch? Thanks, Cesar 2014-08-11 Cesar Philippidis gcc/fortran/ * openmp.c (oacc_compatible_clauses): New function. (resolve_omp_clauses): Use it. (oacc_current_ctx): Move it

<    1   2   3   4   5   6   7   >