On 06/20/2018 09:45 AM, Jakub Jelinek wrote:
> On Tue, Jun 19, 2018 at 10:01:20AM -0700, Cesar Philippidis wrote:
>> >From 53ee03231c5e6e4747b4ef01335079a2d4a98480 Mon Sep 17 00:00:00 2001
>> From: Cesar Philippidis
>> Date: Tue, 19 Jun 2018 09:33:04 -0700
>> Subjec
On 06/20/2018 10:03 AM, Jakub Jelinek wrote:
> On Wed, Jun 20, 2018 at 09:59:29AM -0700, Cesar Philippidis wrote:
>> If it means anything, we have a significant async change that removes
>> the async_refcount field in that struct.
>
> Wasn't async_refcount removed 2 y
is patch OK for trunk?
Thanks,
Cesar
2018-06-20 Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Delete define.
(PTX_DEFAULT_RUNTIME_DIM): New define.
(nvptx_goacc_validate_dims): Use it to allow the runtime to
dynamically allocate
On 06/20/2018 03:15 PM, Tom de Vries wrote:
> On 06/20/2018 11:59 PM, Cesar Philippidis wrote:
>> Now it follows the formula contained in
>> the "CUDA Occupancy Calculator" spreadsheet that's distributed with CUDA.
>
> Any reason we're not using the cuda
Ping.
Ceasr
On 06/20/2018 02:59 PM, Cesar Philippidis wrote:
> At present, the nvptx libgomp plugin does not take into account the
> amount of shared resources on GPUs (mostly shared-memory are register
> usage) when selecting the default num_gangs and num_workers. In certain
> si
e reported line number in fortran combined OpenACC
directives
Is this patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.
Thanks,
Cesar
2018-06-29 Cesar Philippidis
gcc/fortran/
* trans-array.c (gfc_trans_array_bounds): Add an INIT_VLA ar
On 06/29/2018 10:49 AM, Jakub Jelinek wrote:
> On Fri, Jun 29, 2018 at 10:33:56AM -0700, Cesar Philippidis wrote:
>> @@ -1044,21 +1046,6 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
>> return;
>>
>>tree decl = OMP_CLAUSE_DECL (c);
>> -
>
s.
Thanks,
Cesar
The attached patch includes the nvptx and GCC ME reductions enhancements.
Is this patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.
Thanks,
Cesar
2018-06-29 Cesar Philippidis
Nathan Sidwell
gcc/
* config/nvptx/nvptx.c
Attaches are the FE changes for the OpenACC reduction enhancements. It
depends on the ME patch.
Is this patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.
Thanks,
Cesar
2018-06-29 Cesar Philippidis
Nathan Sidwell
gcc/c/
* c-parser.c
Attached are the updated reductions tests cases. Again, these have been
bootstrapped and regression tested cleanly for x86_64 with nvptx
offloading. Is it OK for trunk?
Thanks,
Cesar
2018-06-29 Cesar Philippidis
Nathan Sidwell
gcc/testsuite/
* c-c++-common/goacc/orphan-reductions-1.c
is patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.
Thanks,
Cesar
2018-06-29 Cesar Philippidis
James Norris
gcc/fortran/
* openmp.c (gfc_match_omp_map_clause): Re-write handling of the
deviceptr clause. Add new common_blocks a
On 06/29/2018 10:12 AM, Cesar Philippidis wrote:
> Ping.
While porting the vector length patches to trunk, I realized that I
mistakenly removed support for the environment variable GOMP_OPENACC_DIM
in this patch (thanks for adding those test case Tom!). I'll post an
updated version of th
On 12/27/2017 01:16 AM, Tom de Vries wrote:
> On 12/21/2017 06:19 PM, Cesar Philippidis wrote:
>> My test results are somewhat inconsistent. On MG's build servers, there
>> are no regressions in CUDA 8.
>
> Ack.
>
>> On my laptop, there are fewer regressions
me failure with gemm example
in the PR, so I didn't include it in the patch. However, this patch does
fix the failure with da-1.c in og7. This patch does not cause any
regressions.
Is it OK for trunk?
Thanks,
Cesar
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 5
On 12/19/2017 04:39 PM, Tom de Vries wrote:
> On 12/20/2017 12:25 AM, Cesar Philippidis wrote:
>> og7-ptx-cuda9.diff
>>
>>
>> 2017-12-19 Cesar Philippidis
>>
>> gcc/
>> * config/nvptx/nvptx.c (output_init_frag): Don't use generic addres
I've backported the patch Tom committed to trunk to fix PR83920 to
openacc-gcc-7-branch in revision
d0a1e0fa43ca4004fde33707cb6a93c01cb11507. No changes were required for
og7. The original email can be found here
<https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01729.html>.
Cesar
into trunk. This patch keeps both trunk and
og7 consistent.
Cesar
[nvptx] Backport CUDA 9 support from trunk.
2018-01-19 Cesar Philippidis
Backport from mainline:
2018-01-19 Cesar Philippidis
PR target/83790
gcc/
* config/nvptx/nvptx.c (output_init_frag):
diff --git a/gcc/config/nvpt
al
variables at runtime.
Cesar
Build libffi during bootstrap.
2018-01-25 Cesar Philippidis
* Makefile.def: Bootstrap module libffi. Add libffi dependency
to all-target-libgomp.
* Makefile.in: Regenerate.
* configure.ac: Add libffi to bootstrap_target_libs when libgomp
is bootstrapped.
* config
committee argue that
the reduction variable in inner-reduction.c should be firstprivate, not
copy.
Cesar
Privatize independent OpenACC reductions.
2018-01-26 Cesar Philippidis
gcc/
* gimplify.c (oacc_privatize_reduction): New function.
(omp_add_variable): Use it to determine if a reduction va
nside gimplify.c:omp_add_variable. I know that it's been
a while since you last worked on this. Let me know if you have any state
on that code, otherwise I'll handle the cleanup.
Cesar
Enable firstprivate OpenACC reductions
2018-01-31 Cesar Philippidis
gcc/
* gimplify.c (omp_add_variable): Allow certain Op
this problem would have been detected sooner. I'm
considering moving the PTX .param pass later, possible during
oaccdevlow. But that will have to wait for some other time.
I've applied this patch to openacc-gcc-7-branch.
Cesar
Properly handle alloca'd OpenACC data mappings
2018-01-3
it will be used in other places,
including nvptx_validate_dims and the nvptx reduction handling code.
This patch has been committed to openacc-gcc-7-branch.
Cesar
2018-03-01 Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH,
PTX_DEFAULT_RUNTIME_DIM): Move
oversial. I'll
commit this patch to openacc-gcc-7-branch once the other patches are
ready. There will be three more patches in this series.
Cesar
2018-03-02 Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.
(nvptx_init_axis_predicate): Initi
finalizer will be slow. However,
that's a project for another day.
I'll commit this patch to openacc-gcc-7-branch after Tom reviews the new
nvptx_red_partition insn.
Cesar
2018-03-02 Cesar Philippidis
gcc/
* config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Decl
code.
Overall, the changes in this patch are mild. I'll apply it to
openacc-gcc-7-branch after Tom approves the reduction patch.
Cesar
2018-03-02 Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (NVPTX_GOACC_VL_WARP): Define.
(nvptx_goacc_needs_vl_warp): New function.
(nvptx_goac
c-7-branch once the reduction changes
have been approved.
Cesar
2018-03-02 Cesar Philippidis
libgomp/
* plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of
workers and vectors.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: New test.
* testsuite/libgomp.oacc-fortran/gemm.f90: Ne
et that I posted last week. However, that
patch set didn't consider the placement of the joining barrier.
I've applied this patch to openacc-gcc-7-branch.
Tom, is a similar patch OK for trunk? The major difference between trunk
and og7 is that og7 changed the name of nvptx_warp_sync to n
On 03/09/2018 07:29 AM, Thomas Schwinge wrote:
> On Thu, 1 Mar 2018 13:17:01 -0800, Cesar Philippidis
> wrote:
>> To reduce the size of the final patch,
>> I've separated all of the misc. function and variable renaming into this
>> patch.
>
> Yes, please
On 03/09/2018 08:21 AM, Tom de Vries wrote:
> On 03/09/2018 12:31 AM, Cesar Philippidis wrote:
>> Nvidia Volta GPUs now support warp-level synchronization.
>
> Well, let's try to make that statement a bit more precise.
>
> All Nvidia architectures have supported synch
cts
unused parallelism (in this case, num_workers was being set but there
was no worker partitioned loop). That problem went away with an extra
dg-warning line.
Cesar
2018-03-12 Cesar Philippidis
Backport from trunk:
2018-01-25 Tom de Vries
PR target/84028
gcc/
* config/nvptx/nvptx.c (nv
On 03/19/2018 07:04 AM, Tom de Vries wrote:
> On 03/09/2018 05:55 PM, Cesar Philippidis wrote:
>> On 03/09/2018 08:21 AM, Tom de Vries wrote:
>>> On 03/09/2018 12:31 AM, Cesar Philippidis wrote:
>>>> Nvidia Volta GPUs now support warp-level synchronization.
>&g
On 03/19/2018 10:02 AM, Tom de Vries wrote:
> On 03/19/2018 03:55 PM, Cesar Philippidis wrote:
>>> Note that this changes ordering of the vector-neutering jump and
>>> worker-neutering jump at the end. In principle, this should not be
>>> harmful, but it viol
to nvptx_cta_sync so that function can be used for both
large vector_lengths along with workers. Other than that, I didn't have
to make any changes to his patch.
Cesar
2018-03-20 Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (nvptx_single): Revert changes from
7445a4d40.
Backport fro
On 03/21/2018 08:49 AM, Tom de Vries wrote:
> On 03/02/2018 08:18 PM, Cesar Philippidis wrote:
>
>> og7-vl-part4-hooks.diff
>
>> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
>> index 5642941c6a3..507c8671704 100644
>> --- a/gcc/config/nvptx/
On 03/21/2018 10:10 AM, Tom de Vries wrote:
> On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
>> In addition, nvptx_cta_sync and the corresponding nvptx_barsync insn,
>> have been extended to take a barrier ID and a thread count. The idea
>> here is to assign one barrier fo
On 03/22/2018 06:43 AM, Tom de Vries wrote:
> On 03/22/2018 04:59 AM, Cesar Philippidis wrote:
>> On 03/21/2018 10:10 AM, Tom de Vries wrote:
>>> Changing the code generation scheme for workers is fine, but obviously
>>> that should be a minimal, separate patch
On 03/22/2018 07:23 AM, Tom de Vries wrote:
> On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
>
>> (nvptx_declare_function_name): Emit a .maxntid directive hint and
>> call nvptx_init_oacc_workers.
>
>> +
>> + /* Emit a .maxntid hint to help the PTX JIT
On 03/22/2018 07:44 AM, Tom de Vries wrote:
> On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
>> The attached patch generalizes the worker state propagation and
>> synchronization code to handle large vectors. When the vector_length is
>> larger than a CUDA warp, the nvptx B
10; i++)
> a[i] = i;
>
> return 0;
> }
> ...
>
> I get:
> ...
> .maxntid 32, 16, 1
> ...
>
> That's the change you need to isolate.
I attached an updated patch which incorporates the
cfun->machine->axis_dim changes. It now generates more precise
On 03/22/2018 10:39 AM, Tom de Vries wrote:
> On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
>> + rtx red_partition; /* Similar to bcast_partition, except for vector
>> + reductions. */
>
> Shouldn't this be in "[og7] vector_length extension part 3: r
On 03/22/2018 10:51 AM, Tom de Vries wrote:
> On 03/22/2018 06:24 PM, Cesar Philippidis wrote:
>> On 03/22/2018 09:18 AM, Tom de Vries wrote:
>>
>>> That's obviously not good enough.
>>>
>>> When I compile this test-case:
>>> ...
>>
On 03/26/2018 07:14 AM, Tom de Vries wrote:
> On 03/02/2018 08:18 PM, Cesar Philippidis wrote:
>> diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
>> index ba3f4317f4e..f15ce6b8f8d 100644
>> --- a/gcc/omp-offload.c
>> +++ b/gcc/omp-offload.c
>> @@ -626,7 +626,8
this patch OK for
trunk if the results come back clean?
Thanks,
Cesar
2018-03-26 Cesar Philippidis
gcc/
PR target/85056
* config/nvptx/nvptx.c (nvptx_assemble_decl_begin): Add '[]' to
extern array declarations.
gcc/testsuite/
* testsuite/gcc.target/nvptx/pr85056.c: New test.
On 03/27/2018 01:17 AM, Tom de Vries wrote:
> On 03/26/2018 11:57 PM, Cesar Philippidis wrote:
>> As noted in PR85056, the nvptx BE isn't declaring external arrays using
>> PTX array notation. Specifically, it's emitting code that's missing the
>> empty angle b
On 03/30/2018 07:45 AM, Tom de Vries wrote:
> On 03/30/2018 03:07 AM, Tom de Vries wrote:
>> On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
>>> As a follow up patch will show, the nvptx BE falls back to using
>>> vector_length = 32 when a vector loop is nested in
ds to be fixed up and call cleanup_cfg as necessary. But I wanted to
keep the OMP and OACC code paths similar, so I took the former approach.
I regression tested this patch on x86_64-linux using nvptx offloading.
Is this patch OK for trunk and GCC 7 (and probably GCC 6).
Thanks,
Cesar
Fix PR84955
20
. Consequently, not all of the CUDA threads were
being utilized when vector_length = 32 (which is the default case).
I've committed this patch to openacc-gcc-7-branch which allows
warp-sized vectors to nest inside worker-partitioned loops.
Cesar
2018-04-10 Cesar Philippidis
gcc/
* config/
On 04/09/2018 04:31 AM, Richard Biener wrote:
> On Fri, 6 Apr 2018, Jakub Jelinek wrote:
>
>> On Fri, Apr 06, 2018 at 06:48:52AM -0700, Cesar Philippidis wrote:
>>> 2018-04-06 Cesar Philippidis
>>>
>>> PR middle-end/84955
>>>
>>>
On 04/12/2018 11:27 AM, H.J. Lu wrote:
> On Wed, Apr 11, 2018 at 12:30 PM, Cesar Philippidis
> wrote:
>> On 04/09/2018 04:31 AM, Richard Biener wrote:
>>> On Fri, 6 Apr 2018, Jakub Jelinek wrote:
>>>
>>>> On Fri, Apr 06, 2018 at 06:48:52AM -0700, Cesar
,
Cesar
[openacc] Teach gfortran to lower OpenACC routine dims
gcc/fortran/
* gfortran.h (oacc_function): New enum.
(gfc_oacc_routine_name): Add locus loc field.
* openmp.c (gfc_oacc_routine_dims): Return oacc_function.
(gfc_match_oacc_routine): Update routine clause syntax checking.
Populate
vectors fit inside workers. The target hook
itself doesn't do anything for the host, but the nvptx BE will make use
of it.
Is this patch OK for trunk? I regtested and bootstrapped for x86_64 with
nvptx offloading.
Thanks,
Cesar
[openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM
gcc/
for (...)
{
#pragma acc loop reduction(+:s2)
Here s2 will be transferred into the accelerator as firstprivate instead
of copy.
Is this OK for trunk? I regtested and bootstrapped for x86_64 with nvptx
offloading.
Cesar
[OpenACC] Enable firstprivate OpenACC reductions
2018-XX-YY Cesar
On 09/05/2018 07:30 AM, Tom de Vries wrote:
> On 09/05/2018 12:19 AM, Cesar Philippidis wrote:
>> On 09/02/2018 07:57 AM, Cesar Philippidis wrote:
>>> On 09/01/2018 12:04 PM, Tom de Vries wrote:
>>>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>&g
} } */
> +
> +#include
> +#include
> +#include
> +
> +int
> +main (int argc, char **argv)
> +{
> + const int N = 256;
> + int i;
> + int async = 8;
> + unsigned char *h;
> +
> + h = (unsigned char *) malloc (N);
> +
> + for (i = 0; i < N; i++)
> +{
> + h[i] = i;
> +}
> +
> + acc_copyin_async (h, N, async);
> +
> + memset (h, 0, N);
> +
> + acc_wait (async);
> +
> + acc_copyout_async (h, N, async + 1);
> +
> + acc_wait (async + 1);
> +
> + for (i = 0; i < N; i++)
> +{
> + if (h[i] != i)
> + abort ();
> +}
> +
> + free (h);
> +
> + return 0;
> +}
> Index: libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c
> ===
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (nonexistent)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (working copy)
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
> +
> +#include
> +#include
> +#include
> +
> +int
> +main (int argc, char **argv)
> +{
> + const int N = 256;
> + int i, q = 5;
> + unsigned char *h, *g;
> + void *d;
> +
> + h = (unsigned char *) malloc (N);
> + g = (unsigned char *) malloc (N);
> + for (i = 0; i < N; i++)
> +{
> + g[i] = i;
> +}
> +
> + acc_create_async (h, N, q);
> +
> + acc_memcpy_to_device_async (acc_deviceptr (h), g, N, q);
> + memset (&h[0], 0, N);
> +
> + acc_wait (q);
> +
> + acc_update_self_async (h, N, q + 1);
> + acc_delete_async (h, N, q + 1);
> +
> + acc_wait (q + 1);
> +
> + for (i = 0; i < N; i++)
> +{
> + if (h[i] != i)
> + abort ();
> +}
> +
> + free (h);
> + free (g);
> +
> + return 0;
> +}
> Index: libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
> ===
> --- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (nonexistent)
> +++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (working copy)
> @@ -0,0 +1,57 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +program main
> + use openacc
> + implicit none
> +
> + integer, parameter :: N = 256
> + integer, allocatable :: h(:)
> + integer :: i
> + integer :: async = 5
> +
> + allocate (h(N))
> +
> + do i = 1, N
> +h(i) = i
> + end do
> +
> + call acc_copyin (h)
> +
> + do i = 1, N
> +h(i) = i + i
> + end do
> +
> + call acc_update_device_async (h, sizeof (h), async)
> +
> + if (acc_is_present (h) .neqv. .TRUE.) call abort
> +
> + h(:) = 0
> +
> + call acc_copyout_async (h, sizeof (h), async)
> +
> + call acc_wait (async)
> +
> + do i = 1, N
> +if (h(i) /= i + i) call abort
> + end do
> +
> + call acc_copyin (h, sizeof (h))
> +
> + h(:) = 0
> +
> + call acc_update_self_async (h, sizeof (h), async)
> +
> + if (acc_is_present (h) .neqv. .TRUE.) call abort
> +
> + do i = 1, N
> +if (h(i) /= i + i) call abort
> + end do
> +
> + call acc_delete_async (h, async)
> +
> + call acc_wait (async)
> +
> + if (acc_is_present (h) .neqv. .FALSE.) call abort
> +
> +end program
>
While I can't approve this patch, it seems reasonable to me. I like how
you cleaned up things from OG8 (e.g., replacing return (n ? 1 : 0) with
return n != NULL'). Are there any other OG8 async patches in your queue?
Thanks,
Cesar
On 09/10/2018 10:37 AM, Jason Merrill wrote:
> On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown wrote:
>> This patch (by Cesar) changes the way C++ references are mapped in
>> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase.
>>
>> Post-patch, references
named atomic-fetch-2.c
incorrectly; there should be an underscore between atomic and fetch.
This patch also fixes that.
I tested this patch using both a standalone nvptx compiler and x86_64
Linux with nvptx offloading.
Cesar
[nvptx] Add atomic_fetch* support for SImode arguments.
2018-09-17
On 08/01/2018 04:12 AM, Tom de Vries wrote:
> On 07/31/2018 05:27 PM, Cesar Philippidis wrote:
>>/* Copy the (device) pointers to arguments to the device (dp and hp might
>> in
>> fact have the same value on a unified-memory system). */
>
> This comment
x target with nvptx
offloading.
Thanks,
Cesar
On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote:
> On Wed, 5 Sep 2018 12:52:03 -0700
> Cesar Philippidis wrote:
>
>> At present, gfortran does not encode the gang, worker or vector
>> parallelism clauses when it creates acc routines dim attribute for
>> subro
unk? I verified that libgomp.pdf looks ok.
Thanks,
Cesar
[OpenACC] Update _OPENACC value and documentation for OpenACC 2.5
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
gcc/fortran/
* cpp.c
directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html
Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux with
nvptx offloading.
Thanks,
Cesar
[OpenACC] Don't mark OpenACC auto loops as independent inside acc parallel regions
2018-XX-YY Cesar Philippidis
gc
strapped and regtested it for x86_64 Linux with nvptx
offloading and I didn't encounter any regressions.
Thanks,
Cesar
[OpenACC] Fix acc_shutdown issue
2018-XX-YY James Norris
Cesar Philippidis
libgomp/
* oacc-init.c (acc_shutdown_1): Replace use of gomp_free_memmap with
go
this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
Fix infinite recursion in OMP clause pretty-printing, default label
Apparently, Tom ran into an ICE when we were adding support for new
clauses back in the gomp-4_0-branch days. This
ntions that this
allows the kernels parallelization to work when '#pragma acc loop'
makes the front-ends create OMP_FOR, which the loop analysis phases
don't understand.
I bootstrapped and regtested it on x86_64 Linux with nvptx offloading.
Is this patch OK for trunk?
T
ted it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] handle missing OMP_LIST_ clauses in fortran's parse tree debugger
2018-XX-YY Cesar Philippidis
gcc/fortran/
* dump-parse-tree.c (show_omp_clauses): Add missing omp list_types
and reorder the switch cases to ma
this for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[OpenACC] Fix hang when running oacc exec with CUDA 9.0 nvprof
2018-XX-YY Tom de Vries
Cesar Philippidis
libgomp/
* oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread):
New variable.
(acc_init_1): Set ac
x27;ll add some support for member data OpenACC 2.6, but some of
the OpenACC C++ semantics are still unclear.
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[PR71959] lto dump of callee counts
2018-XX-YY Nathan Sidwell
Cesar P
t introduce any regressions. We do have a couple
of other standalone kernels patches in og8, but those depend on other
patches.
Thanks,
Cesar
[OpenACC] Propagate independent clause for OpenACC kernels pass
2018-XX-YY Chung-Lin Tang
Cesar Philippidis
gcc/
* cfgloop.h (struct loop): Add 'boo
.
The original discussion for this patch can be found here
<https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01872.html>.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] Set safelen to INT_MAX for oacc independent
nvptx offloading.
Thanks,
Cesar
[OpenACC] Update _OPENACC value and documentation for OpenACC 2.5
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
gcc/fortran/
* cpp.c (cpp_define_builtins): Upd
On 09/20/2018 11:22 AM, Paul Richard Thomas wrote:
> Hi Cesar,
>
> It looks OK to me.
>
> Thanks for the patch.
>
> Paul
Thanks! Committed in r264446.
Cesar
> On 20 September 2018 at 18:21, Cesar Philippidis
> wrote:
>> This patch updates Fortran's pa
On 09/20/2018 10:14 AM, Cesar Philippidis wrote:
> As Chung-Lin noted here
> <https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01079.html>:
>
> This patch adjusts omp-low.c:expand_omp_for_generic() to expand to a
> "sequential" loop form (without the OM
On 09/20/2018 09:10 AM, Bernhard Reutner-Fischer wrote:
> On Thu, 20 Sep 2018 07:41:08 -0700
> Cesar Philippidis wrote:
>
>> On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote:
>>> On Wed, 5 Sep 2018 12:52:03 -0700
>>> Cesar Philippidis wrote:
>
>>
nvptx offloading.
Cesar
[OpenACC] update fortran nested parallelism error messages
2018-09-24 Bernhard Reuther-Fischer
Cesar Philippidis
gcc/fortran/
* openmp.c (resolve_oacc_loop_blocks):
gcc/testsuite/
* gfortran.dg/goacc/nested-parallelism.f90: New test.
---
gcc/fortran/ope
On 09/25/2018 05:55 PM, Julian Brown wrote:
> On Tue, 7 Aug 2018 15:09:38 -0700
> Cesar Philippidis wrote:
>
>> I had previously posted this patch as part of a monster deviceptr
>> patch here
>> <https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html>. Thi
x offloading.
Thanks,
Cesar
[OpenACC] C, C++ OpenACC wait diagnostic change
2018-XX-YY James Norris
Cesar Philippidis
gcc/c/
* c-parser.c (c_parser_oacc_wait_list): Change error message.
gcc/cp/
* parser.c (cp_parser_oacc_wait_list): Change error message.
gcc/testsuite/
This is an old gomp4 patch that updates the location of the clause for
acc enter/exit data. Apparently, it didn't impact any test cases. Is
this OK for trunk or should we drop it from OG8?
I bootstrapped and regtested it for x86_64 Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC
hat do you want to do with this patch Thomas? I bootstrapped and
regtested it for x86_64 Linux with nvptx offloading.
Cesar
[OpenACC] Don't gimplify in ssa mode if seen_error in oacc_xform_loop
2018-XX-YY Tom de Vries
Cesar Philippidis
gcc/
PR tree-optimization/68977
* omp-offloa
tx BE. Therefore, I'm not sure if the nvptx port still needs support
for atomic fetch_and_*.
Tom and Thomas, do either of you have any thoughts on this? Should I
commit it to trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
nvptx libgcc atomic routines
it for x86_64
Linux with nvptx offloading.
Cesar
[OpenACC] Use correct location information for OpenACC shape and simple
clauses in C/C++
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c/
* c-parser.c (c_parser_oacc_shape_clause)
(c_parser_oacc_simple_clause): Add loc formal param
call random_number(fptr1)
!Test pointer reshape II
fptr3(1:2,1:2,1:2) => fptr1(4:)
end program
Note how fptr1 doesn't have a contiguous attribute. Does anyone have
thoughts on this? Maybe the ScaTeLib code needs to be updated.
Thanks,
Cesar
Disable "Assignment to contiguou
On 09/26/2018 01:49 PM, Thomas Koenig wrote:
> Hi Cesar,
>
>> As of GCC 8, gfortran now errors when a pointer with a contiguous
>> attribute is set to point to a target without a contiguous attribute. I
>> think this is overly strict, and should probably be demoted to
On 09/26/2018 12:50 PM, Joseph Myers wrote:
> On Wed, 26 Sep 2018, Cesar Philippidis wrote:
>
>> Attached is an old patch which updated the C and C++ FEs to use %<)%>
>> for the right ')' symbol. It's mostly a cosmetic change. All of the
>> change
OK for trunk? I bootstrapped and regression tested it for x86_64
Linux with nvptx offloading. This is only touches the OpenACC code path.
Cesar
[OpenACC] Use oacc_verify_routine_clauses for C/C++
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/
* omp-general.c (oacc_build_routine_dims): M
r trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks
Cesar
[OpenACC] Add support for OpenACC routine nohost clause
(was OpenACC bind, nohost changes)
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/
* tree-core.h (omp_clause_code): Add OMP_CLA
ch too large.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading. This patch is also self-contained to the
OpenACC code path.
Thanks,
Cesar
[OpenACC] Repeated use of the OpenACC routine directive
2018-XX-YY Thomas Schwinge
Cesar Philippidis
this. Maybe certain intrinsic functions should default to
having an implied acc routine directive. But I suppose that's something
for another patch.
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[PR fortran/72741] Check clauses
tries to use an acc routine with insufficient parallelism,
e.g., calling a gang routine from a vector loop.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] Check for sufficient parallelism when calling acc routines in fo
(although certain Fortran routines fall though to this).
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading. This patch only touches the OpenACC code path.
Thanks,
Cesar
[OpenACC] Add warning for unused acc routine parallelism
(was [OpenACC] Don't err
is all an early work in progress. I'm still experimenting with some
other functionality. If you checkout that branch, beware it may be rebased.
Cesar
[OpenACC] Initial Manual Deep Copy
2018-10-02 Cesar Philippidis
gcc/c/
* c-typeck.c (handle_omp_array_sections_1): Enable structs
This patch updates the install documentation to point the the upstream
newlib sources instead of the Mentor Embedded github mirror. I don't see
tarballs for any point releases on newlib's website, so I added a
reference to the git revision containing nvptx port.
Is this OK for trunk?
sk. But in the meantime, having tarballs for the
build dependencies would be nice.
> Otherwise OK.
>
> Btw, can you also update the GCC wiki with regarding to this change?
Done. I added a new 'Build Dependencies' section to the nvptx wiki:
https://gcc.gnu.org/wiki/nvptx
Cesar
er by introducing a new gomp_coalesce_chunk structure with
explicit start and end members. Beyond that, there's no functional
changes to this patch.
Is it OK for trunk? I tested it against x86_64-linux with nvptx
acceleration.
Thanks,
Cesar
2018-05-02 Cesar Philippidis
libgomp/
* target
ls on at least
one legacy driver.
Cesar
2018-05-07 Cesar Philippidis
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Don't create pointer data
mappings for deviceptr clauses.
(gfc_trans_omp_clauses_1): Likewise.
gcc/
* gimplify.c (enum gimplify_omp_var_data): Add GOVD_DEVICETP
This patch backports Jakub's gomp_copy_host2dev optimization from
<https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01800.html>. There were a
couple of changes required due to the new async infrastructure in og7.
I've applied this patch to og7.
Cesar
2018-05-07 Thomas Schw
c-c++-common/goacc/deviceptr-4.c -std=c++98 (test for excess
> errors)
I forgot to update the expected data mapping in devicetpr-4.c. Now,
instead of implicitly adding a 'copy' clause for know deviceptr
variables, the gimplifier will assign a force_deviceptr clause.
I've ap
cause cp_genericize_r uses if
statements to check for statement types instead of a huge switch statement.
Cesar
2018-05-15 Cesar Philippidis
PR c++/85782
gcc/cp/
* cp-gimplify.c (cp_genericize_r): Call genericize_omp_for_stmt for
OACC_LOOPs.
gcc/testsuite/
* c-c++-common/goacc/pr85782.c
Ping.
For reference, I've attached the patch for gcc7.
Cesar
On 05/15/2018 07:11 AM, Cesar Philippidis wrote:
> This patch resolves the issue in PR85782, which involves a C++ ICE
> caused by OpenACC loops which contain continue statements. The problem
> is that genericize_continu
also appear inside a reduction clause. I've also included a
fix for this in this patch.
Is this OK for gomp-4_0-branch?
Thanks,
Cesar
2014-08-11 Cesar Philippidis
gcc/fortran/
* openmp.c (oacc_compatible_clauses): New function.
(resolve_omp_clauses): Use it.
(oacc_current_ctx): Move it
201 - 300 of 637 matches
Mail list logo