On 10/5/18 07:07, Tom de Vries wrote:
> On 6/29/18 8:19 PM, Cesar Philippidis wrote:
>> The attached patch includes the nvptx and GCC ME reductions enhancements.
>>
>> Is this patch OK for trunk? It bootstrapped / regression tested cleanly
>> for x86_64 with nvptx off
This patch introduces a couple of compiler tests for the OpenACC
attach and detach clauses.
I've committed it to openacc-gcc-8-branch.
Cesar
2018-10-30 Cesar Philippidis
gcc/testsuite/
* c-c++-common/goacc/mdc-1.c: New test.
* c-c++-common/goacc/mdc-2.c: New test.
* g++.dg/goacc/
to get libgomp.oacc-c++/this.C to work.
I've committed this patch to openacc-gcc-8-branch.
Cesar
2018-10-30 Cesar Philippidis
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Scan for attach and detach.
(cp_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_{ATTA
anly. Other than
that, these patches are identical.
I've committed this patch to openacc-gcc-8-branch.
Cesar
2018-10-30 Cesar Philippidis
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Scan for attach and detach.
(c_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_
patch tweaks GOMP_MAP_DEEP_COPY because OG8 has a lot of
other map types for acc declare and dynamic arrays. I suspect that
change would be required for trunk too, eventually.
I've committed this patch to openacc-gcc-8-branch.
Cesar
2018-10-30 Cesar Philippidis
gcc/
*
On 10/5/18 23:22, Tom de Vries wrote:
> On 9/18/18 10:04 PM, Cesar Philippidis wrote:
>> 591973d3c3a [nvptx] use user-defined vectors when possible
>
> If I drop this patch, I get the same test results. Can you find a
> testcase for which this patch has an effect?
I just re
This patch introduces a couple of compiler tests for the OpenACC
attach and detach clauses.
Is this OK for trunk after the other patches get approved?
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/testsuite/
* c-c++-common/goacc/mdc-1.c: New test.
* c-c++-common/goacc/mdc-2.c: New test
ed in the future to support more complicated C++
functionality.
Is this patch OK for trunk? I bootstrapped and regression tested it
for x86_64 Linux with nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Scan for attach and d
for attach or detach. Likewise, c_finish_omp_clauses
calls c_oacc_check_attachments to ensure that the variable is a
pointer.
Is this patch OK for trunk? I bootstrapped and regression tested it
for x86_64 Linux with nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/c/
* c
.
Is this patch OK for trunk? I bootstrapped and regression tested it
for x86_64 Linux with nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/
* gimplify.c (gimplify_adjust_omp_clauses): Filter out
GOMP_MAP_STRUCT for acc exit data.
(gimplify_omp_target_update): Promote GOMP_
is all an early work in progress. I'm still experimenting with some
other functionality. If you checkout that branch, beware it may be rebased.
Cesar
[OpenACC] Initial Manual Deep Copy
2018-10-02 Cesar Philippidis
gcc/c/
* c-typeck.c (handle_omp_array_sections_1): Enable structs
(although certain Fortran routines fall though to this).
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading. This patch only touches the OpenACC code path.
Thanks,
Cesar
[OpenACC] Add warning for unused acc routine parallelism
(was [OpenACC] Don't err
tries to use an acc routine with insufficient parallelism,
e.g., calling a gang routine from a vector loop.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] Check for sufficient parallelism when calling acc routines in fo
this. Maybe certain intrinsic functions should default to
having an implied acc routine directive. But I suppose that's something
for another patch.
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[PR fortran/72741] Check clauses
ch too large.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading. This patch is also self-contained to the
OpenACC code path.
Thanks,
Cesar
[OpenACC] Repeated use of the OpenACC routine directive
2018-XX-YY Thomas Schwinge
Cesar Philippidis
r trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks
Cesar
[OpenACC] Add support for OpenACC routine nohost clause
(was OpenACC bind, nohost changes)
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/
* tree-core.h (omp_clause_code): Add OMP_CLA
OK for trunk? I bootstrapped and regression tested it for x86_64
Linux with nvptx offloading. This is only touches the OpenACC code path.
Cesar
[OpenACC] Use oacc_verify_routine_clauses for C/C++
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/
* omp-general.c (oacc_build_routine_dims): M
On 09/26/2018 12:50 PM, Joseph Myers wrote:
> On Wed, 26 Sep 2018, Cesar Philippidis wrote:
>
>> Attached is an old patch which updated the C and C++ FEs to use %<)%>
>> for the right ')' symbol. It's mostly a cosmetic change. All of the
>> change
On 09/26/2018 01:49 PM, Thomas Koenig wrote:
> Hi Cesar,
>
>> As of GCC 8, gfortran now errors when a pointer with a contiguous
>> attribute is set to point to a target without a contiguous attribute. I
>> think this is overly strict, and should probably be demoted to
call random_number(fptr1)
!Test pointer reshape II
fptr3(1:2,1:2,1:2) => fptr1(4:)
end program
Note how fptr1 doesn't have a contiguous attribute. Does anyone have
thoughts on this? Maybe the ScaTeLib code needs to be updated.
Thanks,
Cesar
Disable "Assignment to contiguou
it for x86_64
Linux with nvptx offloading.
Cesar
[OpenACC] Use correct location information for OpenACC shape and simple
clauses in C/C++
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c/
* c-parser.c (c_parser_oacc_shape_clause)
(c_parser_oacc_simple_clause): Add loc formal param
tx BE. Therefore, I'm not sure if the nvptx port still needs support
for atomic fetch_and_*.
Tom and Thomas, do either of you have any thoughts on this? Should I
commit it to trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
nvptx libgcc atomic routines
hat do you want to do with this patch Thomas? I bootstrapped and
regtested it for x86_64 Linux with nvptx offloading.
Cesar
[OpenACC] Don't gimplify in ssa mode if seen_error in oacc_xform_loop
2018-XX-YY Tom de Vries
Cesar Philippidis
gcc/
PR tree-optimization/68977
* omp-offloa
This is an old gomp4 patch that updates the location of the clause for
acc enter/exit data. Apparently, it didn't impact any test cases. Is
this OK for trunk or should we drop it from OG8?
I bootstrapped and regtested it for x86_64 Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC
x offloading.
Thanks,
Cesar
[OpenACC] C, C++ OpenACC wait diagnostic change
2018-XX-YY James Norris
Cesar Philippidis
gcc/c/
* c-parser.c (c_parser_oacc_wait_list): Change error message.
gcc/cp/
* parser.c (cp_parser_oacc_wait_list): Change error message.
gcc/testsuite/
On 09/25/2018 05:55 PM, Julian Brown wrote:
> On Tue, 7 Aug 2018 15:09:38 -0700
> Cesar Philippidis wrote:
>
>> I had previously posted this patch as part of a monster deviceptr
>> patch here
>> <https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html>. Thi
nvptx offloading.
Cesar
[OpenACC] update fortran nested parallelism error messages
2018-09-24 Bernhard Reuther-Fischer
Cesar Philippidis
gcc/fortran/
* openmp.c (resolve_oacc_loop_blocks):
gcc/testsuite/
* gfortran.dg/goacc/nested-parallelism.f90: New test.
---
gcc/fortran/ope
On 09/20/2018 09:10 AM, Bernhard Reutner-Fischer wrote:
> On Thu, 20 Sep 2018 07:41:08 -0700
> Cesar Philippidis wrote:
>
>> On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote:
>>> On Wed, 5 Sep 2018 12:52:03 -0700
>>> Cesar Philippidis wrote:
>
>>
On 09/20/2018 10:14 AM, Cesar Philippidis wrote:
> As Chung-Lin noted here
> <https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01079.html>:
>
> This patch adjusts omp-low.c:expand_omp_for_generic() to expand to a
> "sequential" loop form (without the OM
On 09/20/2018 11:22 AM, Paul Richard Thomas wrote:
> Hi Cesar,
>
> It looks OK to me.
>
> Thanks for the patch.
>
> Paul
Thanks! Committed in r264446.
Cesar
> On 20 September 2018 at 18:21, Cesar Philippidis
> wrote:
>> This patch updates Fortran's pa
nvptx offloading.
Thanks,
Cesar
[OpenACC] Update _OPENACC value and documentation for OpenACC 2.5
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
gcc/fortran/
* cpp.c (cpp_define_builtins): Upd
.
The original discussion for this patch can be found here
<https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01872.html>.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] Set safelen to INT_MAX for oacc independent
t introduce any regressions. We do have a couple
of other standalone kernels patches in og8, but those depend on other
patches.
Thanks,
Cesar
[OpenACC] Propagate independent clause for OpenACC kernels pass
2018-XX-YY Chung-Lin Tang
Cesar Philippidis
gcc/
* cfgloop.h (struct loop): Add 'boo
x27;ll add some support for member data OpenACC 2.6, but some of
the OpenACC C++ semantics are still unclear.
Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[PR71959] lto dump of callee counts
2018-XX-YY Nathan Sidwell
Cesar P
this for x86_64 Linux
with nvptx offloading.
Thanks,
Cesar
[OpenACC] Fix hang when running oacc exec with CUDA 9.0 nvprof
2018-XX-YY Tom de Vries
Cesar Philippidis
libgomp/
* oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread):
New variable.
(acc_init_1): Set ac
ted it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
[OpenACC] handle missing OMP_LIST_ clauses in fortran's parse tree debugger
2018-XX-YY Cesar Philippidis
gcc/fortran/
* dump-parse-tree.c (show_omp_clauses): Add missing omp list_types
and reorder the switch cases to ma
ntions that this
allows the kernels parallelization to work when '#pragma acc loop'
makes the front-ends create OMP_FOR, which the loop analysis phases
don't understand.
I bootstrapped and regtested it on x86_64 Linux with nvptx offloading.
Is this patch OK for trunk?
T
this patch OK for trunk? I bootstrapped and regtested it for x86_64
Linux with nvptx offloading.
Thanks,
Cesar
Fix infinite recursion in OMP clause pretty-printing, default label
Apparently, Tom ran into an ICE when we were adding support for new
clauses back in the gomp-4_0-branch days. This
strapped and regtested it for x86_64 Linux with nvptx
offloading and I didn't encounter any regressions.
Thanks,
Cesar
[OpenACC] Fix acc_shutdown issue
2018-XX-YY James Norris
Cesar Philippidis
libgomp/
* oacc-init.c (acc_shutdown_1): Replace use of gomp_free_memmap with
go
directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html
Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux with
nvptx offloading.
Thanks,
Cesar
[OpenACC] Don't mark OpenACC auto loops as independent inside acc parallel regions
2018-XX-YY Cesar Philippidis
gc
unk? I verified that libgomp.pdf looks ok.
Thanks,
Cesar
[OpenACC] Update _OPENACC value and documentation for OpenACC 2.5
2018-XX-YY Thomas Schwinge
Cesar Philippidis
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
gcc/fortran/
* cpp.c
On 09/19/2018 03:27 PM, Bernhard Reutner-Fischer wrote:
> On Wed, 5 Sep 2018 12:52:03 -0700
> Cesar Philippidis wrote:
>
>> At present, gfortran does not encode the gang, worker or vector
>> parallelism clauses when it creates acc routines dim attribute for
>> subro
x target with nvptx
offloading.
Thanks,
Cesar
On 08/01/2018 04:12 AM, Tom de Vries wrote:
> On 07/31/2018 05:27 PM, Cesar Philippidis wrote:
>>/* Copy the (device) pointers to arguments to the device (dp and hp might
>> in
>> fact have the same value on a unified-memory system). */
>
> This comment
named atomic-fetch-2.c
incorrectly; there should be an underscore between atomic and fetch.
This patch also fixes that.
I tested this patch using both a standalone nvptx compiler and x86_64
Linux with nvptx offloading.
Cesar
[nvptx] Add atomic_fetch* support for SImode arguments.
2018-09-17
On 09/10/2018 10:37 AM, Jason Merrill wrote:
> On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown wrote:
>> This patch (by Cesar) changes the way C++ references are mapped in
>> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase.
>>
>> Post-patch, references
} } */
> +
> +#include
> +#include
> +#include
> +
> +int
> +main (int argc, char **argv)
> +{
> + const int N = 256;
> + int i;
> + int async = 8;
> + unsigned char *h;
> +
> + h = (unsigned char *) malloc (N);
> +
> + for (i = 0; i < N; i++)
> +{
> + h[i] = i;
> +}
> +
> + acc_copyin_async (h, N, async);
> +
> + memset (h, 0, N);
> +
> + acc_wait (async);
> +
> + acc_copyout_async (h, N, async + 1);
> +
> + acc_wait (async + 1);
> +
> + for (i = 0; i < N; i++)
> +{
> + if (h[i] != i)
> + abort ();
> +}
> +
> + free (h);
> +
> + return 0;
> +}
> Index: libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c
> ===
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (nonexistent)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (working copy)
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
> +
> +#include
> +#include
> +#include
> +
> +int
> +main (int argc, char **argv)
> +{
> + const int N = 256;
> + int i, q = 5;
> + unsigned char *h, *g;
> + void *d;
> +
> + h = (unsigned char *) malloc (N);
> + g = (unsigned char *) malloc (N);
> + for (i = 0; i < N; i++)
> +{
> + g[i] = i;
> +}
> +
> + acc_create_async (h, N, q);
> +
> + acc_memcpy_to_device_async (acc_deviceptr (h), g, N, q);
> + memset (&h[0], 0, N);
> +
> + acc_wait (q);
> +
> + acc_update_self_async (h, N, q + 1);
> + acc_delete_async (h, N, q + 1);
> +
> + acc_wait (q + 1);
> +
> + for (i = 0; i < N; i++)
> +{
> + if (h[i] != i)
> + abort ();
> +}
> +
> + free (h);
> + free (g);
> +
> + return 0;
> +}
> Index: libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
> ===
> --- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (nonexistent)
> +++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (working copy)
> @@ -0,0 +1,57 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +program main
> + use openacc
> + implicit none
> +
> + integer, parameter :: N = 256
> + integer, allocatable :: h(:)
> + integer :: i
> + integer :: async = 5
> +
> + allocate (h(N))
> +
> + do i = 1, N
> +h(i) = i
> + end do
> +
> + call acc_copyin (h)
> +
> + do i = 1, N
> +h(i) = i + i
> + end do
> +
> + call acc_update_device_async (h, sizeof (h), async)
> +
> + if (acc_is_present (h) .neqv. .TRUE.) call abort
> +
> + h(:) = 0
> +
> + call acc_copyout_async (h, sizeof (h), async)
> +
> + call acc_wait (async)
> +
> + do i = 1, N
> +if (h(i) /= i + i) call abort
> + end do
> +
> + call acc_copyin (h, sizeof (h))
> +
> + h(:) = 0
> +
> + call acc_update_self_async (h, sizeof (h), async)
> +
> + if (acc_is_present (h) .neqv. .TRUE.) call abort
> +
> + do i = 1, N
> +if (h(i) /= i + i) call abort
> + end do
> +
> + call acc_delete_async (h, async)
> +
> + call acc_wait (async)
> +
> + if (acc_is_present (h) .neqv. .FALSE.) call abort
> +
> +end program
>
While I can't approve this patch, it seems reasonable to me. I like how
you cleaned up things from OG8 (e.g., replacing return (n ? 1 : 0) with
return n != NULL'). Are there any other OG8 async patches in your queue?
Thanks,
Cesar
On 09/05/2018 07:30 AM, Tom de Vries wrote:
> On 09/05/2018 12:19 AM, Cesar Philippidis wrote:
>> On 09/02/2018 07:57 AM, Cesar Philippidis wrote:
>>> On 09/01/2018 12:04 PM, Tom de Vries wrote:
>>>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>&g
for (...)
{
#pragma acc loop reduction(+:s2)
Here s2 will be transferred into the accelerator as firstprivate instead
of copy.
Is this OK for trunk? I regtested and bootstrapped for x86_64 with nvptx
offloading.
Cesar
[OpenACC] Enable firstprivate OpenACC reductions
2018-XX-YY Cesar
vectors fit inside workers. The target hook
itself doesn't do anything for the host, but the nvptx BE will make use
of it.
Is this patch OK for trunk? I regtested and bootstrapped for x86_64 with
nvptx offloading.
Thanks,
Cesar
[openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM
gcc/
,
Cesar
[openacc] Teach gfortran to lower OpenACC routine dims
gcc/fortran/
* gfortran.h (oacc_function): New enum.
(gfc_oacc_routine_name): Add locus loc field.
* openmp.c (gfc_oacc_routine_dims): Return oacc_function.
(gfc_match_oacc_routine): Update routine clause syntax checking.
Populate
On 09/02/2018 07:57 AM, Cesar Philippidis wrote:
> On 09/01/2018 12:04 PM, Tom de Vries wrote:
>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>
>>> Is this patch OK for trunk?
>>>
>>
>> Well, how did you test this (
>> https://gcc.gnu.
On 09/01/2018 12:04 PM, Tom de Vries wrote:
> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>> Is this patch OK for trunk?
>>
>
> Well, how did you test this (
> https://gcc.gnu.org/contribute.html#patches : "Bootstrapping and
> testing. State the host and ta
instructions for __atomic_fetch_{add,and,or,xor} for DI
integers. Without -misa, GCC would use an atomic CAS loop for them. As
an aside, this patch also enables PTX atom instructions for those
aforementioned functions for SI integers.
Is this patch OK for trunk?
Thanks,
Cesar
Basic -misa support for nvptx
On 08/28/2018 02:32 PM, Julian Brown wrote:
> On Tue, 28 Aug 2018 12:23:22 -0700
> Cesar Philippidis wrote:
>> This is specific to OpenACC, and needs to be guarded as such.
>
> Are you sure that condition can be true for OpenMP? I'd assumed not...
My bad, you're co
> + if (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FORCE_PRESENT
> + && TREE_CODE (decl) == PARM_DECL
>&& GFC_ARRAY_TYPE_P (TREE_TYPE (decl))
>&& GFC_TYPE_ARRAY_AKIND (TREE_TYPE (decl)) == GFC_ARRAY_UNKNOWN
>&& GFC_TYPE_ARRAY_UBOUND (TREE_TYPE (decl),
This is specific to OpenACC, and needs to be guarded as such.
Cesar
On 08/13/2018 11:42 AM, Cesar Philippidis wrote:
> On 08/13/2018 09:21 AM, Julian Brown wrote:
>
>> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
>> b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
>> new file mode 100644
>> inde
. However, I see this regression on the host:
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/loop-gwv-2.c
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -O2 execution test
There could be other regressions, but I only tested the new tests
introduced by the patch so far.
Cesar
On 08/13/2018 08:08 AM, Tom de Vries wrote:
> On 08/13/2018 04:54 PM, Cesar Philippidis wrote:
>> Going
>> forward, how would you like to proceed with the nvptx BE vector length
>> changes.
>
> Do you have a branch available on github containing the patch series
>
On 08/13/2018 05:04 AM, Tom de Vries wrote:
> On 08/10/2018 08:39 PM, Cesar Philippidis wrote:
>> is that I modified the default value for vectors as follows
>>
>> +int vectors = default_dim_p[GOMP_DIM_VECTOR]
>> + ? 0 : dims[GOMP_DIM_VECTOR];
&g
On 08/08/2018 08:19 AM, Tom de Vries wrote:
> On Wed, Aug 08, 2018 at 07:09:16AM -0700, Cesar Philippidis wrote:
>> On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
Thanks for review. This version should address all of the following
remarks. However, one thing to note ...
>> [
On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
> I attached an updated version of the CUDA driver patch, although I
> haven't rebased it against your changes yet. It still needs to be tested
> against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
> yo
using
GOMP_MAP_FORCE_DEVICEPTR.
Is this patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.
Thanks,
Cesar
>From b5cf37b795ce78c78f3f434ac6999f7094bd86aa Mon Sep 17 00:00:00 2001
From: Cesar Philippidis
Date: Mon, 7 May 2018 08:23:48 -0700
Subject: [PATCH] [OpenAC
OK for trunk? I bootstrapped and regression tested it for
x86_64 with nvptx offloading.
Thanks,
Cesar
>From 576b2a7d5574400f067ec309929b38b324d8c6f6 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis
Date: Fri, 27 Jan 2017 14:58:16 +
Subject: [PATCH] [OpenACC] Don't error on implicitly priva
to be transferred in via firstprivate
because that would use up a lot of memory on the accelerator.
Is this OK for trunk? I bootstrapped and regtested it for x86_64 with
nvptx offloading.
Thanks,
Cesar
>From b8fb83b36d0f96b12af9a1f5596f31b3c6b72ef0 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis
Date
This patch updates how the OpenACC tile clause is handled in the Fortran
FE to match it's behavior in C/C++. Specifically, the tile clause now
errors on negative integer arguments, instead of emitting a warning.
Is this OK for trunk?
Thanks,
Cesar
>From af39a6d65cfb46397fa62c8852118900
This patch removes a stale reference to trans-openacc.c in
gcc/fortran/trans-statement.h. I'll apply it to trunk as obvious shortly.
Cesar
>From a08fe168c3f3ca4d446915ad26027786cda58394 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis
Date: Tue, 14 Mar 2017 22:33:00 +
Subject
your changes yet. It still needs to be tested
against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
you an update.
Does this patch look OK, at least after testing competes? I removed the
tests for CUDA_ONE_CALL_MAYBE_NULL, because the newer CUDA API isn't
supported in the o
On 08/03/2018 08:22 AM, Tom de Vries wrote:
> On 08/01/2018 09:11 PM, Cesar Philippidis wrote:
>> On 08/01/2018 07:12 AM, Tom de Vries wrote:
>>
>>>>>> + gangs = grids * (blocks / warp_size);
>>>>>
>>>>> So, we launch wit
lock, the driver
occupancy calculator ends up launching fewer gangs.
I don't have a firm position with this default behavior. Perhaps we
should just set
gang = grids
That's probably an improvement over what's there now.
Cesar
On 08/01/2018 03:18 AM, Tom de Vries wrote:
> On 07/31/2018 04:58 PM, Cesar Philippidis wrote:
>> The attached patch teaches libgomp how to use the CUDA thread occupancy
>> calculator built into the CUDA driver. Despite both being based off the
>> CUDA thread occupancy spreads
On 08/01/2018 04:01 AM, Tom de Vries wrote:
> On 07/31/2018 05:12 PM, Cesar Philippidis wrote:
>> This is an old patch which removes the struct map from the nvptx plugin.
>> I believe at one point this was supposed to be used to manage async data
>> mappings, but in practice
27;ll need to fix PR86757 before we push the gangprivate
changes upstream.
Julian, I'm not sure if the GCN port supports gangprivate memory. If it
does, you might be hit by this failure at -O0. But those tests have
already been xfailed, so you should be OK.
Cesar
[og8] More goacc_parlevel enh
I've committed this patch to og8 which backports the first of Tom's
goacc_parlevel patches from mainline. I'll post of a followup patch
which contains various bug fixes. I believe that this patch was
originally introduced in PR82428, or at least it resolves that PR.
bootstrapped and regtested for x86_64 with nvptx
offloading.
Thanks,
Cesar
[PATCH] [libgomp] Truncate config/nvptx/oacc-parallel.c
2018-XX-YY Cesar Philippidis
Thomas Schwinge
libgomp/
* config/nvptx/oacc-parallel.c: Truncate.
(cherry picked from gomp-4_0-branch r228836)
---
libgomp
s this patch OK for trunk? I bootstrapped and regression tested it for
x86_64 with nvptx offloading.
Thanks,
Cesar
[PATCH] [nvptx] Remove use of CUDA unified memory in libgomp
2018-XX-YY Cesar Philippidis
libgomp/
* plugin/plugin-nvptx.c (struct cuda_map): New.
(struct ptx_stream): Replac
This is an old patch which removes the struct map from the nvptx plugin.
I believe at one point this was supposed to be used to manage async data
mappings, but in practice that never worked out.
Is this OK for trunk? I bootstrapped and regtested on x86_64 with nvptx
offloading.
Thanks,
Cesar
existing defaults. Maybe the og8 thread
occupancy would make a better default for older versions of CUDA, but
that's a patch for another day.
Is this patch OK for trunk? I bootstrapped and regression tested it
using x86_64 with nvptx offloading.
Thanks,
Cesar
[nvptx] Use CUDA driver API to se
ng per-device default dimensions.
Neat, thanks!
I wonder if it's worthwhile to optimize the case where a system has more
than one identical GPU.
Cesar
be
your email address, however, it really wants it to be in of the form
Full Name
This is not a huge deal because the email went through, but it was
something that wasn't immediately obvious to me.
Cesar
Hi Tom,
I see that you're reviewing the libgomp changes. Please disregard the
following hunk:
On 07/11/2018 12:13 PM, Cesar Philippidis wrote:
> @@ -1199,12 +1202,59 @@ nvptx_exec (void (*fn), size_t mapnum, void
> **hostaddrs, void **devaddrs,
>
On 07/26/2018 01:33 AM, Richard Biener wrote:
> On Wed, Jul 25, 2018 at 5:30 PM Cesar Philippidis
> wrote:
>>
>> This patch teaches GCC to inform the user how it assigned parallelism
>> to each OpenACC loop at compile time using the -fopt-info-note-omp
>> fla
On 07/24/2018 01:47 PM, ce...@codesourcery.com wrote:
> From: Cesar Philippidis
>
> This patch series contains various cleanups and structural
> reorganizations to the NVPTX BE in preparation for the forthcoming
> variable length vector length enhancements. Tom, in order to make
On 07/25/2018 08:32 AM, Marek Polacek wrote:
> On Wed, Jul 25, 2018 at 08:29:17AM -0700, Cesar Philippidis wrote:
>> The fortran FE incorrectly records the line locations of combined acc
>> loop directives when it lowers the construct to gimple. Usually this
>> isn't a p
distinguish which parallelism was specified by the user and which was
assigned by the compiler. But that can be added in a follow up patch.
Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
with nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc
able to resolve a couple of
long standing diagnostics discrepancies between the c/c++ FEs in the
test suite.
Is this patch OK for trunk? I bootstrapped and regtested using x86_64
with nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/cp/
* par
nvptx offloading.
Thanks,
Cesar
2018-XX-YY Cesar Philippidis
gcc/fortran/
* trans-openmp.c (gfc_trans_oacc_combined_directive): Set the
location of combined acc loops.
(cherry picked from gomp-4_0-branch r245653)
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-ope
in this series are independent from one
another. Patches 1 and 2 fix diagnostics bugs involving incorrect line
numbers. Patch 3 is responsible for generating the actual diagnostics.
Cesar
From: Cesar Philippidis
Chung-Lin had originally defined TARGET_SET_CURRENT_FUNCTION as part
of his gang-local variable patch. But I intend to introduce those
changes at a later time. Eventually the state propagation code will
utilize nvptx_set_current_function to reset the reduction buffer
Tom de Vries
Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.
(nvptx_option_override): Init oacc_bcast_partition.
(nvptx_init_oacc_workers): New function.
(nvptx_declare_function_name): Call
From: Tom de Vries
This patch replaces the confusing, in-lined min, max and rounding code
sprinkled throughout the nvptx BE with calls to MIN, MAX, and ROUND_UP
macros.
2018-XX-YY Tom de Vries
Cesar Philippidis
gcc/
* config/nvptx/nvptx.c
From: Tom de Vries
This patch introduces an axis_dim member to the machine_function
struct. The launch geometry will be queried frequently enough so that
its justified to store that information with each cfun.
2018-XX-YY Tom de Vries
Cesar Philippidis
gcc
From: Cesar Philippidis
This patch teaches nvptx_single to always use barrier '0' for CTA
synchronization. This started off as a cosmetic change, but later on
each large vector (i.e. one that larger than a PTX warp) will need to
use its own unique thread barrier to avoid thread
Cesar Philippidis
gcc/
* config/nvptx/nvptx.md (nvptx_barsync): Add and handle operand.
* config/nvptx/nvptx.c (nvptx_cta_sync): Change arguments to take in a
lock and thread count. Update call to gen_nvptx_barsync.
(nvptx_single, nvptx_process_pars): Update
From: Tom de Vries
This patch only adjusts white space.
2018-XX-YY Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (nvptx_single): Fix whitespace.
(nvptx_neuter_pars): Likewise.
(cherry picked from openacc-gcc-7-branch commit
10f697dfcdaa77b842de6e9a62c68b33a49d3c16
From: Cesar Philippidis
This patch introduces a new struct offload_attrs, which contains the
details regarding the offload function launch geometry. In addition to
its current usage to neuter worker and vector threads, it will
eventually by used to validate the compile-time launch geometry
From: Cesar Philippidis
This patch renames various state propagation functions into somewhat
that reflects their usage in generic worker and vector contexts. E.g.,
whereas before nvptx_wpropagate used to be used exclusively for worker
state propagation, it will eventually be used for any state
From: Cesar Philippidis
Eventually, we want the nvptx BE to use a common shared memory buffer
for both worker and vector state propagation (albeit using different
partitions of shared memory for each logical thread). This patch
renames the worker_bcast variables into a more generic oacc_bcast
From: Cesar Philippidis
This patch series contains various cleanups and structural
reorganizations to the NVPTX BE in preparation for the forthcoming
variable length vector length enhancements. Tom, in order to make
these changes easier for you to review, I broke these patches into
logical
From: Cesar Philippidis
Besides for updating the macros for the NVPTX OpenACC dims, this patch
also renames PTX_GANG_DEFAULT to PTX_DEFAULT_RUNTIME_DIM. I had
originally included the PTX_GANG_DEFAULT hunk in an earlier libgomp
patch, but going forward it makes sense to isolate the nvptx and
1 - 100 of 637 matches
Mail list logo