On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
(nvptx_declare_function_name): Emit a .maxntid directive hint and
call nvptx_init_oacc_workers.
+
+ /* Emit a .maxntid hint to help the PTX JIT emit SYNC branches. */
+ if (lookup_attribute ("omp target entrypoint", DECL_ATTR
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
The attached patch generalizes the worker state propagation and
synchronization code to handle large vectors. When the vector_length is
larger than a CUDA warp, the nvptx BE will now use shared-memory to
spill-and-fill vector state when transitioni
On 03/21/2018 04:43 PM, Richard Biener wrote:
On Wed, 21 Mar 2018, Tom de Vries wrote:
On 03/12/2018 01:14 PM, Richard Biener wrote:
On Thu, 22 Feb 2018, Tom de Vries wrote:
Hi,
this patch fixes an ICE in the parloops pass.
The ICE (when compiling the test-case in attached patch) follows
On 03/21/2018 04:43 PM, Richard Biener wrote:
On Wed, 21 Mar 2018, Tom de Vries wrote:
On 03/12/2018 01:14 PM, Richard Biener wrote:
On Thu, 22 Feb 2018, Tom de Vries wrote:
Hi,
this patch fixes an ICE in the parloops pass.
The ICE (when compiling the test-case in attached patch) follows
On 03/22/2018 04:11 PM, Cesar Philippidis wrote:
On 03/22/2018 07:23 AM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
(nvptx_declare_function_name): Emit a .maxntid directive hint and
call nvptx_init_oacc_workers.
+
+ /* Emit a .maxntid hint to help the PTX
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
+ rtx red_partition; /* Similar to bcast_partition, except for vector
+ reductions. */
Shouldn't this be in "[og7] vector_length extension part 3: reductions"?
Thanks,
- Tom
On 03/22/2018 06:24 PM, Cesar Philippidis wrote:
On 03/22/2018 09:18 AM, Tom de Vries wrote:
That's obviously not good enough.
When I compile this test-case:
...
int
main (void)
{
int a[10];
#pragma acc parallel num_workers (16)
#pragma acc loop worker
for (int i = 0; i &l
On 03/22/2018 06:47 PM, Cesar Philippidis wrote:
On 03/22/2018 10:39 AM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
+ rtx red_partition; /* Similar to bcast_partition, except for vector
+ reductions. */
Shouldn't this be in "[og7] vec
On 03/22/2018 08:04 PM, Cesar Philippidis wrote:
I'm going to retest the variable vector length changes without it and
see if it's still necessary. On one hand, maxntid should be fairly
innocuous, but I don't like how it can mask other PTX JIT bugs. At this
point, I'm leaning towards dropping it
ne to clean up whitespace, but please do that in separate patches.
Committed.
Thanks,
- Tom
[nvptx] Fix whitespace in nvptx_single
2018-03-23 Tom de Vries
* config/nvptx/nvptx.c (nvptx_single): Fix whitespace.
---
gcc/config/nvptx/nvptx.c | 2 +-
1 file changed, 1 insertion(+), 1 deleti
iable names".
It's good to add it back, but that needs to be a separate patch.
Committed.
Thanks,
- Tom
[nvptx] Re-add removed struct parallel comment
2018-03-23 Tom de Vries
* config/nvptx/nvptx.c (struct parallel): Re-add comment.
---
gcc/config/nvptx/nvptx.c | 3 +++
1
This is wrong. The first operand can be a register or a constant, and
the second operand is independent. Whether or not we print the second
operand is independent of whether the first is a register.
In this patch I've reserved INTVAL (operands[1]) == 0 for the "no second
operand" case.
On 03/22/2018 06:24 PM, Cesar Philippidis wrote:
On 03/22/2018 09:18 AM, Tom de Vries wrote:
That's obviously not good enough.
When I compile this test-case:
...
int
main (void)
{
int a[10];
#pragma acc parallel num_workers (16)
#pragma acc loop worker
for (int i = 0; i &l
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
+ if (cfun->machine->sync_bar)
+fprintf (file, "\t\tadd.u32\t\t%%r%d, %%tidy, 1; "
+"// vector synchronization barrier\n",
+REGNO (cfun->machine->sync_bar));
I realize that atm we don't support large vector length whe
f this problem is filed as PR82806 - Stabilize paths
in assembler and dumps ]
Committed.
Thanks,
- Tom
[testsuite] Make scan pattern more precise in vrp104.c
2018-03-24 Tom de Vries
* gcc.dg/tree-ssa/vrp104.c: Make scan-tree-dump-times pattern more
precise.
---
gcc/testsuite/gcc.dg/tree-ssa/vrp1
x86_64.
Build x86_64 with nvptx accelerator and reg-tested libgomp.
OK for stage4 or stage1?
Thanks,
- Tom
Fix switch conversion in offloading functions
2018-03-25 Tom de Vries
PR tree-optimization/85063
* omp-general.c (offloading_function_p): New function. Factor out
of ...
*
On 03/07/2018 04:01 PM, Richard Biener wrote:
On Wed, 7 Mar 2018, Tom de Vries wrote:
On 03/07/2018 02:29 PM, Richard Biener wrote:
On Wed, 7 Mar 2018, Jakub Jelinek wrote:
On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote:
Fix ICE for static vars in offloaded functions
2018-03
On 03/02/2018 08:18 PM, Cesar Philippidis wrote:
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index ba3f4317f4e..f15ce6b8f8d 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -626,7 +626,8 @@ oacc_parse_default_dims (const char *dims)
function. */
static void
-oacc_valid
On 03/02/2018 08:18 PM, Cesar Philippidis wrote:
introduces a new goacc adjust_parallelism target hook.
That's another separate patch.
Committed.
Thanks,
- Tom
[openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM
2018-03-26 Cesar Philippidis
Tom de Vries
* doc/tm.te
On 03/02/2018 08:18 PM, Cesar Philippidis wrote:
The attached patch adjusts the existing goacc validate_dims target hook
This is overkill. All we need is a function
"int oacc_get_default_dim (int dim)".
Thanks,
- Tom
On 03/26/2018 11:57 PM, Cesar Philippidis wrote:
As noted in PR85056, the nvptx BE isn't declaring external arrays using
PTX array notation. Specifically, it's emitting code that's missing the
empty angle brackets '[]'.
[ FYI, see https://en.wikipedia.org/wiki/Bracket
For '[]' I find "square
On 03/26/2018 06:33 PM, Tom de Vries wrote:
+ loop->mask = targetm.goacc.adjust_parallelism (loop->mask, outer_mask);
loop->mask |= this_mask;
I committed the above, but the original:
...
@@ -1397,6 +1407,8 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsigned
o
On 03/02/2018 09:47 PM, Cesar Philippidis wrote:
two test cases.
Committed as separate patch, while ignoring the warnings "using
vector_length \\(32\\), ignoring 128".
Thanks,
- Tom
[openacc] Add vector_length 128 testcases
2018-03-27 Cesar Philippidis
Tom de Vries
*
On 03/28/2018 03:43 PM, Cesar Philippidis wrote:
OK for stage4 trunk.
Can I backport this patch to GCC 6 and 7?
Yes please.
Thanks,
- Tom
Hi,
Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);
int
main ()
{
return foo () + 1;
}
$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
return 2;
}
...
When compiling the test-case like this:
...
$ gcc main.c foo.c -O2 -flto -s
On 03/29/2018 11:11 AM, Tom de Vries wrote:
Hi,
Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);
int
main ()
{
return foo () + 1;
}
$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
return 2;
}
...
When compiling the test
On 03/29/2018 11:11 AM, Tom de Vries wrote:
Hi,
Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);
int
main ()
{
return foo () + 1;
}
$ cat foo.c
int __attribute__((noinline, noclone))
foo (void)
{
return 2;
}
...
When compiling the test
Hi,
when we compile a function with attributes:
...
int __attribute__((noinline, noclone))
foo (void)
{
return 2;
}
...
like this:
...
gcc main.c -fdump-tree-all -fdump-rtl-all
...
we find the function attributes starting from foo.c.004t.gimple:
...
__attribute__((noclone, noinline))
foo ()
{
[ Fix ENOPATCH ]
On 03/29/2018 12:17 PM, Tom de Vries wrote:
Hi,
when we compile a function with attributes:
...
int __attribute__((noinline, noclone))
foo (void)
{
return 2;
}
...
like this:
...
gcc main.c -fdump-tree-all -fdump-rtl-all
...
we find the function attributes starting from
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
As a follow up patch will show, the nvptx BE falls back to using
vector_length = 32 when a vector loop is nested inside a worker loop.
I disabled the fallback, and analyzed the vred2d-128.c illegal memory
access execution failure.
I minimized
d by the
offloading lto1 invocation.
Tested libgomp on x86_64 build with nvptx accelerator.
Committed.
Thanks,
- Tom
[testsuite] Add scan-offload-tree-dump
2018-03-28 Tom de Vries
PR testsuite/85106
* lib/scanoffloadtree.exp: New file.
* testsuite/lib/libgomp-dg.exp (libgomp-dg-test
ot 128 vector_length.
Tested libgomp on x86_64 build with nvptx accelerator.
Committed.
Thanks,
- Tom
[openacc] Add vector-length-128-{1,2,3}.c test-cases
2018-03-30 Tom de Vries
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/vector
On 03/30/2018 03:07 AM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
As a follow up patch will show, the nvptx BE falls back to using
vector_length = 32 when a vector loop is nested inside a worker loop.
I disabled the fallback, and analyzed the vred2d-128.c illegal
On 03/30/2018 05:00 PM, Cesar Philippidis wrote:
I should
have checked that patch with the vector length fallback disabled.
Right. The patch series introduces a lot of code that is not exercised.
I've added an -mlong-vector-in-workers option in my local branch and
added 3 test-cases to exerci
On 03/25/2018 04:30 PM, Thomas Koenig wrote:
[This is take two, the first one was rejected due to size].
Hello world,
the does what the ChangeLog and the Subject say. Regression-tested
on x86_64-pc-linux-gnu.
FTR, this caused PR85166 - "[nvptx, libgfortran] Libgomp fortran tests
using stop
lent to:
...
psize = ROUND_UP (psize, oacc_bcast_align);
...
This patch also replaces all such occurrences with ROUND_UP.
Build on x86_64 with nvptx accelerator and reg-tested libgomp.
Committed.
Thanks,
- Tom
[nvptx] Use MAX, MIN, ROUND_UP macros
2018-04-03 Tom de Vries
* config/nvptx/
Build on x86_64 with nvptx accelerator and tested libgomp.
Committed.
Thanks,
- Tom
[nvptx] Generalize state propagation and synchronization
2018-04-03 Cesar Philippidis
Tom de Vries
* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.
(nvptx_option_override): Init
On 04/03/2018 07:49 PM, Bernhard Reutner-Fischer wrote:
This patch adds scan-ltrans-tree-dump.
Please check all error calls to talk about the correct function -- at least
scan-ltrans-tree-dump-times is wrong.
Hi,
thanks for noticing that. I'll update the patches to fix that.
But I wonder
Tom
[nvptx] Fix neutering of bb with only cond jump
2018-04-05 Tom de Vries
PR target/85204
* config/nvptx/nvptx.c (nvptx_single): Fix neutering of bb with only
cond jump.
* testsuite/libgomp.oacc-c-c++-common/broadcast-1.c: New test.
---
gcc/config/nvptx/nvptx.c
On 04/03/2018 05:00 PM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.
One last thing: this variable needs to be reset to zero for every function.
Without this reset, we can generated different code for a
On 03/02/2018 06:51 PM, Cesar Philippidis wrote:
This patch teaches the nvptx BE how to process vector reductions with
large vector lengths.
Committed test-case exercising large vector length with reductions.
Thanks,
- Tom
[openacc] Add vector-length-128-10.c
2018-04-05 Tom de Vries
On 04/03/2018 05:00 PM, Tom de Vries wrote:
+ unsigned int psize = ROUND_UP (data.offset, oacc_bcast_align);
+ unsigned int pnum = (nvptx_mach_vector_length () > PTX_WARP_SIZE
+ ? nvptx_mach_max_workers () + 1
+ : 1);
This claims
sar Philippidis
Tom de Vries
* config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Declare.
* config/nvptx/nvptx.c (vector_red_size, vector_red_align,
vector_red_partition, vector_red_sym): New global variables.
(nvptx_option_override): Initialize vector_red_sym.
(nvptx_declar
test-cases to start using the
feature.
Build x86_64 with nvptx accelerator and tested libgomp.
Committed.
Thanks,
- Tom
[nvptx] Enable large vectors
2018-04-05 Cesar Philippidis
Tom de Vries
* omp-offload.c (oacc_get_default_dim): New function.
* omp-offload.h (oacc_get_default_dim
On 03/30/2018 05:14 PM, Tom de Vries wrote:
On 03/30/2018 05:00 PM, Cesar Philippidis wrote:
I should
have checked that patch with the vector length fallback disabled.
Right. The patch series introduces a lot of code that is not exercised.
I've added an -mlong-vector-in-workers option
.
Committed.
Thanks,
- Tom
[nvptx] Handle large vectors in libgomp
2018-04-05 Cesar Philippidis
Tom de Vries
* plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of
workers and vectors.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: New test.
---
libgomp/plugin
or and reg-tested libgomp.
Committed to stage4 trunk.
Thanks,
- Tom
[nvptx] Add memory_barrier insn
2018-04-09 Tom de Vries
PR target/84041
* config/nvptx/nvptx.md (define_c_enum "unspecv"): Add UNSPECV_MEMBAR.
(define_expand "*memory_barrier"): New define_expand.
(defin
On 04/09/2018 03:19 PM, Tom de Vries wrote:
Hi,
we've been having hanging OpenMP tests for nvptx offloading:
for-{3,5,6}.c and the corresponding C++ test-cases.
The failures have now been analyzed down to gomp_ptrlock_get in
libgomp/config/nvptx/ptrlock.h:
...
static inline
On 10/27/2016 12:29 AM, Cesar Philippidis wrote:
Currently, the nvptx backend is only neutering the worker axis when
propagating variables used in conditional expressions across the worker
and vector axes. That's a problem with the worker-state spill and fill
propagation implementation because al
On 01/30/2017 07:54 PM, Torvald Riegel wrote:
This patch fixes the __atomic builtins to not implement supposedly
lock-free atomic loads based on just a compare-and-swap operation.
Hi,
The internals doc still lists CAS (
https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html#index-atomic_00
ector partitioned loops. More details regarding this patch can be
found here<https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02187.html>
I've reverted this patch on og7, and backported the fix for PR85204.
Thanks,
- Tom
Backport "[nvptx] Fix neutering of bb with only cond jump"
the type
we arrive at at size of 2.
The patch fixes this by declaring extern structs which have a flexible
array member as an array without given dimension.
Build and tested on nvptx.
Committed to stage4 trunk.
Thanks,
- Tom
[nvptx] Fix handling of extern var with flexible array member
201
Hi,
this patch simplifies the logic in nvptx_single.
Build x86_64 with nvptx accelerator and tested libgomp.
Thanks,
- Tom
[nvptx] Simplifly logic in nvptx_single
2018-04-12 Tom de Vries
* config/nvptx/nvptx.c (nvptx_single): Simplify init of vector variable.
Add and use variable
propagation of branch cond in vw-neutered code
2018-04-12 Tom de Vries
PR target/85246
* config/nvptx/nvptx.c (nvptx_single): Don't use partitioning when
propagating branch condition calculated in vector-worker-neutered code.
* testsuite/libgomp.oacc-fortran/gemm.f90: Use
-foffload=-
On 09/05/2018 12:19 AM, Cesar Philippidis wrote:
> On 09/02/2018 07:57 AM, Cesar Philippidis wrote:
>> On 09/01/2018 12:04 PM, Tom de Vries wrote:
>>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>>
>>>> Is this patch OK for trunk?
>>>>
On 09/06/2018 12:24 AM, Cesar Philippidis wrote:
> I'll commit the attached patch shortly. x86_64 with nvptx offloading
> regression testing didn't yield any new failures, nor did the standalone
> nvptx testing. I'll follow up with an SImode patch later.
I'm sorry, I guess I was not clear enough h
On 09/06/2018 12:24 AM, Cesar Philippidis wrote:
> This is ok (with, as I mentioned above, the SI part split off into a
> separate patch), on the condition that you test libgomp with
> -foffload=-misa=sm_35.
>>> Adding -foffload=misa=sm_35 didn't work because the host gcc doesn't
>>> s
On 09/01/2018 08:10 PM, Tom de Vries wrote:
>> Please add more of this description to the one-line documentation
>> patch you have now;
> Done.
>
>> there are many DIEs that have no name because they
>> don't need one, and this patch doesn't add n
On 9/4/18 5:59 PM, Tom de Vries wrote:
> [ Adding Jason as addressee ]
>
> On 08/28/2018 08:20 PM, Omar Sandoval wrote:
>> On Fri, Aug 17, 2018 at 12:16:07AM -0700, Omar Sandoval wrote:
>>> On Thu, Aug 16, 2018 at 11:54:53PM -0700, Omar Sandoval wrote:
>>>>
On 8/17/18 6:29 AM, Omar Sandoval wrote:
> I don't have commit rights (first time contributor), so if this change is okay
> could it please be applied?
Hi,
thanks for the patch, I've committed the approved version.
[ In case you don't have one already ... ] if you want to continue
contributing,
On 9/26/18 8:33 PM, Cesar Philippidis wrote:
> This patch adds nvptx support for the atomic FETCH_AND_OP functions. I
> recall that this used to be important for OpenACC reductions back in the
> GCC 5.0 days before Nathan split reductions into four phases. Nowadays,
> atomic reductions use a spin l
On 9/25/18 3:11 PM, Chung-Lin Tang wrote:
> Hi Tom,
> this patch removes large portions of plugin/plugin-nvptx.c, since a lot
> of it is
> now in oacc-async.c now.
Yay!
> The new code is essentially a
> NVPTX/CUDA-specific implementation
> of the new-style goacc_asyncqueues.
>
> Also, some neede
On 8/16/18 5:46 PM, Julian Brown wrote:
> On Wed, 15 Aug 2018 21:56:54 +0200
> Bernhard Reutner-Fischer wrote:
>
>> On 15 August 2018 18:46:37 CEST, Julian Brown
>> wrote:
>>> On Mon, 13 Aug 2018 12:06:21 -0700
>>> Cesar Philippidis wrote:
>>
>> atttribute has more t than strictly necessary.
On 6/29/18 8:19 PM, Cesar Philippidis wrote:
> The attached patch includes the nvptx and GCC ME reductions enhancements.
>
> Is this patch OK for trunk? It bootstrapped / regression tested cleanly
> for x86_64 with nvptx offloading.
>
These need fixing:
...
=== ERROR type #5: trailing whitespace
On 9/18/18 10:04 PM, Cesar Philippidis wrote:
> 591973d3c3a [nvptx] use user-defined vectors when possible
If I drop this patch, I get the same test results. Can you find a
testcase for which this patch has an effect?
Thanks,
- Tom
parsing, rather than having each target plugin duplicate it.
Build on x86_64 with nvptx accelerator and reg-tested libgomp.
OK for stage1?
Thanks,
- Tom
[openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin
2018-04-15 Tom de Vries
PR libgomp/85411
* plugin/plugin-nvptx.c (notify_var
t in the reverted patch the
problematic fix was actually not exercised by the test-cases. ]
Thanks,
- Tom
[openacc] Fix ICE when compiling tile loop containing infinite loop
2018-04-16 Cesar Philippidis
Tom de Vries
PR middle-end/84955
* omp-expand.c (expand_oacc_for): Add dummy fals
to og7 as well, but given the extra
symbol added to the plugin interface, I'm not sure about timing. ]
Thanks,
- Tom
[libgomp, testsuite] Use dg-set-target-env-var instead of setenv
2018-04-16 Tom de Vries
* testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: Use
dg-set-
Hi,
while investigating PR85381 - "[og7, nvptx, openacc] parallel-loop-1.c
fails with default vector length 128", I ran into PR 80035/81069.
I've backported the fix to the og7 branch.
Thanks,
- Tom
Backport "[nvptx] Add exit after call to noreturn function"
On 04/17/2018 03:59 PM, Ian Lance Taylor wrote:
The bug report https://github.com/ianlancetaylor/libbacktrace/issues/13
points out that when backtrace_full checks whether memory is
available, it doesn't necessarily release that memory. It will stay
on the free list, so libbacktrace will use more
);
+
+ assert(r32o = r32i);
+ assert(r64o = r64i);
+
+ assert(cio = cii);
+ assert(cfo = cfi);
+ assert(cdo = cdi);
These assert have assigns in them.
Fixed in attached patch, committed.
Thanks,
- Tom
[openacc, libgomp, testsuite] Fix asserts in firstprivate-int.{c,C}
2018-04-18 Tom de Vries
d, x);
+ assert (d.v = x);
+
+ x = 400;
+ parallel_implicit (d, x);
+ assert (d.v = x);
+
+ reference_data (d, x);
+
+ return 0;
+}
Some of these assert have assigns in them.
Fixed in attached patch, committed.
Thanks,
- Tom
[openacc, libgomp, testsuite] Fix asserts in non-scalar-data.C
2018-04-18 T
thread.
The patch (r239736 in og7) fixes this by broadcasting the stack from
W0V0 to WAVA before the call.
Build x86_64 with nvptx accelerator and reg-tested libgomp.
Committed to stage4 trunk.
Thanks,
- Tom
[nvptx] Fix calls to vector and worker routines
2019-04-20 Nathan Sidwell
Tom de
nd
don't emit the barriers.
Build x86_64 with nvptx accelerator and tested libgomp.
Committed to og7 branch.
Thanks,
- Tom
[nvptx, openacc] Don't emit barriers for empty loops
2018-04-21 Tom de Vries
PR target/85381
* config/nvptx/nvptx.c (nvptx_process_pars): Don'
-partitionable routines.
Build x86_64 with nvptx accelerator, tested libgomp.
Committed to og7.
Thanks,
- Tom
[nvptx] Force vl32 if calling vector-partitionable routines
2018-04-23 Tom de Vries
PR target/85486
* omp-offload.c (oacc_fn_attrib_level): Remove static.
* omp-offload.h (oacc_fn_a
[ was: Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955 ]
On 04/16/2018 08:13 PM, Tom de Vries wrote:
On 04/12/2018 08:58 PM, Jakub Jelinek wrote:
On Thu, Apr 12, 2018 at 11:39:43AM -0700, Cesar Philippidis wrote:
Strange. I didn't observe any regressions when I tested it
gomp, testsuite] Reduce recursion depth in declare_target-{1,2}.f90
2018-04-25 Tom de Vries
PR target/85519
* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce
recursion depth from 25 to 23.
* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
---
li
atomic_capture-1.f90
2018-04-28 Tom de Vries
PR testsuite/85527
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Store
atomic capture results obtained in parallel loop to an array, instead of
to a scalar.
---
.../libgomp.oacc-fortran/atomic_capture-1.f90 | 244
om
[openacc, testsuite] Fix undefined behaviour in atomic_capture-1.c
2018-04-29 Julian Brown
Tom de Vries
PR testsuite/85527
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-1.c: Allow
arbitrary order for iterations of atomic subtract check.
---
.../libgomp.oacc-c-c++-common/ato
addresses these issues.
Committed to og7.
Thanks,
- Tom
[libgomp, nvptx] Fix too-many-resources fatal error condition and message
2018-04-30 Tom de Vries
* plugin/plugin-nvptx.c (nvptx_exec): Fix
insufficient-resources-to-launch fatal error condition and message.
---
libgomp/plugin/plugin
resource usage for Titan V in parallel-dims.c
2018-04-30 Tom de Vries
* testsuite/libgomp.oacc-c-c++-common/parallel-dims-compile.c: New test,
factored out of ...
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c (main): ... here.
Limit num_workers to avoid insufficient-resources-to
On 04/16/2018 08:13 PM, Tom de Vries wrote:
On 04/12/2018 08:58 PM, Jakub Jelinek wrote:
On Thu, Apr 12, 2018 at 11:39:43AM -0700, Cesar Philippidis wrote:
Strange. I didn't observe any regressions when I tested it. But, then
again, I was testing against revision
r259092 | jason | 2018-
sable (var);
}
}
...
Fixed by skipping over the non VAR_DECLs in the loop.
Build x86_64 with nvptx accelerator, ran libgomp testsuite.
Committed to og7 branch.
Thanks,
- Tom
[c, openacc] Handle non-var-decl in mark_vars_oacc_gangprivate
2018-05-01 Tom de Vries
PR target/85465
*
Hi,
this patch improves the "offload compiler not found" error message in
nvptx's mkoffload, by suggesting to use '-B' to fix the error.
Committed to trunk.
Thanks,
- Tom
[nvptx] Improve "offload compiler not found" message in mkoffload
2018-05-01 Tom
or message to lto-wrapper
2018-05-01 Tom de Vries
PR lto/85451
* lto-wrapper.c (compile_offload_image): Add "could not find mkoffload"
error message.
---
gcc/lto-wrapper.c | 66 ---
1 file changed, 34 insertions(+), 32 deletions(-
On 11/17/2017 02:18 PM, Tom de Vries wrote:
Hi,
I've factored out 3 new functions to test properties of enum acc_async_t:
...
typedef enum acc_async_t {
/* Keep in sync with include/gomp-constants.h. */
acc_async_noval = -1,
acc_async_sync = -2
} acc_async_t;
...
In ord
On 03/29/2018 11:16 AM, Tom de Vries wrote:
On 03/29/2018 11:11 AM, Tom de Vries wrote:
Hi,
Consider an lto multi-source test-case main.c and foo.c:
..
$ cat main.c
extern int foo (void);
int
main ()
{
return foo () + 1;
}
$ cat foo.c
int __attribute__((noinline, noclone))
foo (void
-ltrans-tree-dump
2018-03-28 Tom de Vries
PR testsuite/85106
* gcc.dg/ipa/ipa-icf-38.c: Use scan-ltrans-tree-dump.
* lib/scanltranstree.exp: New file.
* lib/target-supports.exp (scan-ltrans-tree-dump_required_options)
(scan-ltrans-tree-dump-times_required_options)
(scan-ltrans-tree-dump
/msg00319.html ). ]
OK for trunk?
Thanks,
- Tom
[testsuite] Add scan-offload-tree-dump
2018-03-28 Tom de Vries
PR testsuite/85106
* lib/scanoffloadtree.exp: New file.
* testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to
extra_tool_flags if it contains an -foffload=-fdump
On 01/18/2018 09:55 AM, Tom de Vries wrote:
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c
b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c
index 6de739a..e273a79 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c
+++ b/libgomp
Hi,
I'm posting this patch for the record.
I wrote it but haven't found a use for it yet. I find it easier to write
asm scans for nvptx than rtl ones.
Thanks,
- Tom
[testsuite] Add scan-offload-rtl-dump
2018-03-28 Tom de Vries
* lib/scanoffloadrtl.exp: New fil
[ was: Re: [PATCH, PR82428] Add
__builtin_goacc_{gang,worker,vector}_{id,size} ]
On 01/18/2018 09:55 AM, Tom de Vries wrote:
On 01/17/2018 06:51 PM, Jakub Jelinek wrote:
On Wed, Jan 17, 2018 at 06:42:33PM +0100, Tom de Vries wrote:
@@ -6602,6 +6604,71 @@ expand_stack_save (void
ild x86_64 with nvptx accelerator, tested libgomp.
Committed to og7 branch.
Thanks,
- Tom
[libgomp, openacc, nvptx] Don't select too many workers
2018-05-04 Tom de Vries
PR libgomp/85649
* plugin/plugin-nvptx.c (MIN, MAX): Redefine.
(nvptx_exec): Choose num_workers such that device has suffi
ce, and
reverted) with x86_64 with nvptx accelerator and tested libgomp.
Committed to trunk.
Thanks,
- Tom
[nvptx] Add workaround for subsequent bar.syncs
2018-05-04 Tom de Vries
PR target/85653
* config/nvptx/nvptx.c (WORKAROUND_PTXJIT_BUG_3): Define.
(workaround_barsyncs): New functi
On 04/21/2018 07:36 PM, Jakub Jelinek wrote:
* gcc.dg/nextafter-2.c: New test.
Hi,
FTR, I ran into a link error "unresolved symbol nexttowardf" using the
standalone nvptx toolchain:
...
PASS: gcc.dg/nextafter-1.c (test for excess errors)
PASS: gcc.dg/nextafter-1.c execution test
PASS
[ was: Re: [PATCH, PR82428] Add
__builtin_goacc_{gang,worker,vector}_{id,size} ]
On 05/03/2018 12:36 PM, Tom de Vries wrote:
On 01/18/2018 09:55 AM, Tom de Vries wrote:
diff --git
a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c
b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang
On 05/07/2018 03:41 PM, Christophe Lyon wrote:
On 7 May 2018 at 12:04, Tom de Vries wrote:
On 04/21/2018 07:36 PM, Jakub Jelinek wrote:
* gcc.dg/nextafter-2.c: New test.
Hi,
FTR, I ran into a link error "unresolved symbol nexttowardf" using the
standalone nvptx
ed to trunk.
Thanks,
- Tom
[nvptx] Make trap insn noreturn
2018-05-09 Tom de Vries
PR target/85626
* config/nvptx/nvptx.md (define_insn "trap", define_insn "trap_if_true")
(define_insn "trap_if_false"): Add exit after trap.
---
gcc/config/nvptx/nvptx.md | 6 +++
On 05/01/2018 10:50 PM, Tom de Vries wrote:
On 11/17/2017 02:18 PM, Tom de Vries wrote:
Hi,
I've factored out 3 new functions to test properties of enum acc_async_t:
...
typedef enum acc_async_t {
/* Keep in sync with include/gomp-constants.h. */
acc_async_noval = -1,
acc_async
On 11/17/2017 09:45 AM, Tom de Vries wrote:
Hi,
GOACC_enter_exit_data has this prototype:
...
void
GOACC_enter_exit_data (int device, size_t mapnum,
void **hostaddrs, size_t *sizes,
unsigned short *kinds,
int async, int
901 - 1000 of 2351 matches
Mail list logo