Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: (nvptx_declare_function_name): Emit a .maxntid directive hint and call nvptx_init_oacc_workers. + + /* Emit a .maxntid hint to help the PTX JIT emit SYNC branches. */ + if (lookup_attribute ("omp target entrypoint", DECL_ATTR

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: The attached patch generalizes the worker state propagation and synchronization code to handle large vectors. When the vector_length is larger than a CUDA warp, the nvptx BE will now use shared-memory to spill-and-fill vector state when transitioni

Re: [parloops, PR83126], Use cached affine_ivs canonicalize_loop_ivs

2018-03-22 Thread Tom de Vries
On 03/21/2018 04:43 PM, Richard Biener wrote: On Wed, 21 Mar 2018, Tom de Vries wrote: On 03/12/2018 01:14 PM, Richard Biener wrote: On Thu, 22 Feb 2018, Tom de Vries wrote: Hi, this patch fixes an ICE in the parloops pass. The ICE (when compiling the test-case in attached patch) follows

Re: [parloops, PR83126], Use cached affine_ivs canonicalize_loop_ivs

2018-03-22 Thread Tom de Vries
On 03/21/2018 04:43 PM, Richard Biener wrote: On Wed, 21 Mar 2018, Tom de Vries wrote: On 03/12/2018 01:14 PM, Richard Biener wrote: On Thu, 22 Feb 2018, Tom de Vries wrote: Hi, this patch fixes an ICE in the parloops pass. The ICE (when compiling the test-case in attached patch) follows

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/22/2018 04:11 PM, Cesar Philippidis wrote: On 03/22/2018 07:23 AM, Tom de Vries wrote: On 03/02/2018 05:55 PM, Cesar Philippidis wrote: (nvptx_declare_function_name): Emit a .maxntid directive hint and call nvptx_init_oacc_workers. + +  /* Emit a .maxntid hint to help the PTX

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: + rtx red_partition; /* Similar to bcast_partition, except for vector + reductions. */ Shouldn't this be in "[og7] vector_length extension part 3: reductions"? Thanks, - Tom

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/22/2018 06:24 PM, Cesar Philippidis wrote: On 03/22/2018 09:18 AM, Tom de Vries wrote: That's obviously not good enough. When I compile this test-case: ... int main (void) {   int a[10]; #pragma acc parallel num_workers (16) #pragma acc loop worker   for (int i = 0; i &l

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-22 Thread Tom de Vries
On 03/22/2018 06:47 PM, Cesar Philippidis wrote: On 03/22/2018 10:39 AM, Tom de Vries wrote: On 03/02/2018 05:55 PM, Cesar Philippidis wrote: +  rtx red_partition; /* Similar to bcast_partition, except for vector +    reductions.  */ Shouldn't this be in "[og7] vec

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
On 03/22/2018 08:04 PM, Cesar Philippidis wrote: I'm going to retest the variable vector length changes without it and see if it's still necessary. On one hand, maxntid should be fairly innocuous, but I don't like how it can mask other PTX JIT bugs. At this point, I'm leaning towards dropping it

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
ne to clean up whitespace, but please do that in separate patches. Committed. Thanks, - Tom [nvptx] Fix whitespace in nvptx_single 2018-03-23 Tom de Vries * config/nvptx/nvptx.c (nvptx_single): Fix whitespace. --- gcc/config/nvptx/nvptx.c | 2 +- 1 file changed, 1 insertion(+), 1 deleti

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
iable names". It's good to add it back, but that needs to be a separate patch. Committed. Thanks, - Tom [nvptx] Re-add removed struct parallel comment 2018-03-23 Tom de Vries * config/nvptx/nvptx.c (struct parallel): Re-add comment. --- gcc/config/nvptx/nvptx.c | 3 +++ 1

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
This is wrong. The first operand can be a register or a constant, and the second operand is independent. Whether or not we print the second operand is independent of whether the first is a register. In this patch I've reserved INTVAL (operands[1]) == 0 for the "no second operand" case.

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
On 03/22/2018 06:24 PM, Cesar Philippidis wrote: On 03/22/2018 09:18 AM, Tom de Vries wrote: That's obviously not good enough. When I compile this test-case: ... int main (void) {   int a[10]; #pragma acc parallel num_workers (16) #pragma acc loop worker   for (int i = 0; i &l

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-23 Thread Tom de Vries
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: + if (cfun->machine->sync_bar) +fprintf (file, "\t\tadd.u32\t\t%%r%d, %%tidy, 1; " +"// vector synchronization barrier\n", +REGNO (cfun->machine->sync_bar)); I realize that atm we don't support large vector length whe

[testsuite, committed] Make scan pattern more precise in vrp104.c

2018-03-24 Thread Tom de Vries
f this problem is filed as PR82806 - Stabilize paths in assembler and dumps ] Committed. Thanks, - Tom [testsuite] Make scan pattern more precise in vrp104.c 2018-03-24 Tom de Vries * gcc.dg/tree-ssa/vrp104.c: Make scan-tree-dump-times pattern more precise. --- gcc/testsuite/gcc.dg/tree-ssa/vrp1

[PATCH, PR85063] Fix switch conversion in offloading functions

2018-03-26 Thread Tom de Vries
x86_64. Build x86_64 with nvptx accelerator and reg-tested libgomp. OK for stage4 or stage1? Thanks, - Tom Fix switch conversion in offloading functions 2018-03-25 Tom de Vries PR tree-optimization/85063 * omp-general.c (offloading_function_p): New function. Factor out of ... *

Re: [PATCH] Fix ICE for static vars in offloaded functions

2018-03-26 Thread Tom de Vries
On 03/07/2018 04:01 PM, Richard Biener wrote: On Wed, 7 Mar 2018, Tom de Vries wrote: On 03/07/2018 02:29 PM, Richard Biener wrote: On Wed, 7 Mar 2018, Jakub Jelinek wrote: On Wed, Mar 07, 2018 at 02:20:26PM +0100, Tom de Vries wrote: Fix ICE for static vars in offloaded functions 2018-03

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-26 Thread Tom de Vries
On 03/02/2018 08:18 PM, Cesar Philippidis wrote: diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index ba3f4317f4e..f15ce6b8f8d 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -626,7 +626,8 @@ oacc_parse_default_dims (const char *dims) function. */ static void -oacc_valid

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-26 Thread Tom de Vries
On 03/02/2018 08:18 PM, Cesar Philippidis wrote: introduces a new goacc adjust_parallelism target hook. That's another separate patch. Committed. Thanks, - Tom [openacc] Add target hook TARGET_GOACC_ADJUST_PARALLELISM 2018-03-26 Cesar Philippidis Tom de Vries * doc/tm.te

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-26 Thread Tom de Vries
On 03/02/2018 08:18 PM, Cesar Philippidis wrote: The attached patch adjusts the existing goacc validate_dims target hook This is overkill. All we need is a function "int oacc_get_default_dim (int dim)". Thanks, - Tom

Re: [PATCH,nvptx] Fix PR85056

2018-03-27 Thread Tom de Vries
On 03/26/2018 11:57 PM, Cesar Philippidis wrote: As noted in PR85056, the nvptx BE isn't declaring external arrays using PTX array notation. Specifically, it's emitting code that's missing the empty angle brackets '[]'. [ FYI, see https://en.wikipedia.org/wiki/Bracket For '[]' I find "square

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-03-27 Thread Tom de Vries
On 03/26/2018 06:33 PM, Tom de Vries wrote: + loop->mask = targetm.goacc.adjust_parallelism (loop->mask, outer_mask); loop->mask |= this_mask; I committed the above, but the original: ... @@ -1397,6 +1407,8 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsigned o

Re: [og7] vector_length extension part 5: libgomp and tests

2018-03-27 Thread Tom de Vries
On 03/02/2018 09:47 PM, Cesar Philippidis wrote: two test cases. Committed as separate patch, while ignoring the warnings "using vector_length \\(32\\), ignoring 128". Thanks, - Tom [openacc] Add vector_length 128 testcases 2018-03-27 Cesar Philippidis Tom de Vries *

Re: [PATCH,nvptx] Fix PR85056

2018-03-28 Thread Tom de Vries
On 03/28/2018 03:43 PM, Cesar Philippidis wrote: OK for stage4 trunk. Can I backport this patch to GCC 6 and 7? Yes please. Thanks, - Tom

[PATCH, 0/2] Add scan-ltrans-tree-dump and scan-wpa-ipa-dump

2018-03-29 Thread Tom de Vries
Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () { return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) { return 2; } ... When compiling the test-case like this: ... $ gcc main.c foo.c -O2 -flto -s

[PATCH, testsuite, 1/2] Add scan-wpa-ipa-dump

2018-03-29 Thread Tom de Vries
On 03/29/2018 11:11 AM, Tom de Vries wrote: Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () {   return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) {   return 2; } ... When compiling the test

[PATCH, testsuite, 2/2] Add scan-ltrans-tree-dump

2018-03-29 Thread Tom de Vries
On 03/29/2018 11:11 AM, Tom de Vries wrote: Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () {   return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void) {   return 2; } ... When compiling the test

[PATCH] Print function attributes in rtl dumps

2018-03-29 Thread Tom de Vries
Hi, when we compile a function with attributes: ... int __attribute__((noinline, noclone)) foo (void) { return 2; } ... like this: ... gcc main.c -fdump-tree-all -fdump-rtl-all ... we find the function attributes starting from foo.c.004t.gimple: ... __attribute__((noclone, noinline)) foo () {

Re: [PATCH] Print function attributes in rtl dumps

2018-03-29 Thread Tom de Vries
[ Fix ENOPATCH ] On 03/29/2018 12:17 PM, Tom de Vries wrote: Hi, when we compile a function with attributes: ... int __attribute__((noinline, noclone)) foo (void) {   return 2; } ... like this: ... gcc main.c -fdump-tree-all -fdump-rtl-all ... we find the function attributes starting from

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-29 Thread Tom de Vries
On 03/02/2018 05:55 PM, Cesar Philippidis wrote: As a follow up patch will show, the nvptx BE falls back to using vector_length = 32 when a vector loop is nested inside a worker loop. I disabled the fallback, and analyzed the vred2d-128.c illegal memory access execution failure. I minimized

[og7, testsuite, committed] Add scan-offload-tree-dump

2018-03-30 Thread Tom de Vries
d by the offloading lto1 invocation. Tested libgomp on x86_64 build with nvptx accelerator. Committed. Thanks, - Tom [testsuite] Add scan-offload-tree-dump 2018-03-28 Tom de Vries PR testsuite/85106 * lib/scanoffloadtree.exp: New file. * testsuite/lib/libgomp-dg.exp (libgomp-dg-test

[og7, openacc, committed] Add vector-length-128-{1,2,3}.c test-cases

2018-03-30 Thread Tom de Vries
ot 128 vector_length. Tested libgomp on x86_64 build with nvptx accelerator. Committed. Thanks, - Tom [openacc] Add vector-length-128-{1,2,3}.c test-cases 2018-03-30 Tom de Vries * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/vector

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-30 Thread Tom de Vries
On 03/30/2018 03:07 AM, Tom de Vries wrote: On 03/02/2018 05:55 PM, Cesar Philippidis wrote: As a follow up patch will show, the nvptx BE falls back to using vector_length = 32 when a vector loop is nested inside a worker loop. I disabled the fallback, and analyzed the vred2d-128.c illegal

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-03-30 Thread Tom de Vries
On 03/30/2018 05:00 PM, Cesar Philippidis wrote: I should have checked that patch with the vector length fallback disabled. Right. The patch series introduces a lot of code that is not exercised. I've added an -mlong-vector-in-workers option in my local branch and added 3 test-cases to exerci

Re: [patch, libgomp testsuite] Replace non-standard call abort by STOP n

2018-04-03 Thread Tom de Vries
On 03/25/2018 04:30 PM, Thomas Koenig wrote: [This is take two, the first one was rejected due to size]. Hello world, the does what the ChangeLog and the Subject say.  Regression-tested on x86_64-pc-linux-gnu. FTR, this caused PR85166 - "[nvptx, libgfortran] Libgomp fortran tests using stop

[nvptx] Use MAX, MIN, ROUND_UP macros

2018-04-03 Thread Tom de Vries
lent to: ... psize = ROUND_UP (psize, oacc_bcast_align); ... This patch also replaces all such occurrences with ROUND_UP. Build on x86_64 with nvptx accelerator and reg-tested libgomp. Committed. Thanks, - Tom [nvptx] Use MAX, MIN, ROUND_UP macros 2018-04-03 Tom de Vries * config/nvptx/

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-03 Thread Tom de Vries
Build on x86_64 with nvptx accelerator and tested libgomp. Committed. Thanks, - Tom [nvptx] Generalize state propagation and synchronization 2018-04-03 Cesar Philippidis Tom de Vries * config/nvptx/nvptx.c (oacc_bcast_partition): Declare. (nvptx_option_override): Init

Re: [PATCH, testsuite, 2/2] Add scan-ltrans-tree-dump

2018-04-04 Thread Tom de Vries
On 04/03/2018 07:49 PM, Bernhard Reutner-Fischer wrote: This patch adds scan-ltrans-tree-dump. Please check all error calls to talk about the correct function -- at least scan-ltrans-tree-dump-times is wrong. Hi, thanks for noticing that. I'll update the patches to fix that. But I wonder

[nvptx, PR85204] Fix neutering of bb with only cond jump

2018-04-05 Thread Tom de Vries
Tom [nvptx] Fix neutering of bb with only cond jump 2018-04-05 Tom de Vries PR target/85204 * config/nvptx/nvptx.c (nvptx_single): Fix neutering of bb with only cond jump. * testsuite/libgomp.oacc-c-c++-common/broadcast-1.c: New test. --- gcc/config/nvptx/nvptx.c

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries
On 04/03/2018 05:00 PM, Tom de Vries wrote: On 03/02/2018 05:55 PM, Cesar Philippidis wrote: * config/nvptx/nvptx.c (oacc_bcast_partition): Declare. One last thing: this variable needs to be reset to zero for every function. Without this reset, we can generated different code for a

Re: [og7] vector_length extension part 3: reductions

2018-04-05 Thread Tom de Vries
On 03/02/2018 06:51 PM, Cesar Philippidis wrote: This patch teaches the nvptx BE how to process vector reductions with large vector lengths. Committed test-case exercising large vector length with reductions. Thanks, - Tom [openacc] Add vector-length-128-10.c 2018-04-05 Tom de Vries

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries
On 04/03/2018 05:00 PM, Tom de Vries wrote: + unsigned int psize = ROUND_UP (data.offset, oacc_bcast_align); + unsigned int pnum = (nvptx_mach_vector_length () > PTX_WARP_SIZE + ? nvptx_mach_max_workers () + 1 + : 1); This claims

Re: [og7] vector_length extension part 3: reductions

2018-04-05 Thread Tom de Vries
sar Philippidis Tom de Vries * config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Declare. * config/nvptx/nvptx.c (vector_red_size, vector_red_align, vector_red_partition, vector_red_sym): New global variables. (nvptx_option_override): Initialize vector_red_sym. (nvptx_declar

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-04-05 Thread Tom de Vries
test-cases to start using the feature. Build x86_64 with nvptx accelerator and tested libgomp. Committed. Thanks, - Tom [nvptx] Enable large vectors 2018-04-05 Cesar Philippidis Tom de Vries * omp-offload.c (oacc_get_default_dim): New function. * omp-offload.h (oacc_get_default_dim

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries
On 03/30/2018 05:14 PM, Tom de Vries wrote: On 03/30/2018 05:00 PM, Cesar Philippidis wrote: I should have checked that patch with the vector length fallback disabled. Right. The patch series introduces a lot of code that is not exercised. I've added an -mlong-vector-in-workers option

Re: [og7] vector_length extension part 5: libgomp and tests

2018-04-05 Thread Tom de Vries
. Committed. Thanks, - Tom [nvptx] Handle large vectors in libgomp 2018-04-05 Cesar Philippidis Tom de Vries * plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of workers and vectors. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: New test. --- libgomp/plugin

[nvptx, PR84041] Add memory_barrier insn

2018-04-09 Thread Tom de Vries
or and reg-tested libgomp. Committed to stage4 trunk. Thanks, - Tom [nvptx] Add memory_barrier insn 2018-04-09 Tom de Vries PR target/84041 * config/nvptx/nvptx.md (define_c_enum "unspecv"): Add UNSPECV_MEMBAR. (define_expand "*memory_barrier"): New define_expand. (defin

[og7] backported "[nvptx, PR84041] Add memory_barrier insn"

2018-04-11 Thread Tom de Vries
On 04/09/2018 03:19 PM, Tom de Vries wrote: Hi, we've been having hanging OpenMP tests for nvptx offloading: for-{3,5,6}.c and the corresponding C++ test-cases. The failures have now been analyzed down to gomp_ptrlock_get in libgomp/config/nvptx/ptrlock.h: ...  static inline

Re: [nvptx] propagating conditionals in worker-vector partitioned loops

2018-04-11 Thread Tom de Vries
On 10/27/2016 12:29 AM, Cesar Philippidis wrote: Currently, the nvptx backend is only neutering the worker axis when propagating variables used in conditional expressions across the worker and vector axes. That's a problem with the worker-state spill and fill propagation implementation because al

Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.

2018-04-11 Thread Tom de Vries
On 01/30/2017 07:54 PM, Torvald Riegel wrote: This patch fixes the __atomic builtins to not implement supposedly lock-free atomic loads based on just a compare-and-swap operation. Hi, The internals doc still lists CAS ( https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html#index-atomic_00

[og7] Backport "[nvptx] Fix neutering of bb with only cond jump"

2018-04-11 Thread Tom de Vries
ector partitioned loops. More details regarding this patch can be found here<https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02187.html> I've reverted this patch on og7, and backported the fix for PR85204. Thanks, - Tom Backport "[nvptx] Fix neutering of bb with only cond jump"

[nvptx, PR85296] Fix handling of extern var with flexible array member

2018-04-12 Thread Tom de Vries
the type we arrive at at size of 2. The patch fixes this by declaring extern structs which have a flexible array member as an array without given dimension. Build and tested on nvptx. Committed to stage4 trunk. Thanks, - Tom [nvptx] Fix handling of extern var with flexible array member 201

[og7, nvptx] Simplifly logic in nvptx_single

2018-04-12 Thread Tom de Vries
Hi, this patch simplifies the logic in nvptx_single. Build x86_64 with nvptx accelerator and tested libgomp. Thanks, - Tom [nvptx] Simplifly logic in nvptx_single 2018-04-12 Tom de Vries * config/nvptx/nvptx.c (nvptx_single): Simplify init of vector variable. Add and use variable

[og7, nvptx, committed] Fix propagation of branch cond in vw-neutered code

2018-04-12 Thread Tom de Vries
propagation of branch cond in vw-neutered code 2018-04-12 Tom de Vries PR target/85246 * config/nvptx/nvptx.c (nvptx_single): Don't use partitioning when propagating branch condition calculated in vector-worker-neutered code. * testsuite/libgomp.oacc-fortran/gemm.f90: Use -foffload=-

Re: [patch,nvptx] Basic -misa support for nvptx

2018-09-05 Thread Tom de Vries
On 09/05/2018 12:19 AM, Cesar Philippidis wrote: > On 09/02/2018 07:57 AM, Cesar Philippidis wrote: >> On 09/01/2018 12:04 PM, Tom de Vries wrote: >>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote: >> >>>> Is this patch OK for trunk? >>>>

Re: [patch,nvptx] Basic -misa support for nvptx

2018-09-05 Thread Tom de Vries
On 09/06/2018 12:24 AM, Cesar Philippidis wrote: > I'll commit the attached patch shortly. x86_64 with nvptx offloading > regression testing didn't yield any new failures, nor did the standalone > nvptx testing. I'll follow up with an SImode patch later. I'm sorry, I guess I was not clear enough h

Re: [patch,nvptx] Basic -misa support for nvptx

2018-09-05 Thread Tom de Vries
On 09/06/2018 12:24 AM, Cesar Philippidis wrote: > This is ok (with, as I mentioned above, the SI part split off into a > separate patch), on the condition that you test libgomp with > -foffload=-misa=sm_35. >>> Adding -foffload=misa=sm_35 didn't work because the host gcc doesn't >>> s

[PING][PATCH][debug] Add -gdescribe-dies

2018-09-11 Thread Tom de Vries
On 09/01/2018 08:10 PM, Tom de Vries wrote: >> Please add more of this description to the one-line documentation >> patch you have now; > Done. > >> there are many DIEs that have no name because they >> don't need one, and this patch doesn't add n

[PING][PATCH] DWARF: add DW_AT_count to zero-length arrays

2018-09-13 Thread Tom de Vries
On 9/4/18 5:59 PM, Tom de Vries wrote: > [ Adding Jason as addressee ] > > On 08/28/2018 08:20 PM, Omar Sandoval wrote: >> On Fri, Aug 17, 2018 at 12:16:07AM -0700, Omar Sandoval wrote: >>> On Thu, Aug 16, 2018 at 11:54:53PM -0700, Omar Sandoval wrote: >>>>

Re: [PATCH] DWARF: add DW_AT_count to zero-length arrays

2018-09-13 Thread Tom de Vries
On 8/17/18 6:29 AM, Omar Sandoval wrote: > I don't have commit rights (first time contributor), so if this change is okay > could it please be applied? Hi, thanks for the patch, I've committed the approved version. [ In case you don't have one already ... ] if you want to continue contributing,

Re: [patch] nvptx libgcc atomic routines

2018-10-05 Thread Tom de Vries
On 9/26/18 8:33 PM, Cesar Philippidis wrote: > This patch adds nvptx support for the atomic FETCH_AND_OP functions. I > recall that this used to be important for OpenACC reductions back in the > GCC 5.0 days before Nathan split reductions into four phases. Nowadays, > atomic reductions use a spin l

Re: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes

2018-10-05 Thread Tom de Vries
On 9/25/18 3:11 PM, Chung-Lin Tang wrote: > Hi Tom, > this patch removes large portions of plugin/plugin-nvptx.c, since a lot > of it is > now in oacc-async.c now. Yay! > The new code is essentially a > NVPTX/CUDA-specific implementation > of the new-style goacc_asyncqueues. > > Also, some neede

Re: [PATCH, OpenACC] Add support for gang local storage allocation in shared memory

2018-10-05 Thread Tom de Vries
On 8/16/18 5:46 PM, Julian Brown wrote: > On Wed, 15 Aug 2018 21:56:54 +0200 > Bernhard Reutner-Fischer wrote: > >> On 15 August 2018 18:46:37 CEST, Julian Brown >> wrote: >>> On Mon, 13 Aug 2018 12:06:21 -0700 >>> Cesar Philippidis wrote: >> >> atttribute has more t than strictly necessary.

Re: [patch] various OpenACC reduction enhancements - ME and nvptx changes

2018-10-05 Thread Tom de Vries
On 6/29/18 8:19 PM, Cesar Philippidis wrote: > The attached patch includes the nvptx and GCC ME reductions enhancements. > > Is this patch OK for trunk? It bootstrapped / regression tested cleanly > for x86_64 with nvptx offloading. > These need fixing: ... === ERROR type #5: trailing whitespace

Re: [nvptx] vector length patch series

2018-10-05 Thread Tom de Vries
On 9/18/18 10:04 PM, Cesar Philippidis wrote: > 591973d3c3a [nvptx] use user-defined vectors when possible If I drop this patch, I get the same test results. Can you find a testcase for which this patch has an effect? Thanks, - Tom

[PATCH, openacc, PR85411] Move GOMP_OPENACC_DIM parsing out of nvptx plugin

2018-04-16 Thread Tom de Vries
parsing, rather than having each target plugin duplicate it. Build on x86_64 with nvptx accelerator and reg-tested libgomp. OK for stage1? Thanks, - Tom [openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin 2018-04-15 Tom de Vries PR libgomp/85411 * plugin/plugin-nvptx.c (notify_var

Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-04-16 Thread Tom de Vries
t in the reverted patch the problematic fix was actually not exercised by the test-cases. ] Thanks, - Tom [openacc] Fix ICE when compiling tile loop containing infinite loop 2018-04-16 Cesar Philippidis Tom de Vries PR middle-end/84955 * omp-expand.c (expand_oacc_for): Add dummy fals

[libgomp, testsuite] Use dg-set-target-env-var instead of setenv

2018-04-16 Thread Tom de Vries
to og7 as well, but given the extra symbol added to the plugin interface, I'm not sure about timing. ] Thanks, - Tom [libgomp, testsuite] Use dg-set-target-env-var instead of setenv 2018-04-16 Tom de Vries * testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: Use dg-set-

[og7] Backport "[nvptx] Add exit after call to noreturn function"

2018-04-16 Thread Tom de Vries
Hi, while investigating PR85381 - "[og7, nvptx, openacc] parallel-loop-1.c fails with default vector length 128", I ran into PR 80035/81069. I've backported the fix to the og7 branch. Thanks, - Tom Backport "[nvptx] Add exit after call to noreturn function"

Re: libbacktrace patch committed: Call munmap after memory test

2018-04-17 Thread Tom de Vries
On 04/17/2018 03:59 PM, Ian Lance Taylor wrote: The bug report https://github.com/ianlancetaylor/libbacktrace/issues/13 points out that when backtrace_full checks whether memory is available, it doesn't necessarily release that memory. It will stay on the free list, so libbacktrace will use more

[og7, openacc, libgomp, testsuite] Fix asserts in firstprivate-int.{c,C}

2018-04-18 Thread Tom de Vries
); + + assert(r32o = r32i); + assert(r64o = r64i); + + assert(cio = cii); + assert(cfo = cfi); + assert(cdo = cdi); These assert have assigns in them. Fixed in attached patch, committed. Thanks, - Tom [openacc, libgomp, testsuite] Fix asserts in firstprivate-int.{c,C} 2018-04-18 Tom de Vries

[og7, openacc, libgomp, testsuite] Fix asserts in non-scalar-data.C

2018-04-18 Thread Tom de Vries
d, x); + assert (d.v = x); + + x = 400; + parallel_implicit (d, x); + assert (d.v = x); + + reference_data (d, x); + + return 0; +} Some of these assert have assigns in them. Fixed in attached patch, committed. Thanks, - Tom [openacc, libgomp, testsuite] Fix asserts in non-scalar-data.C 2018-04-18 T

[nvptx, PR85445, committed] Fix calls to vector and worker routines

2018-04-20 Thread Tom de Vries
thread. The patch (r239736 in og7) fixes this by broadcasting the stack from W0V0 to WAVA before the call. Build x86_64 with nvptx accelerator and reg-tested libgomp. Committed to stage4 trunk. Thanks, - Tom [nvptx] Fix calls to vector and worker routines 2019-04-20 Nathan Sidwell Tom de

[og7, nvptx, openacc, PR85381, committed] Don't emit barriers for empty loops

2018-04-21 Thread Tom de Vries
nd don't emit the barriers. Build x86_64 with nvptx accelerator and tested libgomp. Committed to og7 branch. Thanks, - Tom [nvptx, openacc] Don't emit barriers for empty loops 2018-04-21 Tom de Vries PR target/85381 * config/nvptx/nvptx.c (nvptx_process_pars): Don'

[og7, nvptx, PR85486, committed] Force vl32 if calling vector-partitionable routines

2018-04-23 Thread Tom de Vries
-partitionable routines. Build x86_64 with nvptx accelerator, tested libgomp. Committed to og7. Thanks, - Tom [nvptx] Force vl32 if calling vector-partitionable routines 2018-04-23 Tom de Vries PR target/85486 * omp-offload.c (oacc_fn_attrib_level): Remove static. * omp-offload.h (oacc_fn_a

[PATCH, lto, PR85422] Fixup loops before lto write-out

2018-04-23 Thread Tom de Vries
[ was: Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955 ] On 04/16/2018 08:13 PM, Tom de Vries wrote: On 04/12/2018 08:58 PM, Jakub Jelinek wrote: On Thu, Apr 12, 2018 at 11:39:43AM -0700, Cesar Philippidis wrote: Strange. I didn't observe any regressions when I tested it

[nvptx, libgomp, testsuite, PR85519] Reduce recursion depth in declare_target-{1,2}.f90

2018-04-25 Thread Tom de Vries
gomp, testsuite] Reduce recursion depth in declare_target-{1,2}.f90 2018-04-25 Tom de Vries PR target/85519 * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce recursion depth from 25 to 23. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same. --- li

[openacc, testsuite, PR85527, committed] Fix undefined behaviour in atomic_capture-1.f90

2018-04-27 Thread Tom de Vries
atomic_capture-1.f90 2018-04-28 Tom de Vries PR testsuite/85527 * testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Store atomic capture results obtained in parallel loop to an array, instead of to a scalar. --- .../libgomp.oacc-fortran/atomic_capture-1.f90 | 244

[openacc, testsuite, PR85527, committed] Fix undefined behaviour in atomic_capture-1.c

2018-04-29 Thread Tom de Vries
om [openacc, testsuite] Fix undefined behaviour in atomic_capture-1.c 2018-04-29 Julian Brown Tom de Vries PR testsuite/85527 * testsuite/libgomp.oacc-c-c++-common/atomic_capture-1.c: Allow arbitrary order for iterations of atomic subtract check. --- .../libgomp.oacc-c-c++-common/ato

[og7, libgomp, nvptx, committed] Fix too-many-resources fatal error condition and message

2018-04-30 Thread Tom de Vries
addresses these issues. Committed to og7. Thanks, - Tom [libgomp, nvptx] Fix too-many-resources fatal error condition and message 2018-04-30 Tom de Vries * plugin/plugin-nvptx.c (nvptx_exec): Fix insufficient-resources-to-launch fatal error condition and message. --- libgomp/plugin/plugin

[og7, openacc, testsuite, committed] Reduce resource usage for Titan V in parallel-dims.c

2018-04-30 Thread Tom de Vries
resource usage for Titan V in parallel-dims.c 2018-04-30 Tom de Vries * testsuite/libgomp.oacc-c-c++-common/parallel-dims-compile.c: New test, factored out of ... * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c (main): ... here. Limit num_workers to avoid insufficient-resources-to

Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-05-01 Thread Tom de Vries
On 04/16/2018 08:13 PM, Tom de Vries wrote: On 04/12/2018 08:58 PM, Jakub Jelinek wrote: On Thu, Apr 12, 2018 at 11:39:43AM -0700, Cesar Philippidis wrote: Strange. I didn't observe any regressions when I tested it. But, then again, I was testing against revision r259092 | jason | 2018-

[og7, c, openacc, PR85465, committed] Handle non-var-decl in mark_vars_oacc_gangprivate

2018-05-01 Thread Tom de Vries
sable (var); } } ... Fixed by skipping over the non VAR_DECLs in the loop. Build x86_64 with nvptx accelerator, ran libgomp testsuite. Committed to og7 branch. Thanks, - Tom [c, openacc] Handle non-var-decl in mark_vars_oacc_gangprivate 2018-05-01 Tom de Vries PR target/85465 *

[nvptx, PR85451, committed] Improve "offload compiler not found" message in mkoffload

2018-05-01 Thread Tom de Vries
Hi, this patch improves the "offload compiler not found" error message in nvptx's mkoffload, by suggesting to use '-B' to fix the error. Committed to trunk. Thanks, - Tom [nvptx] Improve "offload compiler not found" message in mkoffload 2018-05-01 Tom

[PATCH, lto, PR85451] Add "could not find mkoffload" error message to lto-wrapper

2018-05-01 Thread Tom de Vries
or message to lto-wrapper 2018-05-01 Tom de Vries PR lto/85451 * lto-wrapper.c (compile_offload_image): Add "could not find mkoffload" error message. --- gcc/lto-wrapper.c | 66 --- 1 file changed, 34 insertions(+), 32 deletions(-

[PING] [PATCH, libgomp, openacc] Factor out async argument utility functions

2018-05-01 Thread Tom de Vries
On 11/17/2017 02:18 PM, Tom de Vries wrote: Hi, I've factored out 3 new functions to test properties of enum acc_async_t: ... typedef enum acc_async_t {   /* Keep in sync with include/gomp-constants.h.  */   acc_async_noval = -1,   acc_async_sync  = -2 } acc_async_t; ... In ord

Re: [PATCH, testsuite, 1/2] Add scan-wpa-ipa-dump

2018-05-02 Thread Tom de Vries
On 03/29/2018 11:16 AM, Tom de Vries wrote: On 03/29/2018 11:11 AM, Tom de Vries wrote: Hi, Consider an lto multi-source test-case main.c and foo.c: .. $ cat main.c extern int foo (void); int main () {    return foo () + 1; } $ cat foo.c int __attribute__((noinline, noclone)) foo (void

Re: [PATCH, testsuite, 2/2] Add scan-ltrans-tree-dump

2018-05-02 Thread Tom de Vries
-ltrans-tree-dump 2018-03-28 Tom de Vries PR testsuite/85106 * gcc.dg/ipa/ipa-icf-38.c: Use scan-ltrans-tree-dump. * lib/scanltranstree.exp: New file. * lib/target-supports.exp (scan-ltrans-tree-dump_required_options) (scan-ltrans-tree-dump-times_required_options) (scan-ltrans-tree-dump

[testsuite] Add scan-offload-tree-dump

2018-05-02 Thread Tom de Vries
/msg00319.html ). ] OK for trunk? Thanks, - Tom [testsuite] Add scan-offload-tree-dump 2018-03-28 Tom de Vries PR testsuite/85106 * lib/scanoffloadtree.exp: New file. * testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to extra_tool_flags if it contains an -foffload=-fdump

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-05-03 Thread Tom de Vries
On 01/18/2018 09:55 AM, Tom de Vries wrote: diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c index 6de739a..e273a79 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c +++ b/libgomp

[testsuite] Add scan-offload-rtl-dump

2018-05-03 Thread Tom de Vries
Hi, I'm posting this patch for the record. I wrote it but haven't found a use for it yet. I find it easier to write asm scans for nvptx than rtl ones. Thanks, - Tom [testsuite] Add scan-offload-rtl-dump 2018-03-28 Tom de Vries * lib/scanoffloadrtl.exp: New fil

[PATCH, expand, PR85639] Handle null target in expand_builtin_goacc_parlevel_id_size

2018-05-04 Thread Tom de Vries
[ was: Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size} ] On 01/18/2018 09:55 AM, Tom de Vries wrote: On 01/17/2018 06:51 PM, Jakub Jelinek wrote: On Wed, Jan 17, 2018 at 06:42:33PM +0100, Tom de Vries wrote: @@ -6602,6 +6604,71 @@ expand_stack_save (void

[og7, libgomp, openacc, nvptx, committed] Don't select too many workers

2018-05-04 Thread Tom de Vries
ild x86_64 with nvptx accelerator, tested libgomp. Committed to og7 branch. Thanks, - Tom [libgomp, openacc, nvptx] Don't select too many workers 2018-05-04 Tom de Vries PR libgomp/85649 * plugin/plugin-nvptx.c (MIN, MAX): Redefine. (nvptx_exec): Choose num_workers such that device has suffi

[nvptx, PR85653, committed] Add workaround for subsequent bar.syncs

2018-05-05 Thread Tom de Vries
ce, and reverted) with x86_64 with nvptx accelerator and tested libgomp. Committed to trunk. Thanks, - Tom [nvptx] Add workaround for subsequent bar.syncs 2018-05-04 Tom de Vries PR target/85653 * config/nvptx/nvptx.c (WORKAROUND_PTXJIT_BUG_3): Define. (workaround_barsyncs): New functi

Re: [PATCH] Add constant folding support for next{after,toward}{,f,l} (PR libstdc++/85466)

2018-05-07 Thread Tom de Vries
On 04/21/2018 07:36 PM, Jakub Jelinek wrote: * gcc.dg/nextafter-2.c: New test. Hi, FTR, I ran into a link error "unresolved symbol nexttowardf" using the standalone nvptx toolchain: ... PASS: gcc.dg/nextafter-1.c (test for excess errors) PASS: gcc.dg/nextafter-1.c execution test PASS

[openacc, testsuite] Allow installed testing of libgomp to find gomp-constants.h

2018-05-07 Thread Tom de Vries
[ was: Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size} ] On 05/03/2018 12:36 PM, Tom de Vries wrote: On 01/18/2018 09:55 AM, Tom de Vries wrote: diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-static-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang

Re: [PATCH] Add constant folding support for next{after,toward}{,f,l} (PR libstdc++/85466)

2018-05-08 Thread Tom de Vries
On 05/07/2018 03:41 PM, Christophe Lyon wrote: On 7 May 2018 at 12:04, Tom de Vries wrote: On 04/21/2018 07:36 PM, Jakub Jelinek wrote: * gcc.dg/nextafter-2.c: New test. Hi, FTR, I ran into a link error "unresolved symbol nexttowardf" using the standalone nvptx

[nvptx, PR85626, committed] Make trap insn noreturn

2018-05-09 Thread Tom de Vries
ed to trunk. Thanks, - Tom [nvptx] Make trap insn noreturn 2018-05-09 Tom de Vries PR target/85626 * config/nvptx/nvptx.md (define_insn "trap", define_insn "trap_if_true") (define_insn "trap_if_false"): Add exit after trap. --- gcc/config/nvptx/nvptx.md | 6 +++

Re: [PING] [PATCH, libgomp, openacc] Factor out async argument utility functions

2018-05-09 Thread Tom de Vries
On 05/01/2018 10:50 PM, Tom de Vries wrote: On 11/17/2017 02:18 PM, Tom de Vries wrote: Hi, I've factored out 3 new functions to test properties of enum acc_async_t: ... typedef enum acc_async_t {    /* Keep in sync with include/gomp-constants.h.  */    acc_async_noval = -1,    acc_async

Re: [PATCH, libgomp, openacc] Use GOMP_ASYNC_SYNC in GOACC_declare

2018-05-09 Thread Tom de Vries
On 11/17/2017 09:45 AM, Tom de Vries wrote: Hi, GOACC_enter_exit_data has this prototype: ... void GOACC_enter_exit_data (int device, size_t mapnum,    void **hostaddrs, size_t *sizes,    unsigned short *kinds,    int async, int

<    5   6   7   8   9   10   11   12   13   14   >