Re: [C++] Coding rule enforcement

2015-09-16 Thread Nathan Sidwell
On 09/16/15 10:23, Jason Merrill wrote: On 09/16/2015 08:02 AM, Nathan Sidwell wrote: + else if (warn_multiple_inheritance) +warning (OPT_Wmultiple_inheritance, + "%qT defined with multiple direct bases", ref); You don't need to guard the warning with a check

Re: Openacc launch API

2015-09-16 Thread Nathan Sidwell
Ping? On 09/11/15 11:50, Nathan Sidwell wrote: Ping? https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01498.html On 09/07/15 08:48, Nathan Sidwell wrote: On 08/25/15 09:29, Nathan Sidwell wrote: Jakub, This patch changes the launch API for openacc parallels. The current scheme passes the

Re: Openacc launch API

2015-09-17 Thread Nathan Sidwell
On 09/17/15 05:36, Bernd Schmidt wrote: Fail how? Jakub has requested that it works but falls back to unaccelerated execution, can you confirm this is what you expect to happen with this patch? Yes, that is the failure mode. - if (num_waits) + va_start (ap, kinds); + /* TODO: This will ne

Re: Openacc launch API

2015-09-17 Thread Nathan Sidwell
existing code path and have a tagging flag, rather than duplicate it. nathan o2015-09-17 Nathan Sidwell inlude/ * gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment. (GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX, GOMP_DIM_MASK): New. (GOMP_LAUNCH_DIM

Re: [C++] Coding rule enforcement

2015-09-17 Thread Nathan Sidwell
On 09/16/15 10:23, Jason Merrill wrote: On 09/16/2015 08:02 AM, Nathan Sidwell wrote: + else if (warn_multiple_inheritance) +warning (OPT_Wmultiple_inheritance, + "%qT defined with multiple direct bases", ref); You don't need to guard the warning with a check

[gomp4] default reduction expansion

2015-09-18 Thread Nathan Sidwell
The default reduction expander was confusingly not placed with the other openacc default hooks, it also indirected to a bunch of worker functions all doing essentially the same thing, which obscured what was happening. Reimplemented thusly. nathan 2015-09-18 Nathan Sidwell * omp-low.c

Re: Openacc launch API

2015-09-18 Thread Nathan Sidwell
On 09/18/15 05:13, Bernd Schmidt wrote: Is that so difficult though? See if nvptx ignores (let's say) intelmic arguments in favour of the default and accepts nvptx ones. I'm sorry, I think it is unreasonable to require support in this patch for something that's not yet implemented in the rest

[gomp4] ptx reduction simplification

2015-09-18 Thread Nathan Sidwell
hanges are straightforwards. The vector initializer didn't need to create 2 new BBs -- just one for the intialization path. nathan 2015-09-18 Nathan Sidwell * omp-low.h (omp_reduction_init_op): Declare. * omp-low.c (omp_reduction_init_op): New, broken out of ... (omp_reduction_init): ... here. C

Re: Openacc launch API

2015-09-21 Thread Nathan Sidwell
Jakub? https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01287.html On 09/17/15 10:40, Nathan Sidwell wrote: Updated patch addressing your points. Some further comments though ... + while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0) + != (tag = va_arg (ap, unsigned))) That's a som

New post-LTO OpenACC pass

2015-09-21 Thread Nathan Sidwell
lly to omp-low.c in this patch, it ends up being more widely used. ok for trunk? nathan 2015-09-21 Nathan Sidwell Cesar Philippidis * omp-low.h (get_oacc_fn_attrib): Declare. * omp-low.c (get_oacc_fn_attrib): New. (oacc_xform_on_device): New. (execute_oacc_transform)

Re: [C++] Coding rule enforcement

2015-09-21 Thread Nathan Sidwell
On 09/21/15 12:23, Jason Merrill wrote: On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote: On 21 September 2015 at 15:46, Daniel Gutson wrote: FWIW, we could make this plugin in 2 weeks (w already have static checkers as plugins for our customers). I understand Nathan that you may have some d

Re: New post-LTO OpenACC pass

2015-09-21 Thread Nathan Sidwell
On 09/21/15 16:30, Cesar Philippidis wrote: On 09/21/2015 09:30 AM, Nathan Sidwell wrote: +const pass_data pass_data_oacc_transform = +{ + GIMPLE_PASS, /* type */ + "fold_oacc_transform", /* name */ Want to rename the tree dump file to oacc_xforms like I'm did in the

Re: [gomp4] ptx reduction simplification

2015-09-22 Thread Nathan Sidwell
On 09/22/15 11:10, Thomas Schwinge wrote: Hi! On Fri, 18 Sep 2015 20:05:48 -0400, Nathan Sidwell wrote: I've committed this patch to rework and simplify [...] the reduction lowering hooks. The current implementation [...] [was] overcomplicated in a number of ways. * omp-

Re: New post-LTO OpenACC pass

2015-09-22 Thread Nathan Sidwell
On 09/21/15 16:39, Nathan Sidwell wrote: On 09/21/15 16:30, Cesar Philippidis wrote: On 09/21/2015 09:30 AM, Nathan Sidwell wrote: +const pass_data pass_data_oacc_transform = +{ + GIMPLE_PASS, /* type */ + "fold_oacc_transform", /* name */ Want to rename the tree dump file to o

[gomp4] Another oacc reduction simplification

2015-09-22 Thread Nathan Sidwell
r for vector and worker loops. 2) Create a local private instance for all cases of reference var reductions, not just those in vector & worker loops 3) Generate the sequences of reduction functions in one go, rather than multiple scans of the reduction clauses. nathan 2015-09-22 Na

Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell
On 09/23/15 06:59, Bernd Schmidt wrote: On 09/22/2015 05:16 PM, Nathan Sidwell wrote: +if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE)) + /* acc_on_device must be evaluated at compile time for + constant arguments. */ + { +oacc_xform_on_device (call

Re: [gomp4] Another oacc reduction simplification

2015-09-23 Thread Nathan Sidwell
On 09/23/15 04:02, Thomas Schwinge wrote: Hi! On Tue, 22 Sep 2015 11:29:37 -0400, Nathan Sidwell wrote: I've committed this patch, which simplifies the generation of openacc reduction code. Aside from the progression mentioned in <http://news.gmane.org/find-root.php?message_id=%3C87

Re: [gomp4] lock/unlock internal fn

2015-09-23 Thread Nathan Sidwell
On 09/23/15 05:27, Thomas Schwinge wrote: Hi Nathan! On Mon, 17 Aug 2015 15:30:16 -0400, Nathan Sidwell wrote: I've committed this patch to add a new pair of internal functions. These will be used in implementing reductions. They'll be emitted around reduction finalization, and

[gomp4] vector reductions

2015-09-23 Thread Nathan Sidwell
I've committed this reimplementation of the vector shuffling code. In preparing a fix for the worker reductions (to use a lockless scheme), I wanted to check VIEW_CONVERT_EXPR DTRT. Use of gimplify_assign also reduces the code size. nathan 2015-09-23 Nathan Sidwell * config/

Re: [gomp4] lock/unlock internal fn

2015-09-23 Thread Nathan Sidwell
On 09/23/15 10:16, Thomas Schwinge wrote: Hi Nathan! On Wed, 23 Sep 2015 08:40:51 -0400, Nathan Sidwell wrote: On 09/23/15 05:27, Thomas Schwinge wrote: On Mon, 17 Aug 2015 15:30:16 -0400, Nathan Sidwell wrote: I've committed this patch to add a new pair of internal functions. These

[gomp4] oacc xform updates

2015-09-23 Thread Nathan Sidwell
I've committed this patch to change all the OACC hooks to take a gcall * rather than 'gimple'. mainline has changed the type of 'gimple', and we know we're passing a call anyway. Also updated the rescanning to be more straightforwards. nathan 2015-09-23 N

Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell
On 09/23/15 08:58, Bernd Schmidt wrote: On 09/23/2015 02:14 PM, Nathan Sidwell wrote: On 09/23/15 06:59, Bernd Schmidt wrote: On 09/22/2015 05:16 PM, Nathan Sidwell wrote: +if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE)) + /* acc_on_device must be evaluated at compile time

Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell
On 09/23/15 14:51, Bernd Schmidt wrote: On 09/23/2015 08:42 PM, Nathan Sidwell wrote: As I feared, builtin folding occurs in several places. In particular its first call is very early on in the host compiler, which is far too soon. We have to defer folding until we know whether we're

Re: [gomp4 0/8] NVPTX: initial OpenMP offloading

2015-09-24 Thread Nathan Sidwell
On 09/24/15 03:21, Jakub Jelinek wrote: So I'd like to ask Thomas/Nathan if they are ok with this stuff being on the gomp-4_0-branch for now, once all the prerequisities it needs are on the trunk, it can go into its own branch. Let Thomas & I think about it. Now that the new launch API is app

[gomp4] adjust worker reduction allocation

2015-09-24 Thread Nathan Sidwell
I've committed this patch to reduce the number of worker reduction allocation builtins. We now pass in the (constant) allocation size and alignment and return a void ptr. nathan 2015-09-24 Nathan Sidwell * config/nvptx/nvptx.c (nvptx_expand_work_red_addr): Args 0 & 1 are

Re: New post-LTO OpenACC pass

2015-09-24 Thread Nathan Sidwell
On 09/23/15 14:58, Nathan Sidwell wrote: On 09/23/15 14:51, Bernd Schmidt wrote: On 09/23/2015 08:42 PM, Nathan Sidwell wrote: As I feared, builtin folding occurs in several places. In particular its first call is very early on in the host compiler, which is far too soon. We have to defer

[gomp4] rework ptx builtins ... again

2015-09-24 Thread Nathan Sidwell
rtly. nathan 2015-09-24 Nathan Sidwell * config/nvptx/nvptx.c (struct builtin_description): Delete. (nvptx_expand_shuffle_down): Rename to ... (nvptx_expand_shuffle): ... here. add additional arg for type of shuffle. (nvptx_expand_work_red_addr): Rename to ... (nvptx_expand_worker_

Re: New post-LTO OpenACC pass

2015-09-25 Thread Nathan Sidwell
On 09/25/15 06:28, Bernd Schmidt wrote: This is the c-c++-common/goacc/acc_on_device-2.c testcase. Is that expected to be handled? If I change it to use __builtin_acc_on_device, I can step right into Breakpoint 8, fold_call_stmt (stmt=0x70736e10, ignore=false) at ../../git/gcc/builtins.c:1

Re: [gomp4] Another oacc reduction simplification

2015-09-25 Thread Nathan Sidwell
On 09/24/15 16:32, Cesar Philippidis wrote: On 09/22/2015 08:29 AM, Nathan Sidwell wrote: 1) Don't have a fake gang reduction outside of worker & vector loops. Deal with the receiver object directly. I.e. 'ref_to_res' need not be a null pointer for vector and worker loops.

Re: New post-LTO OpenACC pass

2015-09-28 Thread Nathan Sidwell
On 09/25/15 09:19, Bernd Schmidt wrote: On 09/25/2015 03:03 PM, Bernd Schmidt wrote: 182 else if (acc_device_type (acc_dev->type) == acc_device_host) (gdb) p acc_dev->type $1 = OFFLOAD_TARGET_TYPE_HOST (gdb) next 184 fn (hostaddrs); It's not running the offloaded version, so the t

[gomp4] lockless reductions

2015-09-28 Thread Nathan Sidwell
ry barriers, which would be worse). Why not just use the atomic cmp&swp later to get an initial value. initval(OP) is more than likely to be a correct guess for the first thread reaching here, so we save one memory access. 2015-09-28 Nathan Sidwell * config/nvptx/nvptx.md (atomic

[gomp4] remove goacc locking

2015-09-28 Thread Nathan Sidwell
I've committed this to remove the now no longer needed lock and unlock builtins and related infrastructure. nathan 2015-09-28 Nathan Sidwell * target.def (GOACC_LOCK): Delete hook. * doc/tm.texi.in (TARGET_GOACC_LOCK): Delete. * doc/tm.texi: Rebuilt. * targhooks.h (default_goacc

Re: Openacc launch API

2015-09-28 Thread Nathan Sidwell
itted the attached. Thanks for the review. nathan 2015-09-28 Nathan Sidwell inlude/ * gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment. (GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX, GOMP_DIM_MASK): New. (GOMP_LAUNCH_DIM, GOMP_LAUNCH_ASYNC, GOMP_LAUNCH_WAIT)

[gomp4] fold acc_on_device

2015-09-29 Thread Nathan Sidwell
than 2015-09-29 Nathan Sidwell gcc/ * omp-low.c (oacc_xform_on_device): Delete. (oacc_xform_dim): Return bool. (execute_oacc_transform): Don't handle acc_on_device here. Adjust rescan logic. * builtins.c (expand_builtin_acc_on_device): Delete. (expand_builtin): Do not call it. (f

[gomp4] Rename oacc_transform pass

2015-09-29 Thread Nathan Sidwell
I've committed this to gomp4 branch. It renames the oacc_transform pass to oacc_device_lower, in line with the (now withdrawn) patch for mainline. I'm preparing a version of the pass for mainline with a different initial use than acc_on_device folding. nathan 2015-09-29 Nath

Fold acc_on_device

2015-09-29 Thread Nathan Sidwell
n the host-side libgomp piece. Ok for trunk? nathan 2015-09-29 Nathan Sidwell gcc/ * builtins.c (expand_builtin_acc_on_device): Delete. (expand_builtin): Don't call it. (fold_builtin_1): Fold acc_on_device. libgomp/ * oacc-init.c (acc_on_device): Force optimization level. Inde

New OpenACC pass and Target Hook

2015-09-29 Thread Nathan Sidwell
hook, but currently does no validation. When the partitioned execution patch(es) are ready, it will make sense for the backend to validate -- this is already working on the branch, FWIW. ok for trunk? nathan 2015-09-29 Nathan Sidwell Cesar Philippidis gcc/ * config/nvp

[openacc] use cuda error routine

2015-09-29 Thread Nathan Sidwell
The cuda library has provided cuGetErrorString since at least 5.5, along with documentation of same. What's been missing until cuda 7.0 is a declaration in the cuda header file. I've merged this patch from the gomp4 branch to the nvptx libgomp plugin. nathan 2015-09-29 Nath

Re: Fold acc_on_device

2015-09-29 Thread Nathan Sidwell
On 09/29/15 15:52, Bernd Schmidt wrote: Ok, although I really don't quite see the need to drop the expander. Unnecessary code duplication. It's better to say something once in one place, than try and say it twice in two different places. nathan

Re: Fold acc_on_device

2015-09-30 Thread Nathan Sidwell
On 09/30/15 04:07, Richard Biener wrote: On Tue, Sep 29, 2015 at 8:21 PM, Nathan Sidwell wrote: This patch folds acc_on_device as a regular builtin, but postponed until we know which compiler we're in. As suggested by Bernd, we use the existing builtin folding machinery. Trunk is still

Re: Openacc launch API

2015-09-30 Thread Nathan Sidwell
On 09/30/15 08:37, Matthias Klose wrote: On 25.08.2015 15:29, Nathan Sidwell wrote: Jakub, This patch changes the launch API for openacc parallels. this broke the jit build. The following patch fixes the build for me. Ok to commit? Matthias 2015-09-30 Matthias Klose * jit

Re: Fold acc_on_device

2015-09-30 Thread Nathan Sidwell
On 09/30/15 08:46, Richard Biener wrote: I'll add a comment to builtins.c (not that I expect anyone sees it ;)) Put one instance at the default: label in expand_builtin? nathan

Re: New OpenACC pass and Target Hook

2015-09-30 Thread Nathan Sidwell
ooks ok to me. For avoidance of doubt, is this approval, or 'LGTM, but needs Jakub's approval'? nathan 2015-09-30 Nathan Sidwell Cesar Philippidis gcc/ * config/nvptx/nvptx.c (nvptx_goacc_validate_dims): New. (TARGET_GOACC_VALIDATE_DIMS): Override. * target.def (TA

Re: New OpenACC pass and Target Hook

2015-09-30 Thread Nathan Sidwell
On 09/30/15 11:52, Nathan Sidwell wrote: For avoidance of doubt, is this approval, or 'LGTM, but needs Jakub's approval'? Just noticed unnecessary white space change that patch contained. Updated here. nathan 2015-09-30 Nathan Sidwell Cesar Philippidis gcc/

Re: Fold acc_on_device

2015-09-30 Thread Nathan Sidwell
On 09/30/15 08:46, Richard Biener wrote: Please don't add any new GENERIC based builtin folders. Instead add to gimple-fold.c:gimple_fold_builtin Is this patch ok? nathan 2015-09-30 Nathan Sidwell * builtins.c: Don't include gomp-constants.h. (fold_builtin_1): Don't fol

ptx offload data format

2015-09-30 Thread Nathan Sidwell
f the changes to link_ptx were done by Bernd a while back. No change to the PTX ABI version number, as that just got incremented last week with the launch API change -- it's in a state of flux right now. nathan 2015-09-30 Nathan Sidwell gcc/ * config/nvptx/mkoffload.c (process): Chan

Re: [gomp4] remove goacc locking

2015-10-01 Thread Nathan Sidwell
On 10/01/15 04:14, Thomas Schwinge wrote: Hi Nathan! On Mon, 28 Sep 2015 11:56:09 -0400, Nathan Sidwell wrote: I've committed this to remove the now no longer needed lock and unlock builtins and related infrastructure. If I understand correctly, it is an implementation detail of the

Re: Fold acc_on_device

2015-10-01 Thread Nathan Sidwell
arg0, build_int_cst (integer_type_node, val_host)); gsi_insert_before (gsi, g); ... Like this? nathan 2015-10-01 Nathan Sidwell * builtins.c: Don't include gomp-constants.h. (fold_builtin_1): Don't fold acc_on_device here. * gimple-fold.c: Include g

[gomp4] backport some changes

2015-10-01 Thread Nathan Sidwell
I've applied this to gomp4 to apply some changes to these areas that occurred on merging to trunk. nathan 2015-10-01 Nathan Sidwell * config/nvptx/nvptx.c (nvptx_validate_dims): Rename to ... (nvptx_goacc_validate_dims): ... here. (TARGET_GOACC_VALIDATE_DIMS): Update. * targe

Re: Fold acc_on_device

2015-10-01 Thread Nathan Sidwell
On 10/01/15 08:46, Richard Biener wrote: On Thu, Oct 1, 2015 at 2:33 PM, Nathan Sidwell wrote: use TREE_TYPE (arg0) for the integer cst. Otherwise looks good to me. thanks, fixed up and applied (also noticed a copy & paste malfunction setting the location) nathan 2015-10-01 Na

[gomp4] gimple fold acc_on_device

2015-10-01 Thread Nathan Sidwell
I've applied this version of the acc_on_device folding to gomp4. See https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00074.html for the trunk discussion. nathan 2015-10-01 Nathan Sidwell * builtins.c: Don't include gomp-constants.h. (fold_builtin_1): Don't fold acc

Re: Fold acc_on_device

2015-10-01 Thread Nathan Sidwell
On 10/01/15 13:00, Andrew MacLeod wrote: btw, not that it's necessarily important, but I'm about to submit the include reduction patches today, and it turns out this line is the first use of anything from cgraph.h in builtins.c. So if this is "the way" of doing the test, be aware it adds a dep

Re: Fold acc_on_device

2015-10-06 Thread Nathan Sidwell
On 10/06/15 02:12, Segher Boessenkool wrote: On Thu, Oct 01, 2015 at 08:33:07AM -0400, Nathan Sidwell wrote: 2015-10-01 Nathan Sidwell * builtins.c: Don't include gomp-constants.h. (fold_builtin_1): Don't fold acc_on_device here. * gimple-fold.c: In

[PR 67861] fix printf_chk folding

2015-10-06 Thread Nathan Sidwell
I've committed this obvious fix. Sorry for the breakage. nathan 2015-10-06 Nathan Sidwell PR 67861 * gimple-fold.c (gimple_fold_builtin): Add break after BUILT_IN_PRINTF_CHK, BUILT_IN_VPRINTF_CHK folding. Index: gimple-f

[nvptx] fix some c++ tests

2015-10-08 Thread Nathan Sidwell
I've committed this to trunk. The C++ ABI now returns a pointer to the passed-in artificial arg that points to the return area. consequently return-in-mem and type_mode(return_type) == VOIDmode are not tautologies. nathan 2015-10-08 Nathan Sidwell * config/nvptx/nvptx.h (s

Re: [PR c/64765, c/64880] Support OpenACC Combined Directives in C, C++

2015-10-09 Thread Nathan Sidwell
On 10/08/15 12:39, Thomas Schwinge wrote: Hi! Some bits extracted out of gomp-4_0-branch, and some other bits rewritten; here is a patch to support OpenACC Combined Directives in C, C++. (The Fortran front end already does support these.) As far as I know, Jakub is not available at this time,

Re: [PR c/64765, c/64880] Support OpenACC Combined Directives in C, C++

2015-10-09 Thread Nathan Sidwell
On 10/09/15 09:26, Thomas Schwinge wrote: Hi! You mean the cp_parser_oacc_loop and cp_parser_oacc_kernels_parallel functions need documentation? I agree it's a bit terse, but documenting these by just listing the accepted parsing tokens "# pragma acc loop" etc., followed by the *_CLAUSE_MASKs

[gomp4]

2015-10-09 Thread Nathan Sidwell
I've applied this to gomp4 branch. 1) ports the break fix in gimple-fold from trunk 2) fixes missing tab in ptx output. nathan 2015-10-09 Nathan Sidwell * config/nvptx/nvptx.c (nvptx_init_axis_predicate): Fix output formatting. PR 67861 * gimple-fold.c (gimple_fold_builtin): Add

[gomp4] remove bogus tests

2015-10-10 Thread Nathan Sidwell
the erroneous case for the moment. If anyone's wondering, a patch I'm working on blew up on these two cases because it tried to manipulate the loop and then discovered it wasn't in an offloaded function. nathan 2015-10-10 Nathan Sidwell * c-c++-common/goacc-gomp/nestin

Re: [gomp4] remove bogus tests

2015-10-11 Thread Nathan Sidwell
On 10/10/15 11:00, Nathan Sidwell wrote: I've committed this to gomp4 branch. Both these tests are trying an 'acc loop' outside of an offload region. That's an error. Missed that goacc/nesting-1 had 2 bogus loops. nathan 2015-10-11 Nathan Sidwell * c-c++-comm

Re: [PR c/64765, c/64880] Support OpenACC Combined Directives in C, C++

2015-10-11 Thread Nathan Sidwell
On 10/09/15 09:59, Thomas Schwinge wrote: It's s string describing the pragma as parsed thus far. Again, not documenting that as well as our usage of it is totally "standard", see OpenMP's cp_parser_omp_parallel, cp_parser_omp_for, and many more. Ok, I'm not going to hold this to a higher th

[gomp4] internal fn cleanup

2015-10-11 Thread Nathan Sidwell
I've committed this to gomp4 1) move IFN_UNIQUE constants to the IFN_UNIQUE function definition 2) Update IFN_GOACC_REDUCTION_* comments to match the renamed oacc_device_lower pass. nathan 2015-10-11 Nathan Sidwell * internal-fn.def (IFN_UNIQUE_UNSPEC, IFN_UNIQUE_OACC

[gomp4] OpenACC loop expand reorg

2015-10-12 Thread Nathan Sidwell
and optimize (b) the implementation will be device_type friendly, as device-specific choices will all have been moved to the target compiler. nathan 2015-10-12 Nathan Sidwell * omp-low.c (expand_omp_for_static_nochunk): Remove OpenACC pieces. (expand_omp_for_static_chunk): Likewise, (struct

[gomp4] More openacc loop indirection

2015-10-13 Thread Nathan Sidwell
I've committed this next patch in my series to move loop partitioning decisions to the target compiler. It introduces 2 more IFN_UNIQUE cases, marking the head and tail sequences of an openACC loop. These are added around the reduction and fork/join regions. In the oacc_device_lower pass we

[gomp4] More deferral of partitioning to target

2015-10-14 Thread Nathan Sidwell
cing the dummy axis argument of the appropriate builtins with the specific chosen axis. The next step is to iterate over the body doing the same for the loop abstraction builtin. nathan 2015-10-14 Nathan Sidwell * omp-low.c (struct oacc_loop): Add more fields. (enum oacc_loop_fl

[gomp4] remove dead code

2015-10-14 Thread Nathan Sidwell
I've committed this to gomp4 branch. It removes some now unreachable code and removes the now bogus description about OpenACC. nathan 2015-10-14 Nathan Sidwell * omp-low.c (lower_reduction_clauses): Correct comment, remove unreachable code. Index: gcc/omp-

[gomp4]

2015-10-15 Thread Nathan Sidwell
block(s) justy after the header marker looking for these functions, and set the determined partitioning mask and chunking. The next patch will complete this transition. nathan 2015-10-15 Nathan Sidwell * omp-low.c (struct oacc_loop): Add chunk_size and head_end fields. (extract_omp_for_data)

[gomp4] fix routine-7 test

2015-10-15 Thread Nathan Sidwell
I've committed this to gomp4 branch. It fixes the routine-7 regression I caused when reworking the reduction machinery. nathan 2015-10-15 Nathan Sidwell * omp-low.c (lower_oacc_reductions): Check outer context is a target before lookup. Index: gcc/omp-

[gomp4] reorder functions

2015-10-15 Thread Nathan Sidwell
I've applied this to move the execute_oacc_device_lower function later in the file. I'll shortly be changing it to explicitly call a default oacc handler, and reordering makes the diff confusing (diff choses to make this diff confusing enough). nathan 2015-10-15 Nathan Sidwell

[gomp4] small oacc cleanup

2015-10-16 Thread Nathan Sidwell
(b) I'm going to shortly be emitting diagnostics from the device compiler, and we don't want to only deliver ones from the first offloaded function. nathan 2015-10-16 Nathan Sidwell * omp-low.c (build_outer_var_ref): Just check for openacc function attrib. (pass_oacc_device_lowe

[gomp4] break some tests apart

2015-10-18 Thread Nathan Sidwell
ather unwieldy to work with, and check for what will become later-checked diagnostics, as well as earlier ones. So this patch simply breaks the test cases apart, to reduce this interaction. Committed to gomp4 branch. nathan 2015-10-18 Nathan Sidwell * c-c++-common/goacc/loop-2.c: Break apa

[gomp4] fortran testcase

2015-10-18 Thread Nathan Sidwell
series I'm working on (which exposed this problem). I've adjusted the testcase to specify a partitioning, and marked the test as xfailing. nathan 2015-10-18 Nathan Sidwell * gfortran.dg/goacc/reduction-2.f95: Force loop partitioning and xfail. Index: gcc/testsuite/gfortran.dg/goacc

[gomp4] loop partitioning

2015-10-18 Thread Nathan Sidwell
achinery. The changes to the testcases is changing the expected diagnostic text, and expect more information, such as indicating between which two loops conflicts are occurring. nathan 2015-10-19 Nathan Sidwell gcc/ * omp-low.c (struct omp_region): Remove gwv_this field. (struct omp_c

[gomp4] auto partitioning

2015-10-19 Thread Nathan Sidwell
x27;seq' loop. nathan 2015-10-19 Nathan Sidwell gcc/ * omp-low.c (oacc_loop_auto_partitions): New. (oacc_loop_partition): Call it. gcc/testsuite/ * gfortran.dg/goacc/routine-4.f90: Add diagnostic. * gfortran.dg/goacc/routine-5.f90: Add diagnostic. * c-c++-common/goacc-gomp/nest

[gomp4] loop cleanup

2015-10-19 Thread Nathan Sidwell
I've committed this to gomp4. 1) small cleanup combining the bodies of two identical conditionals. 2) replace and move the OpenACC thread numbering expanders to be nearer the now sole user. nathan 2015-10-19 Nathan Sidwell * omp-low.c (scan_omp_for): Combine OpenACC condit

[gomp4] Update openacc loop iteration partitioning

2015-10-20 Thread Nathan Sidwell
r the latter we want this to expand to a regular loop iterator. Applied to gomp4 branch. nathan 2015-10-20 Nathan Sidwell gcc/ * omp-low.c (expand_oacc_for): Use -1 for unspecified static chunking. Remove unnecessary gimple forcing. (oacc_xform_loop): Adjust chunk size calculation.

[gomp4] lto error message

2015-10-20 Thread Nathan Sidwell
Another small cleanup I noticed. We can use %qD to print a decl name. Applied to gomp4 branch. nathan 2015-10-20 Nathan Sidwell * lto-cgraph.c (input_overwrite_node): Cleanup openacc diagnostic emission. Index: gcc/lto-cgraph.c

Re: [gomp4] lto error message

2015-10-20 Thread Nathan Sidwell
On 10/20/15 16:20, Ilya Verbin wrote: On Tue, Oct 20, 2015 at 15:54:45 -0400, Nathan Sidwell wrote: There might be a situation when some func or var is lost during regular LTO, even if flag_openacc is present. In this case "missing OpenACC ..." message would be wrong. And if flag_

[OpenACC] fix a couple of tests

2015-10-20 Thread Nathan Sidwell
gnostic on such a large vector length. But that's a check for a different testcase. nathan 2015-10-20 Nathan Sidwell * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Set sane vector_length. * testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise. Index: libgomp/testsuite/l

[OpenACC] loop nesting check

2015-10-21 Thread Nathan Sidwell
they are immediately inside one of: 1) another openacc loop 2) an openacc offload region 3) an openacc routine The broken tests are amended with the now expected diagnostic. tested on x86_64-linux-gnu with nvptx accelerator. ok for trunk? nathan 2015-10-21 Nathan Sidwell gcc/ * omp-low.c

Re: [OpenACC] loop nesting check

2015-10-21 Thread Nathan Sidwell
On 10/21/15 12:06, Bernd Schmidt wrote: Were they just compile tests? Yes, some of the tests already expected errors, but missed some. I think one test didn't expect an error, but is a clearly bogus test. nathan

[gomp4] loop nesting check

2015-10-21 Thread Nathan Sidwell
This is the gomp4-branch variant of the loop nesting patch I just committed to trunk. The gomp4 branch had some checking, but a) it didn't catch all erroreous cases b) gave an ambiguous error, by not mentioning 'OpenACC' committed to gomp4 nathan 2015-10-21 Nathan Sidwell

Re: Constify host-side offload data`

2015-10-21 Thread Nathan Sidwell
On 10/21/15 13:33, Ilya Verbin wrote: Hi! This happens because .gnu.offload_{funcs,vars} sections in crtoffload{begin,end}.o now doesn't have WRITE flag, but the same sections produced by omp_finish_file has it. When linker joins writable + nonwritable sections from several objects, it insert

[OpenACC 0/11] execution model

2015-10-21 Thread Nathan Sidwell
I'll be posting a patch series for trunk, which implements the core of the OpenACC execution model. This is split into the following patches: 01-trunk-unique.patch Internal function with a 'uniqueness' property 02-trunk-nvptx-partition.patch NVPTX backend patch set for partitioned execution

[OpenACC 0/11] execution model

2015-10-21 Thread Nathan Sidwell
I'll be posting a patch series for trunk, which implements the core of the OpenACC execution model. This is split into the following patches: 01-trunk-unique.patch Internal function with a 'uniqueness' property 02-trunk-nvptx-partition.patch NVPTX backend patch set for partitioned execution

Re: [OpenACC 1/11] UNIQUE internal function

2015-10-21 Thread Nathan Sidwell
l fns, all with the unique property, as the latter would need (at least) a range check in gimple_call_internal_unique_p rather than a simple equality. Jakub, IYR I originally had IFN_FORK and IFN_JOIN as such distinct internal fns. This replaces that scheme. ok? nathan 2015-10-20 N

Re: [OpenACC 3/11] new target hook

2015-10-21 Thread Nathan Sidwell
he size of that dimension is 1. The default implementation of the hook never only cares if the oacc_fork and oacc_join RTL expanders exist (and they don't on the host compiler). nathan 2015-10-20 Nathan Sidwell * target.def (fork_join): New GOACC hook. * targhooks.h (default_

Re: [OpenACC 2/11] PTX backend changes

2015-10-21 Thread Nathan Sidwell
fore the fork and then fill from that buffer just after the fork. For the worker axis, explicit sync instructions are needed before and after accessing the shared memory state. Bernd, any comments? nathan 2015-10-20 Nathan Sidwell * config/nvptx/nvptx.h (struct machine_function): Add axis

Re: [OpenACC 4/11] C FE changes

2015-10-21 Thread Nathan Sidwell
This patch implements changes to the C parser to deal with the 'gang', 'worker', 'vector', 'seq' and 'auto' clauses on an OpenACC loop directive. The first 3 can take a numeric argument, which is used within a kernels offload region and the gang clause can take an additional 'static' argument,

Re: [OpenACC 5/11] C++ FE changes

2015-10-21 Thread Nathan Sidwell
the clause name. nathan 2015-10-20 Cesar Philippidis Thomas Schwinge James Norris Joseph Myers Julian Brown Nathan Sidwell * parser.c (cp_parser_omp_clause_name): Add auto, gang, seq, vector, worker. (cp_parser_oacc_simple_clause): New. (cp_parse

Re: [OpenACC 6/11] Reduction initialization

2015-10-21 Thread Nathan Sidwell
ading and then copying it to the device. Thus at the end of the region, any slots that weren't used have a sensible initial value which will not destroy the reduction result. This code should be short lived ... nathan 2015-10-20 Nathan Sidwell * omp-low.c (o

Re: [OpenACC 7/11] execution model

2015-10-21 Thread Nathan Sidwell
atch introduces 4 variants of the IFN_UNIQUE function along with a new IFN_LOOP function. I see I also included IFN_OACC_DIM_POS and IFN_OACC_DIM_SIZE which provide the size of a compute axis and a position along it. Those are actually used in the next patch and not here. nathan 2015-10-20

Re: [OpenACC 9/11] oacc_device_lower pass gate

2015-10-21 Thread Nathan Sidwell
This patch is obvious, but included for completeness. We always want to run the device lowering pass (when openacc is enabled), in order to delete the marker and loop functions that should never be seen after this point. nathan 2015-10-20 Nathan Sidwell * omp-low.c

Re: [OpenACC 10/11] remove plugin restriction

2015-10-21 Thread Nathan Sidwell
Here's another obvious patch. The ptx plugin no longer needs to barf on gang or worker dimensions of non-unity. nathan 2015-10-20 Nathan Sidwell * plugin/plugin-nvptx.c (nvptx_exec): Remove check on compute dimensions. Index: libgomp/plugin/plugin-nv

Re: [OpenACC 11/11] execution tests

2015-10-21 Thread Nathan Sidwell
This patch has some new execution tests, verifying loop partitioning is behaving as expected. There are more execution tests on the gomp4 branch, but many of them use reductions. We'll merge those once reductions are merged. nathan 2015-10-20 Nathan Sidwell * testsuite/libgomp.oac

Re: [OpenACC 8/11] device-specific lowering

2015-10-21 Thread Nathan Sidwell
7. nathan 2015-10-20 Nathan Sidwell * omp-low.c: Include gimple-pretty-print.h. (struct oacc_loop): New. (oacc_thread_numbers): New. (oacc_xform_loop): New. (new_oacc_loop_raw, new_oacc_loop_outer, new_oacc_loop, new_oacc_loop_routine, finish_oacc_loop, free_oacc_loop): New,

Re: [OpenACC 11/11] execution tests

2015-10-21 Thread Nathan Sidwell
On 10/21/15 16:14, Ilya Verbin wrote: <11-trunk-tests.patch> Does the testcase with offload IR appear here accidentally? D'oh! yup, fixed. nathan 2015-10-20 Nathan Sidwell * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: New. * testsuite/libgomp.oacc-c-c++-common/loop

Re: [OpenACC 7/11] execution model

2015-10-22 Thread Nathan Sidwell
On 10/22/15 05:23, Jakub Jelinek wrote: On Wed, Oct 21, 2015 at 03:42:26PM -0400, Nathan Sidwell wrote: +/* Flags for an OpenACC loop. */ + +enum oacc_loop_flags + { Weird formatting. I see either Blame emacs (I thought it was configured for GNU formatting ...) + expr = build2

Re: [OpenACC 8/11] device-specific lowering

2015-10-22 Thread Nathan Sidwell
On 10/22/15 05:31, Jakub Jelinek wrote: On Wed, Oct 21, 2015 at 03:49:08PM -0400, Nathan Sidwell wrote: So, how do you expand the OACC loops on non-PTX devices (host, or say XeonPhi)? Do you drop the IFNs and replace stuff with normal loops? On a non ptx target (canonical example being the

Re: [OpenACC 1/11] UNIQUE internal function

2015-10-22 Thread Nathan Sidwell
On 10/22/15 04:07, Richard Biener wrote: On Thu, Oct 22, 2015 at 10:04 AM, Jakub Jelinek wrote: Do you have to scan the whole bb? E.g. don't or should not those unique IFNs force end of bb? Yeah, please make them either end or start a BB so we have to check at most a single stmt. ECF_RETU

<    1   2   3   4   5   6   7   8   9   10   >