On 09/16/15 10:23, Jason Merrill wrote:
On 09/16/2015 08:02 AM, Nathan Sidwell wrote:
+ else if (warn_multiple_inheritance)
+warning (OPT_Wmultiple_inheritance,
+ "%qT defined with multiple direct bases", ref);
You don't need to guard the warning with a check
Ping?
On 09/11/15 11:50, Nathan Sidwell wrote:
Ping?
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01498.html
On 09/07/15 08:48, Nathan Sidwell wrote:
On 08/25/15 09:29, Nathan Sidwell wrote:
Jakub,
This patch changes the launch API for openacc parallels. The current scheme
passes the
On 09/17/15 05:36, Bernd Schmidt wrote:
Fail how? Jakub has requested that it works but falls back to unaccelerated
execution, can you confirm this is what you expect to happen with this patch?
Yes, that is the failure mode.
- if (num_waits)
+ va_start (ap, kinds);
+ /* TODO: This will ne
existing code path and have a tagging flag, rather
than duplicate it.
nathan
o2015-09-17 Nathan Sidwell
inlude/
* gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment.
(GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX,
GOMP_DIM_MASK): New.
(GOMP_LAUNCH_DIM
On 09/16/15 10:23, Jason Merrill wrote:
On 09/16/2015 08:02 AM, Nathan Sidwell wrote:
+ else if (warn_multiple_inheritance)
+warning (OPT_Wmultiple_inheritance,
+ "%qT defined with multiple direct bases", ref);
You don't need to guard the warning with a check
The default reduction expander was confusingly not placed with the other openacc
default hooks, it also indirected to a bunch of worker functions all doing
essentially the same thing, which obscured what was happening.
Reimplemented thusly.
nathan
2015-09-18 Nathan Sidwell
* omp-low.c
On 09/18/15 05:13, Bernd Schmidt wrote:
Is that so difficult though? See if nvptx ignores (let's say) intelmic arguments
in favour of the default and accepts nvptx ones.
I'm sorry, I think it is unreasonable to require support in this patch for
something that's not yet implemented in the rest
hanges are straightforwards. The vector initializer didn't need to
create 2 new BBs -- just one for the intialization path.
nathan
2015-09-18 Nathan Sidwell
* omp-low.h (omp_reduction_init_op): Declare.
* omp-low.c (omp_reduction_init_op): New, broken out of ...
(omp_reduction_init): ... here. C
Jakub?
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01287.html
On 09/17/15 10:40, Nathan Sidwell wrote:
Updated patch addressing your points. Some further comments though ...
+ while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0)
+ != (tag = va_arg (ap, unsigned)))
That's a som
lly to omp-low.c in this patch, it ends up being more widely used.
ok for trunk?
nathan
2015-09-21 Nathan Sidwell
Cesar Philippidis
* omp-low.h (get_oacc_fn_attrib): Declare.
* omp-low.c (get_oacc_fn_attrib): New.
(oacc_xform_on_device): New.
(execute_oacc_transform)
On 09/21/15 12:23, Jason Merrill wrote:
On 09/21/2015 10:01 AM, Manuel López-Ibáñez wrote:
On 21 September 2015 at 15:46, Daniel Gutson
wrote:
FWIW, we could make this plugin in 2 weeks (w already have static
checkers as plugins for our customers). I understand Nathan that you
may have some d
On 09/21/15 16:30, Cesar Philippidis wrote:
On 09/21/2015 09:30 AM, Nathan Sidwell wrote:
+const pass_data pass_data_oacc_transform =
+{
+ GIMPLE_PASS, /* type */
+ "fold_oacc_transform", /* name */
Want to rename the tree dump file to oacc_xforms like I'm did in the
On 09/22/15 11:10, Thomas Schwinge wrote:
Hi!
On Fri, 18 Sep 2015 20:05:48 -0400, Nathan Sidwell wrote:
I've committed this patch to rework and simplify [...]
the reduction lowering hooks.
The current implementation [...]
[was] overcomplicated in a number of ways.
* omp-
On 09/21/15 16:39, Nathan Sidwell wrote:
On 09/21/15 16:30, Cesar Philippidis wrote:
On 09/21/2015 09:30 AM, Nathan Sidwell wrote:
+const pass_data pass_data_oacc_transform =
+{
+ GIMPLE_PASS, /* type */
+ "fold_oacc_transform", /* name */
Want to rename the tree dump file to o
r for
vector and worker loops.
2) Create a local private instance for all cases of reference var reductions,
not just those in vector & worker loops
3) Generate the sequences of reduction functions in one go, rather than multiple
scans of the reduction clauses.
nathan
2015-09-22 Na
On 09/23/15 06:59, Bernd Schmidt wrote:
On 09/22/2015 05:16 PM, Nathan Sidwell wrote:
+if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+ /* acc_on_device must be evaluated at compile time for
+ constant arguments. */
+ {
+oacc_xform_on_device (call
On 09/23/15 04:02, Thomas Schwinge wrote:
Hi!
On Tue, 22 Sep 2015 11:29:37 -0400, Nathan Sidwell wrote:
I've committed this patch, which simplifies the generation of openacc reduction
code.
Aside from the progression mentioned in
<http://news.gmane.org/find-root.php?message_id=%3C87
On 09/23/15 05:27, Thomas Schwinge wrote:
Hi Nathan!
On Mon, 17 Aug 2015 15:30:16 -0400, Nathan Sidwell wrote:
I've committed this patch to add a new pair of internal functions. These will
be used in implementing reductions.
They'll be emitted around reduction finalization, and
I've committed this reimplementation of the vector shuffling code. In preparing
a fix for the worker reductions (to use a lockless scheme), I wanted to check
VIEW_CONVERT_EXPR DTRT. Use of gimplify_assign also reduces the code size.
nathan
2015-09-23 Nathan Sidwell
* config/
On 09/23/15 10:16, Thomas Schwinge wrote:
Hi Nathan!
On Wed, 23 Sep 2015 08:40:51 -0400, Nathan Sidwell
wrote:
On 09/23/15 05:27, Thomas Schwinge wrote:
On Mon, 17 Aug 2015 15:30:16 -0400, Nathan Sidwell wrote:
I've committed this patch to add a new pair of internal functions. These
I've committed this patch to change all the OACC hooks to take a gcall * rather
than 'gimple'. mainline has changed the type of 'gimple', and we know we're
passing a call anyway. Also updated the rescanning to be more straightforwards.
nathan
2015-09-23 N
On 09/23/15 08:58, Bernd Schmidt wrote:
On 09/23/2015 02:14 PM, Nathan Sidwell wrote:
On 09/23/15 06:59, Bernd Schmidt wrote:
On 09/22/2015 05:16 PM, Nathan Sidwell wrote:
+if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+ /* acc_on_device must be evaluated at compile time
On 09/23/15 14:51, Bernd Schmidt wrote:
On 09/23/2015 08:42 PM, Nathan Sidwell wrote:
As I feared, builtin folding occurs in several places. In particular
its first call is very early on in the host compiler, which is far too
soon.
We have to defer folding until we know whether we're
On 09/24/15 03:21, Jakub Jelinek wrote:
So I'd like to ask Thomas/Nathan if they are ok with this stuff being on
the gomp-4_0-branch for now, once all the prerequisities it needs are on the
trunk, it can go into its own branch.
Let Thomas & I think about it. Now that the new launch API is app
I've committed this patch to reduce the number of worker reduction allocation
builtins. We now pass in the (constant) allocation size and alignment and
return a void ptr.
nathan
2015-09-24 Nathan Sidwell
* config/nvptx/nvptx.c (nvptx_expand_work_red_addr): Args 0 & 1
are
On 09/23/15 14:58, Nathan Sidwell wrote:
On 09/23/15 14:51, Bernd Schmidt wrote:
On 09/23/2015 08:42 PM, Nathan Sidwell wrote:
As I feared, builtin folding occurs in several places. In particular
its first call is very early on in the host compiler, which is far too
soon.
We have to defer
rtly.
nathan
2015-09-24 Nathan Sidwell
* config/nvptx/nvptx.c (struct builtin_description): Delete.
(nvptx_expand_shuffle_down): Rename to ...
(nvptx_expand_shuffle): ... here. add additional arg for type of
shuffle.
(nvptx_expand_work_red_addr): Rename to ...
(nvptx_expand_worker_
On 09/25/15 06:28, Bernd Schmidt wrote:
This is the c-c++-common/goacc/acc_on_device-2.c testcase. Is that expected to
be handled? If I change it to use __builtin_acc_on_device, I can step right into
Breakpoint 8, fold_call_stmt (stmt=0x70736e10, ignore=false) at
../../git/gcc/builtins.c:1
On 09/24/15 16:32, Cesar Philippidis wrote:
On 09/22/2015 08:29 AM, Nathan Sidwell wrote:
1) Don't have a fake gang reduction outside of worker & vector loops.
Deal with the receiver object directly. I.e. 'ref_to_res' need not be a
null pointer for vector and worker loops.
On 09/25/15 09:19, Bernd Schmidt wrote:
On 09/25/2015 03:03 PM, Bernd Schmidt wrote:
182 else if (acc_device_type (acc_dev->type) == acc_device_host)
(gdb) p acc_dev->type
$1 = OFFLOAD_TARGET_TYPE_HOST
(gdb) next
184 fn (hostaddrs);
It's not running the offloaded version, so the t
ry barriers, which would be worse). Why not just use the atomic cmp&swp
later to get an initial value. initval(OP) is more than likely to be a correct
guess for the first thread reaching here, so we save one memory access.
2015-09-28 Nathan Sidwell
* config/nvptx/nvptx.md (atomic
I've committed this to remove the now no longer needed lock and unlock builtins
and related infrastructure.
nathan
2015-09-28 Nathan Sidwell
* target.def (GOACC_LOCK): Delete hook.
* doc/tm.texi.in (TARGET_GOACC_LOCK): Delete.
* doc/tm.texi: Rebuilt.
* targhooks.h (default_goacc
itted the attached. Thanks for the review.
nathan
2015-09-28 Nathan Sidwell
inlude/
* gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment.
(GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX,
GOMP_DIM_MASK): New.
(GOMP_LAUNCH_DIM, GOMP_LAUNCH_ASYNC, GOMP_LAUNCH_WAIT)
than
2015-09-29 Nathan Sidwell
gcc/
* omp-low.c (oacc_xform_on_device): Delete.
(oacc_xform_dim): Return bool.
(execute_oacc_transform): Don't handle acc_on_device here. Adjust
rescan logic.
* builtins.c (expand_builtin_acc_on_device): Delete.
(expand_builtin): Do not call it.
(f
I've committed this to gomp4 branch. It renames the oacc_transform pass to
oacc_device_lower, in line with the (now withdrawn) patch for mainline.
I'm preparing a version of the pass for mainline with a different initial use
than acc_on_device folding.
nathan
2015-09-29 Nath
n the host-side libgomp piece.
Ok for trunk?
nathan
2015-09-29 Nathan Sidwell
gcc/
* builtins.c (expand_builtin_acc_on_device): Delete.
(expand_builtin): Don't call it.
(fold_builtin_1): Fold acc_on_device.
libgomp/
* oacc-init.c (acc_on_device): Force optimization level.
Inde
hook, but
currently does no validation. When the partitioned execution patch(es) are
ready, it will make sense for the backend to validate -- this is already working
on the branch, FWIW.
ok for trunk?
nathan
2015-09-29 Nathan Sidwell
Cesar Philippidis
gcc/
* config/nvp
The cuda library has provided cuGetErrorString since at least 5.5, along with
documentation of same. What's been missing until cuda 7.0 is a declaration in
the cuda header file.
I've merged this patch from the gomp4 branch to the nvptx libgomp plugin.
nathan
2015-09-29 Nath
On 09/29/15 15:52, Bernd Schmidt wrote:
Ok, although I really don't quite see the need to drop the expander.
Unnecessary code duplication. It's better to say something once in one place,
than try and say it twice in two different places.
nathan
On 09/30/15 04:07, Richard Biener wrote:
On Tue, Sep 29, 2015 at 8:21 PM, Nathan Sidwell wrote:
This patch folds acc_on_device as a regular builtin, but postponed until we
know which compiler we're in. As suggested by Bernd, we use the existing
builtin folding machinery.
Trunk is still
On 09/30/15 08:37, Matthias Klose wrote:
On 25.08.2015 15:29, Nathan Sidwell wrote:
Jakub,
This patch changes the launch API for openacc parallels.
this broke the jit build.
The following patch fixes the build for me. Ok to commit?
Matthias
2015-09-30 Matthias Klose
* jit
On 09/30/15 08:46, Richard Biener wrote:
I'll add a comment to builtins.c
(not that I expect anyone sees it ;))
Put one instance at the default: label in expand_builtin?
nathan
ooks ok to me.
For avoidance of doubt, is this approval, or 'LGTM, but needs Jakub's approval'?
nathan
2015-09-30 Nathan Sidwell
Cesar Philippidis
gcc/
* config/nvptx/nvptx.c (nvptx_goacc_validate_dims): New.
(TARGET_GOACC_VALIDATE_DIMS): Override.
* target.def (TA
On 09/30/15 11:52, Nathan Sidwell wrote:
For avoidance of doubt, is this approval, or 'LGTM, but needs Jakub's approval'?
Just noticed unnecessary white space change that patch contained. Updated here.
nathan
2015-09-30 Nathan Sidwell
Cesar Philippidis
gcc/
On 09/30/15 08:46, Richard Biener wrote:
Please don't add any new GENERIC based builtin folders. Instead add to
gimple-fold.c:gimple_fold_builtin
Is this patch ok?
nathan
2015-09-30 Nathan Sidwell
* builtins.c: Don't include gomp-constants.h.
(fold_builtin_1): Don't fol
f the changes to link_ptx were done by Bernd a while back.
No change to the PTX ABI version number, as that just got incremented last week
with the launch API change -- it's in a state of flux right now.
nathan
2015-09-30 Nathan Sidwell
gcc/
* config/nvptx/mkoffload.c (process): Chan
On 10/01/15 04:14, Thomas Schwinge wrote:
Hi Nathan!
On Mon, 28 Sep 2015 11:56:09 -0400, Nathan Sidwell wrote:
I've committed this to remove the now no longer needed lock and unlock builtins
and related infrastructure.
If I understand correctly, it is an implementation detail of the
arg0,
build_int_cst (integer_type_node, val_host));
gsi_insert_before (gsi, g);
...
Like this?
nathan
2015-10-01 Nathan Sidwell
* builtins.c: Don't include gomp-constants.h.
(fold_builtin_1): Don't fold acc_on_device here.
* gimple-fold.c: Include g
I've applied this to gomp4 to apply some changes to these areas that occurred on
merging to trunk.
nathan
2015-10-01 Nathan Sidwell
* config/nvptx/nvptx.c (nvptx_validate_dims): Rename to ...
(nvptx_goacc_validate_dims): ... here.
(TARGET_GOACC_VALIDATE_DIMS): Update.
* targe
On 10/01/15 08:46, Richard Biener wrote:
On Thu, Oct 1, 2015 at 2:33 PM, Nathan Sidwell wrote:
use TREE_TYPE (arg0) for the integer cst.
Otherwise looks good to me.
thanks,
fixed up and applied (also noticed a copy & paste malfunction setting the
location)
nathan
2015-10-01 Na
I've applied this version of the acc_on_device folding to gomp4.
See https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00074.html for the trunk
discussion.
nathan
2015-10-01 Nathan Sidwell
* builtins.c: Don't include gomp-constants.h.
(fold_builtin_1): Don't fold acc
On 10/01/15 13:00, Andrew MacLeod wrote:
btw, not that it's necessarily important, but I'm about to submit the include
reduction patches today, and it turns out this line is the first use of
anything from cgraph.h in builtins.c.
So if this is "the way" of doing the test, be aware it adds a dep
On 10/06/15 02:12, Segher Boessenkool wrote:
On Thu, Oct 01, 2015 at 08:33:07AM -0400, Nathan Sidwell wrote:
2015-10-01 Nathan Sidwell
* builtins.c: Don't include gomp-constants.h.
(fold_builtin_1): Don't fold acc_on_device here.
* gimple-fold.c: In
I've committed this obvious fix. Sorry for the breakage.
nathan
2015-10-06 Nathan Sidwell
PR 67861
* gimple-fold.c (gimple_fold_builtin): Add break after
BUILT_IN_PRINTF_CHK, BUILT_IN_VPRINTF_CHK folding.
Index: gimple-f
I've committed this to trunk. The C++ ABI now returns a pointer to the
passed-in artificial arg that points to the return area. consequently
return-in-mem and type_mode(return_type) == VOIDmode are not tautologies.
nathan
2015-10-08 Nathan Sidwell
* config/nvptx/nvptx.h (s
On 10/08/15 12:39, Thomas Schwinge wrote:
Hi!
Some bits extracted out of gomp-4_0-branch, and some other bits
rewritten; here is a patch to support OpenACC Combined Directives in C,
C++. (The Fortran front end already does support these.)
As far as I know, Jakub is not available at this time,
On 10/09/15 09:26, Thomas Schwinge wrote:
Hi!
You mean the cp_parser_oacc_loop and cp_parser_oacc_kernels_parallel
functions need documentation? I agree it's a bit terse, but documenting
these by just listing the accepted parsing tokens "# pragma acc loop"
etc., followed by the *_CLAUSE_MASKs
I've applied this to gomp4 branch.
1) ports the break fix in gimple-fold from trunk
2) fixes missing tab in ptx output.
nathan
2015-10-09 Nathan Sidwell
* config/nvptx/nvptx.c (nvptx_init_axis_predicate): Fix output
formatting.
PR 67861
* gimple-fold.c (gimple_fold_builtin): Add
the
erroneous case for the moment.
If anyone's wondering, a patch I'm working on blew up on these two cases
because it tried to manipulate the loop and then discovered it wasn't in an
offloaded function.
nathan
2015-10-10 Nathan Sidwell
* c-c++-common/goacc-gomp/nestin
On 10/10/15 11:00, Nathan Sidwell wrote:
I've committed this to gomp4 branch. Both these tests are trying an 'acc loop'
outside of an offload region. That's an error.
Missed that goacc/nesting-1 had 2 bogus loops.
nathan
2015-10-11 Nathan Sidwell
* c-c++-comm
On 10/09/15 09:59, Thomas Schwinge wrote:
It's s string describing the pragma as parsed thus far. Again, not
documenting that as well as our usage of it is totally "standard", see
OpenMP's cp_parser_omp_parallel, cp_parser_omp_for, and many more.
Ok, I'm not going to hold this to a higher th
I've committed this to gomp4
1) move IFN_UNIQUE constants to the IFN_UNIQUE function definition
2) Update IFN_GOACC_REDUCTION_* comments to match the renamed oacc_device_lower
pass.
nathan
2015-10-11 Nathan Sidwell
* internal-fn.def (IFN_UNIQUE_UNSPEC, IFN_UNIQUE_OACC
and optimize
(b) the implementation will be device_type friendly, as device-specific choices
will all have been moved to the target compiler.
nathan
2015-10-12 Nathan Sidwell
* omp-low.c (expand_omp_for_static_nochunk): Remove OpenACC
pieces.
(expand_omp_for_static_chunk): Likewise,
(struct
I've committed this next patch in my series to move loop partitioning decisions
to the target compiler.
It introduces 2 more IFN_UNIQUE cases, marking the head and tail sequences of an
openACC loop. These are added around the reduction and fork/join regions. In
the oacc_device_lower pass we
cing
the dummy axis argument of the appropriate builtins with the specific chosen axis.
The next step is to iterate over the body doing the same for the loop
abstraction builtin.
nathan
2015-10-14 Nathan Sidwell
* omp-low.c (struct oacc_loop): Add more fields.
(enum oacc_loop_fl
I've committed this to gomp4 branch. It removes some now unreachable code and
removes the now bogus description about OpenACC.
nathan
2015-10-14 Nathan Sidwell
* omp-low.c (lower_reduction_clauses): Correct comment, remove
unreachable code.
Index: gcc/omp-
block(s) justy after the header marker looking for these
functions, and set the determined partitioning mask and chunking.
The next patch will complete this transition.
nathan
2015-10-15 Nathan Sidwell
* omp-low.c (struct oacc_loop): Add chunk_size and head_end
fields.
(extract_omp_for_data)
I've committed this to gomp4 branch. It fixes the routine-7 regression I
caused when reworking the reduction machinery.
nathan
2015-10-15 Nathan Sidwell
* omp-low.c (lower_oacc_reductions): Check outer context is a
target before lookup.
Index: gcc/omp-
I've applied this to move the execute_oacc_device_lower function later in the
file. I'll shortly be changing it to explicitly call a default oacc handler,
and reordering makes the diff confusing (diff choses to make this diff confusing
enough).
nathan
2015-10-15 Nathan Sidwell
(b) I'm going to shortly be emitting diagnostics from the device compiler,
and we don't want to only deliver ones from the first offloaded function.
nathan
2015-10-16 Nathan Sidwell
* omp-low.c (build_outer_var_ref): Just check for openacc function
attrib.
(pass_oacc_device_lowe
ather unwieldy to work with, and check for what will
become later-checked diagnostics, as well as earlier ones. So this patch simply
breaks the test cases apart, to reduce this interaction.
Committed to gomp4 branch.
nathan
2015-10-18 Nathan Sidwell
* c-c++-common/goacc/loop-2.c: Break apa
series I'm working on (which exposed this problem).
I've adjusted the testcase to specify a partitioning, and marked the test as
xfailing.
nathan
2015-10-18 Nathan Sidwell
* gfortran.dg/goacc/reduction-2.f95: Force loop partitioning and
xfail.
Index: gcc/testsuite/gfortran.dg/goacc
achinery.
The changes to the testcases is changing the expected diagnostic text, and
expect more information, such as indicating between which two loops conflicts
are occurring.
nathan
2015-10-19 Nathan Sidwell
gcc/
* omp-low.c (struct omp_region): Remove gwv_this field.
(struct omp_c
x27;seq'
loop.
nathan
2015-10-19 Nathan Sidwell
gcc/
* omp-low.c (oacc_loop_auto_partitions): New.
(oacc_loop_partition): Call it.
gcc/testsuite/
* gfortran.dg/goacc/routine-4.f90: Add diagnostic.
* gfortran.dg/goacc/routine-5.f90: Add diagnostic.
* c-c++-common/goacc-gomp/nest
I've committed this to gomp4.
1) small cleanup combining the bodies of two identical conditionals.
2) replace and move the OpenACC thread numbering expanders to be nearer the now
sole user.
nathan
2015-10-19 Nathan Sidwell
* omp-low.c (scan_omp_for): Combine OpenACC condit
r the latter we want this to
expand to a regular loop iterator.
Applied to gomp4 branch.
nathan
2015-10-20 Nathan Sidwell
gcc/
* omp-low.c (expand_oacc_for): Use -1 for unspecified static
chunking. Remove unnecessary gimple forcing.
(oacc_xform_loop): Adjust chunk size calculation.
Another small cleanup I noticed. We can use %qD to print a decl name.
Applied to gomp4 branch.
nathan
2015-10-20 Nathan Sidwell
* lto-cgraph.c (input_overwrite_node): Cleanup openacc diagnostic
emission.
Index: gcc/lto-cgraph.c
On 10/20/15 16:20, Ilya Verbin wrote:
On Tue, Oct 20, 2015 at 15:54:45 -0400, Nathan Sidwell wrote:
There might be a situation when some func or var is lost during regular LTO,
even if flag_openacc is present. In this case "missing OpenACC ..." message
would be wrong. And if flag_
gnostic on such a large vector length. But that's a check for a
different testcase.
nathan
2015-10-20 Nathan Sidwell
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Set sane
vector_length.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
Index: libgomp/testsuite/l
they
are immediately inside one of:
1) another openacc loop
2) an openacc offload region
3) an openacc routine
The broken tests are amended with the now expected diagnostic.
tested on x86_64-linux-gnu with nvptx accelerator.
ok for trunk?
nathan
2015-10-21 Nathan Sidwell
gcc/
* omp-low.c
On 10/21/15 12:06, Bernd Schmidt wrote:
Were they just compile tests?
Yes, some of the tests already expected errors, but missed some. I think one
test didn't expect an error, but is a clearly bogus test.
nathan
This is the gomp4-branch variant of the loop nesting patch I just committed to
trunk. The gomp4 branch had some checking, but
a) it didn't catch all erroreous cases
b) gave an ambiguous error, by not mentioning 'OpenACC'
committed to gomp4
nathan
2015-10-21 Nathan Sidwell
On 10/21/15 13:33, Ilya Verbin wrote:
Hi!
This happens because .gnu.offload_{funcs,vars} sections in
crtoffload{begin,end}.o now doesn't have WRITE flag, but the same sections
produced by omp_finish_file has it. When linker joins writable + nonwritable
sections from several objects, it insert
I'll be posting a patch series for trunk, which implements the core of the
OpenACC execution model. This is split into the following patches:
01-trunk-unique.patch
Internal function with a 'uniqueness' property
02-trunk-nvptx-partition.patch
NVPTX backend patch set for partitioned execution
I'll be posting a patch series for trunk, which implements the core of the
OpenACC execution model. This is split into the following patches:
01-trunk-unique.patch
Internal function with a 'uniqueness' property
02-trunk-nvptx-partition.patch
NVPTX backend patch set for partitioned execution
l fns, all with the unique property, as the latter would
need (at least) a range check in gimple_call_internal_unique_p rather than a
simple equality.
Jakub, IYR I originally had IFN_FORK and IFN_JOIN as such distinct internal fns.
This replaces that scheme.
ok?
nathan
2015-10-20 N
he size of that dimension is 1.
The default implementation of the hook never only cares if the oacc_fork and
oacc_join RTL expanders exist (and they don't on the host compiler).
nathan
2015-10-20 Nathan Sidwell
* target.def (fork_join): New GOACC hook.
* targhooks.h (default_
fore the fork and then fill
from that buffer just after the fork.
For the worker axis, explicit sync instructions are needed before and after
accessing the shared memory state.
Bernd, any comments?
nathan
2015-10-20 Nathan Sidwell
* config/nvptx/nvptx.h (struct machine_function): Add
axis
This patch implements changes to the C parser to deal with the 'gang',
'worker', 'vector', 'seq' and 'auto' clauses on an OpenACC loop directive.
The first 3 can take a numeric argument, which is used within a kernels offload
region and the gang clause can take an additional 'static' argument,
the clause name.
nathan
2015-10-20 Cesar Philippidis
Thomas Schwinge
James Norris
Joseph Myers
Julian Brown
Nathan Sidwell
* parser.c (cp_parser_omp_clause_name): Add auto, gang, seq,
vector, worker.
(cp_parser_oacc_simple_clause): New.
(cp_parse
ading and then copying it to the device. Thus at the
end of the region, any slots that weren't used have a sensible initial value
which will not destroy the reduction result.
This code should be short lived ...
nathan
2015-10-20 Nathan Sidwell
* omp-low.c (o
atch introduces 4 variants of the IFN_UNIQUE function along with a new
IFN_LOOP function. I see I also included IFN_OACC_DIM_POS and IFN_OACC_DIM_SIZE
which provide the size of a compute axis and a position along it. Those are
actually used in the next patch and not here.
nathan
2015-10-20
This patch is obvious, but included for completeness. We always want to run the
device lowering pass (when openacc is enabled), in order to delete the marker
and loop functions that should never be seen after this point.
nathan
2015-10-20 Nathan Sidwell
* omp-low.c
Here's another obvious patch. The ptx plugin no longer needs to barf on gang or
worker dimensions of non-unity.
nathan
2015-10-20 Nathan Sidwell
* plugin/plugin-nvptx.c (nvptx_exec): Remove check on compute
dimensions.
Index: libgomp/plugin/plugin-nv
This patch has some new execution tests, verifying loop partitioning is behaving
as expected.
There are more execution tests on the gomp4 branch, but many of them use
reductions. We'll merge those once reductions are merged.
nathan
2015-10-20 Nathan Sidwell
* testsuite/libgomp.oac
7.
nathan
2015-10-20 Nathan Sidwell
* omp-low.c: Include gimple-pretty-print.h.
(struct oacc_loop): New.
(oacc_thread_numbers): New.
(oacc_xform_loop): New.
(new_oacc_loop_raw, new_oacc_loop_outer, new_oacc_loop,
new_oacc_loop_routine, finish_oacc_loop, free_oacc_loop): New,
On 10/21/15 16:14, Ilya Verbin wrote:
<11-trunk-tests.patch>
Does the testcase with offload IR appear here accidentally?
D'oh! yup, fixed.
nathan
2015-10-20 Nathan Sidwell
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop
On 10/22/15 05:23, Jakub Jelinek wrote:
On Wed, Oct 21, 2015 at 03:42:26PM -0400, Nathan Sidwell wrote:
+/* Flags for an OpenACC loop. */
+
+enum oacc_loop_flags
+ {
Weird formatting. I see either
Blame emacs (I thought it was configured for GNU formatting ...)
+ expr = build2
On 10/22/15 05:31, Jakub Jelinek wrote:
On Wed, Oct 21, 2015 at 03:49:08PM -0400, Nathan Sidwell wrote:
So, how do you expand the OACC loops on non-PTX devices (host, or say
XeonPhi)? Do you drop the IFNs and replace stuff with normal loops?
On a non ptx target (canonical example being the
On 10/22/15 04:07, Richard Biener wrote:
On Thu, Oct 22, 2015 at 10:04 AM, Jakub Jelinek wrote:
Do you have to scan the whole bb? E.g. don't or should not those
unique IFNs force end of bb?
Yeah, please make them either end or start a BB so we have to check
at most a single stmt. ECF_RETU
401 - 500 of 2551 matches
Mail list logo