Re: [bfin] Avoid genrecog warning

2013-11-22 Thread Bernd Schmidt
On 11/22/2013 12:50 PM, Richard Sandiford wrote: > Bernd Schmidt writes: >> On 11/22/2013 09:15 AM, Richard Sandiford wrote: >>> +extern int push_multiple_operation_p (rtx); >>> +extern int pop_multiple_operation_p (rtx); >> >> I was once told that function

Re: [PATCH][1/3] Re-submission of Altera Nios II port, gcc parts

2013-11-22 Thread Bernd Schmidt
On 11/16/2013 11:01 AM, Chung-Lin Tang wrote: > My response to the various issues you raised are below. The new patch > has been re-tested. Please see if you can approve for committing now. I agree with all the comments Richard has been making, so I'll just add a few other points. > If you don't

Re: wide-int, bfin

2013-11-25 Thread Bernd Schmidt
On 11/23/2013 08:20 PM, Mike Stump wrote: > Richi has asked the we break the wide-int patch so that the individual port > and front end maintainers can review their parts without have to go through > the entire patch.This patch covers the bfin port. > > Ok? I haven't seen any updates on the

Re: [gomp4] Offload option handling

2014-10-06 Thread Bernd Schmidt
On 10/03/2014 04:31 PM, Andrey Turetskiy wrote: I've applied your option patch on our offload branch (w/o '-ftarget-options' switch yet) and it seems to be working fine. However the patch looks a bit unfinished: @@ -440,7 +554,11 @@ access_check (const char *name, int mode static char* pre

Re: [PATCH 4/n] OpenMP 4.0 offloading infrastructure: lto-wrapper

2014-10-09 Thread Bernd Schmidt
On 10/09/2014 02:07 PM, Ilya Verbin wrote: +#ifndef ACCEL_COMPILER /* We need to check standard_exec_prefix/just_machine_suffix/specs for any override of as, ld and libraries. */ specs_file = (char *) alloca (strlen (standard_exec_prefix) + strlen (just_mach

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2014-10-13 Thread Bernd Schmidt
On 10/13/2014 12:33 PM, Ilya Verbin wrote: On 13 Oct 12:19, Jakub Jelinek wrote: But I'd like to understand why is this one needed. Why should the compilers care? Aggregates layout and alignment of integral/floating types must match between host and offload compilers, sure, but isn't that somet

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2014-10-14 Thread Bernd Schmidt
On 10/14/2014 09:25 AM, Richard Biener wrote: On Mon, 13 Oct 2014, Bernd Schmidt wrote: On 10/13/2014 12:33 PM, Ilya Verbin wrote: On 13 Oct 12:19, Jakub Jelinek wrote: But I'd like to understand why is this one needed. Why should the compilers care? Aggregates layout and alignme

The nvptx port [0/11+]

2014-10-20 Thread Bernd Schmidt
This is a patch kit that adds the nvptx port to gcc. It contains preliminary patches to add needed functionality, the target files, and one somewhat optional patch with additional target tools. There'll be more patch series, one for the testsuite, and one to make the offload functionality work

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt
Since it's a virtual target, I've chosen not to run register allocation. This is one of the patches necessary to make that work, it primarily adds a target hook to disable it and fixes some of the fallout. Bernd

The nvptx port [1/11+] indirect jumps

2014-10-20 Thread Bernd Schmidt
ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be defined. Add a sorry. Bernd gcc/ * optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a sorry if necessary. Index: gcc/optabs.c ===

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt
Since it's a virtual target, I've chosen not to run register allocation. This is one of the patches necessary to make that work, it primarily adds a target hook to disable it and fixes some of the fallout. Bernd gcc/ * target.def (no_register_allocation): New data hook. * doc/tm.texi.in: A

Re: The nvptx port [3/11+] Struct returns

2014-10-20 Thread Bernd Schmidt
Even when returning a structure by passing an invisible reference, gcc still likes to set the return register to the address of the struct. This is undesirable on ptx where things like the return register have to be declared, and the function really returns void at ptx level. I've added a targe

The nvptx port [4/11+] Post-RA pipeline

2014-10-20 Thread Bernd Schmidt
This stops most of the post-regalloc passes to be run if the target doesn't want register allocation. I'd previously moved them all out of postreload to the toplevel, but Jakub (I think) pointed out that the idea is not to run them to avoid crashes if reload fails e.g. for an invalid asm. So I'

The nvptx port [5/11+] Variable declarations

2014-10-20 Thread Bernd Schmidt
ptx assembly follows rather different rules than what's typical elsewhere. We need a new hook to add a " };" string when we are finished outputting a variable with an initializer. Bernd gcc/ * target.def (decl_end): New hook. * varasm.c (assemble_variable_contents, assemble_constant_conten

The nvptx port [6/11+] Pseudo call args

2014-10-20 Thread Bernd Schmidt
On ptx, we'll be using pseudos to pass function args as well, and there's one assert that needs to be toned town to make that work. Bernd gcc/ * expr.c (use_reg_mode): Just return for pseudo registers. Index: gcc/expr.

The nvptx port [7/11+] Inform the port about call arguments

2014-10-20 Thread Bernd Schmidt
In ptx assembly we need to decorate call insns with the arguments that are being passed. We also need to know the exact function type. This is kind of hard to do with the existing infrastructure since things like function_arg are called at other times rather than just when emitting a call, so t

The nvptx port [8/11+] Write undefined decls.

2014-10-20 Thread Bernd Schmidt
ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Bernd gcc/ * target.def (assemble_undefined_decl): New hooks. * hooks.c (hook_void_FILEptr_constcharptr_const_tree): New function. * hooks.h (hook_void_FILEptr_constcharptr_const_tree

The nvptx port [9/11+] Epilogues

2014-10-20 Thread Bernd Schmidt
We skip the late compilation passes on ptx, but there's one piece we do need - fixing up the function so that we get return insns in the right places. This patch just makes thread_prologue_and_epilogue_insns callable from the reorg pass. Bernd gcc/ * function.c (thread_prologue_and_epilogue

The nvptx port [10/11+] Target files

2014-10-20 Thread Bernd Schmidt
@@ +/* NVPTX common hooks. + Copyright (C) 2014 Free Software Foundation, Inc. + Contributed by Bernd Schmidt + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software

The nvptx port [11/11] More tools.

2014-10-20 Thread Bernd Schmidt
tm_file="${tm_file} newlib-stdint.h" Index: git/gcc/config/nvptx/nvptx-as.c === --- /dev/null +++ git/gcc/config/nvptx/nvptx-as.c @@ -0,0 +1,961 @@ +/* An "assembler" for ptx. + Copyright (C) 2014 Free Software Foundation, Inc. + Contributed

Re: The nvptx port [0/11+]

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 10:18 AM, Richard Biener wrote: So with this restriction I wonder why it didn't make sense to go the HSA "backend" route emitting PTX from a GIMPLE SSA pass. This would have avoided the LTO dance as well ... Quite simple - there isn't an established way to do this. If I'd known

Re: The nvptx port [0/11+]

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 10:42 AM, Jakub Jelinek wrote: On Mon, Oct 20, 2014 at 04:17:56PM +0200, Bernd Schmidt wrote: * Can't emit initializers referring to their variable's address since you can't write forward declarations for variables. Can't that be handled by emitting the

[0/6] nvptx testsuite patches

2014-10-21 Thread Bernd Schmidt
This series modifies a large number of tests in order to clean up testsuite results on nvptx. The goal here was never really to get an entirely clean run - the target is just too different from conventional ones - but to be able to test the compiler sufficiently to be sure that it's in good sha

[1/6] nvptx testsuite patches: alloca

2014-10-21 Thread Bernd Schmidt
This deals with uses of alloca in the testsuite. Some tests require it outright, others only at -O0, and others require it implicitly by requiring an alignment for stack variables bigger than the target's STACK_BOUNDARY. For the latter I've added explicit xfails. Bernd gcc/testsuite/ * lib

[2/6] nvptx testsuite patches: typed assembly

2014-10-21 Thread Bernd Schmidt
Since everything in ptx assembly is typed, K&R C is problematic. There are a number of testcases that call functions with the wrong number of arguments, or arguments of the wrong type. I've added a new feature, untyped_assembly, which these tests now require. I've also used this for tests using

[3/6] nvptx testsuite patches: stdio

2014-10-21 Thread Bernd Schmidt
Some tests use stdio functions which are unavaiable with the cut-down newlib I'm using for ptx testing. I'm somewhat uncertain what to do with these; they are by no means the only unavailable library functions the testsuite tries to use (signal is another example). Here's a patch which deals wi

[4/6] nvptx testsuite patches: xfails and skips

2014-10-21 Thread Bernd Schmidt
Some things don't fit into nice categories that apply to a larger set of tests, or which are somewhat random like ptxas tool failures. For these I've added xfails and skips. Bernd gcc/testsuite/ * lib/target-supports.exp (check_effective_target_trampolines, check_profiling_available, check

[5/6] nvptx testsuite patches: jumps and labels

2014-10-21 Thread Bernd Schmidt
This deals with tests requiring indirect jumps (including tests using setjmp), label values, and nonlocal goto. A subset of these tests uses the NO_LABEL_VALUES macro, but it's not consistent across the testsuite. The feature test I wrote tests whether that is defined and returns false for lab

[6/7] Random tweaks

2014-10-21 Thread Bernd Schmidt
This tweaks a few tests so that we don't have to skip them. This is mostly concerned with declaring main properly, or changing other declarations where the test does not seem to rely on the type mismatches. I've also included one example of changing a function name to not be "call", ptxas see

[7/7] nvptx testsuite patches: Return addresses

2014-10-21 Thread Bernd Schmidt
This tests for availability of return addresses in a number of tests. Bernd gcc/testsuite/ * lib/target-supports.exp (check_effective_target_return_address): New function. * gcc.c-torture/execute/20010122-1.c: Require return_address. * gcc.c-torture/execute/20030323-1.c: Likewise. * gcc.c

Re: [4/6] nvptx testsuite patches: xfails and skips

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 05:27 PM, Jeff Law wrote: More ptx tooling failures than I'd expect. I'll leave it up to you whether or not to push on NVidia to fix some of those failures. The timeouts seem particularly troublesome. All I can say is that we've reported them. Bernd

Re: [2/6] nvptx testsuite patches: typed assembly

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 05:16 PM, Jeff Law wrote: On 10/21/14 14:15, Bernd Schmidt wrote: Since everything in ptx assembly is typed, K&R C is problematic. There are a number of testcases that call functions with the wrong number of arguments, or arguments of the wrong type. I've added a ne

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 08:26 PM, Jeff Law wrote: * optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a sorry if necessary. So doesn't this imply no hot-cold partitioning since we use indirect jumps to get across the partition? Similarly doesn't this imply other missing features (se

Re: [3/6] nvptx testsuite patches: stdio

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 09:01 PM, Mike Stump wrote: On Oct 21, 2014, at 7:17 AM, Bernd Schmidt wrote: Some tests use stdio functions which are unavaiable with the cut-down newlib I'm using for ptx testing. I'm somewhat uncertain what to do with these; they are by no means the only unavailab

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:11 PM, Jeff Law wrote: On 10/20/14 14:29, Bernd Schmidt wrote: In ptx assembly we need to decorate call insns with the arguments that are being passed. We also need to know the exact function type. This is kind of hard to do with the existing infrastructure since things like

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:30 PM, Jakub Jelinek wrote: At least for OpenMP, the best would be if the #pragma omp target regions and/or #pragma omp declare target functions contain anything a particular offloading accelerator can't handle, instead of failing the whole compilation perhaps just emit some at l

Avoid calls to realloc for nvptx

2014-10-21 Thread Bernd Schmidt
This is a followup patch for the nvptx port. Since malloc and free are magically provided by the ptx environment, but realloc is missing, it's nontrivial to provide an implementation for it. The Fortran frontend likes to generate calls to realloc, but in one case it seems like we can compute th

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:53 PM, Jeff Law wrote: So, in the end I'm torn. I don't like adding new hooks when they're not needed, but I have some reservations about relying on the order of stuff in CALL_INSN_FUNCTION_USAGE and I worry a bit that you might end up with stuff other than arguments on that li

Re: The nvptx port [8/11+] Write undefined decls.

2014-10-21 Thread Bernd Schmidt
On 10/22/2014 12:05 AM, Jeff Law wrote: On 10/20/14 14:30, Bernd Schmidt wrote: ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Does this need to happen at the use site, or can it be deferred? This is independent of use sites. The

Re: The nvptx port [11/11] More tools.

2014-10-22 Thread Bernd Schmidt
On 10/22/2014 10:31 PM, Jeff Law wrote: These tools currently require GNU extensions - something I probably ought to fix if we decide to add them to the gcc build itself. Would these be more appropriate in binutils? I don't think so, given that we don't need any piece of regular binutils. The

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2014-10-27 Thread Bernd Schmidt
On 10/15/2014 03:52 PM, Richard Biener wrote: I'd say that we eventually should have a type flag that says "this is a va-list type". If we really need to know that - because I don't understand why we need to do this - the context should tell us exactly whether we deal with a va_list object or n

Re: Fix a function decl in gfortran

2014-06-03 Thread Bernd Schmidt
On 05/27/2014 04:01 PM, Tobias Burnus wrote: Bernd Schmidt wrote: Compiling Fortran code with the ptx backend I'm working on results in assembler warnings about mismatch between function calls and function decls. Bootstrapped and tested on x86_64-linux. Ok? OK. The change/bug is due

Re: Fix a function decl in gfortran

2014-06-04 Thread Bernd Schmidt
On 06/04/2014 09:40 AM, Tobias Burnus wrote: Still untested patch, but I cannot resist pointing out stupid typos by myself. I intent to tests the build and test the patch - and then to commit it as obvious. If you see problems with this approach please scream now. I have no idea about the appr

Re: Fix a function decl in gfortran

2014-06-04 Thread Bernd Schmidt
On 06/04/2014 09:40 AM, Tobias Burnus wrote: Still untested patch, but I cannot resist pointing out stupid typos by myself. I intent to tests the build and test the patch - and then to commit it as obvious. If you see problems with this approach please scream now. Even with this applied, I'm s

Re: Fix a function decl in gfortran

2014-06-05 Thread Bernd Schmidt
On 06/04/2014 10:36 PM, Tobias Burnus wrote: Bernd Schmidt wrote: Even with this applied, I'm still seeing similar failures. I didn't claim that the patch would fix everything – nor that it was well tested. Just wanted to report back since the problem doesn't really sho

Re: Create a library for tools like collect2 and lto-wrapper (2/2)

2014-06-06 Thread Bernd Schmidt
e gcc111 machine). Ok? Bernd commit af9bca9c6439e3f8f31b40d5813a3d016b1f21e5 Author: Bernd Schmidt Date: Wed May 21 12:18:17 2014 +0200 Make a collect-utils library for use by tools like collect2 and lto-wrapper. * Makefile.in (ALL_HOST_BACKEND_OBJS): Add collect-utils.o. (lto-wrap

[gomp4] Offload option handling

2014-06-06 Thread Bernd Schmidt
There's a problem when offloading from a compiler for one target machine to another: the machine specific options don't necessarily match. This patch tries to address this. The idea is that since we have two options sections anyway, with different section name prefixes, we can arrange to pass

Re: [gomp4] Add tables generation

2014-06-10 Thread Bernd Schmidt
On 04/17/2014 08:33 PM, Ilya Verbin wrote: Could you please take a look at this patch? It fixes the ordering issue in the tables stated above, and passes all the tests that I have. But I'm not sure about its correctness from the architectural point of view. I'm still skeptical relying on orde

Re: RFA: speeding up dg-extract-results.sh

2014-06-12 Thread Bernd Schmidt
On 05/25/2014 11:35 AM, Richard Sandiford wrote: Bernd Schmidt writes: On 02/13/2014 10:18 AM, Richard Sandiford wrote: contrib/ * dg-extract-results.py: New file. * dg-extract-results.sh: Use it if the environment seems suitable. I'm now seeing the following: Trac

Regimplification enhancements 1/3

2014-06-16 Thread Bernd Schmidt
9d0 Author: Bernd Schmidt Date: Wed Jun 11 18:41:09 2014 +0200 Fix an issue with regimplification. This is in preparation for the lower-address-spaces pass for the ptx port. We need to teach the regimplifier how to handle the case when an ADDR_EXPR turns into something els

Regimplification enhancements 3/3

2014-06-16 Thread Bernd Schmidt
There's code in regimplification that makes us use an extra temporary when we encounter a call returning a non-BLKmode structure. This seems somewhat inefficient and unnecessary, and when used from the lower-addr-spaces pass I'm working on it leads to problems further down that look like tree-ssa

Regimplification enhancements 2/3

2014-06-16 Thread Bernd Schmidt
icate: gimplify_arg uses is_gimple_lvalue in some cases instead of is_gimple_val, and regimplification needs to match that. Bootstrapped and tested on x86_64-linux, ok? Bernd commit c1296ac4f4e7e8f0fb9c87d71ca8194a8eac0067 Author: Bernd Schmidt Date: Wed Jun 11 18:41:09 2014 +0200 Fix an

Re: Regimplification enhancements 1/3

2014-06-16 Thread Bernd Schmidt
On 06/16/2014 01:24 PM, Richard Biener wrote: On Mon, Jun 16, 2014 at 12:56 PM, Bernd Schmidt wrote: For the ptx port, I've needed to write a new pass which ensures all objects go into address spaces as required by the machine. This uses the regimplification code in gimplify-me.c, and

Re: Regimplification enhancements 1/3

2014-06-16 Thread Bernd Schmidt
On 06/16/2014 07:26 PM, Mike Stump wrote: On Jun 16, 2014, at 3:56 AM, Bernd Schmidt wrote: For the ptx port, I've needed to write a new pass which ensures all objects go into address spaces as required by the machine. I have such a machine and I’ve always approached the problem fro

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-06-17 Thread Bernd Schmidt
On 06/17/2014 08:20 PM, Ilya Verbin wrote: Hello Bernd, On 28 Feb 17:21, Bernd Schmidt wrote: For your use case, I'd imagine the offload compiler would be built relatively normally as a full build with "--enable-as-accelerator-for=x86_64-linux", which would install it into loca

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-06-18 Thread Bernd Schmidt
On 06/18/2014 04:13 PM, Ilya Verbin wrote: On 17 Jun 21:22, Bernd Schmidt wrote: On 06/17/2014 08:20 PM, Ilya Verbin wrote: I don't get this part of the plan. Where a host compiler will look for mkoffloads? E.g., first I configure/make/install the target gcc and corresponding mkof

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-06-27 Thread Bernd Schmidt
On 06/19/2014 12:19 PM, Ilya Verbin wrote: On 18 Jun 16:22, Bernd Schmidt wrote: What I think you need to do is For the first compiler: --enable-as-accelerator-for=x86_64-pc-linux-gnu --target=x86_64-intelmic-linux-gnu --prefix=/somewhere No --enable-accelerator options at all. This should

Don't use create_tmp_var for static vars

2014-06-27 Thread Bernd Schmidt
I discovered that create_tmp_var is used in the gfortran frontend to create static variables. IMO the function is not intended to do this, and it causes problems for a modification I need to make to it which assumes that it only creates local variables. So I've made a patch to make fortran dire

Re: Regimplification enhancements 3/3

2014-06-30 Thread Bernd Schmidt
On 06/17/2014 04:54 PM, Martin Jambor wrote: Weird... does the following (untested) patch help? diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 0afa197..747b1b6 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -3277,6 +3277,8 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)

Re: [patch 1/4] change specific int128 -> generic intN

2014-07-03 Thread Bernd Schmidt
On 07/03/2014 06:12 PM, DJ Delorie wrote: The hardware transfers data in and out of byte-oriented memory in TYPE_SIZE_UNITS chunks. Once in a hardware register, all operations are either 8, 16, or 20 bits (TYPE_SIZE) in size. So yes, values are padded in memory, but no, they are not padded in r

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-07-07 Thread Bernd Schmidt
On 07/07/2014 04:50 PM, Ilya Verbin wrote: 2) Or should I build accel compiler as a cross from x86_64-pc-linux-gnu to x86_64-intelmic-linux-gnu? Yes, that's the general idea. Will it help to distinguish the libs? Is libgomp the only problematic one? (Does the accel compiler even need one?)

Re: [PATCH 4/n] OpenMP 4.0 offloading infrastructure: lto-wrapper

2015-05-12 Thread Bernd Schmidt
On 05/12/2015 06:27 PM, Thomas Schwinge wrote: Patch variant 1: @@ -4266,7 +4266,7 @@ process_command (unsigned int decoded_op } gcc_assert (!IS_ABSOLUTE_PATH (tooldir_base_prefix)); - tooldir_prefix2 = concat (tooldir_base_prefix, spec_host_machine, + tooldir_prefix2 = concat (too

[gomp4] New builtins, preparation for oacc vector-single

2015-05-20 Thread Bernd Schmidt
-19 Bernd Schmidt + + * omp-builtins.def (GOACC_thread_broadcast, + GOACC_thread_broadcast_ll): New builtins. + * optabs.def (oacc_thread_broadcast_optab): New optab. + * builtins.c (expand_builtin_oacc_thread_broadcast): New function. + (expand_builtin): Use it. + * config/nvptx/nvptx.c

[gomp4] Unidirectional branches for nvptx

2015-05-20 Thread Bernd Schmidt
New field + warp_equal_pseudos. + * config/nvptx/nvptx.md (br_true, br_false): Add %U modifier. + 2015-05-19 Bernd Schmidt * omp-builtins.def (GOACC_thread_broadcast, Index: gcc/config/nvptx/nvptx.c === --- gcc/config/nvptx/nvptx.c (r

Re: [gomp4] New builtins, preparation for oacc vector-single

2015-05-20 Thread Bernd Schmidt
On 05/20/2015 02:39 PM, Jakub Jelinek wrote: On Wed, May 20, 2015 at 02:01:44PM +0200, Bernd Schmidt wrote: To implement OpenACC vector-single mode, we need to ensure that only one thread out of the group representing a worker executes. The others skip computations but follow along the CFG, so

Re: [nvptx] Re: Mostly rewrite genrecog

2015-05-21 Thread Bernd Schmidt
On 05/21/2015 09:12 AM, Thomas Schwinge wrote: OK to commit? gcc/ * config/nvptx/nvptx.md (allocate_stack): Rename to... (allocate_stack_): ... this, and add :P on both match_operand and unspec. (allocate_stack): New expander. If you really want to. It

[gomp4] Vector-single predication

2015-05-21 Thread Bernd Schmidt
23444) +++ gcc/ChangeLog.gomp (working copy) @@ -1,5 +1,15 @@ 2015-05-20 Bernd Schmidt + * omp-low.c (struct omp_region): Add a gwv_this field. + (bb_region_map): New variable. + (find_omp_for_region_data, find_omp_target_region_data): New static + functions. + (build_omp_regions_1): Call them. Buil

[gomp4] Avoiding predication for certain blocks

2015-05-29 Thread Bernd Schmidt
y) @@ -1,3 +1,18 @@ +2015-05-29 Bernd Schmidt + + * gimple.def (GIMPLE_OMP_ENTRY_END): New code. + * gimple.h (gimple_build_omp_entry_end): Declare. + (CASE_GIMPLE_OMP): Add GIMPLE_OMP_ENTRY_END. + * gimple.c (gimple_build_omp_entry_end): New function. + * gimple-low.c (lower_stm

[gomp4] A thread barrier builtin

2015-05-29 Thread Bernd Schmidt
(working copy) @@ -1,5 +1,13 @@ 2015-05-29 Bernd Schmidt + * config/nvptx/nvptx.md (UNSPECV_BARSYNC): New constant. + (oacc_threadbarrier): New expander. + (threadbarrier_insn): New pattern. + * config/nvptx/nvptx.c (nvptx_cannot_copy_insn_p): + * omp-builtins.def (BUILT_IN_GOACC_THREADBARRIER

[gomp4] Initialize some extra variables at the entry to an OpenACC offloaded region

2015-05-29 Thread Bernd Schmidt
=== --- gcc/ChangeLog.gomp (revision 223870) +++ gcc/ChangeLog.gomp (working copy) @@ -1,5 +1,10 @@ 2015-05-29 Bernd Schmidt + * omp-low.c (struct omp_context): Add worker_var and worker_count + fields. + (oacc_init_count_vars): New function. + (lower_omp_target): Call it. + * config

Re: [gomp4] Avoiding predication for certain blocks

2015-06-01 Thread Bernd Schmidt
On 06/01/2015 12:10 PM, Tom de Vries wrote: On 29/05/15 18:23, Bernd Schmidt wrote: When predicating the code for OpenACC, we should avoid the entry block in an offloaded region, which contains setup code that should be run in every thread. The following patch adds a new marker statement that

[gomp4] Worker-single predication

2015-06-01 Thread Bernd Schmidt
Index: gcc/ChangeLog.gomp === --- gcc/ChangeLog.gomp (revision 223974) +++ gcc/ChangeLog.gomp (working copy) @@ -1,3 +1,29 @@ +2015-06-01 Bernd Schmidt + + * gimple.h (struct gimple_statement_omp_parallel_layout): Add a + broadcast_a

Re: [gomp4] Worker-single predication

2015-06-03 Thread Bernd Schmidt
On 06/02/2015 01:06 PM, Thomas Schwinge wrote: On Mon, 1 Jun 2015 17:58:51 +0200, Bernd Schmidt wrote: This extends the previous vector-single support to also handle worker-level predication. [...] This causes the following regressions; would you please have a look? I committed the

Re: [WIP] OpenMP 4 NVPTX support

2015-04-22 Thread Bernd Schmidt
On 04/21/2015 05:58 PM, Jakub Jelinek wrote: suggests that while it is nice that when building nvptx accel compiler we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future hopefully libgomp.a), nothing attempts to link those in :(. I have that fixed; I expect I'll get around to po

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-04-28 Thread Bernd Schmidt
On 04/27/2015 06:08 PM, Thomas Schwinge wrote: OK to do the following instead? (Coding style/code copied from gcc/config/i386/intelmic-mkoffload.c for consistency.) Err, was this a question for me? I'm fine with that too. Bernd

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
Hi Jakub, I have some questions about nvptx: 1) you've said that alloca isn't supported, but it seems to be wired up and uses the %alloca documented in the PTX manual, what is the issue with that? %alloca not being actually implemented by the current PTX assembler or translator? Y

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 11:01 AM, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 09:29:48AM +0100, Jakub Jelinek wrote: I have some questions about nvptx: Oh, and 5) I have noticed gcc doesn't generate the .uni suffixes anywhere, while llvm generates them; are those appropriate only when a function

Re: system.h vs. C++ STL headers again

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 12:03 PM, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 11:57:57AM +0100, Richard Biener wrote: ? There are also some comments about stdarg.h and stdio.h ordering, dunno what it comes from and if it is still relevant when we require C++ compiler. I think we should simply discoura

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
I'm adding Thomas and Cesar to the Cc list, they may have more insight into CUDA library questions as I haven't really looked into that part all that much. On 11/14/2014 12:39 PM, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 12:09:03PM +0100, Bernd Schmidt wrote: I have some quest

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 01:36 PM, Jakub Jelinek wrote: Any way to query those limits? Size of .shared memory, number of threads in warp, number of warps, etc.? I'd have to google most of that. There seems to be a WARP_SZ constant available in ptx to get the size of the warp. In OpenACC, are all work

Re: RFC: Building a minimal libgfortran for nvptx

2014-11-14 Thread Bernd Schmidt
Hi Tobias, Does printf work? I thought I/O is not supported? Or does it just accept it for linking and drop it? I think Janne's patch has already dealt with the issue of stack allocation. printf (or more accurately vprintf) is supported by ptx as a magic builtin function. We have a printf wra

ptx debugging patch

2014-11-14 Thread Bernd Schmidt
The situation with debugging on ptx is a little strange - it allows .file and .loc directives for line numbers, and it provides a way to define dwarf2 debug sections - but as far as I can tell, there's no way of putting useful or accurate information into the latter. There's also the slight pro

Re: nvptx offloading patches [3/n], i386 bits RFD

2014-11-14 Thread Bernd Schmidt
On 11/05/2014 01:19 AM, Bernd Schmidt wrote: On 11/04/2014 10:50 PM, Jeff Law wrote: No, I don't think it's terminology. It's really that in effect we have two targets. One is a normal CPU, the other is a GPU. ie, there's nothing that says we won't have a GPU that&#x

Re: nvptx offloading patches [2/n]

2014-11-14 Thread Bernd Schmidt
On 11/03/2014 11:23 PM, Jeff Law wrote: On 11/01/14 05:51, Bernd Schmidt wrote: LTO has a mechanism not to stream out common nodes that are expected to be identical on each run. When using LTO to communicate between compilers for different targets, the va_list_type_node and related ones must be

Re: nvptx offloading patches [1/n]

2014-11-14 Thread Bernd Schmidt
On 11/05/2014 12:17 AM, Jeff Law wrote: On 11/04/14 14:08, Bernd Schmidt wrote: On 11/04/2014 10:01 PM, Jeff Law wrote: Communication between host and GPU is all done via some form of memcpy, so I wouldn't expect this to be a problem. They still need to agree on the layout of the stru

Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 08:19 PM, Segher Boessenkool wrote: + /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs), + make those two SETs separate I1 and I2 insns, and make an I0 that is + the original I1. */ + if (i0 == 0 + && GET_CODE (PATTERN (i2)) == PARALLEL + &&

Re: OpenACC middle end changes

2014-11-19 Thread Bernd Schmidt
On 11/19/2014 02:50 AM, Bernd Schmidt wrote: @@ -8417,6 +8926,9 @@ expand_omp_target (struct omp_region *region) /* Add the new function to the offload table. */ vec_safe_push (offload_funcs, child_fn); + /* Add the new function to the offload table

Re: OpenACC middle end changes

2014-11-19 Thread Bernd Schmidt
Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2014-11-20 Thread Bernd Schmidt
On 11/13/2014 05:06 AM, Jan Hubicka wrote: this patch adds infrastructure for proper streaming and merging of TREE_TARGET_OPTION. This breaks the offloading path via LTO since it introduces an incompatibility in LTO format between host and offload machine. A very quick patch to fix it is bel

Another ptx offloading patch

2014-11-20 Thread Bernd Schmidt
te this to use a dollar sign. The patch below does this at the lto-read stage. Bootstrapped on x86_64-linux, ok if testing is successful? Bernd commit 26b41de43c6db6e2368a9511c589c433b1e49c96 Author: Bernd Schmidt Date: Wed Nov 19 21:47:59 2014 +0100 Renaming for invalid symbols w

Re: OpenACC middle end changes

2014-11-20 Thread Bernd Schmidt
On 11/20/2014 07:52 AM, Jakub Jelinek wrote: On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote: Thomas had apparently already pointed out an issue with the new gomp_target class (there are multiple similar types of statements we want to handle with OpenACC, they have different codes

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2014-11-20 Thread Bernd Schmidt
On 11/20/2014 02:20 PM, Richard Biener wrote: On Thu, 20 Nov 2014, Bernd Schmidt wrote: On 11/13/2014 05:06 AM, Jan Hubicka wrote: this patch adds infrastructure for proper streaming and merging of TREE_TARGET_OPTION. This breaks the offloading path via LTO since it introduces an

Re: RFC: Building a minimal libgfortran for nvptx

2014-11-28 Thread Bernd Schmidt
On 11/14/2014 10:28 PM, Tobias Burnus wrote: All in all: Okay when tesing succeeded. I still prefer some words what's excluded (or included) in minimal as comment in configure.ac, but the patch is also okay without. I thought you meant something more than adding a comment. I've added this in t

Re: Option overriding in the offloading code path

2015-02-25 Thread Bernd Schmidt
On 02/25/2015 11:28 AM, Thomas Schwinge wrote: Am I on the right track with my assumption that it is correct that nvptx.c:nvptx_option_override is not invoked in the offloading code path, so we'd need a new target hook (?) to consolidate/override the options in this scenario? I'm surprised by

Re: [PATCH] gcc/config/c6x/c6x.md: Remove "clobber (match_scratch ...)" in "movmisalign_store".

2015-03-27 Thread Bernd Schmidt
ree situations and it didn't go through). Bernd diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8e4b6c1..d5535f9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,8 @@ 2015-03-27 Bernd Schmidt + * config/c6x/c6x.md (movmisalign): Use MEM_P, not + memory_operand. + PR targ

Fix PR65052

2015-03-27 Thread Bernd Schmidt
Bernd commit 432e3b7c5e3e47fdc9232805519d54f516c18008 Author: Bernd Schmidt Date: Fri Mar 27 13:32:31 2015 +0100 Fix c6x-uclinux build failure. * config/c6x/constraints.md (S3): New constraint. * config/c6x/c6x.md (real_jump): Use it. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5cee0a5.

Re: [gomp4] Preserve NVPTX "reconvergence" points

2015-06-19 Thread Bernd Schmidt
On 05/28/2015 05:08 PM, Jakub Jelinek wrote: I understand it is more work, I'd just like to ask that when designing stuff for the OpenACC offloading you (plural) try to take the other offloading devices and host fallback into account. The problem is that many of the transformations we need to

Re: [gomp4] Preserve NVPTX "reconvergence" points

2015-06-19 Thread Bernd Schmidt
On 06/19/2015 02:25 PM, Jakub Jelinek wrote: Emitting PTX specific code from current ompexp is highly undesirable of course, but I must say I'm not a big fan of keeping the GOMP_* gimple trees around for too long either, they've never meant to be used in low gimple, and even all the early optimiz

Re: [gomp4] Preserve NVPTX "reconvergence" points

2015-06-22 Thread Bernd Schmidt
On 06/19/2015 03:45 PM, Jakub Jelinek wrote: I actually believe having some optimization passes in between the ompexp and the lowering of the IR into the form PTX wants is highly desirable, the form with the worker-single or vector-single mode lowered will contain too complex CFG for many optimiz

Re: [gomp4] Preserve NVPTX "reconvergence" points

2015-06-22 Thread Bernd Schmidt
On 06/22/2015 04:24 PM, Jakub Jelinek wrote: I don't understand why lowering the way you suggest helps here at all. In the proposed scheme, you essentially have whole function in e.g. worker-single or vector-single mode, which you need to be able to handle properly in any case, because users can

<    3   4   5   6   7   8   9   10   11   12   >