Re: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

2016-10-03 Thread Christophe Lyon
On 2 October 2016 at 23:05, Doug Gilmore  wrote:
> Hi Christophe,
>
>> From: Christophe Lyon [christophe.l...@linaro.org]
>> Sent: Saturday, October 01, 2016 7:57 AM
>> To: Doug Gilmore
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: Fix PR tree-optimization/77808, ICE in 
>> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>
>> Hi Doug,
>>
>> ...
>> I can confirm that your patch fixes the ICE I was seeing.
>>
>> However, the new testcase does not pass on low end
>> architectures:
>> cc1: warning: -fprefetch-loop-arrays not supported for this target
>> (try -march switches)
>>
>> Can you add a guard?
>>
>> Thanks,
>>
>> Christophe
> I updated the test to only run on X86, MIPS and AARCH64.  Is that OK?
>

I'm afraid not.

The ICE occurred on some arm targets. By "low end" I meant armv5t for
example, as opposed to armv7t.
Is there a suitable effective target?

Thanks,

Christophe

> Thanks,
>
> Doug


Re: [Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

2016-10-03 Thread James Greenhalgh
On Fri, Sep 30, 2016 at 05:57:45PM +, Joseph Myers wrote:
> On Fri, 30 Sep 2016, Jeff Law wrote:
> 
> > On 09/30/2016 11:34 AM, Joseph Myers wrote:
> > > On Fri, 30 Sep 2016, James Greenhalgh wrote:
> > > 
> > > > +  case EXCESS_PRECISION_TYPE_STANDARD:
> > > > +  case EXCESS_PRECISION_TYPE_IMPLICIT:
> > > > +   /* Otherwise, the excess precision we want when we are
> > > > +  in a standards compliant mode, and the implicit precision we
> > > > +  provide can be identical.  */
> > > > +   return FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE;
> > > 
> > > That's wrong for EXCESS_PRECISION_TYPE_IMPLICIT.  There is no implicit
> > > promotion in the back end (and really there shouldn't be any excess
> > > precision here at all, and double_t in glibc should be fixed along with a
> > > GCC change to remove this mistake).
> > Sorry, change to a NAK.
> > 
> > Joseph, what's the right thing to do here?
> 
> (a) The present patch would keep the existing value of FLT_EVAL_METHOD.  
> But the existing value is inaccurate for the default compilation mode, 
> when there is no implicit promotion in the back end, and doing so means 
> suboptimal code in libgcc and glibc because it does things to handle 
> excess precision that isn't actually there (and quite possibly in code 
> elsewhere that looks at FLT_EVAL_METHOD).
> 
> (b) Handling EXCESS_PRECISION_TYPE_IMPLICIT like 
> EXCESS_PRECISION_TYPE_FAST would accurately describe what the back end 
> does.  It would mean that the default FLT_EVAL_METHOD is 0, which is a 
> more accurate description of how the compiler actually behaves, and would 
> avoid the suboptimal code in libgcc and glibc.  It would however mean that 
> unless -fexcess-precision=standard is used, FLT_EVAL_METHOD (accurate) is 
> out of synx with float_t in math.h (inaccurate).
> 
> (c) Removing all special excess precision for S/390 from GCC, and changing 
> float_t to float in glibc, is logically correct and produces optimal code.  
> float_t does not appear in the ABI of any glibc function; in principle it 
> could affect the ABIs of other libraries, but I don't think that's 
> particularly likely.
> 
> The only argument for (a) is that's it's semantics-preserving - it's just 
> that the preserved semantics are nonsensical and involve an inaccurate 
> value of FLT_EVAL_METHOD in the default compilation mode.

I'm happy progressing whichever of a) or b) would be preferred by the
the s390 maintainers. But I'd be uncomfortable making the wider changes
in c) as I've got no access to an s390 build and test environment in which
I have any confidence, nor do I know the s390 port history that led to the
'typedef double float_t' in glibc.

Regardless of which approach is chosen, I'll be sure to update the patch
with a comment paraphrasing your suggestions above.

Thanks,
James



Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-10-03 Thread Martin Liška
On 08/18/2016 03:36 PM, Jakub Jelinek wrote:
> On Thu, May 12, 2016 at 04:12:21PM +0200, Martin Liška wrote:
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -243,6 +243,11 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value;
>>  static bool asan_shadow_offset_computed;
>>  static vec sanitized_sections;
>>  
>> +/* Set of variable declarations that are going to be guarded by
>> +   use-after-scope sanitizer.  */
>> +
>> +static hash_set asan_handled_variables (13);
> 
> Doesn't this introduce yet another global ctor?  If yes (and we
> unfortunately have already way too many), I'd strongly prefer to avoid that,
> use pointer to hash_set or something similar.

Hello

It does, I did pointer in second version of that patch.

> 
>> +/* Depending on POISON flag, emit a call to poison (or unpoison) stack 
>> memory
>> +   allocated for local variables, localted in OFFSETS.  LENGTH is number
>> +   of OFFSETS, BASE is the register holding the stack base,
>> +   against which OFFSETS array offsets are relative to.  BASE_OFFSET 
>> represents
>> +   an offset requested by alignment and similar stuff.  */
>> +
>> +static void
>> +asan_poison_stack_variables (rtx shadow_base, rtx base,
>> + HOST_WIDE_INT base_offset,
>> + HOST_WIDE_INT *offsets, int length,
>> + tree *decls, bool poison)
>> +{
>> +  if (asan_sanitize_use_after_scope ())
>> +for (int l = length - 2; l > 0; l -= 2)
>> +  {
> 
> I think this is unfortunate, it leads to:
> movl$-235802127, 2147450880(%rax)
> movl$-185335552, 2147450884(%rax)
> movl$-202116109, 2147450888(%rax)
> movb$-8, 2147450884(%rax)
> movb$-8, 2147450885(%rax)
> (e.g. on use-after-scope-1.c).
> The asan_emit_stack_protection function already walks all the
> entries in the offsets array in both of the
>   for (l = length; l; l -= 2)
> loops, so please handle the initial poisoning and final unpoisoning there
> as well.  The goal is that for variables that you want poison-after-scope
> at the start of the function (btw, I've noticed that current SVN LLVM
> doesn't bother with it and thus doesn't track "use before scope" (before the
> scope is entered for the first time, maybe we shouldn't either, that would
> catch only compiler bugs rather than user code bugs, right?)) have 0xf8
> on all corresponding bytes including the one that would otherwise have 0x01
> through 0x07.  When unpoisoning at the end of the function, again you should
> combine that with unpoisoning of the red zone and partial zone bytes plus
> the last 0x01 through 0x07, etc.

I also decided to not to handle "use before scope" issues and thus I do not 
poison
stack variables at the very beginning of a function.

As you noticed, the format stack poisoning/unpoisoning code was kind of ugly.
Current unpoisoning code (trunk version) basically clears the whole shadow 
memory
for a stack frame except local variables that are not touched by 
use-after-scope machinery.
That eventually leads to a bit easier code, producing the shadow clearing stuff.

> 
> Plus, as I've mentioned before, it would be nice to optimize - for ASAN_MARK
> unpoison appearing strictly before (i.e. dominating) the first (non-shadow) 
> memory read
> or write in the function (explicit or possible through function calls etc.)
> you really don't need to unpoison (depending on whether we follow LLVM as
> mentioned above then it can be removed without anything, or the decl needs
> to be somehow marked and tell asan_emit_stack_protection it shouldn't poison
> it at the start), and for ASAN_MARK poisoning appearing after the last
> load/store in the function (post dominating those, you don't care about
> noreturn though) you can combine that (remove the ASAN_MARK) with letting
> asan_emit_stack_protection know it doesn't need to unpoison.

Fully agree with that approach, however I would be happy to do that as a 
follow-up as
it's not going to so trivial..

> 
>> +char c = poison ? ASAN_STACK_MAGIC_USE_AFTER_SCOPE : 0;
>> +for (unsigned i = 0; i < shadow_size; ++i)
>> +  {
>> +emit_move_insn (var_mem, gen_int_mode (c, QImode));
>> +var_mem = adjust_address (var_mem, QImode, 1);
> 
> When you combine it with the loop, you can also use the infrastructure to
> handle it 4 bytes at a time.

Current implementation can handle up to 4 bytes at a time. I'm wondering we can
do even better for targets with 64-bits memory stores? How can one get such
info about a target?

> 
> Another thing I've noticed is that the inline expansion of
> __asan_unpoison_stack_memory you emit looks buggy.
> In use-after-scope-1.c I see:
>   _9 = (unsigned long) &my_char;
>   _10 = _9 >> 3;
>   _11 = _10 + 2147450880;
>   _12 = (signed char *) _11;
>   MEM[(short int *)_12] = 0;
> 
> That would be fine only for 16 byte long my_char, but we have instead 9 byte
> one.  So I believe in that case we need to 

[PATCH, 02/N] Introduce tests for -fsanitize-address-use-after-scope

2016-10-03 Thread Martin Liška
Following patch adjusts expected test dumps and also introduces various
new tests.

Martin
>From 4ddafab1e533a1d3580d2f883955d61fe23aa353 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 19 Sep 2016 17:39:29 +0200
Subject: [PATCH 3/3] Introduce tests for -fsanitize-address-use-after-scope

gcc/testsuite/ChangeLog:

2016-09-26  Martin Liska  

	* c-c++-common/asan/force-inline-opt0-1.c: Disable
	-f-sanitize-address-use-after-scope.
	* c-c++-common/asan/inc.c: Change number of expected ASAN_CHECK
	internal fn calls.
	* g++.dg/asan/use-after-scope-1.C: New test.
	* g++.dg/asan/use-after-scope-2.C: Likewise.
	* g++.dg/asan/use-after-scope-3.C: Likewise.
	* g++.dg/asan/use-after-scope-types-1.C: Likewise.
	* g++.dg/asan/use-after-scope-types-2.C: Likewise.
	* g++.dg/asan/use-after-scope-types-3.C: Likewise.
	* g++.dg/asan/use-after-scope-types-4.C: Likewise.
	* g++.dg/asan/use-after-scope-types-5.C: Likewise.
	* g++.dg/asan/use-after-scope-types.h: Likewise.
	* gcc.dg/asan/use-after-scope-1.c: Likewise.
	* gcc.dg/asan/use-after-scope-2.c: Likewise.
	* gcc.dg/asan/use-after-scope-3.c: Likewise.
	* gcc.dg/asan/use-after-scope-4.c: Likewise.
	* gcc.dg/asan/use-after-scope-5.c: Likewise.
	* gcc.dg/asan/use-after-scope-6.c: Likewise.
	* gcc.dg/asan/use-after-scope-7.c: Likewise.
	* gcc.dg/asan/use-after-scope-8.c: Likewise.
	* gcc.dg/asan/use-after-scope-goto-1.c: Likewise.
	* gcc.dg/asan/use-after-scope-goto-2.c: Likewise.
---
 .../c-c++-common/asan/force-inline-opt0-1.c|  1 +
 gcc/testsuite/c-c++-common/asan/inc.c  |  3 +-
 gcc/testsuite/g++.dg/asan/use-after-scope-1.C  | 21 ++
 gcc/testsuite/g++.dg/asan/use-after-scope-2.C  | 40 ++
 gcc/testsuite/g++.dg/asan/use-after-scope-3.C  | 22 ++
 .../g++.dg/asan/use-after-scope-types-1.C  | 17 
 .../g++.dg/asan/use-after-scope-types-2.C  | 17 
 .../g++.dg/asan/use-after-scope-types-3.C  | 17 
 .../g++.dg/asan/use-after-scope-types-4.C  | 17 
 .../g++.dg/asan/use-after-scope-types-5.C  | 17 
 gcc/testsuite/g++.dg/asan/use-after-scope-types.h  | 30 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-1.c  | 18 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-2.c  | 47 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c  | 20 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-4.c  | 16 
 gcc/testsuite/gcc.dg/asan/use-after-scope-5.c  | 27 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-6.c  | 15 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-7.c  | 15 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-8.c  | 14 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-1.c | 47 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-2.c | 25 
 21 files changed, 445 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-1.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-2.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-3.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-1.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-2.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-3.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-4.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-5.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types.h
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-4.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-5.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-6.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-7.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-8.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-2.c

diff --git a/gcc/testsuite/c-c++-common/asan/force-inline-opt0-1.c b/gcc/testsuite/c-c++-common/asan/force-inline-opt0-1.c
index 0576155..2e156f7 100644
--- a/gcc/testsuite/c-c++-common/asan/force-inline-opt0-1.c
+++ b/gcc/testsuite/c-c++-common/asan/force-inline-opt0-1.c
@@ -2,6 +2,7 @@
(before and after inlining) */
 
 /* { dg-do compile } */
+/* { dg-options "-fno-sanitize-address-use-after-scope" } */
 /* { dg-final { scan-assembler-not "__asan_report_load" } } */
 
 __attribute__((always_inline))
diff --git a/gcc/testsuite/c-c++-common/asan/inc.c b/gcc/testsuite/c-c++-common/asan/inc.c
index 5abf373..98121d2 100644
--- a/gcc/testsuite/c-c++-common/asan/inc.c
+++ b/gcc/testsuite/c-c++-common/asan/inc.c
@@ -16,5 +16,6 @@ main ()
   return 0;
 }
 
-/* { dg-fin

Re: [PATCH][RTL ifcvt] Transform (X == CST) ? -CST : Y into (X == CST) ? -X : Y when conditional negation is available

2016-10-03 Thread Kyrill Tkachov


On 02/10/16 20:03, Andrew Pinski wrote:

On Sun, Oct 2, 2016 at 7:50 AM, Jeff Law  wrote:

On 10/02/2016 04:48 AM, Andreas Schwab wrote:

This miscompiles the stage2 ada compiler.

No target identified.


He reported it in a bug report, aarch64-linux-gnu.


As I mentioned in PR 77816 I can't reproduce the Fortran failures reported
and it will take me a while to setup an Ada bootstrap environment.
So I have reverted the patch in the interest of not blocking folks while I try
to reproduce/fix.

Thanks,
Kyrill


Thanks,
Andrew


jeff




Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-10-03 Thread Jakub Jelinek
On Mon, Oct 03, 2016 at 11:27:38AM +0200, Martin Liška wrote:
> > Plus, as I've mentioned before, it would be nice to optimize - for ASAN_MARK
> > unpoison appearing strictly before (i.e. dominating) the first (non-shadow) 
> > memory read
> > or write in the function (explicit or possible through function calls etc.)
> > you really don't need to unpoison (depending on whether we follow LLVM as
> > mentioned above then it can be removed without anything, or the decl needs
> > to be somehow marked and tell asan_emit_stack_protection it shouldn't poison
> > it at the start), and for ASAN_MARK poisoning appearing after the last
> > load/store in the function (post dominating those, you don't care about
> > noreturn though) you can combine that (remove the ASAN_MARK) with letting
> > asan_emit_stack_protection know it doesn't need to unpoison.
> 
> Fully agree with that approach, however I would be happy to do that as a 
> follow-up as
> it's not going to so trivial..

Ok.

> >> +  char c = poison ? ASAN_STACK_MAGIC_USE_AFTER_SCOPE : 0;
> >> +  for (unsigned i = 0; i < shadow_size; ++i)
> >> +{
> >> +  emit_move_insn (var_mem, gen_int_mode (c, QImode));
> >> +  var_mem = adjust_address (var_mem, QImode, 1);
> > 
> > When you combine it with the loop, you can also use the infrastructure to
> > handle it 4 bytes at a time.
> 
> Current implementation can handle up to 4 bytes at a time. I'm wondering we 
> can
> do even better for targets with 64-bits memory stores? How can one get such
> info about a target?

It is not just the question of whether the target has fast 64-bit memory
stores, but also whether the constants you want to store are reasonably
cheap.  E.g. on x86_64, movabsq is kind of expensive, so storing 64-bit 0
is cheap, but storing 64-bit 0xfdfdfdfdfdfdfdfdULL might be better done as 2
32-bit stores, perhaps both for speed and size.

> > 
> > Another thing I've noticed is that the inline expansion of
> > __asan_unpoison_stack_memory you emit looks buggy.
> > In use-after-scope-1.c I see:
> >   _9 = (unsigned long) &my_char;
> >   _10 = _9 >> 3;
> >   _11 = _10 + 2147450880;
> >   _12 = (signed char *) _11;
> >   MEM[(short int *)_12] = 0;
> > 
> > That would be fine only for 16 byte long my_char, but we have instead 9 byte
> > one.  So I believe in that case we need to store
> > 0x00, 0x01 bytes, for little endian thus 0x0100.  You could use for it
> > a function similarly to asan_shadow_cst, just build INTEGER_CST rather than
> > CONST_INT out of it.  In general, poisioning is storing 0xf8 to all affected
> > shadow bytes, unpoisioning should restore the state what we would emit
> > without use-after-scope sanitization, which is all but the last byte 0, and
> > the last byte 0 only if the var size is a multiple of 8, otherwise number
> > of valid bytes (1-7).
> 
> Fixed in the newer patch.
> 
> > 
> > As for the option, it seems clang uses now
> > -fsanitize-address-use-after-scope option, while I don't like that much, if
> > they have already released some version with that option or if they are
> > unwilling to change, I'd go with their option.
> 
> I also do not like the option, but 3.9.0 has already the functionality. Thus,
> I'm copying LLVM behavior.
> 
> > 
> >> + if (flag_stack_reuse != SR_NONE
> >> + && flag_openacc
> >> + && oacc_declare_returns != NULL)
> > 
> > This actually looks like preexisting OpenACC bug, I doubt the OpenACC
> > behavior should depend on -fstack-reuse= setting.
> 
> The generated diff for this hunk is bit misleading, I simplified that
> in the second version.
> 
> > 
> > +  bool unpoison_var = asan_poisoned_variables.contains (t);
> > +  if (asan_sanitize_use_after_scope ()
> > + && unpoison_var)
> > +   asan_poisoned_variables.remove (t);
> > 
> > Similarly to asan_handled_variables, I'd prefer it to be a pointer to
> > hash_set or something similar, so that it costs as few as possible for the
> > general case (no sanitization).  Similarly, querying the hash_set even for
> > no use-after-scope sanitization looks wrong.
> 
> Sure, fixed.
> 
> > 
> > + if ((asan_sanitize_stack_p () || asan_sanitize_use_after_scope ())
> > 
> > I would say if asan_sanitize_stack_p () is false, then we should not be
> > doing use-after-scope sanitization (error if user requested that
> > explicitly).
> 
> Done by adding '&& ASAN_STACK' to asan_sanitize_use_after_scope.
> 
> > 
> > Don't remember if I've mentioned it earlier, but for vars that are
> > TREE_ADDRESSABLE only because of ASAN_MARK calls, we should probably turn
> > them non-addressable and remove those ASAN_MARK calls, those shouldn't leak.
> > You can have a look at the r237814 change for how similarly
> > compare and exchange is special cased for the
> > addressables discovery (though, the ASAN_MARK case would be easier, just
> > drop it rather than turn it into something different).
> 
> I like the approach to not to handle loca

Re: [PATCH v2] add -fprolog-pad=N option to c-family

2016-10-03 Thread AKASHI Takahiro
On Fri, Sep 30, 2016 at 12:01:47PM +0200, Jose E. Marchesi wrote:
> 
> In case anybody missed it, the Linux kernel side to make use
> of this has also been finished meanwhile. Of course it can not
> be accepted without compiler support; and this feature patch
> is much more versatile than just Linux kernel live patching
> on a single architecture.
> 
> How is this supposed to be exploited atomically in RISC arches such as
> sparc?  In such architectures you usually need to patch several
> instructions to load an absolute address into a register.

We had some disucssions in the context of arm64:
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01093.html

But I don't think that we reached a final consensus at that time.

Thanks,
-Takahiro AKASHI

> If a general mechanism is what is intended I would suggest to offer the
> possibility of extending the nops _before_ the function entry point,
> like in:
> 
> (a) nop   ! Load address
> nop   ! Load address
> nop   ! Load address
> nop   ! Load address
> nop   ! Jump to loaded address.
> entry:
> (b) nop   ! PC-relative jump to (a)
> save %sp, bleh, %sp
> ...
> 
> So after the live-patcher patches the loading of the destination address
> and the jump, it can atomically patch (b) to effectively replace the
> implementation of `entry'.
> 
> Wdyt?
> 


[PATCH, configure]: Merge two checks for warning options

2016-10-03 Thread Uros Bizjak
Hello!

I plan to commit the attached patch later today.

2016-10-03  Uros Bizjak  

* configure.ac (strict_warn): Merge -Wmissing-format-attribute and
-Woverloaded-virtual checks for warning options.
* configure: Regenerate.


Bootstrapped on x86_64-linux-gnu.

Uros.
diff --git a/gcc/configure b/gcc/configure
index 2503ba9..80fc5c7 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -6758,63 +6758,7 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
 
 strict_warn=
 save_CXXFLAGS="$CXXFLAGS"
-for real_option in -Wmissing-format-attribute; do
-  # Do the check with the no- prefix removed since gcc silently
-  # accepts any -Wno-* option on purpose
-  case $real_option in
--Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;;
-*) option=$real_option ;;
-  esac
-  as_acx_Woption=`$as_echo "acx_cv_prog_cc_warning_$option" | $as_tr_sh`
-
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CXX supports 
$option" >&5
-$as_echo_n "checking whether $CXX supports $option... " >&6; }
-if { as_var=$as_acx_Woption; eval "test \"\${$as_var+set}\" = set"; }; then :
-  $as_echo_n "(cached) " >&6
-else
-  CXXFLAGS="$option"
-cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-
-int
-main ()
-{
-
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_cxx_try_compile "$LINENO"; then :
-  eval "$as_acx_Woption=yes"
-else
-  eval "$as_acx_Woption=no"
-fi
-rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
-
-fi
-eval ac_res=\$$as_acx_Woption
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
-$as_echo "$ac_res" >&6; }
-  if test `eval 'as_val=${'$as_acx_Woption'};$as_echo "$as_val"'` = yes; then :
-  strict_warn="$strict_warn${strict_warn:+ }$real_option"
-fi
-  done
-CXXFLAGS="$save_CXXFLAGS"
-ac_ext=cpp
-ac_cpp='$CXXCPP $CPPFLAGS'
-ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
-ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS 
conftest.$ac_ext $LIBS >&5'
-ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
-
-
-ac_ext=cpp
-ac_cpp='$CXXCPP $CPPFLAGS'
-ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
-ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS 
conftest.$ac_ext $LIBS >&5'
-ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
-
-save_CXXFLAGS="$CXXFLAGS"
-for real_option in -Woverloaded-virtual; do
+for real_option in -Wmissing-format-attribute -Woverloaded-virtual; do
   # Do the check with the no- prefix removed since gcc silently
   # accepts any -Wno-* option on purpose
   case $real_option in
@@ -18479,7 +18423,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18482 "configure"
+#line 18426 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18585,7 +18529,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18588 "configure"
+#line 18532 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index fa789d5..338956f 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -476,14 +476,14 @@ AC_ARG_ENABLE(build-format-warnings,
 AS_IF([test $enable_build_format_warnings = no],
   [wf_opt=-Wno-format],[wf_opt=])
 ACX_PROG_CXX_WARNING_OPTS(
-   m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual 
$wf_opt])), [loose_warn])
+   m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings ],
+  [-Wcast-qual $wf_opt])), [loose_warn])
 ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wstrict-prototypes -Wmissing-prototypes])),
[c_loose_warn])
 ACX_PROG_CXX_WARNING_OPTS(
-   m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn])
-ACX_PROG_CXX_WARNING_OPTS(
-   m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn])
+   m4_quote(m4_do([-Wmissing-format-attribute ],
+  [-Woverloaded-virtual])), [strict_warn])
 ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), [c_strict_warn])
 ACX_PROG_CXX_WARNING_ALMOST_PEDANTIC(


Re: [v3 PATCH] PR libstdc++/77802

2016-10-03 Thread Jonathan Wakely

On 01/10/16 00:12 +0300, Ville Voutilainen wrote:

I do this with a rather heavy heart, but since gcc6 compiles boost 1.62,
I'll rather have gcc7 do so as well, and I'll throw the tuple fix for lwg2729
to the wolves not because I want to, but because I have to.

Tested on Linux-x64.

2016-10-01  Ville Voutilainen  

   PR libstdc++/77802

   * testsuite/20_util/tuple/77802.cc: New.


Could you please add a note to this new test saying it's undefined
behaviour to instantiate std::tuple with an incomplete type (but that
we try to support it anyway).

OK with that change.




Re: Shared mutex pool

2016-10-03 Thread Jonathan Wakely

On 28/09/16 21:34 +0200, François Dumont wrote:

Hi

   Here is the patch to share a mutex pool between debug mode and 
shared_ptr implementation. It saves 392 bytes on generated .so and 
will make sure that fixing false sharing will impact both usages.


   I preferred to leave implementation in shared_ptr.cc to avoid 
introducing another translation unit.


   * src/c++11/shared_ptr.cc (mask, invalid, get_mutex): Move
   declaration...
   * src/c++11/mutex_pool.h: ... here. New.
   * src/c++11/debug.cc: Use latter.

   Tested under Linux x86_64, normal and debug modes.

Ok to commit ?


OK, thanks.



[PATCH] Ensure "C++" language linkage for std::abs overloads

2016-10-03 Thread Jonathan Wakely

PR libstdc++/77814
* include/bits/std_abs.h: Use "C++" language linkage.
* testsuite/17_intro/headers/c++2011/linkage.cc: Move  to
the end. Add .

I'll commit to trunk when testing finishes.

commit 2dc6b0497b7d0ec0cb298f749419d70a43c2ab70
Author: Jonathan Wakely 
Date:   Mon Oct 3 12:26:55 2016 +0100

Ensure "C++" language linkage for std::abs overloads

PR libstdc++/77814
* include/bits/std_abs.h: Use "C++" language linkage.
* testsuite/17_intro/headers/c++2011/linkage.cc: Move  to
the end. Add .

diff --git a/libstdc++-v3/include/bits/std_abs.h 
b/libstdc++-v3/include/bits/std_abs.h
index ab0f980..732b81a3 100644
--- a/libstdc++-v3/include/bits/std_abs.h
+++ b/libstdc++-v3/include/bits/std_abs.h
@@ -43,6 +43,8 @@
 
 #undef abs
 
+extern "C++"
+{
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -103,5 +105,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
+}
 
 #endif // _GLIBCXX_BITS_STD_ABS_H
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2011/linkage.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2011/linkage.cc
index 67c384b..bb56dbf 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2011/linkage.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2011/linkage.cc
@@ -25,9 +25,7 @@
 extern "C"
 {
 #include 
-#ifdef _GLIBCXX_HAVE_COMPLEX_H
-#include 
-#endif
+// See below for 
 #include 
 #include 
 #ifdef _GLIBCXX_HAVE_FENV_H
@@ -43,6 +41,9 @@ extern "C"
 #include 
 #include 
 #include 
+#if _GLIBCXX_HAVE_STDALIGN_H
+#include 
+#endif
 #include 
 #ifdef _GLIBCXX_HAVE_STDBOOL_H
 #include 
@@ -67,4 +68,10 @@ extern "C"
 #ifdef _GLIBCXX_HAVE_WCTYPE_H
 #include 
 #endif
+
+// Include this last, because it adds extern "C++" and so hides problems in
+// other headers if included first (e.g. PR libstdc++/77814).
+#ifdef _GLIBCXX_HAVE_COMPLEX_H
+#include 
+#endif
 }


[Patch, testsuite] Add ffat-lto-objects to gcc.target/avr/torture/builtins_error.c

2016-10-03 Thread Senthil Kumar Selvaraj
Hi,

  This patch adds -ffat-lto-objects option to an avr target testcase.

  The compiler defaults to thin LTO objects if built with linker plugin
  support, and the error expected by the testcase appears only at link
  time, if at all. Forcing fat LTO object file creation generates the
  error consistently at compile, as expected.

  Committed to trunk.

Regards
Senthil

gcc/testsuite/ChangeLog:

2016-10-03  Senthil Kumar Selvaraj  

* gcc.target/avr/torture/builtins-error.c: Add -ffat-lto-objects
  option.


Index: gcc/testsuite/gcc.target/avr/torture/builtins-error.c
===
--- gcc/testsuite/gcc.target/avr/torture/builtins-error.c   (revision 
240709)
+++ gcc/testsuite/gcc.target/avr/torture/builtins-error.c   (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do assemble } */
+/* { dg-options "-ffat-lto-objects" } */
 
 char insert (long a)
 {


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-10-03 Thread Martin Liška
On 08/18/2016 05:53 PM, Jeff Law wrote:
> On 08/18/2016 09:51 AM, Andi Kleen wrote:
>>> I'd prefer to make updates atomic in multi-threaded applications.
>>> The best proxy we have for that is -pthread.
>>>
>>> Is it slower, most definitely, but odds are we're giving folks
>>> garbage data otherwise, which in many ways is even worse.
>>
>> It will likely be catastrophically slower in some cases.
>>
>> Catastrophically as in too slow to be usable.
>>
>> An atomic instruction is a lot more expensive than a single increment. Also
>> they sometimes are really slow depending on the state of the machine.
> And for those cases there's a way to override.
> 
> The default should be set for correctness.
> 
> jeff

I would to somehow resolve the discussion related to default value selection.
Is the prevailing consensus that we should set -fprofile-update=atomic when
-pthread is set? If so, I'll prepare a patch. I tend to do it this way.

Moreover, I also have a patch that provides a warning, which can be also useful
even though we would change the default behavior:

$ ./xgcc -B. /tmp/a.c -fprofile-update=single -pthread -fprofile-generate
xgcc: warning: -profile-update=atomic should be used to generate a valid 
profile for a multithreaded application

Ideas?
Martin

>From d5a8097dd07d1a3f4263da7ccad970543d92f3e9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 3 Oct 2016 14:02:14 +0200
Subject: [PATCH] Warn about -fprofile-update=single and -pthread

gcc/ChangeLog:

2016-10-03  Martin Liska  

	* common.opt: Mark couple of flags with 'Driver' keyword.
	* gcc.c (driver_handle_option): Handle these options.
	(process_command): Generate the warning.
---
 gcc/common.opt |  8 
 gcc/gcc.c  | 31 +++
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 0e01577..3af9c64 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1920,7 +1920,7 @@ Common Report Var(profile_flag)
 Enable basic program profiling code.
 
 fprofile-arcs
-Common Report Var(profile_arc_flag)
+Common Driver Report Var(profile_arc_flag)
 Insert arc-based program profiling code.
 
 fprofile-dir=
@@ -1933,7 +1933,7 @@ Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input.
 
 fprofile-update=
-Common Joined RejectNegative Enum(profile_update) Var(flag_profile_update) Init(PROFILE_UPDATE_SINGLE)
+Common Driver Joined RejectNegative Enum(profile_update) Var(flag_profile_update) Init(PROFILE_UPDATE_SINGLE)
 -fprofile-update=[single|atomic]	Set the profile update method.
 
 Enum
@@ -1946,11 +1946,11 @@ EnumValue
 Enum(profile_update) String(atomic) Value(PROFILE_UPDATE_ATOMIC)
 
 fprofile-generate
-Common
+Common Driver
 Enable common options for generating profile info for profile feedback directed optimizations.
 
 fprofile-generate=
-Common Joined RejectNegative
+Common Driver Joined RejectNegative
 Enable common options for generating profile info for profile feedback directed optimizations, and set -fprofile-dir=.
 
 fprofile-use
diff --git a/gcc/gcc.c b/gcc/gcc.c
index d3e8c88..b023013 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -233,6 +233,16 @@ static int print_subprocess_help;
 /* Linker suffix passed to -fuse-ld=... */
 static const char *use_ld;
 
+/* Flag indicating whether pthread is provided as a command line option.  */
+static bool pthread_set = false;
+
+/* Flag indicating whether profiling is enabled by an option  */
+static bool profiling_enabled = false;
+
+/* Flag indicating whether profile-update=atomic is provided as a command
+   line option.  */
+static bool profile_update_atomic = false;
+
 /* Whether we should report subprocess execution times to a file.  */
 
 FILE *report_times_to_file = NULL;
@@ -4112,6 +4122,22 @@ driver_handle_option (struct gcc_options *opts,
   handle_foffload_option (arg);
   break;
 
+case OPT_fprofile_update_:
+  if ((profile_update)value == PROFILE_UPDATE_ATOMIC)
+	profile_update_atomic = true;
+  break;
+
+case OPT_pthread:
+  pthread_set = true;
+  break;
+
+case OPT_fprofile_generate:
+case OPT_fprofile_generate_:
+case OPT_fprofile_arcs:
+case OPT_coverage:
+  profiling_enabled = true;
+  break;
+
 default:
   /* Various driver options need no special processing at this
 	 point, having been handled in a prescan above or being
@@ -4580,6 +4606,11 @@ process_command (unsigned int decoded_options_count,
   add_infile ("help-dummy", "c");
 }
 
+  /* Warn about multi-threaded program that do not use -profile=atomic.  */
+  if (profiling_enabled && pthread_set && !profile_update_atomic)
+warning (0, "-profile-update=atomic should be used to generate a valid"
+	 " profile for a multithreaded application");
+
   /* Decide if undefined variable references are allowed in specs.  */
 
   /* --version and --help alone or together are safe.  Note that -v would
-- 
2.9.2



Re: gcc build problem (i386.c) -- missing declaration

2016-10-03 Thread Gerald Pfeifer
On Thu, 29 Sep 2016, Louis Krupp wrote:
> My target was gfortran.
> 
> In any case, someone else fixed this problem.

Good.

Note that by target we are referring to the platform (processor plus
operating system).  You can see this by looking for a line started
with "Target:" in the output of `gcc -v`.

On one of my machines this says "Target: x86_64-suse-linux", on
another one "Target: i386-unknown-freebsd10.3", for example.

Gerald


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-10-03 Thread Nathan Sidwell

On 10/03/16 08:13, Martin Liška wrote:

On 08/18/2016 05:53 PM, Jeff Law wrote:

On 08/18/2016 09:51 AM, Andi Kleen wrote:

I'd prefer to make updates atomic in multi-threaded applications.
The best proxy we have for that is -pthread.

Is it slower, most definitely, but odds are we're giving folks
garbage data otherwise, which in many ways is even worse.


It will likely be catastrophically slower in some cases.

Catastrophically as in too slow to be usable.

An atomic instruction is a lot more expensive than a single increment. Also
they sometimes are really slow depending on the state of the machine.

And for those cases there's a way to override.

The default should be set for correctness.

jeff


I would to somehow resolve the discussion related to default value selection.
Is the prevailing consensus that we should set -fprofile-update=atomic when
-pthread is set? If so, I'll prepare a patch. I tend to do it this way.


This is my preference.

nathan


Re: [PATCH] Machine-readable RTL dumps: print_rtx_function

2016-10-03 Thread David Malcolm
On Sun, 2016-10-02 at 07:04 -0500, Segher Boessenkool wrote:
> On Thu, Sep 29, 2016 at 11:36:29AM -0600, Jeff Law wrote:
> > On 09/29/2016 11:25 AM, Bernd Schmidt wrote:
> > > On 09/29/2016 07:47 PM, David Malcolm wrote:
> > > > This patch adds a new function, print_rtx_function, intended
> > > > for use
> > > > for generating function dumps suitable for parsing by the RTL
> > > > frontend,
> > > > but also intended to be human-readable, and human-authorable.
> > > 
> > > > (note 1 0 4 (nil) NOTE_INSN_DELETED)
> > > > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> > > > (insn 2 4 3 2 (set (mem/c:SI (plus:DI (reg/f:DI 82
> > > > virtual-stack-vars)
> > > > (const_int -4 [0xfffc])) [1
> > > > i+0
> > > > S4 A32])
> > > > (reg:SI 5 di [ i ])) t.c:2 -1
> > > >  (nil))
> > > 
> > > I think it might be a good idea to get rid of redundant
> > > information like
> > > insn numbers for such a dump format. But that can be left for
> > > followup
> > > patches.
> > I would make the same suggestion.  The insn # and backend pattern
> > name 
> > (if any) should be omitted in machine-readable dump format.  I'm
> > fine 
> > with that as a follow-up as well.
> 
> You need the insn id for (at least) code_label.

I think that Bernd is referring to the INSN_CODE, rather than than
INSN_UID.


Re: [PATCH] Fix bootstrap with --enable-languages=all,go

2016-10-03 Thread Rainer Orth
Andrew Haley  writes:

> On 30/09/16 23:16, Rainer Orth wrote:
>> me too, though mostly to have maximum test coverage (primarily on
>> Solaris).  As expected, a x86_64-apple-darwin16 bootstrap with
>> --enable-objc-gc just failed for me.  I'm testing the following patch
>> (on top of Jakub's).
>> 
>>  Rainer
>> 
>> 
>> 2016-10-01  Rainer Orth  
>> 
>>  * configure.ac (target_libraries): Readd target-boehm-gc.
>>  Restore --enable-objc-gc handling.
>>  * configure: Regenerate.
>
> Thanks everybody.  My apologies.

The bootstrap completed successfully now.  Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH][v4] GIMPLE store merging pass

2016-10-03 Thread Kyrill Tkachov

Hi Richard,
another question as I'm working through your comments...

On 29/09/16 11:45, Richard Biener wrote:



+  /* The region from the byte array that we're inserting into.  */
+  tree ptr_wide_int
+   = native_interpret_expr (dest_int_type, ptr + first_byte,
+total_bytes);
+
+  gcc_assert (ptr_wide_int);
+  wide_int dest_wide_int
+   = wi::to_wide (ptr_wide_int, TYPE_PRECISION (dest_int_type));
+  wide_int expr_wide_int
+   = wi::to_wide (tmp_int, byte_size * BITS_PER_UNIT);
+  if (BYTES_BIG_ENDIAN)
+   {
+ unsigned int insert_pos
+   = byte_size * BITS_PER_UNIT - bitlen - (bitpos % BITS_PER_UNIT);
+ dest_wide_int
+   = wi::insert (dest_wide_int, expr_wide_int, insert_pos, bitlen);
+   }
+  else
+   dest_wide_int = wi::insert (dest_wide_int, expr_wide_int,
+   bitpos % BITS_PER_UNIT, bitlen);
+
+  tree res = wide_int_to_tree (dest_int_type, dest_wide_int);
+  native_encode_expr (res, ptr + first_byte, total_bytes, 0);
+

OTOH this whole dance looks as complicated and way more expensive than
using native_encode_expr into a temporary buffern and then a
manually implemented "bit-merging" of it at ptr + first_byte + bitpos.
AFAICS that operation is even endianess agnostic.


If the quantity we're inserting at a non-byte boundary
is more than a byte wide we still have to shift the value
to position properly across the bytes it straddles, so I don't
see how we can avoid creating a wide_int here.
Consider inserting a 10-bit value at bitposition 3 (I hope the mailer
doesn't screw up the indentation):
value:  xx
before: ||||
| byte 1 || byte 2 |
after:  |---x||x---|

We'll native_encode_expr the value into a two-byte buffer but then we can't
just shift each byte by 3 to insert it into the destination buffer, we need
to form the whole 10-bit value and shift is as a whole to not lose any bits.

And if a value crosses bytes then we need to care about BYTES_BIG_ENDIAN when
writing the bytes back into the buffer, no?

Thanks,
Kyrill


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-10-03 Thread Rainer Orth
Hi Martin,

> On 09/30/2016 02:31 PM, Rainer Orth wrote:
>> this would be i386-pc-solaris2.12.  I'm not sure if the constructor
>> priority detection works in a cross scenario.
>> 
>> I'm attaching the resulting assembly (although for Solaris as, the gas
>> build is still running).
>
> Hi. Sorry, I have a stupid mistake in dtor priority
> (I used 65534 instead of desired 99). Please try to test it on Solaris 12
> with the attached patch. I'll send the patch to ML soon.

unfortunately, the patch makes no difference on Solaris 12.  The test
even FAILs when using gas/gld, which is a different/independent
implementation of constructor priority.

> Can you please test whether it makes any change on a solaris target w/o
> prioritized ctors/dtors?

It doesn't: the test PASSes on Solaris 10 and 11 with and without your
patch.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCHv2] Cleanup of input.c

2016-10-03 Thread David Malcolm
On Sun, 2016-10-02 at 13:07 +, Bernd Edlinger wrote:
> Hi Dave,
> 
> here is the new version of the input.c patch:
> 
> I have updated the comments, and revised the test case as requested.
> I have additionally done a bootstrap with build config=bootstrap
> -asan.

Thanks.   A couple of nits inline below...

> Bootstrap and reg-testing on x86_64-pc-linux-gnu.
> Is it OK for trunk?

> 2016-09-26  Bernd Edlinger  
> 
>   PR preprocessor/77699
>   * input.c (maybe_grow): Don't allocate one byte extra
headroom.
>   (get_next_line): Return false on error.
>   (read_next_line): Removed, use get_next_line instead.
>   (read_line_num): Don't copy the line.
>   (location_get_source_line): Don't use static data.
>   (test_reading_source_line): Add more test cases.

FWIW I've been adding selftest:: to those symbols within the namespace
in ChangeLog entries, so I would have written this last one as:
(selftest::test_reading_source_line): Add more test cases.
(mostly out of wanting to emphasize the "real" code vs test code
split).

That said, I don't think we have any official policy on this.

> Index: gcc/input.c
> ===
> --- gcc/input.c   (revision 240693)
> +++ gcc/input.c   (working copy)

[...snip...]

> @@ -643,15 +612,15 @@ goto_next_line (fcache *cache)
>  }
>  
>  /* Read an arbitrary line number LINE_NUM from the file cached in C.
> -   The line is copied into *LINE.  *LINE_LEN must have been set to
the
> -   length of *LINE.  If *LINE is too small (or NULL) it's extended
(or
> -   allocated) and *LINE_LEN is adjusted accordingly.  *LINE ends up
> -   with a terminal zero byte and can contain additional zero bytes.
> +   If the line was read successfully, *LINE points to the beginning
> +   of the line in the file cache and *LINE_LEN is the length of the
> +   line.  *LINE is not nul-terminated, but may contain zero bytes.
> +   *LINE is only valid until the next call of read_line_num.
> This function returns bool if a line was read.  */
>  
>  static bool
>  read_line_num (fcache *c, size_t line_num,
> -char ** line, ssize_t *line_len)
> +char **line, ssize_t *line_len)
>  {
>gcc_assert (line_num > 0);
>  
> @@ -705,12 +674,8 @@ read_line_num (fcache *c, size_t line_num,
>   {
> /* We have the start/end of the line.  Let's just copy
>it again and we are done.  */

The reference to a "copy" in this comment is now invalid.  Maybe the
comment should now simply read:

  /* We have the start/end of the line.  */

or somesuch.

> -   ssize_t len = i->end_pos - i->start_pos + 1;
> -   if (*line_len < len)
> - *line = XRESIZEVEC (char, *line, len);
> -   memmove (*line, c->data + i->start_pos, len);
> -   (*line)[len - 1] = '\0';
> -   *line_len = --len;
> +   *line = c->data + i->start_pos;
> +   *line_len = i->end_pos - i->start_pos;
> return true;
>   }
>  

[...snip...]

OK for trunk with the above comment nit fixed.

Dave


[PATCH 5/6] rs6000: Separate shrink-wrapping

2016-10-03 Thread Segher Boessenkool
This implements the hooks for separate shrink-wrapping for rs6000.
It handles GPRs and LR.  The GPRs get a component number corresponding
to their register number; LR gets component number 0.


2016-06-07  Segher Boessenkool  

* config/rs6000/rs6000.c (machine_function): Add new fields
gpr_is_wrapped_separately and lr_is_wrapped_separately.
(TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS,
TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB,
TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define.
(rs6000_get_separate_components): New function.
(rs6000_components_for_bb): New function.
(rs6000_disqualify_components): New function.
(rs6000_emit_prologue_components): New function.
(rs6000_emit_epilogue_components): New function.
(rs6000_set_handled_components): New function.
(rs6000_emit_prologue): Don't emit LR save if lr_is_wrapped_separately.
Don't emit GPR saves if gpr_is_wrapped_separately for that register.
(restore_saved_lr): Don't restore LR if lr_is_wrapped_separately.
(rs6000_emit_epilogue): Don't emit GPR restores if
gpr_is_wrapped_separately for that register.  Don't make a
REG_CFA_RESTORE note for registers we did not restore, either.
---
 gcc/config/rs6000/rs6000.c | 269 ++---
 1 file changed, 253 insertions(+), 16 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6897b5c..ff606c9 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -153,6 +153,10 @@ typedef struct GTY(()) machine_function
   bool split_stack_argp_used;
   /* Flag if r2 setup is needed with ELFv2 ABI.  */
   bool r2_setup_needed;
+  /* The components already handled by separate shrink-wrapping, which should
+ not be considered by the prologue and epilogue.  */
+  bool gpr_is_wrapped_separately[32];
+  bool lr_is_wrapped_separately;
 } machine_function;
 
 /* Support targetm.vectorize.builtin_mask_for_load.  */
@@ -1514,6 +1518,19 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_SET_UP_BY_PROLOGUE
 #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue
 
+#undef TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS
+#define TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS 
rs6000_get_separate_components
+#undef TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB
+#define TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB rs6000_components_for_bb
+#undef TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS
+#define TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS rs6000_disqualify_components
+#undef TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS
+#define TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS 
rs6000_emit_prologue_components
+#undef TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS
+#define TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS 
rs6000_emit_epilogue_components
+#undef TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS
+#define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS rs6000_set_handled_components
+
 #undef TARGET_EXTRA_LIVE_ON_ENTRY
 #define TARGET_EXTRA_LIVE_ON_ENTRY rs6000_live_on_entry
 
@@ -27285,6 +27302,212 @@ rs6000_global_entry_point_needed_p (void)
   return cfun->machine->r2_setup_needed;
 }
 
+/* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS.  */
+static sbitmap
+rs6000_get_separate_components (void)
+{
+  rs6000_stack_t *info = rs6000_stack_info ();
+
+  if (!(info->savres_strategy & SAVE_INLINE_GPRS)
+  || !(info->savres_strategy & REST_INLINE_GPRS)
+  || WORLD_SAVE_P (info))
+return NULL;
+
+  sbitmap components = sbitmap_alloc (32);
+  bitmap_clear (components);
+
+  /* The GPRs we need saved to the frame.  */
+  int reg_size = TARGET_32BIT ? 4 : 8;
+  int offset = info->gp_save_offset;
+  if (info->push_p)
+offset += info->total_size;
+
+  for (unsigned regno = info->first_gp_reg_save; regno < 32; regno++)
+{
+  if (IN_RANGE (offset, -0x8000, 0x7fff)
+ && rs6000_reg_live_or_pic_offset_p (regno))
+   bitmap_set_bit (components, regno);
+
+  offset += reg_size;
+}
+
+  /* Don't mess with the hard frame pointer.  */
+  if (frame_pointer_needed)
+bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM);
+
+  /* Don't mess with the fixed TOC register.  */
+  if ((TARGET_TOC && TARGET_MINIMAL_TOC)
+  || (flag_pic == 1 && DEFAULT_ABI == ABI_V4)
+  || (flag_pic && DEFAULT_ABI == ABI_DARWIN))
+bitmap_clear_bit (components, RS6000_PIC_OFFSET_TABLE_REGNUM);
+
+  /* Optimize LR save and restore if we can.  This is component 0.  */
+  if (info->lr_save_p
+  && !(flag_pic && (DEFAULT_ABI == ABI_V4 || DEFAULT_ABI == ABI_DARWIN)))
+{
+  offset = info->lr_save_offset;
+  if (info->push_p)
+   offset += info->total_size;
+  if (IN_RANGE (offset, -0x8000, 0x7fff))
+   bitmap_set_bit (components, 0);
+}

[PATCH 4/6] shrink-wrap: Shrink-wrapping for separate components

2016-10-03 Thread Segher Boessenkool
This is the main substance of this patch series.

Instead of doing all of the prologue and epilogue in one spot, it often
is better to do components of it at different places, so that they are
executed less frequently.

What exactly is a component is completely up to the target; this code
treats it all abstractly, and uses hooks for the target to handle the
more concrete things.  Commonly there is one component for each callee-
saved register, for example.

Components can be executed more than once per function execution.  This
pass makes sure that a component's epilogue is not called more often
than the corresponding prologue has been, at any point in time; that the
prologue is called more often, wherever the prologue's effect is needed;
and that the epilogue is called as often as the prologue has been, when
the function exits.  It does this by first deciding which blocks need
which components active, and then placing prologue and epilogue
components to make that exactly true.

Deciding what blocks should run with a certain component active so that
the total cost of executing the prologues (and epilogues) is optimal, is
not a computationally feasible problem.  Instead, for each basic block,
we estimate the cost of putting a prologue right before the block, and
if that is cheaper than the total cost of putting prologues optimally
(according to the estimated cost) in the dominator subtrees strictly
dominated by this first block, place it at the first block instead.
This simple procedure places the components optimally for any dominator
sub tree where the root node's cost does not depend on anything outside
its subtree.

The cost is the execution frequency of all edges into the block coming
from blocks that do not have this component active.  The estimated cost
is the execution frequency of the block, minus the execution frequency
of any backedges (which by definition are coming from subtrees, so if
the "head" block gets a prologue, the source block of any backedge has
that component active as well).

Currently, the epilogues are placed as late as possible, given the
constraints.  This does not matter for execution cost, but we could
save a little bit of code size by placing the epilogues in a smarter
way.  This is a possible future optimisation.

Now all that is left is inserting prologues and epilogues on all edges
that jump into resp. out of the "active" set of blocks.  Often we need
to insert some components' prologues (or epilogues) on all edges into
(or out of) a block.  In theory cross-jumping can unify all such, but
in practice that often fails; besides, that is a lot of work.  So in
this case we insert the prologue and epilogue components at the "head"
or "tail" of a block, instead.

As a final optimisation, if a block needs a prologue and its immediate
dominator has the block as a post-dominator, that immediate dominator
gets the prologue as well.


2016-06-07  Segher Boessenkool  

* function.c (thread_prologue_and_epilogue_insns): Recompute the
live info.  Call try_shrink_wrapping_separate.  Compute the
prologue_seq afterwards, if it has possibly changed.  Compute the
split_prologue_seq and epilogue_seq later, too.
* shrink-wrap.c: #include cfgbuild.h.
(dump_components): New function.
(struct sw): New struct.
(SW): New function.
(init_separate_shrink_wrap): New function.
(fini_separate_shrink_wrap): New function.
(place_prologue_for_one_component): New function.
(spread_components): New function.
(disqualify_problematic_components): New function.
(emit_common_heads_for_components): New function.
(emit_common_tails_for_components): New function.
(insert_prologue_epilogue_for_components): New function.
(try_shrink_wrapping_separate): New function.
* shrink-wrap.h: Declare try_shrink_wrapping_separate.

---
 gcc/function.c|  15 +-
 gcc/shrink-wrap.c | 741 ++
 gcc/shrink-wrap.h |   1 +
 3 files changed, 754 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index 94ed786..6d2a079 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5920,16 +5920,25 @@ thread_prologue_and_epilogue_insns (void)
   edge entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
   edge orig_entry_edge = entry_edge;
 
-  rtx_insn *split_prologue_seq = make_split_prologue_seq ();
   rtx_insn *prologue_seq = make_prologue_seq ();
-  rtx_insn *epilogue_seq = make_epilogue_seq ();
 
   /* Try to perform a kind of shrink-wrapping, making sure the
  prologue/epilogue is emitted only around those parts of the
  function that require it.  */
-
   try_shrink_wrapping (&entry_edge, prologue_seq);
 
+  /* If the target can handle splitting the prologue/epilogue into separate
+ components, try to shrink-wrap these components separately.  */
+  try_shrink_wrapping_separate (entry_edge->dest);
+
+ 

[PATCH 3/6] regrename: Don't rename restores

2016-10-03 Thread Segher Boessenkool
A restore is supposed to restore some certain register.  Restoring it
into some other register will not work.  Don't.


2016-06-07  Segher Boessenkool  

* regrename.c (build_def_use): Invalidate chains that have a
REG_CFA_RESTORE on some instruction.

---
 gcc/regrename.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/regrename.c b/gcc/regrename.c
index 3509e8b..e0d2dd1 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -1655,6 +1655,7 @@ build_def_use (basic_block bb)
 (6) For any non-earlyclobber write we find in an operand, make
 a new chain or mark the hard register as live.
 (7) For any REG_UNUSED, close any chains we just opened.
+(8) For any REG_CFA_RESTORE, kill any chain containing it.
 
 We cannot deal with situations where we track a reg in one mode
 and see a reference in another mode; these will cause the chain
@@ -1867,6 +1868,12 @@ build_def_use (basic_block bb)
scan_rtx (insn, &XEXP (note, 0), NO_REGS, terminate_dead,
  OP_IN);
  }
+
+ /* Step 8: Kill the chains involving register restores.  Those
+should restore _that_ register.  */
+ for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
+   if (REG_NOTE_KIND (note) == REG_CFA_RESTORE)
+ scan_rtx (insn, &XEXP (note, 0), NO_REGS, mark_all_read, OP_IN);
}
   else if (DEBUG_INSN_P (insn)
   && !VAR_LOC_UNKNOWN_P (INSN_VAR_LOCATION_LOC (insn)))
-- 
1.9.3



[PATCH 6/6] shrink-wrap: Testcases for separate shrink-wrapping

2016-10-03 Thread Segher Boessenkool
A few testcases for separate shrink-wrapping: test whether it works in a
trivial case; whether it creates more than one prologue where that is
useful; whether it puts prologues inside a loop if that is cheaper.


2016-10-03  Segher Boessenkool  

gcc/testsuite/
* gcc.target/powerpc/shrink-wrap-separate-0.c: New testcase.
* gcc.target/powerpc/shrink-wrap-separate-1.c: New testcase.
* gcc.target/powerpc/shrink-wrap-separate-2.c: New testcase.

---
 .../gcc.target/powerpc/shrink-wrap-separate-0.c| 22 ++
 .../gcc.target/powerpc/shrink-wrap-separate-1.c| 18 +++
 .../gcc.target/powerpc/shrink-wrap-separate-2.c| 26 ++
 3 files changed, 66 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c

diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
new file mode 100644
index 000..dea0611
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {#before\M.*\mmflr\M} } } */
+
+/* This tests if shrink-wrapping for separate components works.
+
+   r20 (a callee-saved register) is forced live at the start, so that we
+   get it saved in a prologue at the start of the function.
+   The link register only needs to be saved if x is non-zero; without
+   separate shrink-wrapping it would however be saved in the one prologue.
+   The test tests if the mflr insn ends up behind the prologue.  */
+
+void g(void);
+
+void f(int x)
+{
+   register int r20 asm("20") = x;
+   asm("#before" : : "r"(r20));
+   if (x)
+   g();
+   asm(""); // no tailcall of g
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
new file mode 100644
index 000..735b606
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */
+
+/* This tests if shrink-wrapping for separate components creates more
+   than one prologue when that is useful.  In this case, it saves the
+   link register before both the call to g and the call to h.  */
+
+void g(void) __attribute__((noreturn));
+void h(void) __attribute__((noreturn));
+
+void f(int x)
+{
+   if (x == 42)
+   g();
+   if (x == 31)
+   h();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
new file mode 100644
index 000..b22564a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */
+
+/* This tests if shrink-wrapping for separate components puts a prologue
+   inside a loop when that is useful.  In this case, it saves the link
+   register before each call: both calls happen with probability .10,
+   so saving the link register happens with .80 per execution of f on
+   average, which is smaller than 1 which you would get if you saved
+   it outside the loop.  */
+
+int *a;
+void g(void);
+
+void f(int x)
+{
+   int j;
+   for (j = 0; j < 4; j++) {
+   if (__builtin_expect(a[j], 0))
+   g();
+   asm("#" : : : "memory");
+   if (__builtin_expect(a[j], 0))
+   g();
+   a[j]++;
+   }
+}
-- 
1.9.3



[PATCH v4 0/6] Separate shrink-wrapping

2016-10-03 Thread Segher Boessenkool
I updated according to Jeff's latest comments (importantly, we cannot
move a *logue in front of a move in general), and added some testcases.

Bootstrapping is in progress on today's trunk, powerpc64-linux and
powerpc64le-linux.

Is this okay to commit now?


Segher


Segher Boessenkool (6):
  separate shrink-wrap: New command-line flag, status flag, hooks, and
doc
  dce: Don't dead-code delete separately wrapped restores
  regrename: Don't rename restores
  shrink-wrap: Shrink-wrapping for separate components
  rs6000: Separate shrink-wrapping
  shrink-wrap: Testcases for separate shrink-wrapping

 gcc/common.opt |   4 +
 gcc/config/rs6000/rs6000.c | 269 +++-
 gcc/dce.c  |   9 +
 gcc/doc/invoke.texi|  11 +-
 gcc/doc/tm.texi|  63 ++
 gcc/doc/tm.texi.in |  38 ++
 gcc/emit-rtl.h |   4 +
 gcc/function.c |  15 +-
 gcc/regrename.c|   7 +
 gcc/shrink-wrap.c  | 741 +
 gcc/shrink-wrap.h  |   1 +
 gcc/target.def |  57 ++
 .../gcc.target/powerpc/shrink-wrap-separate-0.c|  22 +
 .../gcc.target/powerpc/shrink-wrap-separate-1.c|  18 +
 .../gcc.target/powerpc/shrink-wrap-separate-2.c|  26 +
 15 files changed, 1265 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c

-- 
1.9.3



[PATCH 2/6] dce: Don't dead-code delete separately wrapped restores

2016-10-03 Thread Segher Boessenkool
If there is a separately wrapped register restore on some path that
is dead (say, control goes into an endless loop after it), then we
cannot delete that restore because that would confuse the DWARF CFI
(if there is another path joining).
This happens with gcc.dg/torture/pr53168.c, for example.


2016-06-07  Segher Boessenkool  

* dce.c (delete_unmarked_insns): Don't delete instructions with
a REG_CFA_RESTORE note.

---
 gcc/dce.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/dce.c b/gcc/dce.c
index ea3fb00..d510287 100644
--- a/gcc/dce.c
+++ b/gcc/dce.c
@@ -587,6 +587,15 @@ delete_unmarked_insns (void)
  if (!dbg_cnt (dce))
continue;
 
+ if (crtl->shrink_wrapped_separate
+ && find_reg_note (insn, REG_CFA_RESTORE, NULL))
+   {
+ if (dump_file)
+   fprintf (dump_file, "DCE: NOT deleting insn %d, it's a "
+   "callee-save restore\n", INSN_UID (insn));
+ continue;
+   }
+
  if (dump_file)
fprintf (dump_file, "DCE: Deleting insn %d\n", INSN_UID (insn));
 
-- 
1.9.3



[PATCH 1/6] separate shrink-wrap: New command-line flag, status flag, hooks, and doc

2016-10-03 Thread Segher Boessenkool
This patch adds a new command-line flag "-fshrink-wrap-separate", a status
flag "shrink_wrapped_separate", hooks for abstracting the target components,
and documentation for all those.


2016-06-07  Segher Boessenkool  

* common.opt (-fshrink-wrap-separate): New flag.
* doc/invoke.texi: Document it.
* doc/tm.texi.in (Shrink-wrapping separate components): New subsection.
* doc/tm.texi: Regenerate.
* emit-rtl.h (struct rtl_data): New field shrink_wrapped_separate.
* target.def (shrink_wrap): New hook vector.
(get_separate_components, components_for_bb, disqualify_components,
emit_prologue_components, emit_epilogue_components,
set_handled_components): New hooks.

---
 gcc/common.opt  |  4 
 gcc/doc/invoke.texi | 11 +-
 gcc/doc/tm.texi | 63 +
 gcc/doc/tm.texi.in  | 38 
 gcc/emit-rtl.h  |  4 
 gcc/target.def  | 57 
 6 files changed, 176 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 0e01577..971f296 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2197,6 +2197,10 @@ Common Report Var(flag_shrink_wrap) Optimization
 Emit function prologues only before parts of the function that need it,
 rather than at the top of the function.
 
+fshrink-wrap-separate
+Common Report Var(flag_shrink_wrap_separate) Init(1) Optimization
+Shrink-wrap parts of the prologue and epilogue separately.
+
 fsignaling-nans
 Common Report Var(flag_signaling_nans) Optimization SetByCombined
 Disable optimizations observable by IEEE signaling NaNs.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6767462..7a167a64 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -399,7 +399,8 @@ Objective-C and Objective-C++ Dialects}.
 -fschedule-insns -fschedule-insns2 -fsection-anchors @gol
 -fselective-scheduling -fselective-scheduling2 @gol
 -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops @gol
--fsemantic-interposition -fshrink-wrap -fsignaling-nans @gol
+-fsemantic-interposition -fshrink-wrap -fshrink-wrap-separate @gol
+-fsignaling-nans @gol
 -fsingle-precision-constant -fsplit-ivs-in-unroller @gol
 -fsplit-paths @gol
 -fsplit-wide-types -fssa-backprop -fssa-phiopt @gol
@@ -6590,6 +6591,7 @@ compilation time.
 -fmove-loop-invariants @gol
 -freorder-blocks @gol
 -fshrink-wrap @gol
+-fshrink-wrap-separate @gol
 -fsplit-wide-types @gol
 -fssa-backprop @gol
 -fssa-phiopt @gol
@@ -7500,6 +7502,13 @@ Emit function prologues only before parts of the 
function that need it,
 rather than at the top of the function.  This flag is enabled by default at
 @option{-O} and higher.
 
+@item -fshrink-wrap-separate
+@opindex fshrink-wrap-separate
+Shrink-wrap separate parts of the prologue and epilogue separately, so that
+those parts are only executed when needed.
+This option is on by default, but has no effect unless @option{-fshrink-wrap}
+is also turned on and the target supports this.
+
 @item -fcaller-saves
 @opindex fcaller-saves
 Enable allocation of values to registers that are clobbered by
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8a98ba4..e74ae47 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2924,6 +2924,7 @@ This describes the stack layout and calling conventions.
 * Function Entry::
 * Profiling::
 * Tail Calls::
+* Shrink-wrapping separate components::
 * Stack Smashing Protection::
 * Miscellaneous Register Hooks::
 @end menu
@@ -4853,6 +4854,68 @@ This hook should add additional registers that are 
computed by the prologue to t
 True if a function's return statements should be checked for matching the 
function's return type.  This includes checking for falling off the end of a 
non-void function.  Return false if no such check should be made.
 @end deftypefn
 
+@node Shrink-wrapping separate components
+@subsection Shrink-wrapping separate components
+@cindex shrink-wrapping separate components
+
+The prologue may perform a variety of target dependent tasks such as
+saving callee-saved registers, saving the return address, aligning the
+stack, creating a stack frame, initializing the PIC register, setting
+up the static chain, etc.
+
+On some targets some of these tasks may be independent of others and
+thus may be shrink-wrapped separately.  These independent tasks are
+referred to as components and are handled generically by the target
+independent parts of GCC.
+
+Using the following hooks those prologue or epilogue components can be
+shrink-wrapped separately, so that the initialization (and possibly
+teardown) those components do is not done as frequently on execution
+paths where this would unnecessary.
+
+What exactly those components are is up to the target code; the generic
+code treats them abstractly, as a bit in an @code{sbitmap}.  These
+@code{sbitmap}s are allocated by the @code{shrink_wrap.get_separate_components}

[gomp4] update gfortran's tile clause error handling

2016-10-03 Thread Cesar Philippidis
This patch updates the fortran FE to generate errors, rather than
warnings, for non-positive integer tile clause arguments. I noticed this
problem when I ported over the C/C++ compile time test cases to fortran.
In addition to the two new test files, a couple of other existing tests
needed to be updated to accommodate this new behavior. I've applied it
to gomp-4_0-branch.

Nathan, I haven't looked too deeply into your tile changes yet. Do you
know of the fortran FE is doing anything wrong? I haven't checked if
it's lowering the tile clause in the proper format yet.

Cesar
2016-10-03  Cesar Philippidis  

	gcc/fortran/
	* openmp.c (resolve_oacc_positive_int_expr):Promote the
  warning to an error.

	gcc/testsuite/
	* gfortran.dg/goacc/loop-2.f95: Change expected tile clause
	warnings to errors. 
	* gfortran.dg/goacc/loop-5.f95: Likewise.
	* gfortran.dg/goacc/sie.f95: Likewise.
	* gfortran.dg/goacc/tile-1.f90: New test.
	* gfortran.dg/goacc/tile-2.f90: New test.


diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 92b9afe..399b5d1 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -3266,8 +3266,8 @@ resolve_oacc_positive_int_expr (gfc_expr *expr, const char *clause)
   resolve_oacc_scalar_int_expr (expr, clause);
   if (expr->expr_type == EXPR_CONSTANT && expr->ts.type == BT_INTEGER
   && mpz_sgn(expr->value.integer) <= 0)
-gfc_warning (0, "INTEGER expression of %s clause at %L must be positive",
-		 clause, &expr->where);
+gfc_error ("INTEGER expression of %s clause at %L must be positive",
+	   clause, &expr->where);
 }
 
 /* Emits error when symbol is pointer, cray pointer or cray pointee
diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-2.f95 b/gcc/testsuite/gfortran.dg/goacc/loop-2.f95
index 0c902b2..d4c6273 100644
--- a/gcc/testsuite/gfortran.dg/goacc/loop-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/loop-2.f95
@@ -143,7 +143,7 @@ program test
   DO j = 1,10
   ENDDO
 ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
+!$acc loop tile(-1) ! { dg-error "must be positive" }
 do i = 1,10
 enddo
 !$acc loop tile(i) ! { dg-error "constant expression" }
@@ -307,7 +307,7 @@ program test
   DO j = 1,10
   ENDDO
 ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
+!$acc loop tile(-1) ! { dg-error "must be positive" }
 do i = 1,10
 enddo
 !$acc loop tile(i) ! { dg-error "constant expression" }
@@ -460,7 +460,7 @@ program test
 DO j = 1,10
 ENDDO
   ENDDO
-  !$acc kernels loop tile(-1) ! { dg-warning "must be positive" }
+  !$acc kernels loop tile(-1) ! { dg-error "must be positive" }
   do i = 1,10
   enddo
   !$acc kernels loop tile(i) ! { dg-error "constant expression" }
@@ -612,7 +612,7 @@ program test
 DO j = 1,10
 ENDDO
   ENDDO
-  !$acc parallel loop tile(-1) ! { dg-warning "must be positive" }
+  !$acc parallel loop tile(-1) ! { dg-error "must be positive" }
   do i = 1,10
   enddo
   !$acc parallel loop tile(i) ! { dg-error "constant expression" }
diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-5.f95 b/gcc/testsuite/gfortran.dg/goacc/loop-5.f95
index d059cf7..fe137d5 100644
--- a/gcc/testsuite/gfortran.dg/goacc/loop-5.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/loop-5.f95
@@ -93,9 +93,6 @@ program test
   DO j = 1,10
   ENDDO
 ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
-do i = 1,10
-enddo
 !$acc loop vector tile(*)
 DO i = 1,10
 ENDDO
@@ -129,9 +126,6 @@ program test
   DO j = 1,10
   ENDDO
 ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
-do i = 1,10
-enddo
 !$acc loop vector tile(*)
 DO i = 1,10
 ENDDO
@@ -242,9 +236,6 @@ program test
 DO j = 1,10
 ENDDO
   ENDDO
-  !$acc kernels loop tile(-1) ! { dg-warning "must be positive" }
-  do i = 1,10
-  enddo
   !$acc kernels loop vector tile(*)
   DO i = 1,10
   ENDDO
@@ -333,9 +324,6 @@ program test
 DO j = 1,10
 ENDDO
   ENDDO
-  !$acc parallel loop tile(-1) ! { dg-warning "must be positive" }
-  do i = 1,10
-  enddo
   !$acc parallel loop vector tile(*)
   DO i = 1,10
   ENDDO
diff --git a/gcc/testsuite/gfortran.dg/goacc/sie.f95 b/gcc/testsuite/gfortran.dg/goacc/sie.f95
index 2d66026..b4dd9ed 100644
--- a/gcc/testsuite/gfortran.dg/goacc/sie.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/sie.f95
@@ -78,10 +78,10 @@ program test
   !$acc parallel num_gangs(i+1)
   !$acc end parallel
 
-  !$acc parallel num_gangs(-1) ! { dg-warning "must be positive" }
+  !$acc parallel num_gangs(-1) ! { dg-error "must be positive" }
   !$acc end parallel
 
-  !$acc parallel num_gangs(0) ! { dg-warning "must be positive" }
+  !$acc parallel num_gangs(0) ! { dg-error "must be positive" }
   !$acc end parallel
 
   !$acc parallel num_gangs() ! { dg-error "Invalid character in name" }
@@ -107,10 +107,10 @@ program test
   !$acc parallel num_workers(i+1)
   !$acc end parallel
 
-  !$acc parallel num_workers(-1

Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-10-03 Thread Tom Tromey
> "Eric" == Eric Botcazou  writes:

Eric> So, because of its excessive pickiness, the warning ends up making the 
user 
Eric> butcher informative comments.  How is that helpful?

Those comments are not informative.  In most cases I kept the original
text just to forestall complaints.  But really if you read those
comments they are pointless.

That said, it would be better by far to have a mode where only the
attribute is accepted, and where comments aren't parsed.  Then gcc could
also warn when the attribute is used incorrectly.  The reason this is
preferable is that it helps protect against more errors, say those
introduced by merge mistakes.

Tom


Re: [gomp4] update gfortran's tile clause error handling

2016-10-03 Thread Nathan Sidwell

On 10/03/16 10:07, Cesar Philippidis wrote:


Nathan, I haven't looked too deeply into your tile changes yet. Do you
know of the fortran FE is doing anything wrong? I haven't checked if
it's lowering the tile clause in the proper format yet.


thanks for working on this.  The problems I noticed (& fixed) in the C/c++ 
frontends were


1) map '*' onto integer_zero_node -- this makes my changes cleaner.

2) should only accept integer constant expressions (whatever the fortran 
equivalent of that is).  While runtime values could be made to work, the std 
doesn't require that, and it would perform quite badly due to the lack of 
constant folding


3) failing to parse nested loops correctly.  It only parsed the outermost loop 
as a parallel loop.  Tile in many ways looks like collapse


If those could be addressed that'd be great -- it doesn't need my tile WIP to do 
that.


--
Nathan Sidwell


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-10-03 Thread Martin Liška
On 10/03/2016 03:03 PM, Rainer Orth wrote:
> Hi Martin,
> 
>> On 09/30/2016 02:31 PM, Rainer Orth wrote:
>>> this would be i386-pc-solaris2.12.  I'm not sure if the constructor
>>> priority detection works in a cross scenario.
>>>
>>> I'm attaching the resulting assembly (although for Solaris as, the gas
>>> build is still running).
>>
>> Hi. Sorry, I have a stupid mistake in dtor priority
>> (I used 65534 instead of desired 99). Please try to test it on Solaris 12
>> with the attached patch. I'll send the patch to ML soon.
> 
> unfortunately, the patch makes no difference on Solaris 12.  The test
> even FAILs when using gas/gld, which is a different/independent
> implementation of constructor priority.

Ok, can you please send me x.S file for Solaris 12?

> 
>> Can you please test whether it makes any change on a solaris target w/o
>> prioritized ctors/dtors?
> 
> It doesn't: the test PASSes on Solaris 10 and 11 with and without your
> patch.

I see, that would require the former approach using atexit, which would be
chosen depending on whether target supports prioritized dtors or not.

Martin

> 
>   Rainer
> 



[PATCH] Fix libstdc++ versioned namespace build

2016-10-03 Thread Jonathan Wakely

The versioned namespace build has been broken on all branches for some
time. It's due to new code that doesn't use the namespace macros in
the right places. This fixes all issues.

Rather than declaring the std::experimental::* namespaces in
 I've added a new file that declares them and is
only included by LFTS headers. That allows the new test to pass, which
verifies that the std::experimental namespace doesn't exist when no TS
headers are included.

PR libstdc++/68323
PR libstdc++/77794
* config/abi/pre/gnu-versioned-namespace.ver: Add exports for
__cxa_thread_atexit and __gnu_cxx::__freeres.
* include/Makefile.am: Add 
* include/Makefile.in: Regenerate.
* include.bits/basic_string.h: Fix nesting of versioned namespaces.
* include/bits/c++config: Declare versioned namespaces for literals.
* include/bits/regex.h (basic_regex, match_results): Add workarounds
for PR c++/59256.
* include/bits/uniform_int_dist.h: Fix nesting of versioned namespace.
* include/std/chrono: Likewise.
* include/std/complex: Likewise.
* include/std/string_view: Likewise.
* include/std/variant: Likewise. Add workaround for PR c++/59256.
* include/experimental/bits/fs_fwd.h: Declare versioned namespace.
* include/experimental/bits/lfts_config.h: Declare versioned
namespaces.
* include/experimental/algorithm: Include
.
* include/experimental/any: Likewise.
* include/experimental/bits/erase_if.h: Likewise.
* include/experimental/chrono: Likewise.
* include/experimental/functional: Likewise.
* include/experimental/memory_resource: Likewise.
* include/experimental/optional: Likewise.
* include/experimental/propagate_const: Likewise.
* include/experimental/random: Likewise.
* include/experimental/ratio: Likewise.
* include/experimental/system_error: Likewise.
* include/experimental/tuple: Likewise.
* include/experimental/type_traits: Likewise.
* include/experimental/utility: Likewise.
* include/experimental/string_view: Likewise. Fix nesting of
versioned namespaces.
* include/experimental/bits/string_view.tcc: Reopen inline namespace
for non-inline function definitions.
* testsuite/17_intro/using_namespace_std_exp_neg.cc: New test.
* testsuite/20_util/duration/literals/range.cc: Adjust dg-error line.
* testsuite/experimental/any/misc/any_cast_neg.cc: Likewise.
* testsuite/experimental/propagate_const/assignment/move_neg.cc:
Likewise.
* testsuite/experimental/propagate_const/cons/move_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.
* testsuite/ext/profile/mutex_extensions_neg.cc: Likewise.

Tested x86_64-linux, with --enable-symvers=gnu-versioned-namespace and
--enable-symvers=gnu, on trunk and gcc-6 and gcc-5 branches.

The only failures are in synopsis.cc tests which expect to be able to
redeclare names in namespace std (which is ambiguous if they're really
declared in std::__7) or in tests that use scan-assembler or GDB and
the expected strings are different due to the __7 namespace. I will
probably add an effective target for the versioned namespace so we can
disable those tests when they're going to fail.

Committed to trunk and gcc-6 and gcc-5 branches.

commit 7a3e391a33130d8cee8d763978b6fdc7b0ffd8ea
Author: redi 
Date:   Mon Oct 3 14:35:28 2016 +

Fix libstdc++ versioned namespace build

PR libstdc++/68323
PR libstdc++/77794
* config/abi/pre/gnu-versioned-namespace.ver: Add exports for
__cxa_thread_atexit and __gnu_cxx::__freeres.
* include/Makefile.am: Add 
* include/Makefile.in: Regenerate.
* include.bits/basic_string.h: Fix nesting of versioned namespaces.
* include/bits/c++config: Declare versioned namespaces for literals.
* include/bits/regex.h (basic_regex, match_results): Add workarounds
for PR c++/59256.
* include/bits/uniform_int_dist.h: Fix nesting of versioned namespace.
* include/std/chrono: Likewise.
* include/std/complex: Likewise.
* include/std/string_view: Likewise.
* include/std/variant: Likewise. Add workaround for PR c++/59256.
* include/experimental/bits/fs_fwd.h: Declare versioned namespace.
* include/experimental/bits/lfts_config.h: Declare versioned
namespaces.
* include/experimental/algorithm: Include
.
* include/experimental/any: Likewise.
* include/experimental/bits/erase_if.h: Likewise.
* inclu

Re: [PATCH, OpenACC, Fortran] Fix PR77371, ICE on allocatable

2016-10-03 Thread Jakub Jelinek
On Sun, Oct 02, 2016 at 06:15:18PM +0800, Chung-Lin Tang wrote:
> This patch fixes the two ICEs listed on PR77371.
> One is due to the Fortran omp_privatize_by_reference hook returning true
> for types like 'character(kind=1)[1:XX] *', causing them to be processed
> by the path intended for C++ reference types.

The path isn't something intended for C++ reference types, but for all the
vars where whatever they point to should be privatized rather than just
their value.  Consider

program p
   integer, allocatable :: n
   integer :: m
   allocate (n)
   n = 6
!$acc parallel firstprivate(n) private(m)
   m = n
!$acc end parallel
end

testcase which with -fopenacc ICEs the same way, and then look carefully
what is done on

program p
   integer, allocatable :: n
   integer :: m
   allocate (n)
   n = 6
!$omp parallel firstprivate(n) private(m)
   m = n
!$omp end parallel
end

with -fopenmp.  The var is actually properly allocatable in the latter case,
while it is not with your patch on the first testcase, you just copy over the 
host pointer, that
is definitely not going to work on non-shared memory offloading.
There is nothing special about references that use POINTER_TYPE as opposed
to REFERENCE_TYPE.  So, please first get this working with firstprivate on
allocatables and only then start to play with reductions.

> The other one is simply not setting 'remove = true' while error_at() was 
> already called.

The gimplify.c change is ok for trunk.

> Tested without regressions, committed on gomp-4_0-branch,
> is this okay for trunk as well?
> 
> Thanks,
> Chung-Lin
> 
>   PR fortran/77371
>   * omp-low.c (lower_omp_target): Avoid reference-type processing
>   on pointers for firstprivate clause.
>   * gimplify.c (gimplify_adjust_omp_clauses): Add 'remove = true'
>   when emitting error on private/firstprivate reductions.
> 
>   testsuite/
>   * gfortran.dg/goacc/pr77371-1.f90: New test.
>   * gfortran.dg/goacc/pr77371-2.f90: New test.

Jakub


Re: [PATCH] Remove .jcr registry from the crtfiles

2016-10-03 Thread Joseph Myers
As usual when removing target macros they should be poisoned in system.h.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Remove x86 pcommit instruction

2016-10-03 Thread Andrew Senkevich
Hi,

this patch removes PCOMMIT instruction since it was deprecated,

please visit 
https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction
for details.

Regtested on x86_64.  Is it Ok for trunk?

2016-10-03  Andrew Senkevich  

gcc/

* common/config/i386/i386-common.c (OPTION_MASK_ISA_PCOMMIT_UNSET,
OPTION_MASK_ISA_PCOMMIT_SET): Deleted definitions.
(ix86_handle_option): Deleted handle of OPT_mpcommit.
* config.gcc: Deleted pcommitintrin.h
* config/i386/pcommitintrin.h: Deleted file.
* config/i386/cpuid.h (bit_PCOMMIT): Deleted.
* config/i386/driver-i386.c (host_detect_local_cpu): Deleted pcommit
detection.
* config/i386/i386-c.c (ix86_target_macros_internal): Deleted define
__PCOMMIT__.
* config/i386/i386.c (ix86_target_string): Deleted -mpcommit.
(PTA_PCOMMIT): Deleted define.
(ix86_option_override_internal): Deleted handle of option.
(ix86_valid_target_attribute_inner_p): Deleted pcommit.
* config/i386/i386-builtin.def (IX86_BUILTIN_PCOMMIT,
__builtin_ia32_pcommit): Deleted.
* config/i386/i386.h (TARGET_PCOMMIT, TARGET_PCOMMIT_P): Deleted.
* config/i386/i386.md (unspecv): Deleted UNSPECV_PCOMMIT.
(pcommit): Deleted instruction.
* config/i386/i386.opt: Deleted mpcommit.
* config/i386/x86intrin.h: Deleted inclusion of pcommitintrin.h.

gcc/testsuite/

* gcc.target/i386/pcommit-1.c: Deleted.
* gcc.target/i386/sse-12.c: Deleted -mpcommit option.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2b771d1..0728a9d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,35 @@
+2016-10-03  Andrew Senkevich  
+
+   * common/config/i386/i386-common.c (OPTION_MASK_ISA_PCOMMIT_UNSET,
+   OPTION_MASK_ISA_PCOMMIT_SET): Deleted definitions.
+   (ix86_handle_option): Deleted handle of OPT_mpcommit.
+   * config.gcc: Deleted pcommitintrin.h
+   * config/i386/pcommitintrin.h: Deleted.
+   * config/i386/cpuid.h (bit_PCOMMIT): Deleted.
+   * config/i386/driver-i386.c (host_detect_local_cpu): Deleted pcommit
+   detection.
+   * config/i386/i386-c.c (ix86_target_macros_internal): Deleted define
+   __PCOMMIT__.
+   * config/i386/i386.c (ix86_target_string): Deleted -mpcommit.
+   (PTA_PCOMMIT): Deleted define.
+   (ix86_option_override_internal): Deleted handle of option.
+   (ix86_valid_target_attribute_inner_p): Deleted pcommit.
+   * config/i386/i386-builtin.def (IX86_BUILTIN_PCOMMIT,
+   __builtin_ia32_pcommit): Deleted.
+   * config/i386/i386.h (TARGET_PCOMMIT, TARGET_PCOMMIT_P): Deleted.
+   * config/i386/i386.md (unspecv): Deleted UNSPECV_PCOMMIT.
+   (pcommit): Deleted instruction.
+   * config/i386/i386.opt: Add mpcommit.
+   * config/i386/x86intrin.h: Delete inclusion of pcommitintrin.h.
+   * testsuite/gcc.target/i386/pcommit-1.c: Deleted.
+   * gcc/testsuite/gcc.target/i386/sse-12.c: Deleted -pcommit option.
+   * gcc/testsuite/gcc.target/i386/sse-13.c: Ditto.
+   * gcc/testsuite/gcc.target/i386/sse-14.c: Ditto.
+   * gcc/testsuite/gcc.target/i386/sse-22.c: Ditto.
+   * gcc/testsuite/gcc.target/i386/sse-23.c: Ditto.
+   * gcc/testsuite/g++.dg/other/i386-2.C: Ditto.
+   * gcc/testsuite/g++.dg/other/i386-3.C: Ditto.
+
 2016-10-03  Kyrylo Tkachov  

Revert
diff --git a/gcc/common/config/i386/i386-common.c
b/gcc/common/config/i386/i386-common.c
index 4f0a55f..ce1b5f7 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -86,7 +86,6 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_XSAVEC_SET \
   (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE)
 #define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
-#define OPTION_MASK_ISA_PCOMMIT_SET OPTION_MASK_ISA_PCOMMIT

 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
as -msse4.2.  */
@@ -187,7 +186,6 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLFLUSHOPT_UNSET OPTION_MASK_ISA_CLFLUSHOPT
 #define OPTION_MASK_ISA_XSAVEC_UNSET OPTION_MASK_ISA_XSAVEC
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
-#define OPTION_MASK_ISA_PCOMMIT_UNSET OPTION_MASK_ISA_PCOMMIT
 #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 #define OPTION_MASK_ISA_MWAITX_UNSET OPTION_MASK_ISA_MWAITX
 #define OPTION_MASK_ISA_CLZERO_UNSET OPTION_MASK_ISA_CLZERO
@@ -933,19 +931,6 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;

-case OPT_mpcommit:
-  if (value)
-   {
- opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PCOMMIT_SET;
- o

Re: [patch] Fix ICE on ACATS test for Aarch64 at -O

2016-10-03 Thread Eric Botcazou
Ping: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01781.html

> 2016-09-26  Eric Botcazou  
> 
>   * expmed.c (expand_shift_1): Add MAY_FAIL parameter and do not assert
>   that the result is non-zero if it is true.
>   (maybe_expand_shift): New wrapper around expand_shift_1.
>   (emit_store_flag): Call maybe_expand_shift in lieu of expand_shift.

-- 
Eric Botcazou


RE: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

2016-10-03 Thread Doug Gilmore
>From: Christophe Lyon [christophe.l...@linaro.org]
>Sent: Monday, October 03, 2016 12:05 AM
>To: Doug Gilmore
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: Fix PR tree-optimization/77808, ICE in 
>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>
>On 2 October 2016 at 23:05, Doug Gilmore  wrote:
>> Hi Christophe,
>>
>>> From: Christophe Lyon [christophe.l...@linaro.org]
>>> Sent: Saturday, October 01, 2016 7:57 AM
>>> To: Doug Gilmore
>>> Cc: gcc-patches@gcc.gnu.org
>>> Subject: Re: Fix PR tree-optimization/77808, ICE in 
>>> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>>
>>> Hi Doug,
>>>
>>> ...
>>> I can confirm that your patch fixes the ICE I was seeing.
>>>
>>> However, the new testcase does not pass on low end
>>> architectures:
>>> cc1: warning: -fprefetch-loop-arrays not supported for this target
>>> (try -march switches)
>>>
>>> Can you add a guard?
>>>
>>> Thanks,
>>>
>>> Christophe
>> I updated the test to only run on X86, MIPS and AARCH64.  Is that OK?
>>
>
>I'm afraid not.
>
>The ICE occurred on some arm targets. By "low end" I meant armv5t for
>example, as opposed to armv7t.
>Is there a suitable effective target?
I'll need to investigate that.  BTW, gcc.dg/pr53550.c contains:
/* PR tree-optimization/53550 */
/* { dg-do compile } */
/* { dg-options "-O2 -fprefetch-loop-arrays -w" } */

int *
foo (int *x)
{
  int *a = x + 10, *b = x, *c = a;
  while (b != c)
*--c = *b++;
  return x;
}

Is it also failing on armv5t?  I suppose it would.

Thanks,

Doug
>
>Thanks,
>
>Christophe
>
>> Thanks,
>>
>> Doug


Re: [PATCH, ARM 2/7, ping] Adapt atomic and exclusive load and store to ARMv8-M Baseline

2016-10-03 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 22/09/16 14:41, Thomas Preudhomme wrote:

Hi,

This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
load and store patterns to the constraints of ARMv8-M Baseline. It consists of
two sets of changes:

- adding non predicated output templates because ARMv8-M Baseline does not have
IT instruction
- use low registers for ldr/str

Together these changes require to create 2 new alternatives for atomic_load and
atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
constraint) where ldr/str are used and thus low registers must be used and (ii)
another one for the other memory model where lda/stl are used. These are
separate from the constraint for 32bit targets whose output templates expect
predication.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2016-07-05  Thomas Preud'homme  

* config/arm/constraints.md (Q constraint): Document its use for
Thumb-1.
(Pf constraint): New constraint for relaxed, consume or relaxed memory
models.
* config/arm/sync.md (atomic_load): Add new ARMv8-M Baseline only
alternatives to allow any register when memory model matches Pf and
thus lda is used, but only low registers otherwise.  Use unpredicated
output template for Thumb-1 targets.
(atomic_store): Likewise for stl.
(arm_load_exclusive): Add new ARMv8-M Baseline only alternative
whose output template does not have predication.
(arm_load_acquire_exclusive): Likewise.
(arm_load_exclusivesi): Likewise.
(arm_load_acquire_exclusivesi): Likewise.
(arm_store_release_exclusive): Likewise.
(arm_store_exclusive): Use unpredicated output template for
Thumb-1 targets.


Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.

Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any 
issue:

gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3


Is this ok for trunk?

Best regards,

Thomas

[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/a

Re: [PATCH][v4] GIMPLE store merging pass

2016-10-03 Thread Richard Biener
On October 3, 2016 3:02:04 PM GMT+02:00, Kyrill Tkachov 
 wrote:
>Hi Richard,
>another question as I'm working through your comments...
>
>On 29/09/16 11:45, Richard Biener wrote:
>>
>>> +  /* The region from the byte array that we're inserting into. 
>*/
>>> +  tree ptr_wide_int
>>> +   = native_interpret_expr (dest_int_type, ptr + first_byte,
>>> +total_bytes);
>>> +
>>> +  gcc_assert (ptr_wide_int);
>>> +  wide_int dest_wide_int
>>> +   = wi::to_wide (ptr_wide_int, TYPE_PRECISION (dest_int_type));
>>> +  wide_int expr_wide_int
>>> +   = wi::to_wide (tmp_int, byte_size * BITS_PER_UNIT);
>>> +  if (BYTES_BIG_ENDIAN)
>>> +   {
>>> + unsigned int insert_pos
>>> +   = byte_size * BITS_PER_UNIT - bitlen - (bitpos %
>BITS_PER_UNIT);
>>> + dest_wide_int
>>> +   = wi::insert (dest_wide_int, expr_wide_int, insert_pos,
>bitlen);
>>> +   }
>>> +  else
>>> +   dest_wide_int = wi::insert (dest_wide_int, expr_wide_int,
>>> +   bitpos % BITS_PER_UNIT, bitlen);
>>> +
>>> +  tree res = wide_int_to_tree (dest_int_type, dest_wide_int);
>>> +  native_encode_expr (res, ptr + first_byte, total_bytes, 0);
>>> +
>> OTOH this whole dance looks as complicated and way more expensive
>than
>> using native_encode_expr into a temporary buffern and then a
>> manually implemented "bit-merging" of it at ptr + first_byte +
>bitpos.
>> AFAICS that operation is even endianess agnostic.
>
>If the quantity we're inserting at a non-byte boundary
>is more than a byte wide we still have to shift the value
>to position properly across the bytes it straddles, so I don't
>see how we can avoid creating a wide_int here.
>Consider inserting a 10-bit value at bitposition 3 (I hope the mailer
>doesn't screw up the indentation):
>value:  xx
>before: ||||
> | byte 1 || byte 2 |
>after:  |---x||x---|
>
>We'll native_encode_expr the value into a two-byte buffer but then we
>can't
>just shift each byte by 3 to insert it into the destination buffer, we
>need
>to form the whole 10-bit value and shift is as a whole to not lose any
>bits.

Native encode will encode into a byte array in target representation / 
endianess.

I think you can work byte-wise by properly merging 'lost' bits from adjacent 
bytes.  And you at most need 2 of them per 'target byte'.

>And if a value crosses bytes then we need to care about
>BYTES_BIG_ENDIAN when
>writing the bytes back into the buffer, no?

If you shift a > byte size quantity on the host (wide-ints are in host 
endianess) then you indeed need to watch out for endianess.

But as we deal with target memory representation plus bit offsets into memory I 
think it's natural to work with bytes.

Richard.

>Thanks,
>Kyrill




Re: [PATCH, ARM 3/7, ping] Refactor atomic compare_and_swap to make it fit for ARMv8-M Baseline

2016-10-03 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 22/09/16 14:44, Thomas Preudhomme wrote:

Hi,

This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch refactors the expander and
splitter for atomics to make the logic work with ARMv8-M Baseline which has
limitation of Thumb-1 in terms of CC flag setting and different conditional
compare insn patterns.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2016-09-02  Thomas Preud'homme  

* config/arm/arm.c (arm_expand_compare_and_swap): Add new bdst local
variable.  Add the new parameter to the insn generator.  Set that
parameter to be CC flag for 32-bit targets, bval otherwise.  Set the
return value from the negation of that parameter for Thumb-1, keeping
the logic unchanged otherwise except for using bdst as the destination
register of the compare_and_swap insn.
(arm_split_compare_and_swap): Add explanation about how is the value
returned to the function comment.  Rename scratch variable to
neg_bval.  Adapt initialization of variables holding operands to the
new operand numbers.  Use return register to hold result of store
exclusive for Thumb-1, scratch register otherwise.  Construct the
appropriate cbranch for Thumb-1 targets, keeping the logic unchanged
for 32-bit targets.  Guard Z flag setting to restrict to 32bit targets.
Use gen_cbranchsi4 rather than hand-written conditional branch to loop
for strongly ordered compare_and_swap.
* config/arm/predicates.md (cc_register_operand): New predicate.
* config/arm/sync.md (atomic_compare_and_swap_1): Use a
match_operand with the new predicate to accept either the CC flag or a
destination register for the boolean return value, restricting it to
CC flag only via constraint.  Adapt operand numbers accordingly.


Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.

Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any 
issue:

gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3


Is this ok for trunk?

Best regards,

Thomas

[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/

Re: [PATCH, ARM 4/7, ping] Adapt atomic compare and swap to ARMv8-M Baseline

2016-10-03 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 22/09/16 14:46, Thomas Preudhomme wrote:

Hi,

This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch makes the necessary change
for compare and swap to work for ARMv8-M Baseline, doubleword integers excepted.
Namely, it adds Thumb-1 specific constraints to compare_and_swap. The
constraints are chosen so that once the pattern is splitted, the individual
instructions have their constraints respected. In particular, the constraints
for the cbranchsi4_* pattern must be duplicated here, which explains the use of
several alternatives.

Note: changes to enable other atomic operation are in the next patch of the 
series.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2016-07-05  Thomas Preud'homme  

* config/arm/sync.md (atomic_compare_and_swap_1): Add new ARMv8-M
Baseline only alternatives to (i) hold store atomic success value in a
return register rather than a scratch register, (ii) use a low register
for it and to (iii) ensure the cbranchsi insn generated by the split
respect the constraints of Thumb-1 cbranchsi4_insn and
cbranchsi4_scratch.
* config/arm/thumb1.md (cbranchsi4_insn): Add comment to indicate
constraints must match those in atomic_compare_and_swap.
(cbranchsi4_scratch): Likewise.


Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.

Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any 
issue:

gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3


Is this ok for trunk?

Best regards,

Thomas

[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
libstdc++-v3/testsuite/29_atomics

Re: [PATCH, ARM 5/7, ping] Adapt other atomic operations to ARMv8-M Baseline

2016-10-03 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 22/09/16 14:47, Thomas Preudhomme wrote:

Hi,

This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch adds support for remaining
atomic operations (exchange, addition, substraction, bitwise AND, OR, XOR and
NAND to ARMv8-M Baseline, doubleword integers excepted. As with the previous
patch in the patch series, this mostly consists adding Thumb-1 specific
constraints to atomic_* patterns to match those in thumb1.md for the non atomic
operation.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2016-09-02  Thomas Preud'homme  

* config/arm/arm.c (arm_split_atomic_op): Add function comment.  Add
logic to to decide whether to copy over old value to register for new
value.
* config/arm/sync.md: Add comments explaning why mode and code
attribute are not defined in iterators.md
(thumb1_atomic_op_str): New code attribute.
(thumb1_atomic_newop_str): Likewise.
(thumb1_atomic_fetch_op_str): Likewise.
(thumb1_atomic_fetch_newop_str): Likewise.
(thumb1_atomic_fetch_oldop_str): Likewise.
(atomic_exchange): Add new ARMv8-M Baseline only alternatives to
mirror the more restrictive constraints of the Thumb-1 insns after
split compared to Thumb-2 counterpart insns.
(atomic_): Likewise.  Add comment to keep constraints
in sync with non atomic version.
(atomic_nand): Likewise.
(atomic_fetch_): Likewise.
(atomic_fetch_nand): Likewise.
(atomic__fetch): Likewise.
(atomic_nand_fetch): Likewise.
* config/arm/thumb1.md (thumb1_addsi3): Add comment to keep contraint
in sync with atomic version.
(thumb1_subsi3_insn): Likewise.
(thumb1_andsi3_insn): Likewise.
(thumb1_iorsi3_insn): Likewise.
(thumb1_xorsi3_insn): Likewise.


Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.

Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any 
issue:

gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3


Is this ok for trunk?

Best regards,

Thomas

[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gc

Re: [PATCH, ARM/testsuite 6/7, ping] Force soft float in ARMv6-M and ARMv8-M Baseline options

2016-10-03 Thread Thomas Preudhomme

On 22/09/16 17:15, Thomas Preudhomme wrote:

On 22/09/16 16:47, Richard Earnshaw (lists) wrote:

On 22/09/16 15:51, Thomas Preudhomme wrote:

Sorry, noticed an error in the patch. It was not caught during testing
because GCC was built with --with-mode=thumb. Correct patch attached.

Best regards,

Thomas

On 22/09/16 14:49, Thomas Preudhomme wrote:

Hi,

ARMv6-M and ARMv8-M Baseline only support soft float ABI. Therefore, the
arm_arch_v8m_base add option should pass -mfloat-abi=soft, much like
-mthumb is
passed for architectures that only support Thumb instruction set. This
patch
adds -mfloat-abi=soft to both arm_arch_v6m and arm_arch_v8m_base add
options.
Patch is in attachment.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2016-07-15  Thomas Preud'homme  

* lib/target-supports.exp (add_options_for_arm_arch_v6m): Add
-mfloat-abi=soft option.
(add_options_for_arm_arch_v8m_base): Likewise.


Is this ok for trunk?

Best regards,

Thomas


6_softfloat_testing_v6m_v8m_baseline.patch


diff --git a/gcc/testsuite/lib/target-supports.exp
b/gcc/testsuite/lib/target-supports.exp
index
0dabea0850124947a7fe333e0b94c4077434f278..b5d72f1283be6a6e4736a1d20936e169c1384398
100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3540,24 +3540,25 @@ proc check_effective_target_arm_fp16_hw { } {
 # Usage: /* { dg-require-effective-target arm_arch_v5_ok } */
 #/* { dg-add-options arm_arch_v5 } */
 # /* { dg-require-effective-target arm_arch_v5_multilib } */
-foreach { armfunc armflag armdef } { v4 "-march=armv4 -marm" __ARM_ARCH_4__
- v4t "-march=armv4t" __ARM_ARCH_4T__
- v5 "-march=armv5 -marm" __ARM_ARCH_5__
- v5t "-march=armv5t" __ARM_ARCH_5T__
- v5te "-march=armv5te" __ARM_ARCH_5TE__
- v6 "-march=armv6" __ARM_ARCH_6__
- v6k "-march=armv6k" __ARM_ARCH_6K__
- v6t2 "-march=armv6t2" __ARM_ARCH_6T2__
- v6z "-march=armv6z" __ARM_ARCH_6Z__
- v6m "-march=armv6-m -mthumb" __ARM_ARCH_6M__
- v7a "-march=armv7-a" __ARM_ARCH_7A__
- v7r "-march=armv7-r" __ARM_ARCH_7R__
- v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
- v7em "-march=armv7e-m -mthumb" __ARM_ARCH_7EM__
- v8a "-march=armv8-a" __ARM_ARCH_8A__
- v8_1a "-march=armv8.1a" __ARM_ARCH_8A__
- v8m_base "-march=armv8-m.base -mthumb"
__ARM_ARCH_8M_BASE__
- v8m_main "-march=armv8-m.main -mthumb"
__ARM_ARCH_8M_MAIN__ } {
+foreach { armfunc armflag armdef } {
+v4 "-march=armv4 -marm" __ARM_ARCH_4__
+v4t "-march=armv4t" __ARM_ARCH_4T__
+v5 "-march=armv5 -marm" __ARM_ARCH_5__
+v5t "-march=armv5t" __ARM_ARCH_5T__
+v5te "-march=armv5te" __ARM_ARCH_5TE__
+v6 "-march=armv6" __ARM_ARCH_6__
+v6k "-march=armv6k" __ARM_ARCH_6K__
+v6t2 "-march=armv6t2" __ARM_ARCH_6T2__
+v6z "-march=armv6z" __ARM_ARCH_6Z__
+v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
+v7a "-march=armv7-a" __ARM_ARCH_7A__
+v7r "-march=armv7-r" __ARM_ARCH_7R__
+v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
+v7em "-march=armv7e-m -mthumb" __ARM_ARCH_7EM__
+v8a "-march=armv8-a" __ARM_ARCH_8A__
+v8_1a "-march=armv8.1a" __ARM_ARCH_8A__
+v8m_base "-march=armv8-m.base -mthumb -mfloat-abi=soft"
__ARM_ARCH_8M_BASE__
+v8m_main "-march=armv8-m.main -mthumb" __ARM_ARCH_8M_MAIN__ } {
 eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
 proc check_effective_target_arm_arch_FUNC_ok { } {
 if { [ string match "*-marm*" "FLAG" ] &&



I think if you're going to do this you need to also check that changing
the ABI in this way isn't incompatible with other aspects of how the
user has invoked dejagnu.


So should this check also be done for all the target for which -mthumb is passed
or is there a difference between the two situations?


Ping?

Best regards,

Thomas


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-10-03 Thread Jeff Law

On 10/03/2016 06:26 AM, Nathan Sidwell wrote:

On 10/03/16 08:13, Martin Liška wrote:

On 08/18/2016 05:53 PM, Jeff Law wrote:

On 08/18/2016 09:51 AM, Andi Kleen wrote:

I'd prefer to make updates atomic in multi-threaded applications.
The best proxy we have for that is -pthread.

Is it slower, most definitely, but odds are we're giving folks
garbage data otherwise, which in many ways is even worse.


It will likely be catastrophically slower in some cases.

Catastrophically as in too slow to be usable.

An atomic instruction is a lot more expensive than a single
increment. Also
they sometimes are really slow depending on the state of the machine.

And for those cases there's a way to override.

The default should be set for correctness.

jeff


I would to somehow resolve the discussion related to default value
selection.
Is the prevailing consensus that we should set -fprofile-update=atomic
when
-pthread is set? If so, I'll prepare a patch. I tend to do it this way.


This is my preference.

Likewise.
jeff


Re: [PATCH, ARM 7/7, ping] Enable ARMv8-M atomic and synchronization support for ARMv8-M Baseline

2016-10-03 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 22/09/16 14:50, Thomas Preudhomme wrote:

Hi,

This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch enables atomic and
synchronization support added in previous patches of the series and adds tests.
Enabling is done at the end of the patch series to ensure that no ICE is seen
when in the middle of the patch series (eg. while doing a bisect). Enabling is
done by enabling the exclusive and atomic loads and stores needed to implement
all synchronization and atomic operations.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-07-05  Thomas Preud'homme  

* config/arm/arm.h (TARGET_HAVE_LDREX): Define for ARMv8-M Baseline.
(TARGET_HAVE_LDREXBH): Likewise.
(TARGET_HAVE_LDACQ): Likewise.


*** gcc/testsuite/ChangeLog ***

2016-07-05  Thomas Preud'homme  

* gcc.target/arm/atomic-comp-swap-release-acquire-3.c: New test.
* gcc.target/arm/atomic-op-acq_rel-3.c: Likewise.
* gcc.target/arm/atomic-op-acquire-3.c: Likewise.
* gcc.target/arm/atomic-op-char-3.c: Likewise.
* gcc.target/arm/atomic-op-consume-3.c: Likewise.
* gcc.target/arm/atomic-op-int-3.c: Likewise.
* gcc.target/arm/atomic-op-relaxed-3.c: Likewise.
* gcc.target/arm/atomic-op-release-3.c: Likewise.
* gcc.target/arm/atomic-op-seq_cst-3.c: Likewise.
* gcc.target/arm/atomic-op-short-3.c: Likewise.


Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.

Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any 
issue:

gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3


Is this ok for trunk?

Best regards,

Thomas

[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc

[PATCH] Define std::gcd and std::lcm for C++17

2016-10-03 Thread Jonathan Wakely

This shares the code for std::gcd and std::experimental::gcd, and
similarly for lcm.

I realised I'd mixed up the gcd.cc and lcm.cc tests so this patch also
swaps them around.

* doc/xml/manual/status_cxx2017.xml: Update gcd/lcm status.
* doc/html/*: Regenerate.
* include/experimental/numeric (__abs): Move to .
(gcd, lcm): Use __detail::gcd and __detail::lcm.
* include/std/numeric (__detail::__abs_integral)
(__detail::__gcd, __detail::__lcm): Define.
(gcd, lcm): Define for C++17.
* testsuite/26_numerics/gcd/1.cc: New test.
* testsuite/26_numerics/lcm/1.cc: New test.
* testsuite/experimental/numeric/gcd.cc: Swap contents with ...
* testsuite/experimental/numeric/lcd.cc: ... this.

Tested powerpc64le-linux, committed to trunk.

commit 56efd86de7a7bbcfe43fe1c20979e06eb5e49802
Author: Jonathan Wakely 
Date:   Mon Oct 3 17:09:10 2016 +0100

Define std::gcd and std::lcm for C++17

* doc/xml/manual/status_cxx2017.xml: Update gcd/lcm status.
* doc/html/*: Regenerate.
* include/experimental/numeric (__abs): Move to .
(gcd, lcm): Use __detail::gcd and __detail::lcm.
* include/std/numeric (__detail::__abs_integral)
(__detail::__gcd, __detail::__lcm): Define.
(gcd, lcm): Define for C++17.
* testsuite/26_numerics/gcd/1.cc: New test.
* testsuite/26_numerics/lcm/1.cc: New test.
* testsuite/experimental/numeric/gcd.cc: Swap contents with ...
* testsuite/experimental/numeric/lcd.cc: ... this.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index feed085..9f47b349 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -615,14 +615,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
Adopt Selected Library Fundamentals V2 Components for C++17 

   
http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0295r0.pdf";>
P0295R0

   
-   No 
+   7 
__cpp_lib_gcd >= 201606 ,
  __cpp_lib_lcm >= 201606 
   
diff --git a/libstdc++-v3/include/experimental/numeric 
b/libstdc++-v3/include/experimental/numeric
index 5089772..6d1dc21 100644
--- a/libstdc++-v3/include/experimental/numeric
+++ b/libstdc++-v3/include/experimental/numeric
@@ -52,44 +52,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #define __cpp_lib_experimental_gcd_lcm 201411
 
-  // std::abs is not constexpr and doesn't support unsigned integers.
-  template
-constexpr
-enable_if_t<__and_, is_signed<_Tp>>::value, _Tp>
-__abs(_Tp __val)
-{ return __val < 0 ? -__val : __val; }
-
-  template
-constexpr
-enable_if_t<__and_, is_unsigned<_Tp>>::value, _Tp>
-__abs(_Tp __val)
-{ return __val; }
-
-  // Greatest common divisor
+  /// Greatest common divisor
   template
 constexpr common_type_t<_Mn, _Nn>
 gcd(_Mn __m, _Nn __n)
 {
   static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
   static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
-
-  return __m == 0 ? fundamentals_v2::__abs(__n)
-   : __n == 0 ? fundamentals_v2::__abs(__m)
-   : fundamentals_v2::gcd(__n, __m % __n);
+  return std::__detail::__gcd(__m, __n);
 }
 
-  // Least common multiple
+  /// Least common multiple
   template
 constexpr common_type_t<_Mn, _Nn>
 lcm(_Mn __m, _Nn __n)
 {
   static_assert(is_integral<_Mn>::value, "arguments to lcm are integers");
   static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
-
-  return (__m != 0 && __n != 0)
-   ? (fundamentals_v2::__abs(__m) / fundamentals_v2::gcd(__m, __n))
- * fundamentals_v2::__abs(__n)
-   : 0;
+  return std::__detail::__lcm(__m, __n);
 }
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/std/numeric b/libstdc++-v3/include/std/numeric
index 47a7cb8..7b1ab98 100644
--- a/libstdc++-v3/include/std/numeric
+++ b/libstdc++-v3/include/std/numeric
@@ -74,4 +74,83 @@
  * math functions.
  */
 
+#if __cplusplus >= 201402L
+#include 
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+namespace __detail
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  // std::abs is not constexpr and doesn't support unsigned integers.
+  template
+constexpr
+enable_if_t<__and_, is_signed<_Tp>>::value, _Tp>
+__abs_integral(_Tp __val)
+{ return __val < 0 ? -__val : __val; }
+
+  template
+constexpr
+enable_if_t<__and_, is_unsigned<_Tp>>::value, _Tp>
+__abs_integral(_Tp __val)
+{ return __val; }
+
+  template
+constexpr common_type_t<_Mn, _Nn>
+__gcd(_Mn __m, _Nn __n)
+{
+  return __m == 0 ? __detail::__abs_integral(__n)
+   : __n == 0 ? __detail::__abs_integral(__m)
+   : __detail::__gcd(__n, __m % __n);
+}
+
+ 

[PATCH] Extend -Wint-in-bool-context to more conditional expressions

2016-10-03 Thread Bernd Edlinger
Hi!

This is a next step in extending the -Wint-in-bool-context warning
to cover the case when a conditional expression has only
one arm which evaluates to a non-boolean integer value.

With a previous version of this warning, we found PR 77574,
among with several more or less false positives, but meanwhile,
mostly due to excluding conditional expressions that originate
from macro expansion, there are no false positives any more,
so I think this is fine now with -Wall.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.c-family:
2016-10-03  Bernd Edlinger  

	* c-common.c (c_common_truthvalue_conversion): Warn also for suspicious
	conditional expression in boolean context when only one arm is
	non-boolean.

testsuite:
2016-10-03  Bernd Edlinger  

	* c-c++-common/Wint-in-bool-context.c: Update test.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 240713)
+++ gcc/c-family/c-common.c	(working copy)
@@ -4675,6 +4675,14 @@ c_common_truthvalue_conversion (location_t locatio
 	warning_at (EXPR_LOCATION (expr), OPT_Wint_in_bool_context,
 			"?: using integer constants in boolean context, "
 			"the expression will always evaluate to %");
+	  else if ((TREE_CODE (val1) == INTEGER_CST
+		&& !integer_zerop (val1)
+		&& !integer_onep (val1))
+		   || (TREE_CODE (val2) == INTEGER_CST
+		   && !integer_zerop (val2)
+		   && !integer_onep (val2)))
+	warning_at (EXPR_LOCATION (expr), OPT_Wint_in_bool_context,
+			"?: using integer constants in boolean context");
 	}
   /* Distribute the conversion into the arms of a COND_EXPR.  */
   if (c_dialect_cxx ())
Index: gcc/testsuite/c-c++-common/Wint-in-bool-context.c
===
--- gcc/testsuite/c-c++-common/Wint-in-bool-context.c	(revision 240713)
+++ gcc/testsuite/c-c++-common/Wint-in-bool-context.c	(working copy)
@@ -10,7 +10,7 @@ int foo (int a, int b)
   if (a > 0 && a <= (b == 2) ? 1 : 1) /* { dg-bogus "boolean context" } */
 return 2;
 
-  if (a > 0 && a <= (b == 3) ? 0 : 2) /* { dg-bogus "boolean context" } */
+  if (a > 0 && a <= (b == 3) ? 0 : 2) /* { dg-warning "boolean context" } */
 return 3;
 
   if (a == b ? 0 : 0) /* { dg-bogus "boolean context" } */


Re: [PATCH] Remove .jcr registry from the crtfiles

2016-10-03 Thread Jakub Jelinek
On Mon, Oct 03, 2016 at 03:26:10PM +, Joseph Myers wrote:
> As usual when removing target macros they should be poisoned in system.h.

Here is the patch with that poisoning.  Bootstrapped/regtested on
x86_64-linux and i686-linux again, ok for trunk?

2016-10-03  Jakub Jelinek  

gcc/
* defaults.h (JCR_SECTION_NAME, TARGET_USE_JCR_SECTION): Remove.
* system.h (JCR_SECTION_NAME, TARGET_USE_JCR_SECTION): Poison.
* doc/tm.texi.in (TARGET_USE_JCR_SECTION): Remove.
* doc/tm.texi: Regenerated.
* config/i386/mingw32.h (TARGET_USE_JCR_SECTION): Remove.
* config/i386/cygming.h (TARGET_USE_JCR_SECTION): Remove.
* config/darwin.h (JCR_SECTION_NAME): Remove.
* config/pa/pa64-hpux.h (JCR_SECTION_NAME): Remove.
* config/rs6000/aix71.h (TARGET_USE_JCR_SECTION): Remove.
* config/rs6000/aix51.h (TARGET_USE_JCR_SECTION): Remove.
* config/rs6000/aix52.h (TARGET_USE_JCR_SECTION): Remove.
* config/rs6000/aix53.h (TARGET_USE_JCR_SECTION): Remove.
* config/rs6000/aix61.h (TARGET_USE_JCR_SECTION): Remove.
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Don't define
__LIBGCC_JCR_SECTION_NAME__.
libgcc/
* config/i386/cygming-crtbegin.c (_Jv_RegisterClasses): Remove.
(__JCR_LIST__): Remove.
(__gcc_register_frame): Don't attempt to _Jv_RegisterClasses.
* config/i386/cygming-crtend.c (__JCR_END__): Remove.
* config/ia64/crtbegin.S (__JCR_LIST__): Remove.
* config/ia64/crtend.S (__JCR_END__): Remove.
* crtstuff.c: Remove __LIBGCC_JCR_SECTION_NAME__ from preprocessor
conditionals.
(__JCR_LIST__, __JCR_END__): Remove.
(frame_dummy): Don't attempt to _Jv_RegisterClasses.
(__do_global_ctors_1): Likewise.

--- gcc/config/i386/mingw32.h.jj2016-05-20 09:05:08.836063467 +0200
+++ gcc/config/i386/mingw32.h   2016-10-01 18:55:14.646199686 +0200
@@ -239,9 +239,6 @@ do {
 \
 #undef TARGET_N_FORMAT_TYPES
 #define TARGET_N_FORMAT_TYPES 3
 
-/* Let defaults.h definition of TARGET_USE_JCR_SECTION apply. */
-#undef TARGET_USE_JCR_SECTION
-
 #define HAVE_ENABLE_EXECUTE_STACK
 #undef  CHECK_EXECUTE_STACK_ENABLED
 #define CHECK_EXECUTE_STACK_ENABLED flag_setstackexecutable
--- gcc/config/i386/cygming.h.jj2016-09-27 09:46:13.0 +0200
+++ gcc/config/i386/cygming.h   2016-10-01 18:56:16.133441952 +0200
@@ -443,11 +443,6 @@ do {   \
 
 #endif /* HAVE_GAS_WEAK */
 
-/* FIXME: SUPPORTS_WEAK && TARGET_HAVE_NAMED_SECTIONS is true,
-   but for .jcr section to work we also need crtbegin and crtend
-   objects.  */
-#define TARGET_USE_JCR_SECTION 1
-
 /* Decide whether it is safe to use a local alias for a virtual function
when constructing thunks.  */
 #undef TARGET_USE_LOCAL_THUNK_ALIAS_P
--- gcc/config/darwin.h.jj  2016-09-15 13:39:14.518013115 +0200
+++ gcc/config/darwin.h 2016-10-01 18:55:40.056886539 +0200
@@ -825,9 +825,6 @@ enum machopic_addr_class {
 #define EH_FRAME_SECTION_NAME   "__TEXT"
 #define EH_FRAME_SECTION_ATTR 
",coalesced,no_toc+strip_static_syms+live_support"
 
-/* Java runtime class list.  */
-#define JCR_SECTION_NAME "__DATA,jcr,regular,no_dead_strip"
-
 #undef ASM_PREFERRED_EH_DATA_FORMAT
 #define ASM_PREFERRED_EH_DATA_FORMAT(CODE,GLOBAL)  \
   (((CODE) == 2 && (GLOBAL) == 1) \
--- gcc/config/pa/pa64-hpux.h.jj2016-04-08 19:19:23.894042211 +0200
+++ gcc/config/pa/pa64-hpux.h   2016-10-01 18:55:35.171946738 +0200
@@ -170,8 +170,6 @@ along with GCC; see the file COPYING3.
 #define DATA_SECTION_ASM_OP"\t.data"
 #define BSS_SECTION_ASM_OP "\t.section\t.bss"
 
-#define JCR_SECTION_NAME   ".jcr"
-
 #define HP_INIT_ARRAY_SECTION_ASM_OP   "\t.section\t.init"
 #define GNU_INIT_ARRAY_SECTION_ASM_OP  "\t.section\t.init_array"
 #define HP_FINI_ARRAY_SECTION_ASM_OP   "\t.section\t.fini"
@@ -382,8 +380,8 @@ do {
\
initializers specified here.  */
 
 /* We need to add frame_dummy to the initializer list if EH_FRAME_SECTION_NAME
-   or JCR_SECTION_NAME is defined.  */
-#if defined(EH_FRAME_SECTION_NAME) || defined(JCR_SECTION_NAME)
+   is defined.  */
+#if defined(EH_FRAME_SECTION_NAME)
 #define PA_INIT_FRAME_DUMMY_ASM_OP ".dword P%frame_dummy"
 #else
 #define PA_INIT_FRAME_DUMMY_ASM_OP ""
--- gcc/config/rs6000/aix71.h.jj2016-01-21 21:28:01.218834652 +0100
+++ gcc/config/rs6000/aix71.h   2016-10-01 18:55:49.667768100 +0200
@@ -210,8 +210,6 @@ extern long long intatoll(const char
 /* This target defines SUPPORTS_WEAK and TARGET_ASM_NAMED_SECTION,
but does not have crtbegin/end.  */
 
-#define TARGET_USE_JCR_SECTION 0
-
 #define TARGET_AIX_VERSION 71
 
 /* AIX 7.1 supports DWARF3 debugging, but XCOFF remains the default.  */
--- gcc/config/rs6000/aix51.h.jj2016-06-20 10:30:35.629607920 +0200
+++ gcc/config/rs6000/

[PATCH] Fix ubsan ICE on vector shift (PR sanitizer/77823)

2016-10-03 Thread Jakub Jelinek
Hi!

libsanitizer isn't right now prepared to handle vector types, and we don't
instrument vector additions/multiplications etc. for overflow etc. either,
so this patch just turns the single case that slipped through.

As I wrote in the PR, in the future we should probably change libubsan to
handle them and start instrumenting those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-03  Jakub Jelinek  

PR sanitizer/77823
* c-ubsan.c (ubsan_instrument_shift): Return NULL_TREE if type0
is not integral.

* c-c++-common/ubsan/shift-9.c: New test.

--- gcc/c-family/c-ubsan.c.jj   2016-01-04 14:55:58.0 +0100
+++ gcc/c-family/c-ubsan.c  2016-10-03 13:49:49.423318587 +0200
@@ -114,6 +114,9 @@ ubsan_instrument_shift (location_t loc,
   tree t, tt = NULL_TREE;
   tree type0 = TREE_TYPE (op0);
   tree type1 = TREE_TYPE (op1);
+  if (!INTEGRAL_TYPE_P (type0))
+return NULL_TREE;
+
   tree op1_utype = unsigned_type_for (type1);
   HOST_WIDE_INT op0_prec = TYPE_PRECISION (type0);
   tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1);
@@ -126,8 +129,7 @@ ubsan_instrument_shift (location_t loc,
 
   /* If this is not a signed operation, don't perform overflow checks.
  Also punt on bit-fields.  */
-  if (!INTEGRAL_TYPE_P (type0)
-  || TYPE_OVERFLOW_WRAPS (type0)
+  if (TYPE_OVERFLOW_WRAPS (type0)
   || GET_MODE_BITSIZE (TYPE_MODE (type0)) != TYPE_PRECISION (type0))
 ;
 
--- gcc/testsuite/c-c++-common/ubsan/shift-9.c.jj   2016-10-03 
14:23:54.301711636 +0200
+++ gcc/testsuite/c-c++-common/ubsan/shift-9.c  2016-10-03 13:54:50.0 
+0200
@@ -0,0 +1,30 @@
+/* PR sanitizer/77823 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-fsanitize=undefined -Wno-psabi -w" } */
+
+typedef unsigned V __attribute__((vector_size(32)));
+typedef unsigned __int128 W __attribute__((vector_size(32)));
+
+V
+foo (V v)
+{
+  return v << 30;
+}
+
+V
+bar (V v, V w)
+{
+  return v << w;
+}
+
+W
+baz (W v)
+{
+  return v << 30;
+}
+
+W
+boo (W v, W w)
+{
+  return v << w;
+}

Jakub


[C++ PATCH] Fix ICE during C++11 lambda error recovery (PR c++/77791)

2016-10-03 Thread Jakub Jelinek
Hi!

In param_list some entries could be error_mark_node, we should just ignore
those.  ALso, this patch optimizes by testing cxx_dialect < cxx14 just once.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-03  Jakub Jelinek  

PR c++/77791
* parser.c (cp_parser_lambda_declarator_opt): Only pedwarn
for C++11 on decls in the param_list.  Test cxx_dialect < cxx14 before
the loop just once.

* g++.dg/cpp0x/lambda/lambda-77791.C: New test.

--- gcc/cp/parser.c.jj  2016-09-27 21:09:59.0 +0200
+++ gcc/cp/parser.c 2016-10-03 15:00:31.759317804 +0200
@@ -10114,10 +10114,11 @@ cp_parser_lambda_declarator_opt (cp_pars
 
   /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
-  for (tree t = param_list; t; t = TREE_CHAIN (t))
-   if (TREE_PURPOSE (t) && cxx_dialect < cxx14)
- pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
-  "default argument specified for lambda parameter");
+  if (cxx_dialect < cxx14)
+   for (tree t = param_list; t; t = TREE_CHAIN (t))
+ if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
+   pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
+"default argument specified for lambda parameter");
 
   cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
 
--- gcc/testsuite/g++.dg/cpp0x/lambda/lambda-77791.C.jj 2016-10-03 
15:01:21.831694292 +0200
+++ gcc/testsuite/g++.dg/cpp0x/lambda/lambda-77791.C2016-10-03 
14:58:22.0 +0200
@@ -0,0 +1,4 @@
+// PR c++/77791
+// { dg-do compile { target c++11 } }
+
+auto a = [] (int i, int i = 0) {}; // { dg-error "redefinition of" }

Jakub


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-10-03 Thread Andi Kleen
> >>I would to somehow resolve the discussion related to default value
> >>selection.
> >>Is the prevailing consensus that we should set -fprofile-update=atomic
> >>when
> >>-pthread is set? If so, I'll prepare a patch. I tend to do it this way.
> >
> >This is my preference.
> Likewise.

I still think it shouldn't be default even with -pthread because it could 
dramatically
degrade performance in these cases. People likely have -pthread in their 
Makefiles
without realizing it. Such changes should be explict opt-in.

Often severe performance decreases lead to incorrectness in practice
("is now too slow to finish training workload in rebuild cycle") 

-Andi


Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-10-03 Thread Martin Sebor

+FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test

FAIL: test_a_double:364: "%a" expected result for "0x0.0p+0"
doesn't match function call return value: 20 != 6
FAIL: test_a_double:365: "%a" expected result for "0x1.0p+0"
doesn't match function call return value: 20 != 6
FAIL: test_a_double:366: "%a" expected result for "0x1.0p+1"
doesn't match function call return value: 20 != 6

FAIL: test_a_long_double:375: "%La" expected result for
"0x0.p+0" doesn't match function call return
value: 35 != 6
FAIL: test_a_long_double:376: "%La" expected result for
"0x1.p+0" doesn't match function call return
value: 35 != 6
FAIL: test_a_long_double:377: "%La" expected result for
"0x1.p+1" doesn't match function call return
value: 35 != 6


I don't know about these.  It looks like the Solaris printf doesn't
handle the %a directive correctly and the tests (and the related
checks/optimization) might need to be disabled, which in turn might
involve extending the existing printf hook or adding a new one.


I've found the following in Solaris 10 (and up) printf(3C):

 a, AA  double  argument  representing  a  floating-point
 number  is  converted in the style "[-]0xh.p+d",
 where the single  hexadecimal  digit  preceding  the
 radix  point is 0 if the value converted is zero and
 1 otherwise and the  number  of  hexadecimal  digits
 after it is equal to the precision; if the precision
 is missing, the number of digits printed  after  the
 radix  point  is  13  for the conversion of a double
 value, 16 for the conversion of a long double  value
 on  x86,  and 28 for the conversion of a long double
 value on SPARC; if the precision is zero and the '#'
 flag  is  not  specified, no decimal-point character
 will appear. The letters "abcdef"  are  used  for  a
 conversion  and  the  letters "ABCDEF" for A conver-
 sion. The A conversion specifier produces  a  number
 with  'X'  and  'P'  instead  of  'x'  and  'p'. The
 exponent will always contain at least one digit, and
 only  as  many more digits as necessary to represent
 the decimal exponent of 2. If the value is zero, the
 exponent is zero.

 The converted value is rounded to fit the  specified
 output  format  according to the prevailing floating
 point rounding direction mode. If the conversion  is
 not exact, an inexact exception is raised.

 A double argument representing an infinity or NaN is
 converted in the SUSv3 style of an e or E conversion
 specifier.

I tried to check the relevant sections of the latest C99 and C11 drafts
to check if this handling of missing precision is allowed by the
standard, but I couldn't even fully parse the language there.


I don't have access to Solaris to fully debug and test this there.
Would you mind helping with it?


Not at all: if it turns out that Solaris has bugs in this area, I can
easily file them, too.


I think it's actually a defect in the C standard.  It doesn't
specify how many decimal digits an implementation must produce
on output for a plain %a directive (i.e., when precision isn't
explicitly specified).  With Glibc, for instance, printf("%a",
1.0) prints 0x8p-3 while on Solaris it prints  0x8.00p-3.
Both seem reasonable but neither is actually specified.  In
theory, an implementation is allowed print any number of zeros
after the decimal point, which the standard should (IMO) not
permit.  There should be a cap (e.g., of at most 6 decimal
digits when precision is not specified with %a, just like
there us with %e).  I'll propose to change the standard and
forward it to the C committee.  Until then, I've worked
around it in the patch for pr77735 (under review).  If you
have a moment and could try it out on Solaris and let me
know how it goes I'd be grateful.

Thanks
Martin


Re: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

2016-10-03 Thread Christophe Lyon
On 3 October 2016 at 18:07, Doug Gilmore  wrote:
>>From: Christophe Lyon [christophe.l...@linaro.org]
>>Sent: Monday, October 03, 2016 12:05 AM
>>To: Doug Gilmore
>>Cc: gcc-patches@gcc.gnu.org
>>Subject: Re: Fix PR tree-optimization/77808, ICE in 
>>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>
>>On 2 October 2016 at 23:05, Doug Gilmore  wrote:
>>> Hi Christophe,
>>>
 From: Christophe Lyon [christophe.l...@linaro.org]
 Sent: Saturday, October 01, 2016 7:57 AM
 To: Doug Gilmore
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: Fix PR tree-optimization/77808, ICE in 
 duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

 Hi Doug,

 ...
 I can confirm that your patch fixes the ICE I was seeing.

 However, the new testcase does not pass on low end
 architectures:
 cc1: warning: -fprefetch-loop-arrays not supported for this target
 (try -march switches)

 Can you add a guard?

 Thanks,

 Christophe
>>> I updated the test to only run on X86, MIPS and AARCH64.  Is that OK?
>>>
>>
>>I'm afraid not.
>>
>>The ICE occurred on some arm targets. By "low end" I meant armv5t for
>>example, as opposed to armv7t.
>>Is there a suitable effective target?
> I'll need to investigate that.  BTW, gcc.dg/pr53550.c contains:
> /* PR tree-optimization/53550 */
> /* { dg-do compile } */
> /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */
>
> int *
> foo (int *x)
> {
>   int *a = x + 10, *b = x, *c = a;
>   while (b != c)
> *--c = *b++;
>   return x;
> }
>
> Is it also failing on armv5t?  I suppose it would.
>
It doesn't, but that's probably thanks to -w

Christophe

> Thanks,
>
> Doug
>>
>>Thanks,
>>
>>Christophe
>>
>>> Thanks,
>>>
>>> Doug


RE: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

2016-10-03 Thread Doug Gilmore
>From: Christophe Lyon [christophe.l...@linaro.org]
>Sent: Monday, October 03, 2016 11:23 AM
>To: Doug Gilmore
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: Fix PR tree-optimization/77808, ICE in 
>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>
>On 3 October 2016 at 18:07, Doug Gilmore  wrote:
>>>From: Christophe Lyon [christophe.l...@linaro.org]
>>>Sent: Monday, October 03, 2016 12:05 AM
>>>To: Doug Gilmore
>>>Cc: gcc-patches@gcc.gnu.org
>>>Subject: Re: Fix PR tree-optimization/77808, ICE in 
>>>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>>
>>>On 2 October 2016 at 23:05, Doug Gilmore  wrote:
 Hi Christophe,

> From: Christophe Lyon [christophe.l...@linaro.org]
> Sent: Saturday, October 01, 2016 7:57 AM
> To: Doug Gilmore
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Fix PR tree-optimization/77808, ICE in 
> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>
> Hi Doug,
>
> ...
> I can confirm that your patch fixes the ICE I was seeing.
>
> However, the new testcase does not pass on low end
> architectures:
> cc1: warning: -fprefetch-loop-arrays not supported for this target
> (try -march switches)
>
> Can you add a guard?
>
> Thanks,
>
> Christophe
 I updated the test to only run on X86, MIPS and AARCH64.  Is that OK?

>>>
>>>I'm afraid not.
>>>
>>>The ICE occurred on some arm targets. By "low end" I meant armv5t for
>>>example, as opposed to armv7t.
>>>Is there a suitable effective target?
>> I'll need to investigate that.  BTW, gcc.dg/pr53550.c contains:
>> /* PR tree-optimization/53550 */
>> /* { dg-do compile } */
>> /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */
>>
>> int *
>> foo (int *x)
>> {
>>   int *a = x + 10, *b = x, *c = a;
>>   while (b != c)
>> *--c = *b++;
>>   return x;
>> }
>>
>> Is it also failing on armv5t?  I suppose it would.
>>
>It doesn't, but that's probably thanks to -w
Sounds like we don't need add guards then, it is just a matter
of adding -w to the command line.

Does that work for you?

Thanks,

Doug
>
>Christophe
>
>> Thanks,
>>
>> Doug
>>>
>>>Thanks,
>>>
>>>Christophe
>>>
 Thanks,

 Doug


libgo patch committed: strip most C macros from runtime.inc

2016-10-03 Thread Ian Lance Taylor
The Go runtime package in libgo is picking up C macros from
runtime_sysinfo.go and then re-exporting them to runtime.inc.  This
can cause name conflicts.  Change the Makefile so that we only put the
macros we need into runtime.inc.  These are the constants that are
actually defined by Go code, not runtime_sysinfo.go.  There are only a
few, so we can pattern match.

This is an additional hack on runtime.inc.  The long term goal is to
convert the runtime package to Go and eliminate runtime.inc entirely,
so a few hacks seem acceptable.

This fixes GCC PR 77809.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240667)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-f3fb9bf2d5a009a707962a416fcd1a8435756218
+325f8074c5224ae537f8e00aede5c780b70f914c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 240667)
+++ libgo/Makefile.am   (working copy)
@@ -1284,8 +1284,13 @@ runtime_go_lo_GOCFLAGS = -fgo-c-header=r
 runtime-go.lo:
$(BUILDPACKAGE)
 runtime.inc: s-runtime-inc; @true
-s-runtime-inc: runtime-go.lo
-   $(SHELL) $(srcdir)/mvifdiff.sh runtime.inc.tmp runtime.inc
+s-runtime-inc: runtime-go.lo Makefile
+   rm -f runtime.inc.tmp2
+   grep -v "#define _" runtime.inc.tmp > runtime.inc.tmp2
+   for pattern in '_G[a-z]' '_P[a-z]' _Max _Lock _Sig _Trace _MHeap _Num; 
do \
+ grep "#define $$pattern" runtime.inc.tmp >> runtime.inc.tmp2; \
+   done
+   $(SHELL) $(srcdir)/mvifdiff.sh runtime.inc.tmp2 runtime.inc
$(STAMP) $@
 runtime_check_GOCFLAGS = -fgo-compiling-runtime
 runtime/check: $(CHECK_DEPS)


[gomp4] gimple prettypint

2016-10-03 Thread Nathan Sidwell

Committed this port from trunk to gomp4

nathan
Index: ChangeLog.gomp
===
--- ChangeLog.gomp	(revision 240724)
+++ ChangeLog.gomp	(working copy)
@@ -1,3 +1,8 @@
+2016-10-03  Nathan Sidwell  
+
+	* gimple-pretty-print.c (dump_gimple_call_args): Simplify "' "
+	printing.
+
 2016-10-02  Chung-Lin Tang  
 
 	PR fortran/77371
Index: gimple-pretty-print.c
===
--- gimple-pretty-print.c	(revision 240724)
+++ gimple-pretty-print.c	(working copy)
@@ -629,26 +629,21 @@ dump_gimple_call_args (pretty_printer *b
 	{
 	  i++;
 	  pp_string (buffer, enums[v]);
-	  if (i < gimple_call_num_args (gs))
-		pp_string (buffer, ", ");
 	}
 	}
 }
 
   for (; i < gimple_call_num_args (gs); i++)
 {
-  dump_generic_node (buffer, gimple_call_arg (gs, i), 0, flags, false);
-  if (i < gimple_call_num_args (gs) - 1)
+  if (i)
 	pp_string (buffer, ", ");
+  dump_generic_node (buffer, gimple_call_arg (gs, i), 0, flags, false);
 }
 
   if (gimple_call_va_arg_pack_p (gs))
 {
-  if (gimple_call_num_args (gs) > 0)
-{
-  pp_comma (buffer);
-  pp_space (buffer);
-}
+  if (i)
+	pp_string (buffer, ", ");
 
   pp_string (buffer, "__builtin_va_arg_pack ()");
 }


Re: [PATCH, Fortran] Fix ICE due to comparison between UNION components

2016-10-03 Thread Fritz Reese
On Sun, Oct 2, 2016 at 6:27 PM, Fritz Reese  wrote:
> All,
>
> The attached fixes an[other] ICE in the comparison between UNIONs.
> This time the ICE is due to a BT_UNION component comparing itself to a
> BT_DERIVED component, thus considering their FL_STRUCT and FL_UNION
> typenodes to be equal. This is very similar to PR fortran/77782,
> except it is an error in the comparison of _components_ from
> gfc_compare_types, instead of an error comparing the _type symbols_
> from gfc_compare_derived. The patch makes sure that BT_UNION compared
> to anything other than BT_UNION is _not_ equal, while still comparing
> two union components with gfc_compare_union_types.
>
> Will commit soon with no complaints. Maybe this patch will finally get
> type comparison right for unions.
>

Meant to include a null-guard as part of the patch, see below.


Fritz Reese

2016-10-03  Fritz Reese  

   Fix ICE due to comparison between UNION components.

gcc/fortran/
* interface.c (gfc_compare_types): Don't compare BT_UNION
components until we know they're both UNIONs.
* interface.c (gfc_compare_union_types): Guard against empty
components.

gcc/testsuite/gfortran.dg/
* dec_union_9.f90, dec_union_10.f90: New testcases.


union_ice.patch2
Description: Binary data


Re: [RFC] Extend ipa-bitwise-cp with pointer alignment propagation

2016-10-03 Thread Prathamesh Kulkarni
On 22 September 2016 at 17:26, Jan Hubicka  wrote:
>> Hi,
>> The attached patch tries to extend ipa bits propagation to handle
>> pointer alignment propagation.
>> The patch just disables ipa-cp-alignment pass, I suppose we want to
>> eventually remove it ?
>
> Yes, can you please verify that alignments it computes are monotonously
> worse than those your new code computes and include the removal in the
> next iteration of the patch?
>>
>> Bootstrap+tested on x86_64-unknown-linux-gnu.
>> Cross-tested on arm*-*-*, aarch64*-*-*.
>> Does the patch look OK ?
>>
>> Thanks,
>> Prathamesh
>> @@ -2258,8 +2271,8 @@ propagate_constants_accross_call (struct cgraph_edge 
>> *cs)
>>&dest_plats->itself);
>> ret |= propagate_context_accross_jump_function (cs, jump_func, i,
>> &dest_plats->ctxlat);
>> -   ret |= propagate_alignment_accross_jump_function (cs, jump_func,
>> -  
>> &dest_plats->alignment);
>> +// ret |= propagate_alignment_accross_jump_function (cs, jump_func,
>> +//
>> &dest_plats->alignment);
>
> obviously we do not want commented out ocde..
>
>> ret |= propagate_bits_accross_jump_function (cs, i, jump_func,
>>  
>> &dest_plats->bits_lattice);
>> ret |= propagate_aggs_accross_jump_function (cs, jump_func,
>> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
>> index 1629781..5cee27b 100644
>> --- a/gcc/ipa-prop.c
>> +++ b/gcc/ipa-prop.c
>> @@ -1701,6 +1701,16 @@ ipa_compute_jump_functions_for_edge (struct 
>> ipa_func_body_info *fbi,
>> jfunc->bits.mask = 0;
>>   }
>>   }
>> +  else if (POINTER_TYPE_P (TREE_TYPE (arg)))
>> + {
>> +   unsigned HOST_WIDE_INT bitpos;
>> +   unsigned align;
>> +
>> +   jfunc->bits.known = true;
>> +   get_pointer_alignment_1 (arg, &align, &bitpos);
>> +   jfunc->bits.mask = wi::mask(TYPE_PRECISION (TREE_TYPE 
>> (arg)), false).and_not (align / BITS_PER_UNIT - 1);
>
> ... and long lines :)
>
>> +   jfunc->bits.value = bitpos / BITS_PER_UNIT;
>> + }
>>else
>>   gcc_assert (!jfunc->bits.known);
>>
>> @@ -5534,7 +5544,7 @@ ipcp_update_bits (struct cgraph_node *node)
>>next_parm = DECL_CHAIN (parm);
>>
>>if (!bits[i].known
>> -   || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
>> +   || !(INTEGRAL_TYPE_P (TREE_TYPE (parm)) || POINTER_TYPE_P (TREE_TYPE 
>> (parm)))
>
> I suppose eventually we may want to enable other types, too.
> It does even make sense to propagate this on aggregates, but definitly on
> vectors and complex numbers.
>
> Otherwise the patch seems fine to me (modulo Richard's comments)
Hi,
Sorry for late response, I was travelling.
I tried to verify the alignments are monotonously worse with the
attached patch (verify.diff),
which asserts that alignment lattice is not better than bits lattice
during each propagation
step in propagate_constants_accross_call().
Does that look OK ?

ipa-cp-alignment has better alignments than ipa-bit-cp in following cases:

a) ipa_get_type() returns NULL: ipa-bits-cp sets lattice to bottom if
ipa_get_type (param) returns NULL,
for instance in case of K&R function, while ipa-cp-alignment doesn't
look at param types,
and can propagate alignments.
The following assert:
if (bits_lattice.bottom_p ())
  gcc_assert (align_lattice.bottom_p())

triggered for 400.perlbench, 403.gcc, 456.hmmer and 481.wrf due to
ipa_get_type()
returning NULL. I am not really sure how to handle this case, since we
need to know parameter's
type during bits propagation for obtaining precision.

b) This happens for attached test-case (test.i),
which is a reduced (and slightly modified) test-case from 458.sjeng.
Bits propagation sets lattice to bottom, while alignment propagation
propagates .

In bits_lattice::meet_with_1
m_mask = other_mask = 0x0fff0
m_value = 0x7
other_value = 0x8

In this case meet operation sets m_mask to:
(m_mask | mask) | (m_value ^ other_value) = 0x0fff0 |
(0xf) == 0x0
and hence the bits lattice is set to bottom.
I suppose it doesn't matter for this case if bits propagation sets
lattice to bottom,
since propagating  isn't really useful ?

The attached patch (alignprop-4.diff) removes ipa-cp-alignment, and
checks for misalign against old_misalgin and prints message in the dump file
if they mismatch. Testing in progress.

Thanks,
Prathamesh
> Honza
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 77da489..fee530e 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1910,6 +1910,28 @@ propagate_context_accross_jump_function (cgraph_edge *cs,
   return ret;
 }
 
+static void
+verify_align_worse_p (ipcp_param_lattices *dest_plats)
+{
+  ipcp_alignment_lattice align_lattice = dest_plats->alignment;
+  ipcp_bits_lattice bits_lattice = dest_plat

[gomp4] tile pre patch

2016-10-03 Thread Nathan Sidwell

I've committed this to gomp4.  It gets a few tile-related things out of the way.

1) we were asserting we never saw tile clauses in a few places.  That'll change 
soon, and the processing required of them is nothing, so just accept them.  We 
don't need to gimplify the operands, as they have to be INTEGER_CSTs


2) Broke out OACC_DIM_{SIZE,POS} internal function generation to a worker 
function, as I need that soon.


3) Remvoe a stale comment and note OACC_DIM_POS's non-constness might be overly 
conservative.


nathan
2016-10-03  Nathan Sidwell  

	* gimplify.c (gimplify_scan_omp_clauses): No special handling for
	OMP_CLAUSE_TILE.
	* omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_TILE.
	(expand_oacc_for): Remove out of date note.  Fix whitespace.
	(oacc_dim_call): New.
	(oacc_thread_numbers): Use it.
	(oacc_loop_fixed_partitions): Dump partitioning.
	* tree-nested.c (convert_nonlocal_omp_clauses): Allow OMP_CLAUSE_TILE.
	* internal-fn.def (GOACC_DIM_POS): Comment may be overly conservative.

Index: gimplify.c
===
--- gimplify.c	(revision 240724)
+++ gimplify.c	(working copy)
@@ -7555,16 +7555,6 @@ gimplify_scan_omp_clauses (tree *list_p,
 	remove = true;
 	  break;
 
-	case OMP_CLAUSE_TILE:
-	  for (tree list = OMP_CLAUSE_TILE_LIST (c); !remove && list;
-	   list = TREE_CHAIN (list))
-	{
-	  if (gimplify_expr (&TREE_VALUE (list), pre_p, NULL,
- is_gimple_val, fb_rvalue) == GS_ERROR)
-		remove = true;
-	}
-	  break;
-
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	  remove = true;
 	  break;
@@ -7573,6 +7563,7 @@ gimplify_scan_omp_clauses (tree *list_p,
 	case OMP_CLAUSE_ORDERED:
 	case OMP_CLAUSE_UNTIED:
 	case OMP_CLAUSE_COLLAPSE:
+	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_INDEPENDENT:
Index: internal-fn.def
===
--- internal-fn.def	(revision 240724)
+++ internal-fn.def	(working copy)
@@ -175,7 +175,7 @@ DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW, NU
dimension.  DIM_POS is pure (and not const) so that it isn't
thought to clobber memory and can be gcse'd within a single
parallel region, but not across FORK/JOIN boundaries.  They take a
-   single INTEGER_CST argument.  */
+   single INTEGER_CST argument.  This might be overly conservative.  */
 DEF_INTERNAL_FN (GOACC_DIM_SIZE, ECF_CONST | ECF_NOTHROW | ECF_LEAF, ".")
 DEF_INTERNAL_FN (GOACC_DIM_POS, ECF_PURE | ECF_NOTHROW | ECF_LEAF, ".")
 
Index: omp-low.c
===
--- omp-low.c	(revision 240724)
+++ omp-low.c	(working copy)
@@ -2221,6 +2221,7 @@ scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_INDEPENDENT:
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
+	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
@@ -2234,7 +2235,6 @@ scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_NOHOST:
-	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
@@ -2395,6 +2395,7 @@ scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_INDEPENDENT:
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
+	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE__GRIDDIM_:
 	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
@@ -2402,7 +2403,6 @@ scan_sharing_clauses (tree clauses, omp_
 	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_NOHOST:
-	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
@@ -11244,11 +11244,7 @@ expand_omp_taskloop_for_inner (struct om
 [incoming]
  V = B + ((range -/+ 1) / S +/- 1) * S [*]
 
-   [*] Needed if V live at end of loop
-
-   Note: CHUNKING & GWV mask are specified explicitly here.  This is a
-   transition, and will be specified by a more general mechanism shortly.
- */
+   [*] Needed if V live at end of loop.  */
 
 static void
 expand_oacc_for (struct omp_region *region, struct omp_for_data *fd)
@@ -11357,7 +11353,6 @@ expand_oacc_for (struct omp_region *regi
 	  ass = gimple_build_assign (fd->loop.n2, total);
 	  gsi_insert_before (&gsi, ass, GSI_SAME_STMT);
 	}
-  
 }
 
   tree b = fd->loop.n1;
@@ -18906,6 +18901,23 @@ omp_finish_file (void)
 }
 }
 
+/* Call dim_pos (POS == true) or dim_size (POS == false) builtins for
+   axis DIM.  Return a tmp var holding the result.  */
+
+static tree
+oacc_dim_call (bool pos, int dim, gimple_seq *seq)
+{
+  tree arg = build_int_cst (unsigned_type_node, dim);
+  tree size = create_tmp_var (integer_type_node);
+  enum internal_fn fn = pos ? IFN_GOACC_DIM_POS : IFN_GOACC_DIM_SIZE;
+  gimple *call = gimple_build_call_internal (fn, 1, arg);
+
+  gimple_call_set_lhs (call, size);
+  gimple_seq_add_stmt (seq, call);
+
+  return size;
+}
+
 /* Find the number of threads (POS = false), or thread number (POS =
true) for an OpenACC region partitioned as MASK.  Setup code
required 

[EVRP] Fold stmts with vrp_fold_stmt

2016-10-03 Thread kugan

Hi,

This patch improves Early VRP by folding stmts using vrp_fold_stmt as it 
is done in ssa_propagate for VRP.


I have also changed EVRP to handle POINTER_TYPE_P. I will send follow up 
patches to use this in IPA-VRP.


Bootstrapped and regression testd on x86_64-linux-gnu with no new 
regressions. Is this OK for trunk?


Thanks,
Kugan

gcc/testsuite/ChangeLog:

2016-10-03  Kugan Vivekanandarajah  

* gcc.dg/pr68217.c: Adjust testcase as more cases are now handled in
  evrp.
* gcc.dg/predict-1.c: Likewise.
* gcc.dg/predict-9.c: Likewise.
* gcc.dg/tree-ssa/pr20318.c: Likewise.
* gcc.dg/tree-ssa/pr21001.c: Likewise.
* gcc.dg/tree-ssa/pr21090.c: Likewise.
* gcc.dg/tree-ssa/pr21294.c: Likewise.
* gcc.dg/tree-ssa/pr21559.c: Likewise.
* gcc.dg/tree-ssa/pr21563.c: Likewise.
* gcc.dg/tree-ssa/pr23744.c: Likewise.
* gcc.dg/tree-ssa/pr25382.c: Likewise.
* gcc.dg/tree-ssa/pr61839_1.c: Likewise.
* gcc.dg/tree-ssa/pr68431.c: Likewise.
* gcc.dg/tree-ssa/vrp03.c: Likewise.
* gcc.dg/tree-ssa/vrp07.c: Likewise.
* gcc.dg/tree-ssa/vrp09.c: Likewise.
* gcc.dg/tree-ssa/vrp17.c: Likewise.
* gcc.dg/tree-ssa/vrp18.c: Likewise.
* gcc.dg/tree-ssa/vrp19.c: Likewise.
* gcc.dg/tree-ssa/vrp20.c: Likewise.
* gcc.dg/tree-ssa/vrp23.c: Likewise.
* gcc.dg/tree-ssa/vrp24.c: Likewise.
* gcc.dg/tree-ssa/vrp58.c: Likewise.
* gcc.dg/tree-ssa/vrp92.c: Likewise.
* gcc.dg/tree-ssa/vrp98.c: Likewise.
* gcc.dg/vrp-min-max-1.c: Likewise.

gcc/ChangeLog:

2016-10-03  Kugan Vivekanandarajah  

* tree-vrp.c (evrp_dom_walker::before_dom_children): Handle
  POINTER_TYPE_P. Also fold stmts with vrp_fold_stmt.
>From 4bb16e7d01674461a47e6b6488b04fb1907234ea Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 3 Oct 2016 06:12:05 +1100
Subject: [PATCH 1/5] Fold stmts using vrp_fold in evrp

---
 gcc/testsuite/gcc.dg/pr68217.c|  4 ++--
 gcc/testsuite/gcc.dg/predict-1.c  |  2 +-
 gcc/testsuite/gcc.dg/predict-9.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr20318.c   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr21001.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21090.c   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr21294.c   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr21559.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21563.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr23744.c   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr25382.c   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c |  6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr68431.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp03.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/vrp07.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp09.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/vrp17.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/vrp18.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c |  6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/vrp20.c |  6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/vrp23.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp24.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp58.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp92.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp98.c |  2 +-
 gcc/testsuite/gcc.dg/vrp-min-max-1.c  |  2 +-
 gcc/tree-vrp.c| 27 ---
 27 files changed, 62 insertions(+), 49 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr68217.c b/gcc/testsuite/gcc.dg/pr68217.c
index 426a99a..fbe4627 100644
--- a/gcc/testsuite/gcc.dg/pr68217.c
+++ b/gcc/testsuite/gcc.dg/pr68217.c
@@ -1,6 +1,6 @@
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
 
 int foo (void)
 {
@@ -11,4 +11,4 @@ int foo (void)
 return 0;
 }
 
-/* { dg-final { scan-tree-dump "\\\[-INF, 0\\\]" "vrp1" } } */
+/* { dg-final { scan-tree-dump "\\\[-INF, 0\\\]" "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/predict-1.c b/gcc/testsuite/gcc.dg/predict-1.c
index 10d62ba..0d14802 100644
--- a/gcc/testsuite/gcc.dg/predict-1.c
+++ b/gcc/testsuite/gcc.dg/predict-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-profile_estimate" } */
 
 extern int global;
 
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index 196e31c..8833cb3 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-profile_estimate" } */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c b/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c
index 41f569e..11d4f0d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c
+++ b/gcc/testsuite/g

[PR tree-optimization/71550 ] Drop cached loop iteration information as needed due to threading

2016-10-03 Thread Jeff Law
As noted in BZs 71550 and 71403 (and possibly others, I'm going to have 
to do some searching).  Jump threading can sometimes fuse two loops, in 
the process creating an irreducible loop and invalidating the cached 
iteration information.


The no longer valid cached iteration information can result in unrolling 
doing some unpleasant transformations and generating incorrect code.


I believe it was Jan that pointed me at vect_free_loop_info_assumption. 
It's somewhat poorly named, but does exactly what we need.


I look at renaming it and putting it elsewhere, but it seems to straddle 
scev, vectorization, the generic loop infrastructure, etc.  So I just 
left it as-is.


Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested by 
reverting the changes on the trunk which make 71550 latent, adding this 
patch and verifying 71550 works correctly.


Installed on the trunk.

Jeff
commit 88cd085912752f971aa2e6aee2ed2d05dd2c5ca7
Author: law 
Date:   Mon Oct 3 19:28:24 2016 +

PR tree-optimization/71550
PR tree-optimization/71403
* tree-ssa-threadbackward.c: Include tree-vectorizer.h
(profitable_jump_thread_path): Also return boolean indicating if
the realized path will create an irreducible loop.
Remove loop depth tests from 71403.
(fsm_find_control_statement_thread_paths): Remove loop depth tests
from 71403.  If threading will create an irreducible loop, then
throw away loop iteration and related information.

PR tree-optimization/71550
PR tree-optimization/71403
* gcc.c-torture/execute/pr71550.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240727 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7f4e311..ba56e63 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2016-10-03  Jeff Law  
+
+   PR tree-optimization/71550
+   PR tree-optimization/71403
+   * tree-ssa-threadbackward.c: Include tree-vectorizer.h
+   (profitable_jump_thread_path): Also return boolean indicating if
+   the realized path will create an irreducible loop.
+   Remove loop depth tests from 71403.
+   (fsm_find_control_statement_thread_paths): Remove loop depth tests
+   from 71403.  If threading will create an irreducible loop, then
+   throw away loop iteration and related information.
+
 2016-10-03  Uros Bizjak  
 
* configure.ac (strict_warn): Merge -Wmissing-format-attribute and
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index d0ed6a6..9e62464 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2016-09-26  Jeff Law  
+
+   PR tree-optimization/71550
+   PR tree-optimization/71403
+   * gcc.c-torture/execute/pr71550.c: New test.
+
 2016-10-03  Senthil Kumar Selvaraj  
 
* gcc.target/avr/torture/builtins-error.c: Add -ffat-lto-objects
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr71550.c 
b/gcc/testsuite/gcc.c-torture/execute/pr71550.c
new file mode 100644
index 000..8d1ecda
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr71550.c
@@ -0,0 +1,26 @@
+
+extern void exit (int);
+
+int a = 3, b, c, f, g, h;
+unsigned d;
+char *e;
+
+int
+main ()
+{
+  for (; a; a--)
+{
+  int i;
+  if (h && i)
+   __builtin_printf ("%d%d", c, f);
+  i = 0;
+  for (; i < 2; i++)
+   if (g)
+ for (; d < 10; d++)
+   b = *e;
+  i = 0;
+  for (; i < 1; i++)
+   ;
+}
+  exit (0);
+}
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 6b522ad..fd7d855 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-ssa.h"
 #include "tree-phinodes.h"
 #include "tree-inline.h"
+#include "tree-vectorizer.h"
 
 static int max_threaded_paths;
 
@@ -110,7 +111,8 @@ fsm_find_thread_path (basic_block start_bb, basic_block 
end_bb,
 
 static edge
 profitable_jump_thread_path (vec *&path,
-basic_block bbi, tree name, tree arg, bool speed_p)
+basic_block bbi, tree name, tree arg, bool speed_p,
+bool *creates_irreducible_loop)
 {
   /* Note BBI is not in the path yet, hence the +1 in the test below
  to make sure BBI is accounted for in the path length test.  */
@@ -296,12 +298,12 @@ profitable_jump_thread_path (vec 
*&path,
   return NULL;
 }
 
-  bool creates_irreducible_loop = false;
+  *creates_irreducible_loop = false;
   if (threaded_through_latch
   && loop == taken_edge->dest->loop_father
   && (determine_bb_domination_status (loop, taken_edge->dest)
  == DOMST_NONDOMINATING))
-creates_irreducible_loop = true;
+*creates_irreducible_loop = true;
 
   if (path_crosses_loops)
 {
@@ -343,7 +345,7 @@ profitable_jump_thread_path (vec *&path,
 

[ipa-cp] add space in dump message

2016-10-03 Thread Prathamesh Kulkarni
Committed as obvious (r240730).

Thanks,
Prathamesh
2016-10-03  Prathamesh Kulkarni  

* ipa-cp.c (propagate_bits_accross_jump_function): Introduce space
between callee name and param in dump message in call to fprintf.

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 88baf69..1dc5cb6 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1775,7 +1775,7 @@ propagate_bits_accross_jump_function (cgraph_edge *cs, 
int idx, ipa_jump_func *j
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Setting dest_lattice to bottom, because"
-   "param %i type is NULL for %s\n", idx,
+   " param %i type is NULL for %s\n", idx,
cs->callee->name ());
 
   return dest_lattice->set_to_bottom ();


[ipa-prop] set m_vr and bits to NULL in ipcp_transform_function

2016-10-03 Thread Prathamesh Kulkarni
Bootstrap+test in progress on x86_64-unknown-linux-gnu.
OK to commit if passes ?

Thanks,
Prathamesh
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 5ed9bbf..d71ffcf 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -5667,6 +5667,9 @@ ipcp_transform_function (struct cgraph_node *node)
   fbi.bb_infos.release ();
   free_dominance_info (CDI_DOMINATORS);
   (*ipcp_transformations)[node->uid].agg_values = NULL;
+  (*ipcp_transformations)[node->uid].bits = NULL;
+  (*ipcp_transformations)[node->uid].m_vr = NULL;
+
   descriptors.release ();
 
   if (!something_changed)


[PATCH] fix PR c++/77804 - ICE on placement VLA new

2016-10-03 Thread Martin Sebor

The attached patch removes an assumption from the implementation
of the -Wplacement-new warning that the size of the array type
enclosed in parentheses and accepted by G++ as an extension is
constant.  The assumption causes an ICE in 6.2.0 and 7.0.

Is the patch good to commit to both 7.0 and the 6 branch?

Thanks
Martin
PR c++/77804 - Internal compiler error on incorrect initialization of new-d array

gcc/cp/ChangeLog:
2016-10-03  Martin Sebor  

	PR c++/77804
	* init.c (warn_placement_new_too_small): Avoid assuming an array type
	has a constant size.

gcc/testsuite/ChangeLog:
2016-10-03  Martin Sebor  

	PR c++/77804
	* g++.dg/warn/Wplacement-new-size-4.C: New test.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 798de08..30957f1 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -2504,7 +2504,7 @@ warn_placement_new_too_small (tree type, tree nelts, tree size, tree oper)
 	  && warn_placement_new < 2)
 	return;
 	}
-	  
+
   /* The size of the buffer can only be adjusted down but not up.  */
   gcc_checking_assert (0 <= adjust);
 
@@ -2526,8 +2526,13 @@ warn_placement_new_too_small (tree type, tree nelts, tree size, tree oper)
   else if (nelts && CONSTANT_CLASS_P (nelts))
 	  bytes_need = tree_to_uhwi (nelts)
 	* tree_to_uhwi (TYPE_SIZE_UNIT (type));
-  else
+  else if (tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)))
 	bytes_need = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+  else
+	{
+	  /* The type is a VLA.  */
+	  return;
+	}
 
   if (bytes_avail < bytes_need)
 	{
diff --git a/gcc/testsuite/g++.dg/warn/Wplacement-new-size-4.C b/gcc/testsuite/g++.dg/warn/Wplacement-new-size-4.C
new file mode 100644
index 000..da9b1ab
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wplacement-new-size-4.C
@@ -0,0 +1,14 @@
+// PR c++/77804 - Internal compiler error on incorrect initialization of
+// new-d array
+// { dg-do compile }
+// { dg-additional-options "-Wplacement-new -Wvla -Wno-error=vla" }
+
+void* operator new[] (__SIZE_TYPE__ n, void *p) { return p; }
+
+int main()
+{
+char buf[256];
+unsigned n = 10;
+int* p = new (buf) (int[n]);  // { dg-warning "non-constant array new length must be specified without parentheses around the type-id" }
+//  { dg-warning "ISO C\\+\\+ forbids variable length array" "vla warning" { target *-*-* } .-1 }
+}


Re: [PATCH] Fix libstdc++ versioned namespace build

2016-10-03 Thread Jonathan Wakely

On 03/10/16 15:41 +0100, Jonathan Wakely wrote:

The versioned namespace build has been broken on all branches for some
time. It's due to new code that doesn't use the namespace macros in
the right places. This fixes all issues.

Rather than declaring the std::experimental::* namespaces in
 I've added a new file that declares them and is
only included by LFTS headers. That allows the new test to pass, which
verifies that the std::experimental namespace doesn't exist when no TS
headers are included.

PR libstdc++/68323
PR libstdc++/77794
* config/abi/pre/gnu-versioned-namespace.ver: Add exports for
__cxa_thread_atexit and __gnu_cxx::__freeres.
* include/Makefile.am: Add 
* include/Makefile.in: Regenerate.
* include.bits/basic_string.h: Fix nesting of versioned namespaces.
* include/bits/c++config: Declare versioned namespaces for literals.
* include/bits/regex.h (basic_regex, match_results): Add workarounds
for PR c++/59256.
* include/bits/uniform_int_dist.h: Fix nesting of versioned namespace.
* include/std/chrono: Likewise.
* include/std/complex: Likewise.
* include/std/string_view: Likewise.
* include/std/variant: Likewise. Add workaround for PR c++/59256.
* include/experimental/bits/fs_fwd.h: Declare versioned namespace.
* include/experimental/bits/lfts_config.h: Declare versioned
namespaces.
* include/experimental/algorithm: Include
.
* include/experimental/any: Likewise.
* include/experimental/bits/erase_if.h: Likewise.
* include/experimental/chrono: Likewise.
* include/experimental/functional: Likewise.
* include/experimental/memory_resource: Likewise.
* include/experimental/optional: Likewise.
* include/experimental/propagate_const: Likewise.
* include/experimental/random: Likewise.
* include/experimental/ratio: Likewise.
* include/experimental/system_error: Likewise.
* include/experimental/tuple: Likewise.
* include/experimental/type_traits: Likewise.
* include/experimental/utility: Likewise.
* include/experimental/string_view: Likewise. Fix nesting of
versioned namespaces.
* include/experimental/bits/string_view.tcc: Reopen inline namespace
for non-inline function definitions.
* testsuite/17_intro/using_namespace_std_exp_neg.cc: New test.
* testsuite/20_util/duration/literals/range.cc: Adjust dg-error line.
* testsuite/experimental/any/misc/any_cast_neg.cc: Likewise.
* testsuite/experimental/propagate_const/assignment/move_neg.cc:
Likewise.
* testsuite/experimental/propagate_const/cons/move_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.
* testsuite/ext/profile/mutex_extensions_neg.cc: Likewise.

Tested x86_64-linux, with --enable-symvers=gnu-versioned-namespace and
--enable-symvers=gnu, on trunk and gcc-6 and gcc-5 branches.

The only failures are in synopsis.cc tests which expect to be able to
redeclare names in namespace std (which is ambiguous if they're really
declared in std::__7) or in tests that use scan-assembler or GDB and
the expected strings are different due to the __7 namespace. I will
probably add an effective target for the versioned namespace so we can
disable those tests when they're going to fail.

Committed to trunk and gcc-6 and gcc-5 branches.


It appears that I failed to squash my work-in-progress commits on the
gcc-6-branch, sorry.

The following commits (r240715-r240718) should have been the same
commit as r240719, and won't build for the default config (they work
for --enable-symvers=gnu-versioned-namespace, but commit r240719 has a
fix to make it also build for --enable-symvers=gnu, which is why it
all needed to be squashed into a single commit.

Apologies for messing up the branch.


commit b78f70a1eeac5da59f977fa58c332490e05c14b5
Author: redi 
Date:   Mon Oct 3 14:36:18 2016 +

   add exports
   
   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-6-branch@240718 138bc75d-0d04-0410-961f-82ee72b054a4


commit e5e83353c99c7cabf8d7cdcdf7ee750e5ec7f129
Author: redi 
Date:   Mon Oct 3 14:36:13 2016 +

   Fix misuse of versioned namespace for LFTS
   
   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-6-branch@240717 138bc75d-0d04-0410-961f-82ee72b054a4


commit a2121229a7e8405c7722a8b872d6ad8981c4b2d4
Author: redi 
Date:   Mon Oct 3 14:36:06 2016 +

   Declare inline namespaces for filesystem
   
   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-6-branch@240716 138bc75d-0d04-0410-961f-82ee72b054a4


commit 0215eefdfd5d872

[gomp4] map the '*' tile argument onto integer_zero_node in fortran

2016-10-03 Thread Cesar Philippidis
As the subject states, this patch maps the '*' tile clause arguments
onto integer_zero_node. Before the fortran FE was using mapping it onto
-1. This patch should make the clause parsing in fortran on par with the
C and C++ FEs.

This patch has been applied to gomp-4_0-branch.

Cesar
2016-10-03  Cesar Philippidis  

	gcc/fortran/	
	* openmp.c (resolve_oacc_loop_blocks): Use integer zero to
represent the '*' tile argument.

	gcc/testsuite/
	* gfortran.dg/goacc/tile-lowering.f95: New test.


diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 399b5d1..df489ba 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5183,11 +5183,11 @@ resolve_oacc_loop_blocks (gfc_code *code)
 	  if (el->expr == NULL)
 	{
 	  /* NULL expressions are used to represent '*' arguments.
-		 Convert those to a -1 expressions.  */
+		 Convert those to a 0 expressions.  */
 	  el->expr = gfc_get_constant_expr (BT_INTEGER,
 		gfc_default_integer_kind,
 		&code->loc);
-	  mpz_set_si (el->expr->value.integer, -1);
+	  mpz_set_si (el->expr->value.integer, 0);
 	}
 	  else
 	{
diff --git a/gcc/testsuite/gfortran.dg/goacc/tile-lowering.f95 b/gcc/testsuite/gfortran.dg/goacc/tile-lowering.f95
new file mode 100644
index 000..b36cdc7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/tile-lowering.f95
@@ -0,0 +1,85 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+
+subroutine test
+  integer i, j, k
+
+  !$acc parallel
+  !$acc loop tile (1)
+  do i = 1, 10
+  end do
+
+  !$acc loop tile (*)
+  do i = 1, 10
+  end do
+  
+  !$acc loop tile (1,2)
+  do i = 1, 10
+ do j = 1, 10
+ end do
+  end do
+
+  !$acc loop tile (*,2)
+  do i = 1, 10
+ do j = 1, 10
+ end do
+  end do
+
+  !$acc loop tile (1,*)
+  do i = 1, 10
+ do j = 1, 10
+ end do
+  end do
+
+  !$acc loop tile (*,*)
+  do i = 1, 10
+ do j = 1, 10
+ end do
+  end do
+
+  
+  !$acc loop tile (1,2,3)
+  do i = 1, 10
+ do j = 1, 10
+do k = 1, 10
+end do
+ end do
+  end do
+
+  !$acc loop tile (*,2,3)
+  do i = 1, 10
+ do j = 1, 10
+do k = 1, 10
+end do
+ end do
+  end do
+
+  !$acc loop tile (1,*,3)
+  do i = 1, 10
+ do j = 1, 10
+do k = 1, 10
+end do
+ end do
+  end do
+
+  !$acc loop tile (1,2,*)
+  do i = 1, 10
+ do j = 1, 10
+do k = 1, 10
+end do
+ end do
+  end do
+  !$acc end parallel
+end subroutine test
+
+! { dg-final { scan-tree-dump-times "tile\\(1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(0\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(1, 2\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(0, 2\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(1, 0\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(0, 0\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(1, 2, 3\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(0, 2, 3\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(1, 0, 3\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "tile\\(1, 2, 0\\)" 1 "original" } }
+


Re: Re: [PATCH, OpenACC, Fortran] Fix PR77371, ICE on allocatable

2016-10-03 Thread Cesar Philippidis
On 10/03/2016 07:59 AM, Jakub Jelinek wrote:

> with -fopenmp.  The var is actually properly allocatable in the latter case,
> while it is not with your patch on the first testcase, you just copy over the 
> host pointer, that
> is definitely not going to work on non-shared memory offloading.

I think that's the expected behavior in OpenACC. Basically, unless
pointers have explicit data clauses with subarray arguments, the
compiler is supposed to treat those pointers as "scalars" and not remap
the contents of the pointer.

Chung-Lin, maybe this issue with allocatable data along with a different
void will persuade the OpenACC technical committee update the implicit
data mapping behavior of pointers. Can you raise this issue with the
OpenACC technical committee?

> There is nothing special about references that use POINTER_TYPE as opposed
> to REFERENCE_TYPE.  So, please first get this working with firstprivate on
> allocatables and only then start to play with reductions.

I agree something like that would be better. Is OpenMP supposed to
implicitly map the allocated data on the accelerator too?

Cesar



Re: [PATCH] Delete GCJ

2016-10-03 Thread Matthias Klose
On 05.09.2016 17:13, Andrew Haley wrote:
> As discussed.  I think I should ask a Global reviewer to approve this
> one.  For obvious reasons I haven't included the diffs to the deleted
> gcc/java and libjava directories.  The whole tree, post GCJ-deletion,
> is at svn+ssh://gcc.gnu.org/svn/gcc/branches/gcj/gcj-deletion-branch
> if anyone would like to try it.
> 
> Andrew.
> 

still breaks bootstraps when configured with --enable-objc-gc.

the immediate step should be to fix the bootstrap failure, as an additional step
to remove boehm-gc from the gcc sources and be able to use an external boehm-gc.

Thanks, Matthias



[PATCH] define TARGET_PRINTF_POINTER_FORMAT for powerpc-linux (77837)

2016-10-03 Thread Martin Sebor

The attached patch adds definitions of TARGET_PRINTF_POINTER_FORMAT
to the rs6000 pair of linux.h and linux64.h headers, analogous to
the config/linux.h header.  This appears to be necessary since
unlike most other back ends, the rs6000 back end doesn't include
the latter linux.h.

The patch fixes bug 77837 - missing -Wformat-length warning for %p
with null argument on powerpc64.

Thanks
Martin
PR target/77837 - missing -Wformat-length warning for %p with null argument on powerpc64

gcc/ChangeLog:
2016-10-03  Martin Sebor  

	PR target/77837
	* config/rs6000/linux.h (TARGET_PRINTF_POINTER_FORMAT): Define.
	* config/rs6000/linux64.h (TARGET_PRINTF_POINTER_FORMAT): Likewise.

diff --git a/gcc/config/rs6000/linux.h b/gcc/config/rs6000/linux.h
index ac9296d..e70fa02 100644
--- a/gcc/config/rs6000/linux.h
+++ b/gcc/config/rs6000/linux.h
@@ -138,3 +138,7 @@
   || (TARGET_GLIBC_MAJOR == 2 && TARGET_GLIBC_MINOR >= 19)
 #define RS6000_GLIBC_ATOMIC_FENV 1
 #endif
+
+/* The format string to which "%p" corresponds.  */
+#undef TARGET_PRINTF_POINTER_FORMAT
+#define TARGET_PRINTF_POINTER_FORMAT gnu_libc_printf_pointer_format
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index e86b5d5..fa75bce 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -634,3 +634,7 @@ extern int dot_symbols;
   || (TARGET_GLIBC_MAJOR == 2 && TARGET_GLIBC_MINOR >= 19)
 #define RS6000_GLIBC_ATOMIC_FENV 1
 #endif
+
+/* The format string to which "%p" corresponds.  */
+#undef TARGET_PRINTF_POINTER_FORMAT
+#define TARGET_PRINTF_POINTER_FORMAT gnu_libc_printf_pointer_format


Re: [PATCH] define TARGET_PRINTF_POINTER_FORMAT for powerpc-linux (77837)

2016-10-03 Thread Segher Boessenkool
On Mon, Oct 03, 2016 at 05:30:35PM -0600, Martin Sebor wrote:
> The attached patch adds definitions of TARGET_PRINTF_POINTER_FORMAT
> to the rs6000 pair of linux.h and linux64.h headers, analogous to
> the config/linux.h header.  This appears to be necessary since
> unlike most other back ends, the rs6000 back end doesn't include
> the latter linux.h.
> 
> The patch fixes bug 77837 - missing -Wformat-length warning for %p
> with null argument on powerpc64.
> 
> Thanks
> Martin

> PR target/77837 - missing -Wformat-length warning for %p with null argument 
> on powerpc64
> 
> gcc/ChangeLog:
> 2016-10-03  Martin Sebor  
> 
>   PR target/77837
>   * config/rs6000/linux.h (TARGET_PRINTF_POINTER_FORMAT): Define.
>   * config/rs6000/linux64.h (TARGET_PRINTF_POINTER_FORMAT): Likewise.

Okay for trunk, thanks!


Segher


p.s. You forgot to cc: the maintainers, and the email subject doesn't
start with "rs6000" or similar either, I found this mail by accident...


Re: [PATCH] define TARGET_PRINTF_POINTER_FORMAT for powerpc-linux (77837)

2016-10-03 Thread Martin Sebor

On 10/03/2016 07:10 PM, Segher Boessenkool wrote:

On Mon, Oct 03, 2016 at 05:30:35PM -0600, Martin Sebor wrote:

The attached patch adds definitions of TARGET_PRINTF_POINTER_FORMAT
to the rs6000 pair of linux.h and linux64.h headers, analogous to
the config/linux.h header.  This appears to be necessary since
unlike most other back ends, the rs6000 back end doesn't include
the latter linux.h.

The patch fixes bug 77837 - missing -Wformat-length warning for %p
with null argument on powerpc64.

Thanks
Martin



PR target/77837 - missing -Wformat-length warning for %p with null argument on 
powerpc64

gcc/ChangeLog:
2016-10-03  Martin Sebor  

PR target/77837
* config/rs6000/linux.h (TARGET_PRINTF_POINTER_FORMAT): Define.
* config/rs6000/linux64.h (TARGET_PRINTF_POINTER_FORMAT): Likewise.


Okay for trunk, thanks!


Segher


p.s. You forgot to cc: the maintainers, and the email subject doesn't
start with "rs6000" or similar either, I found this mail by accident...


Thanks and my bad.  I thought Bill was one of them/you.  I'll
remember to CC you and David in the future.  Better yet, with
my increasingly volatile memory, I might write a script to do
it for me.

Martin


Re: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

2016-10-03 Thread Christophe Lyon
On 3 October 2016 at 20:36, Doug Gilmore  wrote:
>>From: Christophe Lyon [christophe.l...@linaro.org]
>>Sent: Monday, October 03, 2016 11:23 AM
>>To: Doug Gilmore
>>Cc: gcc-patches@gcc.gnu.org
>>Subject: Re: Fix PR tree-optimization/77808, ICE in 
>>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>
>>On 3 October 2016 at 18:07, Doug Gilmore  wrote:
From: Christophe Lyon [christophe.l...@linaro.org]
Sent: Monday, October 03, 2016 12:05 AM
To: Doug Gilmore
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Fix PR tree-optimization/77808, ICE in 
duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439

On 2 October 2016 at 23:05, Doug Gilmore  wrote:
> Hi Christophe,
>
>> From: Christophe Lyon [christophe.l...@linaro.org]
>> Sent: Saturday, October 01, 2016 7:57 AM
>> To: Doug Gilmore
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: Fix PR tree-optimization/77808, ICE in 
>> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>>
>> Hi Doug,
>>
>> ...
>> I can confirm that your patch fixes the ICE I was seeing.
>>
>> However, the new testcase does not pass on low end
>> architectures:
>> cc1: warning: -fprefetch-loop-arrays not supported for this target
>> (try -march switches)
>>
>> Can you add a guard?
>>
>> Thanks,
>>
>> Christophe
> I updated the test to only run on X86, MIPS and AARCH64.  Is that OK?
>

I'm afraid not.

The ICE occurred on some arm targets. By "low end" I meant armv5t for
example, as opposed to armv7t.
Is there a suitable effective target?
>>> I'll need to investigate that.  BTW, gcc.dg/pr53550.c contains:
>>> /* PR tree-optimization/53550 */
>>> /* { dg-do compile } */
>>> /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */
>>>
>>> int *
>>> foo (int *x)
>>> {
>>>   int *a = x + 10, *b = x, *c = a;
>>>   while (b != c)
>>> *--c = *b++;
>>>   return x;
>>> }
>>>
>>> Is it also failing on armv5t?  I suppose it would.
>>>
>>It doesn't, but that's probably thanks to -w
> Sounds like we don't need add guards then, it is just a matter
> of adding -w to the command line.
>
> Does that work for you?
>

Yes, it does, I verified all the configurations I normally validate.
Adding "-w" to the testcase does the trick.

Thanks,

Christophe

> Thanks,
>
> Doug
>>
>>Christophe
>>
>>> Thanks,
>>>
>>> Doug

Thanks,

Christophe

> Thanks,
>
> Doug