[SPARC] Fix PR target/69072

2016-01-04 Thread Eric Botcazou
This fixes an ICE on SPARC 64-bit in a corner case where a struct containing a 
nested packed struct is passed beyond the 6th position to a function.  The 
various routines of the back-end implementing the complex calling convention 
disagree on the passing mechanism, leading to an assertion failure.

Tested (incl. binary compatibility) on SPARC/Solaris, applied on the mainline.


2016-01-04  Eric Botcazou  

PR target/69072
* config/sparc/sparc.c (scan_record_type): Take into account subfields
to compute the PACKED_P predicate.
(function_arg_record_value): Minor tweaks.


2016-01-04  Eric Botcazou  

* gcc.target/sparc/20160104-1.c: New test.

-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 231971)
+++ config/sparc/sparc.c	(working copy)
@@ -6140,30 +6140,28 @@ sparc_strict_argument_naming (cumulative_args_t ca
   that is eligible for promotion in integer registers.
 - FP_REGS_P: the record contains at least one field or sub-field
   that is eligible for promotion in floating-point registers.
-- PACKED_P: the record contains at least one field that is packed.
+- PACKED_P: the record contains at least one field that is packed.  */
 
-   Sub-fields are not taken into account for the PACKED_P predicate.  */
-
 static void
 scan_record_type (const_tree type, int *intregs_p, int *fpregs_p,
 		  int *packed_p)
 {
-  tree field;
-
-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+  for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 {
   if (TREE_CODE (field) == FIELD_DECL)
 	{
-	  if (TREE_CODE (TREE_TYPE (field)) == RECORD_TYPE)
-	scan_record_type (TREE_TYPE (field), intregs_p, fpregs_p, 0);
-	  else if ((FLOAT_TYPE_P (TREE_TYPE (field))
-		   || TREE_CODE (TREE_TYPE (field)) == VECTOR_TYPE)
+	  tree field_type = TREE_TYPE (field);
+
+	  if (TREE_CODE (field_type) == RECORD_TYPE)
+	scan_record_type (field_type, intregs_p, fpregs_p, packed_p);
+	  else if ((FLOAT_TYPE_P (field_type)
+		   || TREE_CODE (field_type) == VECTOR_TYPE)
 		  && TARGET_FPU)
 	*fpregs_p = 1;
 	  else
 	*intregs_p = 1;
 
-	  if (packed_p && DECL_PACKED (field))
+	  if (DECL_PACKED (field))
 	*packed_p = 1;
 	}
 }
@@ -6647,9 +6645,10 @@ function_arg_record_value (const_tree type, machin
 
   parms.nregs += intslots;
 }
+
+  /* Allocate the vector and handle some annoying special cases.  */
   nregs = parms.nregs;
 
-  /* Allocate the vector and handle some annoying special cases.  */
   if (nregs == 0)
 {
   /* ??? Empty structure has no value?  Duh?  */
@@ -6661,17 +6660,16 @@ function_arg_record_value (const_tree type, machin
 	 load.  */
 	  return gen_rtx_REG (mode, regbase);
 	}
-  else
-	{
-	  /* ??? C++ has structures with no fields, and yet a size.  Give up
-	 for now and pass everything back in integer registers.  */
-	  nregs = (typesize + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
-	}
+
+  /* ??? C++ has structures with no fields, and yet a size.  Give up
+	 for now and pass everything back in integer registers.  */
+  nregs = (typesize + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
   if (nregs + slotno > SPARC_INT_ARG_MAX)
 	nregs = SPARC_INT_ARG_MAX - slotno;
 }
-  gcc_assert (nregs != 0);
 
+  gcc_assert (nregs > 0);
+
   parms.ret = gen_rtx_PARALLEL (mode, rtvec_alloc (parms.stack + nregs));
 
   /* If at least one field must be passed on the stack, generate
/* PR target/69072 */
/* Reported by Zdenek Sojka  */

/* { dg-do compile } */

typedef struct
{
  struct
  {
double d;
  } __attribute__((packed)) a;
} S;

void
foo (S s1, S s2, S s3, S s4, S s5, S s6, S s7)
{}


[SPARC] Fix PR target/69100

2016-01-04 Thread Eric Botcazou
This fixes another ICE on SPARC 64-bit in a corner case where __builtin_apply 
is compiled with -mno-fpu/-msoft-float.

Tested (incl. binary compatibility) on SPARC/Solaris, applied on the mainline.


2016-01-04  Eric Botcazou  

PR target/69100
* config/sparc/sparc.h (FUNCTION_ARG_REGNO_P): Return true in 64-bit
mode for %f0-%f31 only if TARGET_FPU.


2016-01-04  Eric Botcazou  

* gcc.target/sparc/20160104-2.c: New test.

-- 
Eric Botcazou/* PR target/69100 */
/* Reported by Zdenek Sojka  */

/* { dg-do compile } */
/* { dg-options "-mno-fpu" } */

void
foo (void)
{
  __builtin_apply (0, 0, 0);
}
Index: config/sparc/sparc.h
===
--- config/sparc/sparc.h	(revision 231971)
+++ config/sparc/sparc.h	(working copy)
@@ -1176,9 +1176,8 @@ extern char leaf_reg_remap[];
On SPARC, these are the "output" registers.  v9 also uses %f0-%f31.  */
 
 #define FUNCTION_ARG_REGNO_P(N) \
-(TARGET_ARCH64 \
- ? (((N) >= 8 && (N) <= 13) || ((N) >= 32 && (N) <= 63)) \
- : ((N) >= 8 && (N) <= 13))
+  (((N) >= 8 && (N) <= 13)	\
+   || (TARGET_ARCH64 && TARGET_FPU && (N) >= 32 && (N) <= 63))
 
 /* Define a data type for recording info about an argument list
during the scan of that argument list.  This data type should


Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint

2016-01-04 Thread Jakub Jelinek
On Sun, Jan 03, 2016 at 07:11:58PM -0800, H.J. Lu wrote:
> --- a/gcc/config/i386/predicates.md
> +++ b/gcc/config/i386/predicates.md
> @@ -951,6 +951,13 @@
> (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
> (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
>  
> +; Return true when OP is operand acceptable for vector memory operand.
> +; Only AVX can have misaligned memory operand.
> +(define_predicate "vector_memory_operand"
> +  (and (match_operand 0 "memory_operand")
> +   (ior (match_test "TARGET_AVX")
> + (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)"

Shouldn't this take into account the ssememalign attribute too?
I mean, various instructions have some ssememalign > 8, which means they
can't accept any alignment, but happily accept say >= 32-bit alignment
or >= 64-bit alignment.  Though, ssememalign is an instruction attribute
and the predicates/constraints don't have access to the current instruction.
So maybe we need more constraints and more predicates, the ones you've added
for ssememalign == 0 instructions, don't change anything in instructions
with ssememalign == 8 (you've clearly changed some of them, and patch 3
shows you've tried to partially undo it afterwards, but only the constraint,
not the predicate, and only in one instruction), and use different
predicates/constraints for ssememalign == {16,32,64} instructions.

Jakub


[ping] Enable -mstackrealign with SSE on 32-bit Windows

2016-01-04 Thread Eric Botcazou
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html

Thanks in advance.

-- 
Eric Botcazou


Re: [ping] Enable -mstackrealign with SSE on 32-bit Windows

2016-01-04 Thread Uros Bizjak
On Mon, Jan 4, 2016 at 9:26 AM, Eric Botcazou  wrote:
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html
>
> Thanks in advance.

This is really Windows specific setting, so Windows maintainer should
OK the patch.

Uros.


Re: [ping] Enable -mstackrealign with SSE on 32-bit Windows

2016-01-04 Thread Eric Botcazou
> This is really Windows specific setting, so Windows maintainer should
> OK the patch.

Makes sense, both maintainers now CCed.

https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html

-- 
Eric Botcazou


Re: [PATCH] PR/68089: C++-11: Ingore "alignas(0)".

2016-01-04 Thread Dominik Vogt
On Fri, Jan 01, 2016 at 05:53:08PM -0700, Martin Sebor wrote:
> On 12/31/2015 04:50 AM, Dominik Vogt wrote:
> >The attached patch fixes C++-11 handling of "alignas(0)" which
> >should be ignored but currently generates an error message.  A
> >test case is included; the patch has been tested on S390x.  Since
> >it's a language issue it should be independent of the backend
> >used.
> 
> The patch doesn't handle value-dependent expressions(*).

> It
> seems that the problem is in handle_aligned_attribute() calling
> check_user_alignment() with the second argument (ALLOW_ZERO)
> set to false.  Calling it with true fixes the problem and handles
> value-dependent expressions (I haven't done any more testing beyond
> that).

Like the attached patch?  (Passes the testsuite on s390x.)

But wouldn't an "aligned" attribute be added, allowing the backend
to possibly generate an error or a warning?

> Also, in the test, I noticed the definition of the first struct
> is missing the terminating semicolon.

Yeah.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/c-family/ChangeLog

PR/69089
* c-common.c (handle_aligned_attribute): Allow 0 as an argument to the
"aligned" attribute.

gcc/testsuite/ChangeLog

PR/69089
* g++.dg/cpp0x/alignas5.C: New test.
>From 2461293b9070da74950fd0ae055d1239cc69ce67 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 30 Dec 2015 15:08:52 +0100
Subject: [PATCH] C++-11: Ingore "alignas(0)" instead of generating an
 error message.

This is required by the C++-11 standard.
---
 gcc/c-family/c-common.c   |  2 +-
 gcc/testsuite/g++.dg/cpp0x/alignas5.C | 29 +
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas5.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 653d1dc..9eb25a9 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -7804,7 +7804,7 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   else if (TYPE_P (*node))
 type = node, is_type = 1;
 
-  if ((i = check_user_alignment (align_expr, false)) == -1
+  if ((i = check_user_alignment (align_expr, true)) == -1
   || !check_cxx_fundamental_alignment_constraints (*node, i, flags))
 *no_add_attrs = true;
   else if (is_type)
diff --git a/gcc/testsuite/g++.dg/cpp0x/alignas5.C b/gcc/testsuite/g++.dg/cpp0x/alignas5.C
new file mode 100644
index 000..f3252a9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alignas5.C
@@ -0,0 +1,29 @@
+// PR c++/69089
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-attributes" }
+
+alignas (0) int valid1;
+alignas (1 - 1) int valid2;
+struct Tvalid
+{
+  alignas (0) int i;
+  alignas (2 * 0) int j;
+};
+
+alignas (-1) int invalid1; /* { dg-error "not a positive power of 2" } */
+alignas (1 - 2) int invalid2; /* { dg-error "not a positive power of 2" } */
+struct Tinvalid
+{
+  alignas (-1) int i; /* { dg-error "not a positive power of 2" } */
+  alignas (2 * 0 - 1) int j; /* { dg-error "not a positive power of 2" } */
+};
+
+template  struct TNvalid1 { alignas (N) int i; };
+TNvalid1<0> SNvalid1;
+template  struct TNvalid2 { alignas (N) int i; };
+TNvalid2<1 - 1> SNvalid2;
+
+template  struct TNinvalid1 { alignas (N) int i; }; /* { dg-error "not a positive power of 2" } */
+TNinvalid1<-1> SNinvalid1;
+template  struct TNinvalid2 { alignas (N) int i; }; /* { dg-error "not a positive power of 2" } */
+TNinvalid2<1 - 2> SNinvalid2;
-- 
2.3.0



Re: [PATCH 4/4] Un-XFAIL ssa-dom-cse-2.c for most platforms

2016-01-04 Thread Alan Lawrence

On 24/12/15 19:59, Mike Stump wrote:

On Dec 22, 2015, at 8:00 AM, Alan Lawrence  wrote:

On 21/12/15 15:33, Bill Schmidt wrote:


Not on a stage1 compiler - check_p8vector_hw_available itself requires being
able to run executables - I'll check on gcc112. However, both look like they're
really about the host (ability to execute an asm instruction), not the target
(/ability for gcc to output such an instruction)


Hm, that looks like a pervasive problem for powerpc.  There are a number
of things that are supposed to be testing effective target but rely on
check_p8vector_hw_available, which as you note requires executing an
instruction and is really about the host.  We need to clean that up; I
should probably open a bug.  Kind of amazed this has gotten past us for
a couple of years.


Well, I was about to apologize for making a bogus remark. A really "proper" 
setup, would be to tell dejagnu to run your execution tests in some kind of 
emulator/simulator (on your host, perhaps one kind of powerpc) that only/additionally 
runs instructions for the other, _target_, kind of powerpc...and whatever setup you'd 
need for all that probably does not live in the GCC repository!


I’m not following.  dejagnu can already run tests on the target to makes 
decisions on which tests to run and what to expect from them, if it wants.  
Some ports already do this.  Further, this is pretty typical and standard and 
easy to do

You confuse the issue by mentioning host, but this I think is wrong.  These 
decisions have nothing to do with the host.  The are properties of the target 
execution environment.

I’d be happy to help if you’d like.  I’d just need the details of what you’d 
like help with.


You're right, which is why I described my first (wrong) remark as bogus. That 
is, check_p8vector_hw_available is executing an assembly instruction, and on a 
well-configured test setup, that would potentially invoke an emulator etc. - 
whereas I am just doing 'native' testing on gcc110/gcc112 on the compile farm.


So (as Mike says) there is no bug here, but one just needs to be aware that 
passing -mcpu=power7 (say) is not sufficient to make check_p8vector_hw_available 
return false when executing on a power8 host; you would also need to set up some 
kind of power7 emulator/simulator.


Hope that's clear!

Thanks,
Alan


Re: [PATCH 1/4] Make SRA scalarize constant-pool loads

2016-01-04 Thread Alan Lawrence

On 24/12/15 11:53, Alan Lawrence wrote:

Here's a new version that fixes the gcc.dg/guality/pr54970.c failures seen on
aarch64 and powerpc64.

[snip]

This also fixes a bunch of other guality tests on AArch64 that were failing
prior to the patch series, and another bunch on PowerPC64 (bigendian -m32), 
listed below.


Ach, sorry, not quite. That version avoids any regressions (e.g. in pr54970.c), 
but does not fix all those other tests, unless you also have this hunk 
(https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01483.html):


diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index a3ff2df..2a741b8 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2651,7 +2651,8 @@ analyze_all_variable_accesses (void)
&& scalarizable_type_p (TREE_TYPE (var)))
  {
if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
-   <= max_scalarization_size)
+ <= max_scalarization_size
+   || DECL_IN_CONSTANT_POOL (var))
  {
create_total_scalarization_access (var);
completely_scalarize (var, TREE_TYPE (var), 0, var);


...which I was using to increase test coverage of the SRA changes. 
(Alternatively, you can "fix" the tests by running the testsuite with a forced 
--param sra-max-scalarization-size. But this is only saying that the dwarf info 
now generated by scalarizing constant-pools, is better than whatever dwarf was 
being generated by whatever other part of the compiler before.)


--Alan


Re: [PATCH][Testsuite]Cleanup logs from gdb tests by adding newlines

2016-01-04 Thread Alan Lawrence

Ping.

--Alan

On 10/12/15 10:31, Alan Lawrence wrote:

Runs of the guality testsuite can sometimes end up with gcc.log containing
malformed lines like:

A debugging session is active.PASS: gcc.dg/guality/pr36728-1.c   -O2  line 18 
arg4 == 4
A debugging session is active.PASS: gcc.dg/guality/restrict.c   -O2  line 30 
type:ip == int *
Inferior 1 [process 27054] will be killed.PASS: 
gcc.dg/guality/restrict.c   -O2  line 30 type:cicrp == const int * const 
restrict
Inferior 1 [process 27160] will be killed.PASS: 
gcc.dg/guality/restrict.c   -O2  line 30 type:cvirp == int * const volatile 
restrict

This patch just makes sure the PASS/FAIL comes at the beginning of a line.  (At
the slight cost of adding some extra newlines not in the actual test output.)

I moved the remote_close target calls earlier, to avoid any possible race
condition of extra output being generated after the newline - this may not be
strictly necessary.

Tested on aarch64-none-linux-gnu and x86_64-none-linux-gnu.

I think this is reasonable for stage 3 - OK for trunk?

gcc/testsuite/ChangeLog:
* lib/gcc-gdb-test.exp (gdb-test): call remote_close earlier, and send
newline to log, before calling pass/fail/unsupported.
* lib/gcc-simulate-thread.exp (simulate-thread): Likewise.
---
  gcc/testsuite/lib/gcc-gdb-test.exp| 15 ++-
  gcc/testsuite/lib/gcc-simulate-thread.exp | 10 +++---
  2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp 
b/gcc/testsuite/lib/gcc-gdb-test.exp
index d3ba6e4..f60cabf 100644
--- a/gcc/testsuite/lib/gcc-gdb-test.exp
+++ b/gcc/testsuite/lib/gcc-gdb-test.exp
@@ -84,8 +84,9 @@ proc gdb-test { args } {
  remote_expect target [timeout_value] {
# Too old GDB
-re "Unhandled dwarf expression|Error in sourced command file|



[committed] Update copyright years, part 1

2016-01-04 Thread Jakub Jelinek
Hi!

I've committed following patch to update the user visible copyright years
(and rolled new year of gcc/fortran and libjava ChangeLogs, plus added
Copyright boilerplate at the end of libitm, libgomp and libquadmath
ChangeLog files).

2016-01-04  Jakub Jelinek  

gcc/
* gcc.c (process_command): Update copyright notice dates.
* gcov-dump.c (print_version): Ditto.
* gcov.c (print_version): Ditto.
* gcov-tool.c (print_version): Ditto.
* gengtype.c (create_file): Ditto.
* doc/cpp.texi: Bump @copying's copyright year.
* doc/cppinternals.texi: Ditto.
* doc/gcc.texi: Ditto.
* doc/gccint.texi: Ditto.
* doc/gcov.texi: Ditto.
* doc/install.texi: Ditto.
* doc/invoke.texi: Ditto.
gcc/ada/
* gnat_ugn.texi: Bump @copying's copyright year.
* gnat_rm.texi: Likewise.
gcc/fortran/
* gfortranspec.c (lang_specific_driver): Update copyright notice
dates.
* gfc-internals.texi: Bump @copying's copyright year.
* gfortran.texi: Ditto.
* intrinsic.texi: Ditto.
* invoke.texi: Ditto.
gcc/go/
* gccgo.texi: Bump @copyrights-go year.
gcc/java/
* jcf-dump.c (version): Update copyright notice dates.
libgomp/
* libgomp.texi: Bump @copying's copyright year.
libitm/
* libitm.texi: Bump @copying's copyright year.
libjava/
* classpath/gnu/java/rmi/registry/RegistryImpl.java (version): Update
copyright notice dates.
* classpath/tools/gnu/classpath/tools/orbd/Main.java (run): Ditto.
* gnu/gcj/convert/Convert.java (version): Update copyright notice
dates.
* gnu/gcj/tools/gcj_dbtool/Main.java (main): Ditto.
libquadmath/
* libquadmath.texi: Bump @copying's copyright year.

--- gcc/ada/gnat_rm.texi(revision 232052)
+++ gcc/ada/gnat_rm.texi(working copy)
@@ -25,7 +25,7 @@ GNAT Reference Manual , November 18, 201
 
 AdaCore
 
-Copyright @copyright{} 2008-2015, Free Software Foundation
+Copyright @copyright{} 2008-2016, Free Software Foundation
 @end quotation
 
 @end copying
--- gcc/ada/gnat_ugn.texi   (revision 232052)
+++ gcc/ada/gnat_ugn.texi   (working copy)
@@ -25,7 +25,7 @@ GNAT User's Guide for Native Platforms ,
 
 AdaCore
 
-Copyright @copyright{} 2008-2015, Free Software Foundation
+Copyright @copyright{} 2008-2016, Free Software Foundation
 @end quotation
 
 @end copying
--- gcc/doc/cpp.texi(revision 232052)
+++ gcc/doc/cpp.texi(working copy)
@@ -10,7 +10,7 @@
 
 @copying
 @c man begin COPYRIGHT
-Copyright @copyright{} 1987-2015 Free Software Foundation, Inc.
+Copyright @copyright{} 1987-2016 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/cppinternals.texi   (revision 232052)
+++ gcc/doc/cppinternals.texi   (working copy)
@@ -18,7 +18,7 @@
 @ifinfo
 This file documents the internals of the GNU C Preprocessor.
 
-Copyright (C) 2000-2015 Free Software Foundation, Inc.
+Copyright (C) 2000-2016 Free Software Foundation, Inc.
 
 Permission is granted to make and distribute verbatim copies of
 this manual provided the copyright notice and this permission notice
@@ -47,7 +47,7 @@ into another language, under the above c
 @page
 @vskip 0pt plus 1filll
 @c man begin COPYRIGHT
-Copyright @copyright{} 2000-2015 Free Software Foundation, Inc.
+Copyright @copyright{} 2000-2016 Free Software Foundation, Inc.
 
 Permission is granted to make and distribute verbatim copies of
 this manual provided the copyright notice and this permission notice
--- gcc/doc/gcc.texi(revision 232052)
+++ gcc/doc/gcc.texi(working copy)
@@ -40,7 +40,7 @@
 @c %**end of header
 
 @copying
-Copyright @copyright{} 1988-2015 Free Software Foundation, Inc.
+Copyright @copyright{} 1988-2016 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/gccint.texi (revision 232052)
+++ gcc/doc/gccint.texi (working copy)
@@ -26,7 +26,7 @@
 @c %**end of header
 
 @copying
-Copyright @copyright{} 1988-2015 Free Software Foundation, Inc.
+Copyright @copyright{} 1988-2016 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/gcov.texi   (revision 232052)
+++ gcc/doc/gcov.texi   (working copy)
@@ -4,7 +4,7 @@
 
 @ignore
 @c man begin COPYRIGHT
-Copyright @copyright{} 1996-2015 Free Software Foundation, Inc.
+Copyright @copyright{} 1996-2016 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- gcc/doc/install.texi(revision 232052)
+++ gcc/doc/install.texi(working copy)
@@ -44,7

Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint

2016-01-04 Thread H.J. Lu
On Mon, Jan 4, 2016 at 12:21 AM, Jakub Jelinek  wrote:
> On Sun, Jan 03, 2016 at 07:11:58PM -0800, H.J. Lu wrote:
>> --- a/gcc/config/i386/predicates.md
>> +++ b/gcc/config/i386/predicates.md
>> @@ -951,6 +951,13 @@
>> (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
>> (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
>>
>> +; Return true when OP is operand acceptable for vector memory operand.
>> +; Only AVX can have misaligned memory operand.
>> +(define_predicate "vector_memory_operand"
>> +  (and (match_operand 0 "memory_operand")
>> +   (ior (match_test "TARGET_AVX")
>> + (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)"
>
> Shouldn't this take into account the ssememalign attribute too?
> I mean, various instructions have some ssememalign > 8, which means they
> can't accept any alignment, but happily accept say >= 32-bit alignment
> or >= 64-bit alignment.  Though, ssememalign is an instruction attribute
> and the predicates/constraints don't have access to the current instruction.
> So maybe we need more constraints and more predicates, the ones you've added
> for ssememalign == 0 instructions, don't change anything in instructions
> with ssememalign == 8 (you've clearly changed some of them, and patch 3
> shows you've tried to partially undo it afterwards, but only the constraint,
> not the predicate, and only in one instruction), and use different
> predicates/constraints for ssememalign == {16,32,64} instructions.
>
> Jakub

>From INSTRUCTION EXCEPTION SPECIFICATION section in Intel SDM
volume 2, only legacy SSE instructions with memory operand not
16-byte aligned get General Protection fault.  There is no need to check
1, 2, 4, 8 byte alignments. Since x86 backend has accurate constraints
and predicates for 16-byte alignment after my patches, there is no need for
ix86_legitimate_combined_insn nor ssememalign.  My followup patch will
remove them.  I have tested it without regressions.  I will submit it after my
patches have been checked in.


-- 
H.J.


Re: [testsuite][ARM target attributes] Fix effective_target tests

2016-01-04 Thread Christophe Lyon
On 18 December 2015 at 15:16, Kyrill Tkachov
 wrote:
> Hi Christophe,
>
>
> On 17/12/15 22:17, Christophe Lyon wrote:
>>
>> Hi,
>>
>> Here is an updated version of this patch.
>> I did test it with
>> -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in
>> addition to my usual set of options.
>>
>> Compared to the previous version:
>> - I added some doc in sourcebuild.texi
>> - I no longer modify arm_vfp_ok...
>> - I replaced all uses of arm_vfp with the new arm_fp because I found
>> that the existing tests do not actually need to pass -mfpu=vfp: this
>> is implicitly set as the default when using -mfloat-abi={softfp|hard}
>> - I chose not to remove arm_vfp_ok because we may need it in the
>> future, if a test really needs vfp (as opposed to neon for instance)
>> - in gcc.target/arm/attr-crypto.c I force the initial fpu to be vfp
>> via pragma instead, so that the next pragma fpu
>> fpu=crypto-neon-fp-armv8 is always compatible, regardless of the
>> command-line options/default fpu
>> - same for attr-neon2.c and attr-neon3.c
>> - I updated cmp-2.c, unsigned-float.c, vfp-1.c, vfp-ldmdbd.c,
>> vfp-ldmdbs.c, vfp-ldmiad.c, vfp-ldmias.c, vfp-stmdbd.c, vfp-stmdbs.c,
>> vfp-stmiad.c, vfp-stmias.c, vnmul-[1234].c to use the new arm_fp
>> effective target instead of arm_vfp. This is so that they don't need
>> to use -mfpu=vfp and can use the new dg-add-options arm_fp
>>
>> The validation results show (in addition to what I originally reported):
>> - attr-crypto.c and attr-neon3.c now ICE in some cases. This is PR68895.
>> - depending on the GCC configuration (e.g. --with-fpu=neon)
>> attr-neon3.c may fail. This is PR68896.
>>
>> OK?
>
>
> Thanks for following up on this.
> I think you also need to document the new arm_crypto_pragma_ok.
>
Indeed, I forgot it.

Here is a new version of the patch with a few words added to document
this function.
I did not modify the testcase after Christian's comments and
PR68934: my understanding is that the testscase are valid after
all and Christian is working on fixing the ICE.

2016-01-04  Christophe Lyon  

* doc/sourcebuild.texi (arm_crypto_pragma_ok): Document new entry.
(arm_fp_ok): Likewise.
(arm_fp): Likewise.
(arm_crypto): Likewise.
* lib/target-supports.exp
(check_effective_target_arm_fp_ok_nocache): New.
(check_effective_target_arm_fp_ok): New.
(add_options_for_arm_fp): New.
(check_effective_target_arm_crypto_ok_nocache): Require
target_arm_v8_neon_ok instead of arm32.
(check_effective_target_arm_crypto_pragma_ok_nocache): New.
(check_effective_target_arm_crypto_pragma_ok): New.
(add_options_for_arm_vfp): New.
* gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
target instead. Force initial fpu to vfp.
* gcc.target/arm/attr-neon-builtin-fail.c: Do not force
-mfloat-abi=softfp, use arm_fp_ok effective target instead.
* gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok
dependency.
* gcc.target/arm/attr-neon2.c: Do not force -mfloat-abi=softfp,
use arm_vfp effective target instead. Force initial fpu to vfp.
* gcc.target/arm/attr-neon3.c: Likewise.
* gcc.target/arm/cmp-2.c: Use arm_fp_ok effective target instead of
arm_vfp_ok.
* gcc.target/arm/unsigned-float.c: Likewise.
* gcc.target/arm/vfp-1.c: Likewise.
* gcc.target/arm/vfp-ldmdbd.c: Likewise.
* gcc.target/arm/vfp-ldmdbs.c: Likewise.
* gcc.target/arm/vfp-ldmiad.c: Likewise.
* gcc.target/arm/vfp-ldmias.c: Likewise.
* gcc.target/arm/vfp-stmdbd.c: Likewise.
* gcc.target/arm/vfp-stmdbs.c: Likewise.
* gcc.target/arm/vfp-stmiad.c: Likewise.
* gcc.target/arm/vfp-stmias.c: Likewise.
* gcc.target/arm/vnmul-1.c: Likewise.
* gcc.target/arm/vnmul-2.c: Likewise.
* gcc.target/arm/vnmul-3.c: Likewise.
* gcc.target/arm/vnmul-4.c: Likewise.

OK?

Christophe.


> Kyrill
>
>
>> Christophe
>>
>> 2015-12-17  Christophe Lyon  
>>
>>  * doc/sourcebuild.texi (arm_fp_ok): Document new entry.
>>  (arm_fp): Likewise.
>>  * lib/target-supports.exp
>>  (check_effective_target_arm_fp_ok_nocache): New.
>>  (check_effective_target_arm_fp_ok): New.
>>  (add_options_for_arm_fp): New.
>>  (check_effective_target_arm_crypto_ok_nocache): Require
>>  target_arm_v8_neon_ok instead of arm32.
>>  (check_effective_target_arm_crypto_pragma_ok_nocache): New.
>>  (check_effective_target_arm_crypto_pragma_ok): New.
>>  (add_options_for_arm_vfp): New.
>>  * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
>>  target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
>>  target instead. Force initial fpu to vfp.
>>  * gcc.target/arm/attr-neon-builtin-fail.c: Do not force
>>  -mfloat-abi=softfp, use arm_fp_ok effective target instead.
>>  * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok
>>  dependency.

Re: [testsuite][ARM target attributes] Fix effective_target tests

2016-01-04 Thread Christophe Lyon
On 4 January 2016 at 15:20, Christophe Lyon  wrote:
> On 18 December 2015 at 15:16, Kyrill Tkachov
>  wrote:
>> Hi Christophe,
>>
>>
>> On 17/12/15 22:17, Christophe Lyon wrote:
>>>
>>> Hi,
>>>
>>> Here is an updated version of this patch.
>>> I did test it with
>>> -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in
>>> addition to my usual set of options.
>>>
>>> Compared to the previous version:
>>> - I added some doc in sourcebuild.texi
>>> - I no longer modify arm_vfp_ok...
>>> - I replaced all uses of arm_vfp with the new arm_fp because I found
>>> that the existing tests do not actually need to pass -mfpu=vfp: this
>>> is implicitly set as the default when using -mfloat-abi={softfp|hard}
>>> - I chose not to remove arm_vfp_ok because we may need it in the
>>> future, if a test really needs vfp (as opposed to neon for instance)
>>> - in gcc.target/arm/attr-crypto.c I force the initial fpu to be vfp
>>> via pragma instead, so that the next pragma fpu
>>> fpu=crypto-neon-fp-armv8 is always compatible, regardless of the
>>> command-line options/default fpu
>>> - same for attr-neon2.c and attr-neon3.c
>>> - I updated cmp-2.c, unsigned-float.c, vfp-1.c, vfp-ldmdbd.c,
>>> vfp-ldmdbs.c, vfp-ldmiad.c, vfp-ldmias.c, vfp-stmdbd.c, vfp-stmdbs.c,
>>> vfp-stmiad.c, vfp-stmias.c, vnmul-[1234].c to use the new arm_fp
>>> effective target instead of arm_vfp. This is so that they don't need
>>> to use -mfpu=vfp and can use the new dg-add-options arm_fp
>>>
>>> The validation results show (in addition to what I originally reported):
>>> - attr-crypto.c and attr-neon3.c now ICE in some cases. This is PR68895.
>>> - depending on the GCC configuration (e.g. --with-fpu=neon)
>>> attr-neon3.c may fail. This is PR68896.
>>>
>>> OK?
>>
>>
>> Thanks for following up on this.
>> I think you also need to document the new arm_crypto_pragma_ok.
>>
> Indeed, I forgot it.
>
> Here is a new version of the patch with a few words added to document
> this function.
> I did not modify the testcase after Christian's comments and
> PR68934: my understanding is that the testscase are valid after
> all and Christian is working on fixing the ICE.
>
With the attachment, this time...


> 2016-01-04  Christophe Lyon  
>
> * doc/sourcebuild.texi (arm_crypto_pragma_ok): Document new entry.
> (arm_fp_ok): Likewise.
> (arm_fp): Likewise.
> (arm_crypto): Likewise.
> * lib/target-supports.exp
> (check_effective_target_arm_fp_ok_nocache): New.
> (check_effective_target_arm_fp_ok): New.
> (add_options_for_arm_fp): New.
> (check_effective_target_arm_crypto_ok_nocache): Require
> target_arm_v8_neon_ok instead of arm32.
> (check_effective_target_arm_crypto_pragma_ok_nocache): New.
> (check_effective_target_arm_crypto_pragma_ok): New.
> (add_options_for_arm_vfp): New.
> * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
> target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
> target instead. Force initial fpu to vfp.
> * gcc.target/arm/attr-neon-builtin-fail.c: Do not force
> -mfloat-abi=softfp, use arm_fp_ok effective target instead.
> * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok
> dependency.
> * gcc.target/arm/attr-neon2.c: Do not force -mfloat-abi=softfp,
> use arm_vfp effective target instead. Force initial fpu to vfp.
> * gcc.target/arm/attr-neon3.c: Likewise.
> * gcc.target/arm/cmp-2.c: Use arm_fp_ok effective target instead of
> arm_vfp_ok.
> * gcc.target/arm/unsigned-float.c: Likewise.
> * gcc.target/arm/vfp-1.c: Likewise.
> * gcc.target/arm/vfp-ldmdbd.c: Likewise.
> * gcc.target/arm/vfp-ldmdbs.c: Likewise.
> * gcc.target/arm/vfp-ldmiad.c: Likewise.
> * gcc.target/arm/vfp-ldmias.c: Likewise.
> * gcc.target/arm/vfp-stmdbd.c: Likewise.
> * gcc.target/arm/vfp-stmdbs.c: Likewise.
> * gcc.target/arm/vfp-stmiad.c: Likewise.
> * gcc.target/arm/vfp-stmias.c: Likewise.
> * gcc.target/arm/vnmul-1.c: Likewise.
> * gcc.target/arm/vnmul-2.c: Likewise.
> * gcc.target/arm/vnmul-3.c: Likewise.
> * gcc.target/arm/vnmul-4.c: Likewise.
>
> OK?
>
> Christophe.
>
>
>> Kyrill
>>
>>
>>> Christophe
>>>
>>> 2015-12-17  Christophe Lyon  
>>>
>>>  * doc/sourcebuild.texi (arm_fp_ok): Document new entry.
>>>  (arm_fp): Likewise.
>>>  * lib/target-supports.exp
>>>  (check_effective_target_arm_fp_ok_nocache): New.
>>>  (check_effective_target_arm_fp_ok): New.
>>>  (add_options_for_arm_fp): New.
>>>  (check_effective_target_arm_crypto_ok_nocache): Require
>>>  target_arm_v8_neon_ok instead of arm32.
>>>  (check_effective_target_arm_crypto_pragma_ok_nocache): New.
>>>  (check_effective_target_arm_crypto_pragma_ok): New.
>>>  (add_options_for_arm_vfp): New.
>>>  * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
>>>  target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
>>>  target 

Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909

2016-01-04 Thread Bernd Schmidt

On 12/20/2015 05:27 PM, Segher Boessenkool wrote:

On Fri, Dec 18, 2015 at 02:19:37AM +0100, Bernd Schmidt wrote:

On 12/17/2015 10:07 PM, Segher Boessenkool wrote:

It turns out v4 wasn't quite complete anyway; so here "v5".

If a candidate PRE cannot get the prologue because a block BB is
reachable from it, but PRE does not dominate BB, we try again with the
dominators of PRE.  That "try again" needs to again consider BB though,
we aren't done with it.

This fixes this problem.  Tested on the 68909 testcase, and bootstrapped
and regression checked on powerpc64-linux.  Is this okay for trunk?


This code is getting really quite confusing,
and at the least I think we
need more documentation of what exactly vec is supposed to contain at
the entry to the inner while loop here.


Same as in the other loop: vec is a stack of blocks that still need to
be looked at.  I can duplicate the comment if you want?


No, I think more is needed. The inner loop looks like it should be 
emptying the vec, but this is not true if we break out of it, and your 
patch now even adds an explicit push. It also looks like it wants to use 
the bb_tmp bitmap to cache results for future iterations of the outer 
loop, but I'm not convinced this is actually correct. I can't follow 
this behaviour anymore without clear a description of intent.


Also, it might be clearer to not modify "pro" in this loop - use a 
"cand" variable, and modify "pro" instead of last_ok, getting rid of the 
latter.



That would be a regression (from GCC 5); but I understand your worry.
How about we disable it if any further problems show up?


Let's see whether we can make sense of this code and decide then.


bernd



RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2016-01-04 Thread Ajit Kumar Agarwal


-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: Wednesday, December 23, 2015 12:06 PM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
representation

On 12/11/2015 02:11 AM, Ajit Kumar Agarwal wrote:
>
> Mibench/EEMBC benchmarks (Target Microblaze)
>
> Automotive_qsort1(4.03%), Office_ispell(4.29%), Office_stringsearch1(3.5%). 
> Telecom_adpcm_d( 1.37%), ospfv2_lite(1.35%).
>>I'm having a real tough time reproducing any of these results.  In fact, I'm 
>>having a tough time seeing cases where path splitting even applies to the 
>>Mibench/EEMBC benchmarks >>mentioned above.

>>In the very few cases where split-paths might apply, the net resulting 
>>assembly code I get is the same with and without split-paths.

>>How consistent are these results?

I am consistently getting the gains for office_ispell and office_stringsearch1, 
telcom_adpcm_d. I ran it again today and we see gains in the same bench mark 
tests 
with the split path changes.

>>What functions are being affected that in turn impact performance?

For office_ispell: The function are Function "linit (linit, funcdef_no=0, 
decl_uid=2535, cgraph_uid=0, symbol_order=2) for lookup.c file".
   "Function checkfile (checkfile, 
funcdef_no=1, decl_uid=2478, cgraph_uid=1, symbol_order=4)"
   " Function correct (correct, funcdef_no=2, 
decl_uid=2503, cgraph_uid=2, symbol_order=5)"
   " Function askmode (askmode, funcdef_no=24, 
decl_uid=2464, cgraph_uid=24, symbol_order=27)"
   for correct.c file.
  
For office_stringsearch1: The function is Function "bmhi_search (bmhi_search, 
funcdef_no=1, decl_uid=2178, cgraph_uid=1, symbol_order=5)"
for bmhisrch.c file.

>>What options are you using to compile the benchmarks?  I'm trying with
>>-O2 -fsplit-paths and -O3 in my attempts to trigger the transformation so 
>>that I can look more closely at possible heuristics.

I am using the following flags.

-O3 mlittle-endian -mxl-barrel-shift -mno-xl-soft-div -mhard-float 
-mxl-float-convert -mxl-float-sqrt   -mno-xl-soft-mul -mxl-multiply-high 
-mxl-pattern-compare.

To disable split paths -fno-split-paths is used on top of the above flags.

>>Is this with the standard microblaze-elf target?  Or with some other target?

I am using the --target=microblaze-xilinx-elf to build the microblaze target.

Thanks & Regards
Ajit

jeff




[PATCH] Adjust contrib/update-copyright.py

2016-01-04 Thread Jakub Jelinek
Hi!

One of the gfortran.dg/ tests has NVidia copyright, which made
update-copyright.py stop changing anything further.

Committed to trunk as obvious.

2016-01-04  Jakub Jelinek  

* update-copyright.py (GCCCopyright): Add NVIDIA Corporation
as external author.

--- contrib/update-copyright.py (revision 232054)
+++ contrib/update-copyright.py (working copy)
@@ -1,6 +1,6 @@
 #!/usr/bin/python
 #
-# Copyright (C) 2013 Free Software Foundation, Inc.
+# Copyright (C) 2013-2016 Free Software Foundation, Inc.
 #
 # This script is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -696,6 +696,7 @@ class GCCCopyright (Copyright):
 self.add_external_author ('James Theiler, Brian Gough')
 self.add_external_author ('Makoto Matsumoto and Takuji Nishimura,')
 self.add_external_author ('National Research Council of Canada.')
+self.add_external_author ('NVIDIA Corporation')
 self.add_external_author ('Peter Dimov and Multi Media Ltd.')
 self.add_external_author ('Peter Dimov')
 self.add_external_author ('Pipeline Associates, Inc.')

Property changes on: contrib/update-copyright.py
___
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property

Jakub


Re: [PATCH, PR69043, fortran] Trying to include a directory causes an infinite loop

2016-01-04 Thread Jim MacArthur

On 24/12/15 16:38, Jim MacArthur wrote:

 Botstrapped and tested for regressions on x86_64-pc-linux-gnu. There is
a test case for the bug included.


I missed out the test case when creating the first patch. This one 
should have it.


PR fortran/69043
   * scanner.c (load_file): Abort and show an error if stat() shows the 
path is a directory.
Index: gcc/fortran/scanner.c
===
--- gcc/fortran/scanner.c   (revision 231945)
+++ gcc/fortran/scanner.c   (working copy)
@@ -2200,6 +2200,8 @@ load_file (const char *realfilename, const char *d
   FILE *input;
   int len, line_len;
   bool first_line;
+  struct stat st;
+  int stat_result;
   const char *filename;
   /* If realfilename and displayedname are different and non-null then
  surely realfilename is the preprocessed form of
@@ -2242,6 +2244,16 @@ load_file (const char *realfilename, const char *d
   current_file->filename, current_file->line, filename);
  return false;
}
+
+  stat_result = stat (realfilename, &st);
+  if (stat_result == 0 && st.st_mode & S_IFDIR)
+   {
+ fprintf (stderr, "%s:%d: Error: Included path '%s'"
+  " is a directory.\n",
+  current_file->filename, current_file->line, filename);
+ fclose (input);
+ return false;
+   }
 }
 
   /* Load the file.
Index: gcc/testsuite/gfortran.dg/include_9.f
===
--- gcc/testsuite/gfortran.dg/include_9.f   (revision 0)
+++ gcc/testsuite/gfortran.dg/include_9.f   (working copy)
@@ -0,0 +1,7 @@
+! { dg-do compile }
+
+  include '/'
+  program main
+  end program
+
+! { dg-error "is a directory" " " { target *-*-* } 3 }


Re: cilkplus fails without pthreads for me

2016-01-04 Thread Bernd Schmidt

On 01/01/2016 07:13 PM, Mike Stump wrote:

cilkplus fails without pthreads for me:

xg++: error: unrecognized command line option '-pthread' compiler
exited with status 1 output is: xg++: error: unrecognized command
line option '-pthread'


> @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } {
>  return 0;
>   }
>
> +if { ! [check_effective_target_pthread] } {
> +   return 0;
> +}
> +

I think you'll also want to revert Nathan's earlier change that adds 
just nvptx for the same reason. Ok with that change.



Bernd


Re: [PATCH], PowerPC, add ISA 3.0 xxperm (power9 patch #12)

2016-01-04 Thread David Edelsohn
On Thu, Dec 31, 2015 at 1:30 PM, Michael Meissner
 wrote:
> This patch adds support for the ISA 3.0 XXPERM instruction, which is like
> VPERM, except it can operate on any VSX register.  Since the instruction is a 
> 3
> operand instruction (RT and RA must be the same), I made it so VPERM was
> preferred.  I also added XXPERM fusion support where a XXLOR move instruction
> immediately before the XXPERM instruction is fused together.
>
> I have bootstrapped and done make check on a big endian power7 and a little
> endian power8 system.  In addition, I built all of Spec 2006 with power9
> support enabled, and all of the tests that previously built now build with
> XXPERM being generated (the OMNETPP benchmark currently does not build on
> little endian for either power8 or power9).  Are these patches ok to check in?
>
> [gcc]
> 2015-12-31  Michael Meissner  
>
> * config/rs6000/constraints.md (wo constraint): New constraint for
> ISA 3.0 (power9).
>
> * config/rs6000/rs6000.c (rs6000_debug_reg_global): Add support
> for wo constraint.
> (rs6000_init_hard_regno_mode_ok): Likewise.
>
> * config/rs6000/rs6000.h (r6000_reg_class_enum): Add support for
> wo constraint.
>
> * config/rs6000/altivec.md (altivec_vperm_): Clean up vperm
> expanders not to have constraints.  Add support for ISA 3.0 xxperm
> instruction.  Add support for fusing xxlor with xxperm.
> (altivec_vperm__internal): Likewise.
> (altivec_vperm_v8hiv16qi): Likewise.
> (altivec_vperm_v16q): Likewise.
> (altivec_vperm__uns): Likewise.
> (vperm_v8hiv4si): Likewise.
> (vperm_v16qiv8hi): Likewise.
>
> * doc/md.texi (RS/6000 constraints): Document wo constraint.
>
> [gcc/testsuite]
> 2015-12-31  Michael Meissner  
>
> * gcc.target/powerpc/p9-permute.c: New test for xxperm code
> generation.

This is okay.

Thanks, David


Re: [PATCH], PowerPC, Add -mpower9-dform to switches turned on with -mcpu=power9

2016-01-04 Thread David Edelsohn
On Thu, Dec 31, 2015 at 3:41 PM, Michael Meissner
 wrote:
> When I did the inital d-form support for ISA 3.0 (power9) for loading scalar
> SF/DF values into Altivec registers, I did not enable -mpower9-dform with the
> other ISA 3.0 switches when you used -mcpu=power9.  This was during the 
> initial
> development, I had some bugs.  I fixed the bugs, but I forgot to enable the
> d-form addressing support.  This patch enables that default.
>
> I have built all of Spec 2006 with this option, and there were no failures.  I
> did not do the full bootstrap/make check right now, but I have done it in the
> past with no regressions.  Is it ok to install this patch?
>
> 2015-12-31  Michael Meissner  
>
> * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Add
> OPTION_MASK_P9_DFORM.
>
> (Note, at some point there will be patches to enable using d-form addressing
> with 128-bit vector types, but those patches aren't ready yet).

This is okay.

Thanks, David


Re: cilkplus fails without pthreads for me

2016-01-04 Thread Nathan Sidwell

On 01/01/16 13:13, Mike Stump wrote:

cilkplus fails without pthreads for me:

xg++: error: unrecognized command line option '-pthread'
compiler exited with status 1
output is:
xg++: error: unrecognized command line option '-pthread'

FAIL: c-c++-common/attr-simd-3.c  -std=gnu++14 PR68158 (test for errors, line 5)

I suspect pthreads is a fairly hard requirement.  Either a test compile and 
link needs to be done, or we need to be able to whack out the tests on 
non-pthread systems.

Ok?


Probably not.  See  the discussion at 
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01882.html  Admittedly, that was 
annotating the test directly,  but Rainer's comment suggests to me that 
requiring pthreads would be too great a hammer.


You don't say what target -- is it a system where a target triplet is 
insufficient for this check?


nathan


Re: [PATCH] c/68966 - atomic_fetch_* on atomic_bool not diagnosed

2016-01-04 Thread Marek Polacek
Hi Martin,

On Sun, Jan 03, 2016 at 08:03:20PM -0700, Martin Sebor wrote:
> Index: gcc/doc/extend.texi
> ===
> --- gcc/doc/extend.texi   (revision 232047)
> +++ gcc/doc/extend.texi   (working copy)
> @@ -9238,6 +9238,8 @@
>  @{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; @}   // nand
>  @end smallexample
>  
> +The object pointed to by the first argument must of integer or pointer type. 
>  It must not be a Boolean type.

Too long line and missing "be " after "must"?

> +The same constraints on arguments apply as for the corresponding 
> @code{__sync_op_and_fetch} built-in functions.
> +

Too long line.

> -All memory orders are valid.
> +The object pointed to by the first argument must of integer or pointer type. 
>  It must not be a Boolean type.  All memory orders are valid.

Too long line and missing "be " after "must"?

> +The same constraints on arguments apply as for the corresponding 
> @code{__atomic_op_fetch} built-in functions.  All memory orders are valid.

Too long line.

> @@ -10686,12 +10691,16 @@
>if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type))
>  goto incompatible;
>  
> +  if (fetch && TREE_CODE (type) == BOOLEAN_TYPE)
> +  goto incompatible;

This goto is indented two more spaces than it should be.

> @@ -11250,6 +11259,11 @@
>   vec *params)
>  {
>enum built_in_function orig_code = DECL_FUNCTION_CODE (function);
> +
> +  /* Is function is one of the _FETCH_OP_ or _OP_FETCH_ built-ins?

I think drop the second "is".

> @@ -11325,6 +11339,9 @@
>  case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N:
>  case BUILT_IN_ATOMIC_LOAD_N:
>  case BUILT_IN_ATOMIC_STORE_N:
> +  {
> + fetch_op = false;
> +  }

Let's either remove those {} or add a fallthrough comment as done above.

> @@ -11358,7 +11375,16 @@
>  case BUILT_IN_SYNC_LOCK_TEST_AND_SET_N:
>  case BUILT_IN_SYNC_LOCK_RELEASE_N:
>{
> - int n = sync_resolve_size (function, params);
> + /* The following are not _FETCH_OPs and must be accepted with
> +pointers to _Bool (or C++ bool).  */
> + if (fetch_op)
> +   fetch_op =
> + orig_code != BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_N
> + && orig_code != BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_N
> + && orig_code != BUILT_IN_SYNC_LOCK_TEST_AND_SET_N
> + && orig_code != BUILT_IN_SYNC_LOCK_RELEASE_N;
> + 

Trailing whitespaces on this line.  And I think add () around the RHS
of the assignment to fetch_op.

> Index: gcc/testsuite/gcc.dg/atomic-fetch-bool.c
> ===
> --- gcc/testsuite/gcc.dg/atomic-fetch-bool.c  (revision 0)
> +++ gcc/testsuite/gcc.dg/atomic-fetch-bool.c  (working copy)
> @@ -0,0 +1,64 @@
> +/* PR c/68966 - atomic_fetch_* on atomic_bool not diagnosed
> +   Test to verify that calls to __atomic_fetch_op funcions with a _Bool
> +   argument are rejected.  This is necessary because GCC expects that
> +   all initialized _Bool objects have a specific representation and
> +   allowing atomic operations to change it would break the invariant.  */
> +/* { dg-do compile } */
> +/* { dg-options "-std=c11" } */

Doesn't matter here, but probably add -pedantic-errors.

> Index: gcc/testsuite/gcc.dg/sync-fetch-bool.c
> ===
> --- gcc/testsuite/gcc.dg/sync-fetch-bool.c(revision 0)
> +++ gcc/testsuite/gcc.dg/sync-fetch-bool.c(working copy)
> @@ -0,0 +1,54 @@
> +/* PR c/68966 - atomic_fetch_* on atomic_bool not diagnosed
> +   Test to verify that calls to __sync_fetch_op funcions with a _Bool
> +   argument are rejected.  This is necessary because GCC expects that
> +   all initialized _Bool objects have a specific representation and
> +   allowing atomic operations to change it would break the invariant.  */
> +/* { dg-do compile } */
> +/* { dg-options "-std=c99" } */

As the testcase uses _Atomic, I wonder why there's -std=c99.  I'd use
-std=c11 -pedantic-errors.

Thanks,

Marek


Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2016-01-04 Thread Bernd Schmidt

On 12/31/2015 10:21 AM, Yuri Rumyantsev wrote:

Here is slightly modified patch which limits a number of conditional
moves instead of changing conditional branch cost. This is in fact a
work-around for very poor cost model which needs to be enhanced to
evaluate cost of conditional move that could be greater then cost of
ordinary move (for some targets). This fix did not show any
performance regressions on different x86 platforms in comparison with
James patch.


I think this is OK. In the future, when attaching patches, please make 
sure they are text/plain so they are displayed by mail readers and can 
be quoted.



Bernd



Mark oacc kernels fns

2016-01-04 Thread Nathan Sidwell
There's currently no robust predicate to determine whether an oacc offload 
function is for a kernels region (as opposed to a parallel region).  The test in 
tree-ssa-loop.c uses the heuristic of seeing if all the dimensions are defaulted 
 (which can easily be true for parallel offloads at that point).


This patch marks TREE_PUBLIC on the offload attribute values, to note kernels 
regions,  and adds a predicate to check that.  I also broke out the function 
level determination from oacc_validate_dims, as there it was only laziness on my 
part to have not done that earlier.


Using these predicates improves the dump output of the openacc device lowering 
pass too.


ok?

nathan
2016-01-04  Nathan Sidwell  

	* omp-low.h (oacc_fn_attrib_kernels_p): Declare.
	* omp-low.c (set_oacc_fn_attrib): Add IS_KERNEL arg.
	(oacc_fn_attrib_kernels_p, oacc_fn_attrib_level): New.
	(expand_omp_target): Pass is_kernel to set_oacc_fn_attrib.
	(oacc_validate_dims): Add LEVEL arg, don't return level.
	(new_oacc_loop_routine): Use oacc_fn_attrib_level, not
	oacc_validate_dims.
	(execute_oacc_device_lower): Adjust, add more dump output.
	* tree-ssa-loop.c (gate_oacc_kernels): Use oacc_fn_attrib_kernels_p.

Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 232057)
+++ gcc/omp-low.c	(working copy)
@@ -12395,10 +12395,11 @@ replace_oacc_fn_attrib (tree fn, tree di
 
 /* Scan CLAUSES for launch dimensions and attach them to the oacc
function attribute.  Push any that are non-constant onto the ARGS
-   list, along with an appropriate GOMP_LAUNCH_DIM tag.  */
+   list, along with an appropriate GOMP_LAUNCH_DIM tag.  IS_KERNEL is
+   true, if these are for a kernels region offload function.  */
 
 static void
-set_oacc_fn_attrib (tree fn, tree clauses, vec *args)
+set_oacc_fn_attrib (tree fn, tree clauses, bool is_kernel, vec *args)
 {
   /* Must match GOMP_DIM ordering.  */
   static const omp_clause_code ids[]
@@ -12423,6 +12424,9 @@ set_oacc_fn_attrib (tree fn, tree clause
 	  non_const |= GOMP_DIM_MASK (ix);
 	}
   attr = tree_cons (NULL_TREE, dim, attr);
+  /* Note kernelness with TREE_PUBLIC.  */
+  if (is_kernel)
+	TREE_PUBLIC (attr) = 1;
 }
 
   replace_oacc_fn_attrib (fn, attr);
@@ -12491,6 +12495,36 @@ get_oacc_fn_attrib (tree fn)
   return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn));
 }
 
+/* Return true if this oacc fn attrib is for a kernels offload
+   region.  We use the TREE_PUBLIC flag of each dimension -- only
+   need to check the first one.  */
+
+bool
+oacc_fn_attrib_kernels_p (tree attr)
+{
+  return TREE_PUBLIC (TREE_VALUE (attr));
+}
+
+/* Return level at which oacc routine may spawn a partitioned loop, or
+   -1 if it is not a routine (i.e. is an offload fn).  */
+
+static int
+oacc_fn_attrib_level (tree attr)
+{
+  tree pos = TREE_VALUE (attr);
+
+  if (!TREE_PURPOSE (pos))
+return -1;
+  
+  int ix = 0;
+  for (ix = 0; ix != GOMP_DIM_MAX;
+   ix++, pos = TREE_CHAIN (pos))
+if (!integer_zerop (TREE_PURPOSE (pos)))
+  break;
+
+  return ix;
+}
+
 /* Extract an oacc execution dimension from FN.  FN must be an
offloaded function or routine that has already had its execution
dimensions lowered to the target-specific values.  */
@@ -12808,6 +12842,7 @@ expand_omp_target (struct omp_region *re
   enum built_in_function start_ix;
   location_t clause_loc;
   unsigned int flags_i = 0;
+  bool oacc_kernels_p = false;
 
   switch (gimple_omp_target_kind (entry_stmt))
 {
@@ -12827,8 +12862,10 @@ expand_omp_target (struct omp_region *re
   start_ix = BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA;
   flags_i |= GOMP_TARGET_FLAG_EXIT_DATA;
   break;
-case GF_OMP_TARGET_KIND_OACC_PARALLEL:
 case GF_OMP_TARGET_KIND_OACC_KERNELS:
+  oacc_kernels_p = true;
+  /* FALLTHROUGH */
+case GF_OMP_TARGET_KIND_OACC_PARALLEL:
   start_ix = BUILT_IN_GOACC_PARALLEL;
   break;
 case GF_OMP_TARGET_KIND_OACC_DATA:
@@ -13010,7 +13047,7 @@ expand_omp_target (struct omp_region *re
   break;
 case BUILT_IN_GOACC_PARALLEL:
   {
-	set_oacc_fn_attrib (child_fn, clauses, &args);
+	set_oacc_fn_attrib (child_fn, clauses, oacc_kernels_p, &args);
 	tagging = true;
   }
   /* FALLTHRU */
@@ -18929,17 +18966,17 @@ oacc_xform_loop (gcall *call)
 }
 
 /* Validate and update the dimensions for offloaded FN.  ATTRS is the
-   raw attribute.  DIMS is an array of dimensions, which is returned.
-   Returns the function level dimensionality --  the level at which an
-   offload routine wishes to partition a loop.  */
+   raw attribute.  DIMS is an array of dimensions, which is filled in.
+   LEVEL is the partitioning level of a routine, or -1 for an offload
+   region itself.  */
 
-static int
-oacc_validate_dims (tree fn, tree attrs, int *dims)
+static void
+oacc_validate_dims (tree fn, tree attrs, int *dims, int level)
 {
   tree purpose[GOMP_DIM_MAX];
   unsigned ix;
   tree pos = 

Re: varpool/constpool bug

2016-01-04 Thread Nathan Sidwell
My patch to stop constant pool objects accidentally ending up in the varpool 
caused problems with (at least) powerpc. 
(https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02100.html) Hence reverted.


This patch changes compare_base_decls to simply use the varpool getter, rather 
than get_create.  We still need the preceding decl_in_symtab_p to filter out 
decls that should never be in the varpool (the getter has an assert to check 
you're not trying to abuse it).


ok?

nathan
2016-01-04  Nathan Sidwell  

	gcc/
	* alias.c (compare_base_decls): Use symtab_node::get.

	gcc/testsuite/
	* gcc.dg/alias-15.c: New.

Index: alias.c
===
--- alias.c	(revision 232057)
+++ alias.c	(working copy)
@@ -2044,8 +2044,15 @@ compare_base_decls (tree base1, tree bas
   || !decl_in_symtab_p (base2))
 return 0;
 
-  ret = symtab_node::get_create (base1)->equal_address_to
-		 (symtab_node::get_create (base2), true);
+  /* Don't cause symbols to be inserted by the act of checking.  */
+  symtab_node *node1 = symtab_node::get (base1);
+  if (!node1)
+return 0;
+  symtab_node *node2 = symtab_node::get (base2);
+  if (!node2)
+return 0;
+  
+  ret = node1->equal_address_to (node2, true);
   if (ret == 2)
 return -1;
   return ret;
Index: testsuite/gcc.dg/alias-15.c
===
--- testsuite/gcc.dg/alias-15.c	(revision 0)
+++ testsuite/gcc.dg/alias-15.c	(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-additional-options  "-O2 -fdump-ipa-cgraph" } */
+
+/* RTL-level CSE shouldn't introduce LCO (for the string) into varpool */
+char *p;
+
+void foo ()
+{
+  p = "abc\n";
+
+  while (*p != '\n')
+p++;
+}
+
+/* { dg-final { scan-ipa-dump-not "LC0" "cgraph" } } */


Re: [PATCH] PR/68089: C++-11: Ingore "alignas(0)".

2016-01-04 Thread Martin Sebor

On 01/04/2016 04:33 AM, Dominik Vogt wrote:

On Fri, Jan 01, 2016 at 05:53:08PM -0700, Martin Sebor wrote:

On 12/31/2015 04:50 AM, Dominik Vogt wrote:

The attached patch fixes C++-11 handling of "alignas(0)" which
should be ignored but currently generates an error message.  A
test case is included; the patch has been tested on S390x.  Since
it's a language issue it should be independent of the backend
used.


The patch doesn't handle value-dependent expressions(*).



It
seems that the problem is in handle_aligned_attribute() calling
check_user_alignment() with the second argument (ALLOW_ZERO)
set to false.  Calling it with true fixes the problem and handles
value-dependent expressions (I haven't done any more testing beyond
that).


Like the attached patch?  (Passes the testsuite on s390x.)


Yes, like that (though someone other than me needs to approve
your patch).



But wouldn't an "aligned" attribute be added, allowing the backend
to possibly generate an error or a warning?


AFAICS, both the C and C++ front ends ignore the attribute
when check_user_alignment() returns -1 (either on error or
when the requested alignment is zero and ALLOW_ZERO is true).

Martin

PS I wonder what it is about this thread that makes my email
client (Thunderbird) include only gcc-patches and krebbel
when I hit Reply All and not you.  (I had to manually add
your email.)  It looks like your reply back to me did the
same thing.

Martin



Re: cilkplus fails without pthreads for me

2016-01-04 Thread Nathan Sidwell

On 01/04/16 10:06, Bernd Schmidt wrote:

On 01/01/2016 07:13 PM, Mike Stump wrote:

cilkplus fails without pthreads for me:

xg++: error: unrecognized command line option '-pthread' compiler
exited with status 1 output is: xg++: error: unrecognized command
line option '-pthread'


 > @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } {
 >  return 0;
 >   }
 >
 > +if { ! [check_effective_target_pthread] } {
 > +   return 0;
 > +}
 > +

I think you'll also want to revert Nathan's earlier change that adds just nvptx
for the same reason. Ok with that change.


Yes please.

nathan



guilty test suite fix

2016-01-04 Thread Mike Stump
So, I’d like for the guality people to chime in.  I only kick in, if they fail 
to do so for any reason.  :-)

Either, the stuff downstream _must_ arrange for newline ended content, or this 
code has to do it, if they don’t.  My take, I think they are signing up for 
newline terminated content:

binutils/gdb$ grep 'will be killed' *.c
top.c:_("\tInferior %d [%s] will be killed.\n"), inf->num,

commit b8fa0bfa752bb672c66a1d6fdefcdf4cb308a712
Author: Pedro Alves 
Date:   Fri Aug 14 14:28:15 2009 +

2009-08-14  Pedro Alves  

gdb/
* top.c (any_thread_of): Delete.
(kill_or_detach): Use any_thread_of_process.
* top.c (print_inferior_quit_action): New.
(quit_confirm): Rewrite to print info about all inferiors.
* target.c (dispose_inferior): New.
(target_preopen): Use it.

2009-08-14  Pedro Alves  

gdb/testsuite/
* gdb.threads/killed.exp, gdb.threads/manythreads.exp,
gdb.threads/staticthreads.exp: Adjust to "quit" output changes.


+ _("\tInferior %d [%s] will be killed.\n"), inf->num,

So, the question is, what output this line, stripping that newline?  As far as 
I can tell, there is no gdb that won’t print it since 2009.

The other possible patch would be the routine that read that from gdb and 
printed it without the newline.  I’m not sure if that patch or your patch is 
better.

On Jan 4, 2016, at 4:16 AM, Alan Lawrence  wrote:
> On 10/12/15 10:31, Alan Lawrence wrote:
>> Runs of the guality testsuite can sometimes end up with gcc.log containing
>> malformed lines like:
>> 
>> A debugging session is active.PASS: gcc.dg/guality/pr36728-1.c   -O2  line 
>> 18 arg4 == 4
>> A debugging session is active.PASS: gcc.dg/guality/restrict.c   -O2  line 30 
>> type:ip == int *
>>  Inferior 1 [process 27054] will be killed.PASS: 
>> gcc.dg/guality/restrict.c   -O2  line 30 type:cicrp == const int * const 
>> restrict
>>  Inferior 1 [process 27160] will be killed.PASS: 
>> gcc.dg/guality/restrict.c   -O2  line 30 type:cvirp == int * const volatile 
>> restrict
>> 
>> This patch just makes sure the PASS/FAIL comes at the beginning of a line.  
>> (At
>> the slight cost of adding some extra newlines not in the actual test output.)
>> 
>> I moved the remote_close target calls earlier, to avoid any possible race
>> condition of extra output being generated after the newline - this may not be
>> strictly necessary.
>> 
>> Tested on aarch64-none-linux-gnu and x86_64-none-linux-gnu.
>> 
>> I think this is reasonable for stage 3 - OK for trunk?
>> 
>> gcc/testsuite/ChangeLog:
>>  * lib/gcc-gdb-test.exp (gdb-test): call remote_close earlier, and send
>>  newline to log, before calling pass/fail/unsupported.
>>  * lib/gcc-simulate-thread.exp (simulate-thread): Likewise.
>> ---
>>  gcc/testsuite/lib/gcc-gdb-test.exp| 15 ++-
>>  gcc/testsuite/lib/gcc-simulate-thread.exp | 10 +++---
>>  2 files changed, 17 insertions(+), 8 deletions(-)
>> 
>> diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp 
>> b/gcc/testsuite/lib/gcc-gdb-test.exp
>> index d3ba6e4..f60cabf 100644
>> --- a/gcc/testsuite/lib/gcc-gdb-test.exp
>> +++ b/gcc/testsuite/lib/gcc-gdb-test.exp
>> @@ -84,8 +84,9 @@ proc gdb-test { args } {
>>  remote_expect target [timeout_value] {
>>  # Too old GDB
>>  -re "Unhandled dwarf expression|Error in sourced command file|> type in " {
>> -unsupported "$testname"
>>  remote_close target
>> +send_log "\n"
>> +unsupported "$testname"
>>  file delete $cmd_file
>>  return
>>  }
>> @@ -93,7 +94,9 @@ proc gdb-test { args } {
>>  -re {[\n\r]\$1 = ([^\n\r]*)[\n\r]+\$2 = ([^\n\r]*)[\n\r]} {
>>  set first $expect_out(1,string)
>>  set second $expect_out(2,string)
>> +remote_close target
>>  if { $first == $second } {
>> +send_log "\n"
>>  pass "$testname"
>>  } else {
>>  # We need the -- to disambiguate $first from an option,
>> @@ -101,7 +104,6 @@ proc gdb-test { args } {
>>  send_log -- "$first != $second\n"
>>  fail "$testname"
>>  }
>> -remote_close target
>>  file delete $cmd_file
>>  return
>>  }
>> @@ -116,26 +118,29 @@ proc gdb-test { args } {
>>  regsub -all {\mlong int\M} $type "long" type
>>  regsub -all {\mshort int\M} $type "short" type
>>  set expected [lindex $args 2]
>> +remote_close target
>>  if { $type == $expected } {
>> +send_log "\n"
>>  pass "$testname"
>>  } else {
>>  send_log -- "$type != $expected\n"
>>  fail "$testname"
>>  }
>> -remote_close target
>>  file delete $cmd_file
>>  return
>>  }
>>  timeout {
>> -unsupported "$testname"
>>  remote_close target
>> +send_log "\n"
>> +  

[gomp4] Fix acc_on_device for C++

2016-01-04 Thread Nathan Sidwell
This patch fixes acc_on_device's C++ wrapper when compiling at -O0.  The wrapper 
isn't inlined, and we need to mark the function as needing emission by the 
device compiler too.


nathan
2016-01-04  Nathan Sidwell  

	* openacc.c (acc_on_device): Add routine pragma for C++ wrapper.
	* testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: New.

Index: libgomp/openacc.h
===
--- libgomp/openacc.h	(revision 232058)
+++ libgomp/openacc.h	(working copy)
@@ -121,6 +121,7 @@ int acc_set_cuda_stream (int, void *) __
 
 /* Forwarding function with correctly typed arg.  */
 
+#pragma acc routine seq
 inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
 {
   return acc_on_device ((int) __arg);
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
===
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(working copy)
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-O0" } */
+
+#include 
+
+/* acc_on_device might not be folded at -O0, but it should work. */
+
+int main ()
+{
+  int dev;
+  
+#pragma acc parallel copyout (dev)
+  {
+dev = acc_on_device (acc_device_not_host);
+  }
+
+  int expect = 1;
+  
+#if  ACC_DEVICE_TYPE_host
+  expect = 0;
+#endif
+  
+  return dev != expect;
+}


Re: [PATCH 1/3] Fix logic bug in Cilk Plus array expansion

2016-01-04 Thread Jeff Law

On 01/02/2016 04:26 PM, Patrick Palka wrote:

On Sat, Jan 2, 2016 at 3:21 AM, Jakub Jelinek  wrote:

On Fri, Jan 01, 2016 at 10:06:34PM -0700, Jeff Law wrote:

gcc/cp/ChangeLog:

 * cp-array-notation.c (cp_expand_cond_array_notations): Return
 error_mark_node only if find_rank failed, not if it was
 successful.

Can you use -fdump-tree-original in the testcase and verify there's no <<<
error >>> expressions in the resulting dump file?

With that change, this is OK.


I think the patch is incomplete.  Because, find_rank does not always emit
an error if it returns false, so we again have cases where we can get
error_mark_node in the code without error being emitted.
   else if (*rank != current_rank)
 {
   /* In this case, find rank is being recursed through a set of
  expression of the form A  B, where A and B both have
  array notations in them and the rank of A is not equal to rank of 
B.
  A simple example of such case is the following: X[:] + Y[:][:] */
   *rank = current_rank;
   return false;
 }
and other spots.  E.g.
   if (prev_arg && EXPR_HAS_LOCATION (prev_arg))
 error_at (EXPR_LOCATION (prev_arg),
   "rank mismatch between %qE and %qE", prev_arg,
   TREE_OPERAND (expr, ii));
looks very suspicious.


Hmm, good point. Here's a contrived test case that causes find_rank to
return false without emitting an error message thus we again end up
with an error_mark_node in the gimplifier:

/* { dg-do compile } */
/* { dg-options "-fcilkplus" } */

void foo() {}

#define ALEN 1024

int main(int argc, char* argv[])
{
   typedef void (*f) (void *);
   f b[ALEN], c[ALEN][ALEN];
   (b[:]) ((void *)c[:][:]);
   _Cilk_spawn foo();
   return 0;
}

But this patch was intended to only fix the testsuite fallout that
patch 3 would have otherwise caused, and not to e.g. fix all the bugs
with find_rank.

(BTW patch 3 also makes this test case trigger an ICE, instead of
being silently miscompiled.)
Can you please include this test (xfailed) when you commit patch #1.  I 
think you want the test to scan for error_mark_node in the gimplified dump.


Jeff



Re: varpool/constpool bug

2016-01-04 Thread Jeff Law

On 01/04/2016 08:57 AM, Nathan Sidwell wrote:

My patch to stop constant pool objects accidentally ending up in the
varpool caused problems with (at least) powerpc.
(https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02100.html) Hence reverted.

This patch changes compare_base_decls to simply use the varpool getter,
rather than get_create.  We still need the preceding decl_in_symtab_p to
filter out decls that should never be in the varpool (the getter has an
assert to check you're not trying to abuse it).

ok?

Once it passes the usual bootstrap & regression testing.

Looking at it again, it seems "obvious" now that the act of comparing 
things for alias analysis shouldn't be inserting new things into the tables.



jeff


Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-04 Thread Jeff Law

On 12/24/2015 04:55 AM, Alan Lawrence wrote:

This version changes the test cases to fix failures on some platforms, by
rewriting the initializers so that they aren't pushed out to the constant pool.

gcc/ChangeLog:

* tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and ARRAY_REF
using get_ref_base_and_extent.
(equal_mem_array_ref_p): New.
(hashable_expr_equal_p): Add call to previous.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-dom-cse-5.c: New.
* gcc.dg/tree-ssa/ssa-dom-cse-6.c: New.
* gcc.dg/tree-ssa/ssa-dom-cse-7.c: New.

This is fine.

Thanks,
Jeff



Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-04 Thread Jeff Law

On 12/21/2015 06:13 AM, Alan Lawrence wrote:

This is a respin of patches
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03266.html and
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03267.html, which were
"too quickly" approved before concerns with efficiency were pointed out.

I tried to change the hashing just in tree-ssa-dom.c using C++ subclassing, but
couldn't cleanly separate this out from tree-ssa-scopedtables and
tree-ssa-threadedge.c due to use of avail_exprs_stack. So I figured it was
probably appropriate to use the equivalences in jump threading too. Also,
using get_ref_base_and_extent unifies handling of MEM_REFs and ARRAY_REFs
(hence only one patch rather than two).

It is appropriate.



I've added a couple of testcases that show the improvement in DOM, but in all
cases I had to disable FRE, even PRE, to get any improvement, apart from on
ssa-dom-cse-2.c itself (where on the affected platforms FRE still does not do
the optimization). This makes me wonder if this is the right approach or whether
changing the references output by SRA (as per
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01490.html , judged as a hack to
SRA to work around limitations in DOM - or is it?) would be better.

I just doubt it happens all that much.



Jeff


Re: cilkplus fails without pthreads for me

2016-01-04 Thread Mike Stump
On Jan 4, 2016, at 7:22 AM, Nathan Sidwell  wrote:
> On 01/01/16 13:13, Mike Stump wrote:
>> cilkplus fails without pthreads for me:
>> 
>> xg++: error: unrecognized command line option '-pthread'
>> compiler exited with status 1
>> output is:
>> xg++: error: unrecognized command line option '-pthread'
>> 
>> FAIL: c-c++-common/attr-simd-3.c  -std=gnu++14 PR68158 (test for errors, 
>> line 5)
>> 
>> I suspect pthreads is a fairly hard requirement.  Either a test compile and 
>> link needs to be done, or we need to be able to whack out the tests on 
>> non-pthread systems.
>> 
>> Ok?
> 
> Probably not.  See  the discussion at 
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01882.html  Admittedly, that 
> was annotating the test directly,  but Rainer's comment suggests to me that 
> requiring pthreads would be too great a hammer.
> 
> You don't say what target -- is it a system where a target triplet is 
> insufficient for this check?

That was on purpose.  All non-pthreads targets.  One cannot ascertain if a 
system has pthreads by checking a target triplet.  This is the problem I want 
fixed.  Adding a clause for one such target doesn’t fix all such targets.  I 
didn’t read Rainer’s comments as authoritative for the design of cilk.  I also 
don’t read them as inconsistent with my proposed patch.

Since Bernd Ok it, and that is consistent with the apparent design to me, I’m 
going with his approval.  Here is my take, the runtime is written to require 
pthreads, that’s just how it is.  Since it is, the testing for it is going to 
require pthreads.  That’s just how it is.  We gate off all tests that require 
cilk on systems that don’t have pthreads.  Special escapes from the general 
rule can happen before or after the newly added clause on a per target or some 
other metric.

The next proposed patch is:

Index: target-supports.exp
===
--- target-supports.exp (revision 232062)
+++ target-supports.exp (working copy)
@@ -1442,11 +1442,6 @@ proc check_effective_target_cilkplus { }
return 0;
 }
 
-# No pthreads on NVPTX
-if { [istarget nvptx-*-*] } {
-   return 0;
-}
-
 if { ! [check_effective_target_pthread] } {
return 0;
 }

I believe this is now, not required nor desirable.  The attr-simd-3.c test case 
on NVPTX should be able to show if this is on the right track.

Ok?

Re: cilkplus fails without pthreads for me

2016-01-04 Thread Mike Stump
On Jan 4, 2016, at 9:09 AM, Nathan Sidwell  wrote:
> On 01/04/16 10:06, Bernd Schmidt wrote:
>> On 01/01/2016 07:13 PM, Mike Stump wrote:
>>> cilkplus fails without pthreads for me:
>>> 
>>> xg++: error: unrecognized command line option '-pthread' compiler
>>> exited with status 1 output is: xg++: error: unrecognized command
>>> line option '-pthread'
>> 
>> > @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } {
>> >  return 0;
>> >   }
>> >
>> > +if { ! [check_effective_target_pthread] } {
>> > +   return 0;
>> > +}
>> > +
>> 
>> I think you'll also want to revert Nathan's earlier change that adds just 
>> nvptx
>> for the same reason. Ok with that change.
> 
> Yes please.

I believe that patch has:

+/* { dg-do compile { target cilkplus } } */

in it, and this I believe is required for the test to be skipped on my target?

Re: cilkplus fails without pthreads for me

2016-01-04 Thread Nathan Sidwell

On 01/04/16 14:19, Mike Stump wrote:


I believe that patch has:

+/* { dg-do compile { target cilkplus } } */

in it, and this I believe is required for the test to be skipped on my target?


that bit is still necessary.  It's the bit in the .exp file testing nvptx-*-* 
that's no longer needed.


nathan


Re: cilkplus fails without pthreads for me

2016-01-04 Thread Nathan Sidwell

On 01/04/16 14:17, Mike Stump wrote:


The next proposed patch is:

Index: target-supports.exp
===
--- target-supports.exp (revision 232062)
+++ target-supports.exp (working copy)
@@ -1442,11 +1442,6 @@ proc check_effective_target_cilkplus { }
return 0;
  }

-# No pthreads on NVPTX
-if { [istarget nvptx-*-*] } {
-   return 0;
-}
-
  if { ! [check_effective_target_pthread] } {
return 0;
  }

I believe this is now, not required nor desirable.  The attr-simd-3.c test case 
on NVPTX should be able to show if this is on the right track.

Ok?


works for me, thanks.



[PATCH] Fix SLP ICE (PR tree-optimization/69083)

2016-01-04 Thread Jakub Jelinek
Hi!

The vec-cmp SLP patch added
+ if (VECTOR_BOOLEAN_TYPE_P (vector_type))
+   {
+ /* Can't use VIEW_CONVERT_EXPR for booleans because
+of possibly different sizes of scalar value and
+vector element.  */
...
+   }
hunk a few lines above this spot, but that only handles constants.
For non-constants, the problem is similar, boolean vector element type might
have different size from the op's type, but it really should be fold
convertible to that, so while we can't use VCE, we can use a NOP_EXPR
instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-04  Jakub Jelinek  

PR tree-optimization/69083
* tree-vect-slp.c (vect_get_constant_vectors): For
VECTOR_BOOLEAN_TYPE_P assert op is fold_convertible_p to vector_type's
element type.  If op is fold_convertible_p to vector_type's element
type, use NOP_EXPR instead of VCE.

* gcc.dg/vect/pr69083.c: New test.

--- gcc/tree-vect-slp.c.jj  2015-12-18 09:38:27.0 +0100
+++ gcc/tree-vect-slp.c 2016-01-04 12:56:20.800412147 +0100
@@ -2967,9 +2967,22 @@ vect_get_constant_vectors (tree op, slp_
{
  tree new_temp = make_ssa_name (TREE_TYPE (vector_type));
  gimple *init_stmt;
- op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type), op);
- init_stmt
-   = gimple_build_assign (new_temp, VIEW_CONVERT_EXPR, op);
+ if (VECTOR_BOOLEAN_TYPE_P (vector_type))
+   {
+ gcc_assert (fold_convertible_p (TREE_TYPE (vector_type),
+ op));
+ init_stmt = gimple_build_assign (new_temp, NOP_EXPR, op);
+   }
+ else if (fold_convertible_p (TREE_TYPE (vector_type), op))
+   init_stmt = gimple_build_assign (new_temp, NOP_EXPR, op);
+ else
+   {
+ op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type),
+  op);
+ init_stmt
+   = gimple_build_assign (new_temp, VIEW_CONVERT_EXPR,
+  op);
+   }
  gimple_seq_add_stmt (&ctor_seq, init_stmt);
  op = new_temp;
}
--- gcc/testsuite/gcc.dg/vect/pr69083.c.jj  2016-01-04 13:11:51.958279240 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr69083.c 2016-01-04 13:12:36.142663787 +0100
@@ -0,0 +1,20 @@
+/* PR tree-optimization/69083 */
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+int d;
+short f;
+
+void
+foo (int a, int b, int e, short c)
+{
+  for (; e; e++)
+{
+  int j;
+  for (j = 0; j < 3; j++)
+   {
+ f = 7 >> b ? a : b;
+ d |= c == 1 ^ 1 == f;
+   }
+}
+}

Jakub


Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint

2016-01-04 Thread Uros Bizjak
On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu  wrote:
> On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu  wrote:
>> On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener
>>  wrote:
>>> On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak  
>>> wrote:
On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu  wrote:
> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak 
wrote:
>> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu 
wrote:
>>> SSE vector arithmetic and logic instructions only accept aligned
memory
>>> operand.  This patch adds vector_memory_operand and "Bm" constraint
for
>>> aligned SSE memory operand.  They are applied to SSE any_logic
patterns.
>>>
>>> OK for trunk and release branches if there are regressions?
>>
>> This patch is just papering over deeper problem, as Jakub said in
the PR [1]:
>>
>> --q--
>> GCC uses the ix86_legitimate_combined_insn target hook to disallow
>> misaligned memory into certain SSE instructions.
>> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset
&)FeatureEntry_21 + 8] ]) 0)
>> is not misaligned memory, it is a subreg of a pseudo register, so it
is fine.
>> If the replacement of the pseudo register with memory happens in
some
>> other pass, then it probably either should use the
>> legitimate_combined_insn target hook or some other one.  I think we
>> have already a PR where that happens during live range shrinking.
>> --/q--
>>
>> Please figure out where memory replacement happens. There are
several
>> other SSE insns (please grep the .md for "ssememalign" attribute)
that
>> are affected by this problem, so fixing a couple of patterns won't
>> solve the problem completely.
>
> LRA turns
>
> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
> (xor:V4SI (reg:V4SI 149)
> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>  (expr_list:REG_DEAD (reg:V4SI 149)
> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
> &)FeatureEntry_2(D)] ])
> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20
frame)
> (const_int -16 [0xfff0])) [3
> MEM[(unsigned int *)&D.2851]+0 S16 A128])
> (nil)
>
> into
>
> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ]
[148])
> (xor:V4SI (reg:V4SI 21 xmm0 [149])
> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ]
[117])
> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
> foo.ii:26 3454 {*xorv4si3}
>  (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
> (const_int -16 [0xfff0])) [3
MEM[(unsigned
> int *)&D.2851]+0 S16 A128])
> (nil)))
>
> since
>
> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>
> satisfies the 'm" constraint.  I don't think LRA should call
> ix86_legitimate_combined_insn to validate to validate constraints on
> an instruction.

Hm...

if LRA desn't assume that generic "m" constraint implies at least
natural alignment of propageted operand, then your patch is the way to
go.
>>>
>>> I don't think it even considers alignment. Archs where alignment validity 
>>> depends on the actual instruction should model this with proper constraints.
>>>
>>> But in this case, *every* SSE vector memory constraint should be
changed to Bm.
>>>
>>> I'd say so ...
>>
>> The "Bm" constraint should be applied only to non-move SSE
>> instructions with 16-byte memory operand.
>>
>
> Here are 3 patch which implement it.  There is one exception
> on SSE *mov_internal.  With Bm, LRA will crash, which
> may be an LRA bug.   I used m as workaround.
>
> Tested on x86-64 without regressions.  OK for trunk?

Looking at the comment in Patch 3, I'd say let's keep
*mov_internal constraints unchanged. But it looks to me that we
have to finally relax

  if ((TARGET_AVX || TARGET_IAMCU)
  && (misaligned_operand (operands[0], mode)
  || misaligned_operand (operands[1], mode)))

condition to allow unaligned moves for all targets, not only AVX and
IAMCU. The rationale for this decision is that if the RA won't be able
to satisfy Bm constraint, it can load the value into XMM register.
This will be done through SSE *mov internal, so unaligned move
has to be generated.

But please, double check the changes. In Patch 2, I have found:

@ -2041,10 +2041,10 @@
(set_attr "mode" "")])

 (define_insn "*ieee_smax3"
-  [(set (match_operand:VF 0 "register_operand" "=v,v")
+  [(set (match_operand:VF 0 "register_operand" "=x,v")
 (unspec:VF
   [(match_operand:VF 1 "register_operand" "0,v")
-   (match_operand:VF 2 "nonimmediat

Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint

2016-01-04 Thread H.J. Lu
On Mon, Jan 4, 2016 at 12:19 PM, Uros Bizjak  wrote:
> On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu  wrote:
>> On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu  wrote:
>>> On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener
>>>  wrote:
 On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak  
 wrote:
>On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu  wrote:
>> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak 
>wrote:
>>> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu 
>wrote:
 SSE vector arithmetic and logic instructions only accept aligned
>memory
 operand.  This patch adds vector_memory_operand and "Bm" constraint
>for
 aligned SSE memory operand.  They are applied to SSE any_logic
>patterns.

 OK for trunk and release branches if there are regressions?
>>>
>>> This patch is just papering over deeper problem, as Jakub said in
>the PR [1]:
>>>
>>> --q--
>>> GCC uses the ix86_legitimate_combined_insn target hook to disallow
>>> misaligned memory into certain SSE instructions.
>>> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset
>&)FeatureEntry_21 + 8] ]) 0)
>>> is not misaligned memory, it is a subreg of a pseudo register, so it
>is fine.
>>> If the replacement of the pseudo register with memory happens in
>some
>>> other pass, then it probably either should use the
>>> legitimate_combined_insn target hook or some other one.  I think we
>>> have already a PR where that happens during live range shrinking.
>>> --/q--
>>>
>>> Please figure out where memory replacement happens. There are
>several
>>> other SSE insns (please grep the .md for "ssememalign" attribute)
>that
>>> are affected by this problem, so fixing a couple of patterns won't
>>> solve the problem completely.
>>
>> LRA turns
>>
>> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
>> (xor:V4SI (reg:V4SI 149)
>> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
>> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>>  (expr_list:REG_DEAD (reg:V4SI 149)
>> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
>> &)FeatureEntry_2(D)] ])
>> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20
>frame)
>> (const_int -16 [0xfff0])) [3
>> MEM[(unsigned int *)&D.2851]+0 S16 A128])
>> (nil)
>>
>> into
>>
>> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ]
>[148])
>> (xor:V4SI (reg:V4SI 21 xmm0 [149])
>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ]
>[117])
>> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>> foo.ii:26 3454 {*xorv4si3}
>>  (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>> (const_int -16 [0xfff0])) [3
>MEM[(unsigned
>> int *)&D.2851]+0 S16 A128])
>> (nil)))
>>
>> since
>>
>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
>> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>
>> satisfies the 'm" constraint.  I don't think LRA should call
>> ix86_legitimate_combined_insn to validate to validate constraints on
>> an instruction.
>
>Hm...
>
>if LRA desn't assume that generic "m" constraint implies at least
>natural alignment of propageted operand, then your patch is the way to
>go.

 I don't think it even considers alignment. Archs where alignment validity 
 depends on the actual instruction should model this with proper 
 constraints.

 But in this case, *every* SSE vector memory constraint should be
>changed to Bm.

 I'd say so ...
>>>
>>> The "Bm" constraint should be applied only to non-move SSE
>>> instructions with 16-byte memory operand.
>>>
>>
>> Here are 3 patch which implement it.  There is one exception
>> on SSE *mov_internal.  With Bm, LRA will crash, which
>> may be an LRA bug.   I used m as workaround.
>>
>> Tested on x86-64 without regressions.  OK for trunk?
>
> Looking at the comment in Patch 3, I'd say let's keep
> *mov_internal constraints unchanged. But it looks to me that we
> have to finally relax
>
>   if ((TARGET_AVX || TARGET_IAMCU)
>   && (misaligned_operand (operands[0], mode)
>   || misaligned_operand (operands[1], mode)))
>
> condition to allow unaligned moves for all targets, not only AVX and
> IAMCU. The rationale for this decision is that if the RA won't be able
> to satisfy Bm constraint, it can load the value into XMM register.
> This will be done through SSE *mov internal, so unaligned move
> has to be generated.
>
> But please, double check the changes. In Patch 2, I have found:
>
> @ -2041,10 +2041,10 @@
> (set_attr "mode" "")])
>
>  (define_insn "*ieee_smax3"
> -  [(set (match_op

Re: [patch] ARM FreeBSD fix bootstrap

2016-01-04 Thread Andreas Tobler

Ping :)

TIA,
Andreas

On 23.12.15 20:28, Andreas Tobler wrote:

On 23.12.15 11:22, Richard Earnshaw (lists) wrote:

On 22/12/15 19:53, Andreas Tobler wrote:

Hi all,

the commit for PR68617 broke boostrap on armv6*-*-freebsd*.

We still have unaligned_access = 0 on armv6 here on FreeBSD.

The commit from the above PR overrides my SUBTARGET_OVERRIDE_OPTIONS I
called in arm_option_override. And it sets the unaligned_access to 1.

The attached patch fixes this, bootstrap ongoing but passed the breaking
stage where genmddeps bus errored.

Is this patch ok for trunk once bootstrap completes?

TIA,
Andreas

2015-12-22  Andreas Tobler  

  * config/arm/freebsd.h (SUBTARGET_OVERRIDE_OPTIONS): Adjust to
  check unaligned_access on the gcc_options set.
  * config/arm/arm.c (arm_option_override): Move
  SUBTARGET_OVERRIDE_OPTIONS from here to
  (arm_option_override_internal).



Moving this hunk to a different place potentially affects VXWORKS (the
only other target that uses this hook).  I'd like to see confirmation
from the VxWorks maintainers (Nathan?) that this doesn't cause any
problems for them.  If it does, then I think you need to create a new
subtarget hook (SUBTARGET_OVERRIDE_INTERNAL_OPTIONS?) and change FreeBSD
to use that rather than the existing hook.


I noticed this morning that VxWorks might be affected. To be on the safe
side I'd like to propose the attached version since it makes clear where
the override belongs to and I don't think hijacking
SUBTARGET_OVERRIDE_OPTIONS is a good idea here.
I need the override in the arm_option_override_internal function after
the default has been set.

What do you think?

Thanks,

Andreas

2015-12-23  Andreas Tobler  

* config/arm/freebsd.h: Rename SUBTARGET_OVERRIDE_OPTIONS to
SUBTARGET_OVERRIDE_INTERNAL_OPTIONS. Adjust to check
unaligned_access on the gcc_options set.
* config/arm/arm.c (arm_option_override_internal): Use
SUBTARGET_OVERRIDE_INTERNAL_OPTIONS.





Re: [PATCH], PowerPC IEEE 128-bit fp, #11 (enable libgcc conversions)

2016-01-04 Thread Michael Meissner
On Thu, Dec 31, 2015 at 08:29:58PM +, Joseph Myers wrote:
> On Tue, 29 Dec 2015, Michael Meissner wrote:
> 
> > +/* __eqkf2 returns 0 if equal, or 1 if not equal or NaN.  */
> > +CMPtype
> > +__eqkf2_hw (TFtype a, TFtype b)
> > +{
> > +  return (__builtin_isunordered (a, b) || (a != b)) ? 1 : 0;
> 
> This is more complicated than necessary.  "return a != b;" will suffice.

Ok.  I will change this.

> > +/* __gekf2 returns -1 if a < b, 0 if a == b, +1 if a > b, or -2 if NaN.  */
> > +CMPtype
> > +__gekf2_hw (TFtype a, TFtype b)
> > +{
> > +  if (__builtin_isunordered (a, b))
> > +return -2;
> > +
> > +  else if (a < b)
> > +return -1;
> 
> The __builtin_isunordered check should come after the < check, so that the 
> "invalid" exception gets raised for quiet NaN arguments.
> 
> > +/* __lekf2 returns -1 if a < b, 0 if a == b, +1 if a > b, or +2 if NaN.  */
> > +CMPtype
> > +__lekf2_hw (TFtype a, TFtype b)
> > +{
> > +  if (__builtin_isunordered (a, b))
> > +return 2;
> > +
> > +  else if (a < b)
> > +return -1;
> 
> Likewise.

Ok.  I will change these.

> > +  char *p = (char *) getauxval (AT_PLATFORM);
> 
> glibc deliberately exports __getauxval at a public symbol version, so you 
> can do this in a namespace-clean way.

Ok.  I will change this.  The getauxval call by the way is only a temporary
measure until the support for __builtin_cpu_supports is added to the PowerPC.

> > +CMPtype __eqkf2 (TFtype, TFtype)
> > +  __attribute__ ((__ifunc__ ("__eqkf2_resolve")));
> > +
> > +CMPtype __gekf2 (TFtype, TFtype)
> > +  __attribute__ ((__ifunc__ ("__gekf2_resolve")));
> > +
> > +CMPtype __lekf2 (TFtype, TFtype)
> > +  __attribute__ ((__ifunc__ ("__lekf2_resolve")));
> 
> Don't you need to arrange __nekf2, __gtkf2, __ltkf2 aliases to these 
> resolvers (the semantics mean they don't need to be separate functions, 
> but the entry points need to be there given the optabs the back end sets 
> up)?

Because of default conversions we cannot allow the normal optab mechanism to be
used for IEEE 128-bit floating point emulation.  This is due to the fact that
if you have a __float128 comparison, the compiler will see if a larger type can
do the comparison, and in this case, the larger type is TFmode (i.e. IBM
extended double using the current defaults).

Instead rs6000_generate_compare generates the calls, and it does not use
the alternate names.  I can easily put in the resolver calls as well for the
alternate names just in case somebody hand crafts a call to __nekf3.

> 
> > +#ifdef _ARCH_PPC64
> > +TItype_ppc __fixkfti (TFtype)
> > +  __attribute__ ((__ifunc__ ("__fixkfti_resolve")));
> > +
> > +UTItype_ppc __fixunskfti (TFtype)
> > +  __attribute__ ((__ifunc__ ("__fixunskfti_resolve")));
> > +
> > +TFtype __floattikf (TItype_ppc)
> > +  __attribute__ ((__ifunc__ ("__floattikf_resolve")));
> > +
> > +TFtype __floatuntikf (UTItype_ppc)
> > +  __attribute__ ((__ifunc__ ("__floatuntikf_resolve")));
> > +#endif
> 
> I don't see the point of using ifuncs that just always return the software 
> version.  You might as well just give the software version the appropriate 
> function name directly, and add ifuncs later if adding a version using 
> hardware arithmetic (e.g. doing something like the libgcc2.c functions 
> with hardware conversions to/from DImode).

I'll think about it.  At some point, I was hoping to have implementations for
ISA 3.0.  However, there is not an ISA 3.0 instruction that converts from
128-bit integer to 128-bit floating point or vice versa.

> 
> > +#define ISA_BIT(x) (1 << (63 - x))
> 
> As far as I can see, my previous comment still applies: this part of the 
> sfp-machine.h changes needs to be under some appropriate conditional so 
> that it only applies when building the KFmode functions, not for 32-bit 
> soft-float / e500 libgcc builds.

Agreed.  I will fix this.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH] document -Winvalid-memory-model

2016-01-04 Thread Martin Sebor

As discussed in c/69104, the -Winvalid-memory-model option is
not documented in the manual.  The attached patch rectifies that.

Martin
gcc/ChangeLog:
2016-01-04  Martin Sebor  

	* doc/invoke.texi (Warning Options): Document -Winvalid-memory-model.

Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 232047)
+++ doc/invoke.texi	(working copy)
@@ -263,7 +263,8 @@
 -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol
 -Winvalid-pch -Wlarger-than=@var{len} @gol
 -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol
--Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol
+-Wmain -Wmaybe-uninitialized -Winvalid-memory-model @gol
+-Wmemset-transposed-args @gol
 -Wmisleading-indentation -Wmissing-braces @gol
 -Wmissing-field-initializers -Wmissing-include-dirs @gol
 -Wno-multichar  -Wnonnull  -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol
@@ -4305,6 +4306,26 @@
 computations may be deleted by data flow analysis before the warnings
 are printed.
 
+@item -Winvalid-memory-model
+@opindex Winvalid-memory-model
+@opindex Wno-invalid-memory-model
+Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins},
+and the C11 atomic generic functions with a memory consistency argument
+that is either invalid for the operation or outside the range of values
+of the @code{memory_order} enumeration.  For example, since the
+@code{__atomic_store} and @code{__atomic_store_n} built-ins are only
+defined for the relaxed, relase, and sequentially consistent memory
+orders the following code is diagnosed:
+
+@smallexample
+void store (int *i)
+@{
+  __atomic_store_n (i, 0, memory_order_consume);
+@}
+@end smallexample
+
+@option{-Winvalid-memory-model} is enabled by default.
+
 @item -Wmaybe-uninitialized
 @opindex Wmaybe-uninitialized
 @opindex Wno-maybe-uninitialized


[PATCH] libiberty: support demangling of rvalue reference typenames

2016-01-04 Thread Artemiy Volkov
This patch adds handling of 'O' (rvalue ref) type codes in the C++ demangling
code which is done similarly to the 'R' (regular references) case. It also adds
a few testcases for various demangling styles which are just mirrored versions
of the corresponding regular references demangling tests.

libiberty/ChangeLog:

2016-01-04  Artemiy Volkov  

* cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference
constant.
(demangle_template_value_parm): Handle tk_rvalue_reference
type kind.
(do_type): Support 'O' type id (rvalue references).

* testsuite/demangle-expected: Add tests.
---
 libiberty/cplus-dem.c |  13 +++-
 libiberty/testsuite/demangle-expected | 115 ++
 2 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c
index c68b981..122f05c 100644
--- a/libiberty/cplus-dem.c
+++ b/libiberty/cplus-dem.c
@@ -237,6 +237,7 @@ typedef enum type_kind_t
   tk_none,
   tk_pointer,
   tk_reference,
+  tk_rvalue_reference,
   tk_integral,
   tk_bool,
   tk_char,
@@ -2033,7 +2034,8 @@ demangle_template_value_parm (struct work_stuff *work, 
const char **mangled,
 }
   else if (tk == tk_real)
 success = demangle_real_value (work, mangled, s);
-  else if (tk == tk_pointer || tk == tk_reference)
+  else if (tk == tk_pointer || tk == tk_reference
+   || tk == tk_rvalue_reference)
 {
   if (**mangled == 'Q')
success = demangle_qualified (work, mangled, s,
@@ -3574,6 +3576,14 @@ do_type (struct work_stuff *work, const char **mangled, 
string *result)
tk = tk_reference;
  break;
 
+  /* An rvalue reference type */
+   case 'O':
+  (*mangled)++;
+  string_prepend (&decl, "&&");
+  if (tk == tk_none)
+tk = tk_rvalue_reference;
+  break;
+
  /* An array */
case 'A':
  {
@@ -3631,7 +3641,6 @@ do_type (struct work_stuff *work, const char **mangled, 
string *result)
  break;
 
case 'M':
-   case 'O':
  {
type_quals = TYPE_UNQUALIFIED;
 
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index aebf01b..f947de7 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -31,6 +31,11 @@ ArrowLine::ArrowheadIntersects(Arrowhead *, BoxObj &, 
Graphic *)
 ArrowLine::ArrowheadIntersects
 #
 --format=gnu --no-params
+ArrowheadIntersects__9ArrowLineP9ArrowheadO6BoxObjP7Graphic
+ArrowLine::ArrowheadIntersects(Arrowhead *, BoxObj &&, Graphic *)
+ArrowLine::ArrowheadIntersects
+#
+--format=gnu --no-params
 AtEnd__13ivRubberGroup
 ivRubberGroup::AtEnd(void)
 ivRubberGroup::AtEnd
@@ -51,6 +56,11 @@ TextCode::CoreConstDecls(ostream &)
 TextCode::CoreConstDecls
 #
 --format=gnu --no-params
+CoreConstDecls__8TextCodeO7ostream
+TextCode::CoreConstDecls(ostream &&)
+TextCode::CoreConstDecls
+#
+--format=gnu --no-params
 Detach__8StateVarP12StateVarView
 StateVar::Detach(StateVarView *)
 StateVar::Detach
@@ -66,21 +76,41 @@ RelateManip::Effect(ivEvent &)
 RelateManip::Effect
 #
 --format=gnu --no-params
+Effect__11RelateManipO7ivEvent
+RelateManip::Effect(ivEvent &&)
+RelateManip::Effect
+#
+--format=gnu --no-params
 FindFixed__FRP4CNetP4CNet
 FindFixed(CNet *&, CNet *)
 FindFixed
 #
 --format=gnu --no-params
+FindFixed__FOP4CNetP4CNet
+FindFixed(CNet *&&, CNet *)
+FindFixed
+#
+--format=gnu --no-params
 Fix48_abort__FR8twolongs
 Fix48_abort(twolongs &)
 Fix48_abort
 #
 --format=gnu --no-params
+Fix48_abort__FO8twolongs
+Fix48_abort(twolongs &&)
+Fix48_abort
+#
+--format=gnu --no-params
 GetBarInfo__15iv2_6_VScrollerP13ivPerspectiveRiT2
 iv2_6_VScroller::GetBarInfo(ivPerspective *, int &, int &)
 iv2_6_VScroller::GetBarInfo
 #
 --format=gnu --no-params
+GetBarInfo__15iv2_6_VScrollerP13ivPerspectiveOiT2
+iv2_6_VScroller::GetBarInfo(ivPerspective *, int &&, int &&)
+iv2_6_VScroller::GetBarInfo
+#
+--format=gnu --no-params
 GetBgColor__C9ivPainter
 ivPainter::GetBgColor(void) const
 ivPainter::GetBgColor
@@ -986,11 +1016,21 @@ List::Pix::Pix(List::Pix const &)
 List::Pix::Pix
 #
 --format=gnu --no-params
+__Q2t4List1Z10VHDLEntity3PixOCQ2t4List1Z10VHDLEntity3Pix
+List::Pix::Pix(List::Pix const &&)
+List::Pix::Pix
+#
+--format=gnu --no-params
 __Q2t4List1Z10VHDLEntity7elementRC10VHDLEntityPT0
 List::element::element(VHDLEntity const &, 
List::element *)
 List::element::element
 #
 --format=gnu --no-params
+__Q2t4List1Z10VHDLEntity7elementOC10VHDLEntityPT0
+List::element::element(VHDLEntity const &&, 
List::element *)
+List::element::element
+#
+--format=gnu --no-params
 __Q2t4List1Z10VHDLEntity7elementRCQ2t4List1Z10VHDLEntity7element
 List::element::element(List::element const &)
 List::element::element
@@ -1036,6 +1076,11 @@ PixX 
>::PixX(PixX >::PixX
 #
 --format=gnu --no-params
+__t4PixX3Z11VHDLLibraryZ14VHDLLibraryRepZt4List1Z10VHDLEntityOCt4PixX3Z11VHDLLibrary

Re: [PATCH] document -Winvalid-memory-model

2016-01-04 Thread Sandra Loosemore

On 01/04/2016 03:17 PM, Martin Sebor wrote:

As discussed in c/69104, the -Winvalid-memory-model option is
not documented in the manual.  The attached patch rectifies that.


Thanks for tackling this.


Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 232047)
+++ doc/invoke.texi (working copy)
@@ -263,7 +263,8 @@
 -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol
 -Winvalid-pch -Wlarger-than=@var{len} @gol
 -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol
--Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol
+-Wmain -Wmaybe-uninitialized -Winvalid-memory-model @gol
+-Wmemset-transposed-args @gol
 -Wmisleading-indentation -Wmissing-braces @gol
 -Wmissing-field-initializers -Wmissing-include-dirs @gol
 -Wno-multichar  -Wnonnull  -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} 
@gol


We just had a patch a month or so ago (r231022) to sort this table into 
something approaching alphabetical order, module no- prefixes, I guess. 
 Can you please insert the new entry into a less random place?



@@ -4305,6 +4306,26 @@
 computations may be deleted by data flow analysis before the warnings
 are printed.

+@item -Winvalid-memory-model
+@opindex Winvalid-memory-model
+@opindex Wno-invalid-memory-model
+Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins},
+and the C11 atomic generic functions with a memory consistency argument
+that is either invalid for the operation or outside the range of values
+of the @code{memory_order} enumeration.  For example, since the
+@code{__atomic_store} and @code{__atomic_store_n} built-ins are only


s/built-ins/builtins/ (like in the @refs you used previously)


+defined for the relaxed, relase, and sequentially consistent memory


s/relase/release/


+orders the following code is diagnosed:
+
+@smallexample
+void store (int *i)
+@{
+  __atomic_store_n (i, 0, memory_order_consume);
+@}
+@end smallexample
+
+@option{-Winvalid-memory-model} is enabled by default.
+
 @item -Wmaybe-uninitialized
 @opindex Wmaybe-uninitialized
 @opindex Wno-maybe-uninitialized


OK with those changes.

-Sandra



Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint

2016-01-04 Thread H.J. Lu
On Mon, Jan 4, 2016 at 1:11 PM, H.J. Lu  wrote:
> On Mon, Jan 4, 2016 at 12:19 PM, Uros Bizjak  wrote:
>> On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu  wrote:
>>> On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu  wrote:
 On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener
  wrote:
> On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak  
> wrote:
>>On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu  wrote:
>>> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak 
>>wrote:
 On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu 
>>wrote:
> SSE vector arithmetic and logic instructions only accept aligned
>>memory
> operand.  This patch adds vector_memory_operand and "Bm" constraint
>>for
> aligned SSE memory operand.  They are applied to SSE any_logic
>>patterns.
>
> OK for trunk and release branches if there are regressions?

 This patch is just papering over deeper problem, as Jakub said in
>>the PR [1]:

 --q--
 GCC uses the ix86_legitimate_combined_insn target hook to disallow
 misaligned memory into certain SSE instructions.
 (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset
>>&)FeatureEntry_21 + 8] ]) 0)
 is not misaligned memory, it is a subreg of a pseudo register, so it
>>is fine.
 If the replacement of the pseudo register with memory happens in
>>some
 other pass, then it probably either should use the
 legitimate_combined_insn target hook or some other one.  I think we
 have already a PR where that happens during live range shrinking.
 --/q--

 Please figure out where memory replacement happens. There are
>>several
 other SSE insns (please grep the .md for "ssememalign" attribute)
>>that
 are affected by this problem, so fixing a couple of patterns won't
 solve the problem completely.
>>>
>>> LRA turns
>>>
>>> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
>>> (xor:V4SI (reg:V4SI 149)
>>> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
>>> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>>>  (expr_list:REG_DEAD (reg:V4SI 149)
>>> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
>>> &)FeatureEntry_2(D)] ])
>>> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20
>>frame)
>>> (const_int -16 [0xfff0])) [3
>>> MEM[(unsigned int *)&D.2851]+0 S16 A128])
>>> (nil)
>>>
>>> into
>>>
>>> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ]
>>[148])
>>> (xor:V4SI (reg:V4SI 21 xmm0 [149])
>>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ]
>>[117])
>>> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>> foo.ii:26 3454 {*xorv4si3}
>>>  (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>>> (const_int -16 [0xfff0])) [3
>>MEM[(unsigned
>>> int *)&D.2851]+0 S16 A128])
>>> (nil)))
>>>
>>> since
>>>
>>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
>>> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
>>>
>>> satisfies the 'm" constraint.  I don't think LRA should call
>>> ix86_legitimate_combined_insn to validate to validate constraints on
>>> an instruction.
>>
>>Hm...
>>
>>if LRA desn't assume that generic "m" constraint implies at least
>>natural alignment of propageted operand, then your patch is the way to
>>go.
>
> I don't think it even considers alignment. Archs where alignment validity 
> depends on the actual instruction should model this with proper 
> constraints.
>
> But in this case, *every* SSE vector memory constraint should be
>>changed to Bm.
>
> I'd say so ...

 The "Bm" constraint should be applied only to non-move SSE
 instructions with 16-byte memory operand.

>>>
>>> Here are 3 patch which implement it.  There is one exception
>>> on SSE *mov_internal.  With Bm, LRA will crash, which
>>> may be an LRA bug.   I used m as workaround.
>>>
>>> Tested on x86-64 without regressions.  OK for trunk?
>>
>> Looking at the comment in Patch 3, I'd say let's keep
>> *mov_internal constraints unchanged. But it looks to me that we
>> have to finally relax
>>
>>   if ((TARGET_AVX || TARGET_IAMCU)
>>   && (misaligned_operand (operands[0], mode)
>>   || misaligned_operand (operands[1], mode)))
>>
>> condition to allow unaligned moves for all targets, not only AVX and
>> IAMCU. The rationale for this decision is that if the RA won't be able
>> to satisfy Bm constraint, it can load the value into XMM register.
>> This will be done through SSE *mov internal, so unaligned move
>> has to be generated.
>>
>>

Re: [PATCH] document -Winvalid-memory-model

2016-01-04 Thread Martin Sebor

We just had a patch a month or so ago (r231022) to sort this table into
something approaching alphabetical order, module no- prefixes, I guess.
  Can you please insert the new entry into a less random place?


Sure.  It was meant to be inserted in the right place, my brain
just filtered out the "invalid" part in the name of the option.
Fixed in the updated patch.


@@ -4305,6 +4306,26 @@
 computations may be deleted by data flow analysis before the warnings
 are printed.

+@item -Winvalid-memory-model
+@opindex Winvalid-memory-model
+@opindex Wno-invalid-memory-model
+Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins},
+and the C11 atomic generic functions with a memory consistency argument
+that is either invalid for the operation or outside the range of values
+of the @code{memory_order} enumeration.  For example, since the
+@code{__atomic_store} and @code{__atomic_store_n} built-ins are only


s/built-ins/builtins/ (like in the @refs you used previously)


I thought the @refs were an inconsistency and built-in was
the preferred spelling.  That's what someone else pointed
out to me sometime ago and what I see documented in the GCC
Coding Conventions (and what I also noticed used elsewhere
in this section of the manual).

But looking more closely, there are quite a few uses of both
builtins and built-ins, on this manual page as well as on
others.  Which makes me wonder which of the two is prevalent.

I count 68 occurrences of the words builtin and builtins in
the manual (separated by space and ignoring capitalization)
and 481 occurrences of the words built-in and built-ins.

I also count 50 occurrences of built-in in the gcc.pot file
and 33 occurrences of builtin.

This seems to confirm my understanding of the recommended
convention (though it also shows how inconsistently it is
being followed).  Please let me know if I missed something.


+defined for the relaxed, relase, and sequentially consistent memory


s/relase/release/


Fixed, thanks.

I will go ahead and commit this version of the patch tomorrow
unless you have objections.

Martin
gcc/ChangeLog:
2016-01-04  Martin Sebor  

	* doc/invoke.texi (Warning Options): Document -Winvalid-memory-model.

Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 232047)
+++ doc/invoke.texi	(working copy)
@@ -260,7 +260,7 @@
 -Wignored-qualifiers  -Wincompatible-pointer-types @gol
 -Wimplicit  -Wimplicit-function-declaration  -Wimplicit-int @gol
 -Winit-self  -Winline  -Wno-int-conversion @gol
--Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol
+-Wno-int-to-pointer-cast -Winvalid-memory-model -Wno-invalid-offsetof @gol
 -Winvalid-pch -Wlarger-than=@var{len} @gol
 -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol
@@ -4305,6 +4305,26 @@
 computations may be deleted by data flow analysis before the warnings
 are printed.
 
+@item -Winvalid-memory-model
+@opindex Winvalid-memory-model
+@opindex Wno-invalid-memory-model
+Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins},
+and the C11 atomic generic functions with a memory consistency argument
+that is either invalid for the operation or outside the range of values
+of the @code{memory_order} enumeration.  For example, since the
+@code{__atomic_store} and @code{__atomic_store_n} built-ins are only
+defined for the relaxed, release, and sequentially consistent memory
+orders the following code is diagnosed:
+
+@smallexample
+void store (int *i)
+@{
+  __atomic_store_n (i, 0, memory_order_consume);
+@}
+@end smallexample
+
+@option{-Winvalid-memory-model} is enabled by default.
+
 @item -Wmaybe-uninitialized
 @opindex Wmaybe-uninitialized
 @opindex Wno-maybe-uninitialized


Re: [PATCH] document -Winvalid-memory-model

2016-01-04 Thread Sandra Loosemore

On 01/04/2016 05:15 PM, Martin Sebor wrote:


s/built-ins/builtins/ (like in the @refs you used previously)


I thought the @refs were an inconsistency and built-in was
the preferred spelling.  That's what someone else pointed
out to me sometime ago and what I see documented in the GCC
Coding Conventions (and what I also noticed used elsewhere
in this section of the manual).

But looking more closely, there are quite a few uses of both
builtins and built-ins, on this manual page as well as on
others.  Which makes me wonder which of the two is prevalent.

I count 68 occurrences of the words builtin and builtins in
the manual (separated by space and ignoring capitalization)
and 481 occurrences of the words built-in and built-ins.

I also count 50 occurrences of built-in in the gcc.pot file
and 33 occurrences of builtin.

This seems to confirm my understanding of the recommended
convention (though it also shows how inconsistently it is
being followed).  Please let me know if I missed something.


Sorry, my bad.  "Built-in", hyphenated, is correct as an adjective, as 
in "built-in function".  It's not clear what we're supposed to use as a 
noun, but it seems "builtin" isn't it, either.  :-S  I think that a 
couple years ago I changed a bunch of other instances to "built-in 
function" to avoid this trouble, but I won't insist on that here.



I will go ahead and commit this version of the patch tomorrow
unless you have objections.


Looks OK to me now.

-Sandra



Re: [PATCH] c/68966 - atomic_fetch_* on atomic_bool not diagnosed

2016-01-04 Thread Martin Sebor

On 01/04/2016 08:22 AM, Marek Polacek wrote:

Hi Martin,


...

Thanks for the careful review!

I've fixed the problems you pointed out in the attached patch.
The typos are my bad.  As for the whitespace, I have to confess
I'm finding all the rules tedious to follow without some sort
of automation.  Jason suggested some option to git but I don't
use git to commit (too many other problems).  I'm also not sure
the option makes Git replace 8 spaces with TABs.  I tried to
have my Emacs automatically strip trailing whitespace for me
but that was causing spurious changes on otherwise untouched
lines that already contain it (clearly, I'm not the only who
struggles with whitespace).  I don't suppose everyone is
voluntarily subjecting themselves to this torture so there
must be a way to make it less onerous and painful.  What's
your secret?

Martin

gcc/ChangeLog:
2016-01-04  Martin Sebor  

	PR c/68966
	* doc/extend.texi (__atomic Builtins, __sync Builtins): Document
	constraint on the type of arguments.

gcc/c-family/ChangeLog:
2016-01-04  Martin Sebor  

	PR c/68966
	* c-common.c (sync_resolve_size): Reject first argument when it's
	a pointer to _Bool.

gcc/testsuite/ChangeLog:
2016-01-04  Martin Sebor  

	PR c/68966
	* gcc.dg/atomic-fetch-bool.c: New test.
	* gcc.dg/sync-fetch-bool.c: Same.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 232047)
+++ gcc/doc/extend.texi	(working copy)
@@ -9238,6 +9238,9 @@
 @{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; @}   // nand
 @end smallexample
 
+The object pointed to by the first argument must be of integer or pointer
+type.  It must not be a Boolean type.
+
 @emph{Note:} GCC 4.4 and later implement @code{__sync_fetch_and_nand}
 as @code{*ptr = ~(tmp & value)} instead of @code{*ptr = ~tmp & value}.
 
@@ -9261,6 +9264,9 @@
 @{ *ptr = ~(*ptr & value); return *ptr; @}   // nand
 @end smallexample
 
+The same constraints on arguments apply as for the corresponding
+@code{__sync_op_and_fetch} built-in functions.
+
 @emph{Note:} GCC 4.4 and later implement @code{__sync_nand_and_fetch}
 as @code{*ptr = ~(*ptr & value)} instead of
 @code{*ptr = ~*ptr & value}.
@@ -9507,13 +9513,14 @@
 @deftypefnx {Built-in Function} @var{type} __atomic_or_fetch (@var{type} *ptr, @var{type} val, int memorder)
 @deftypefnx {Built-in Function} @var{type} __atomic_nand_fetch (@var{type} *ptr, @var{type} val, int memorder)
 These built-in functions perform the operation suggested by the name, and
-return the result of the operation. That is,
+return the result of the operation.  That is,
 
 @smallexample
 @{ *ptr @var{op}= val; return *ptr; @}
 @end smallexample
 
-All memory orders are valid.
+The object pointed to by the first argument must be of integer or pointer
+type.  It must not be a Boolean type.  All memory orders are valid.
 
 @end deftypefn
 
@@ -9530,7 +9537,8 @@
 @{ tmp = *ptr; *ptr @var{op}= val; return tmp; @}
 @end smallexample
 
-All memory orders are valid.
+The same constraints on arguments apply as for the corresponding
+@code{__atomic_op_fetch} built-in functions.  All memory orders are valid.
 
 @end deftypefn
 
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 232047)
+++ gcc/c-family/c-common.c	(working copy)
@@ -7804,7 +7804,7 @@
   else if (TYPE_P (*node))
 type = node, is_type = 1;
 
-  if ((i = check_user_alignment (align_expr, false)) == -1
+  if ((i = check_user_alignment (align_expr, true)) == -1
   || !check_cxx_fundamental_alignment_constraints (*node, i, flags))
 *no_add_attrs = true;
   else if (is_type)
@@ -10657,11 +10657,16 @@
 /* A helper function for resolve_overloaded_builtin in resolving the
overloaded __sync_ builtins.  Returns a positive power of 2 if the
first operand of PARAMS is a pointer to a supported data type.
-   Returns 0 if an error is encountered.  */
+   Returns 0 if an error is encountered.
+   FETCH is true when FUNCTION is one of the _FETCH_OP_ or _OP_FETCH_
+   built-ins.  */
 
 static int
-sync_resolve_size (tree function, vec *params)
+sync_resolve_size (tree function, vec *params, bool fetch)
 {
+  /* Type of the argument.  */
+  tree argtype;
+  /* Type the argument points to.  */
   tree type;
   int size;
 
@@ -10671,7 +10676,7 @@
   return 0;
 }
 
-  type = TREE_TYPE ((*params)[0]);
+  argtype = type = TREE_TYPE ((*params)[0]);
   if (TREE_CODE (type) == ARRAY_TYPE)
 {
   /* Force array-to-pointer decay for C++.  */
@@ -10686,12 +10691,16 @@
   if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type))
 goto incompatible;
 
+  if (fetch && TREE_CODE (type) == BOOLEAN_TYPE)
+goto incompatible;
+
   size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
   if (size == 1 || size == 2 || size == 4 || size == 8 || size == 16)
 return size;
 
  incompatible:
-  error ("incompatible type for argument %d of %qE", 1, function);
+  error (

Re: [PATCH] c++/58109 - alignas() fails to compile with constant expression

2016-01-04 Thread Martin Sebor

Ping: looking for review/approval of the patch below:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02074.html

Thanks
Martin

On 12/22/2015 07:32 PM, Martin Sebor wrote:

The attached patch adds handling of dependent arguments to
attribute aligned and attribute vector_size, fixing c++/58109
and 69022 - attribute vector_size ignored with dependent bytes.

Tested on x86_64.

Martin




[PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-04 Thread Thomas Preud'homme
Hi,

bswap optimization pass generate wrong code on big endian targets when the 
result of a bit operation it analyzed is a partial load of the range of memory 
accessed by the original expression (when one or more bytes at lowest address 
were lost in the computation). This is due to the way cmpxchg and cmpnop are 
adjusted in find_bswap_or_nop before being compared to the result of the 
symbolic expression. Part of the adjustment is endian independent: it's to 
ignore the bytes that were not accessed by the original gimple expression. 
However, when the result has less byte than that original expression, some 
more byte need to be ignored and this is endian dependent.

The current code only support loss of bytes at the highest addresses because 
there is no code to adjust the address of the load. However, for little and 
big endian targets the bytes at highest address translate into different byte 
significance in the result. This patch first separate cmpxchg and cmpnop 
adjustement into 2 steps and then deal with endianness correctly for the 
second step.

ChangeLog entries are as follow:


*** gcc/ChangeLog ***

2015-12-16  Thomas Preud'homme  

PR tree-optimization/67781
* tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg
and cmpnop in two steps: first the ones not accessed in original
gimple expression in a endian independent way and then the ones not
accessed in the final result in an endian-specific way.


*** gcc/testsuite/ChangeLog ***

2015-12-16  Thomas Preud'homme  

PR tree-optimization/67781
* gcc.c-torture/execute/pr67781.c: New file.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c 
b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
new file mode 100644
index 000..bf50aa2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
@@ -0,0 +1,34 @@
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#ifdef __UINT8_TYPE__
+typedef __UINT8_TYPE__ uint8_t;
+#else
+typedef unsigned char uint8_t;
+#endif
+
+struct
+{
+  uint32_t a;
+  uint8_t b;
+} s = { 0x123456, 0x78 };
+
+int pr67781()
+{
+  uint32_t c = (s.a << 8) | s.b;
+  return c;
+}
+
+int
+main ()
+{
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+return 0;
+
+  if (pr67781 () != 0x12345678)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index b00f046..e5a185f 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number 
*n, int limit)
 static gimple *
 find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
 {
+  unsigned rsize;
+  uint64_t tmpn, mask;
 /* The number which the find_bswap_or_nop_1 result should match in order
to have a full byte swap.  The number is shifted to the right
according to the size of the symbolic number before using it.  */
@@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number 
*n, bool *bswap)
 
   /* Find real size of result (highest non-zero byte).  */
   if (n->base_addr)
-{
-  int rsize;
-  uint64_t tmpn;
-
-  for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
-  n->range = rsize;
-}
+for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
+  else
+rsize = n->range;
 
-  /* Zero out the extra bits of N and CMP*.  */
+  /* Zero out the bits corresponding to untouched bytes in original gimple
+ expression.  */
   if (n->range < (int) sizeof (int64_t))
 {
-  uint64_t mask;
-
   mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1;
   cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER;
   cmpnop &= mask;
 }
 
+  /* Zero out the bits corresponding to unused bytes in the result of the
+ gimple expression.  */
+  if (rsize < n->range)
+{
+  if (BYTES_BIG_ENDIAN)
+   {
+ mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
+ cmpxchg &= mask;
+ cmpnop >>= (n->range - rsize) * BITS_PER_MARKER;
+   }
+  else
+   {
+ mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
+ cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER;
+ cmpnop &= mask;
+   }
+  n->range = rsize;
+}
+
   /* A complete byte swap should make the symbolic number to start with
  the largest digit in the highest order byte. Unchanged symbolic
  number indicates a read with same endianness as target architecture.  */



Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC and 
on an arm-none-eabi GCC cross-compiler without any regression. I'm waiting for 
a slot on gcc110 to do a big endian bootstrap but at least the testcase works 
on mips-linux. I'll send an update once bootstrap is complete.

Is this ok for trunk and 5 branch in a week time if no regression is reported?

Best r

Re: [PATCH] libiberty: support demangling of rvalue reference typenames

2016-01-04 Thread Ian Lance Taylor
Artemiy Volkov  writes:

> This patch adds handling of 'O' (rvalue ref) type codes in the C++ demangling
> code which is done similarly to the 'R' (regular references) case. It also 
> adds
> a few testcases for various demangling styles which are just mirrored versions
> of the corresponding regular references demangling tests.
>
> libiberty/ChangeLog:
>
> 2016-01-04  Artemiy Volkov  
>
> * cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference
> constant.
> (demangle_template_value_parm): Handle tk_rvalue_reference
> type kind.
> (do_type): Support 'O' type id (rvalue references).

Is there a compiler that actually generate these symbols?

Ian


Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-04 Thread Richard Biener
On January 4, 2016 8:08:17 PM GMT+01:00, Jeff Law  wrote:
>On 12/21/2015 06:13 AM, Alan Lawrence wrote:
>> This is a respin of patches
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03266.html and
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03267.html, which were
>> "too quickly" approved before concerns with efficiency were pointed
>out.
>>
>> I tried to change the hashing just in tree-ssa-dom.c using C++
>subclassing, but
>> couldn't cleanly separate this out from tree-ssa-scopedtables and
>> tree-ssa-threadedge.c due to use of avail_exprs_stack. So I figured
>it was
>> probably appropriate to use the equivalences in jump threading too.
>Also,
>> using get_ref_base_and_extent unifies handling of MEM_REFs and
>ARRAY_REFs

Without looking at the patch, ARRAY_REFs can have non-constant indices which 
get_ref_base_and_extend handles conservative.  You should make sure to not 
regress here.

Richard.

>> (hence only one patch rather than two).
>It is appropriate.
>
>
>> I've added a couple of testcases that show the improvement in DOM,
>but in all
>> cases I had to disable FRE, even PRE, to get any improvement, apart
>from on
>> ssa-dom-cse-2.c itself (where on the affected platforms FRE still
>does not do
>> the optimization). This makes me wonder if this is the right approach
>or whether
>> changing the references output by SRA (as per
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01490.html , judged as
>a hack to
>> SRA to work around limitations in DOM - or is it?) would be better.
>I just doubt it happens all that much.
>
>
>
>Jeff




Re: [PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-04 Thread Thomas Preud'homme
On Tuesday, January 05, 2016 01:53:37 PM you wrote:
> 
> Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC
> and on an arm-none-eabi GCC cross-compiler without any regression. I'm
> waiting for a slot on gcc110 to do a big endian bootstrap but at least the
> testcase works on mips-linux. I'll send an update once bootstrap is
> complete.

Bootstrap went fine on gcc110 with the following language enabled: 
c,c++,objc,obj-c++,java,fortran,ada,go,lto.

Best regards,

Thomas


[PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-04 Thread Thomas Preud'homme
Hi,

g++.dg/pr67989.C passes -march=armv4t to gcc when compiling which fails if 
RUNTESTFLAGS passes -mcpu or -march with a different value. This patch adds a 
dg-skip-if directive to skip the test when such a thing happens.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2015-12-31  Thomas Preud'homme  

* g++.dg/pr67989.C: Skip test if already running it with -mcpu or 
-march with different value.


diff --git a/gcc/testsuite/g++.dg/pr67989.C b/gcc/testsuite/g++.dg/pr67989.C
index 
90261c450b4b9429fb989f7df62f3743017c7363..61be8e172a96df5bb76f7ecd8543dadf825e7dc7
 
100644
--- a/gcc/testsuite/g++.dg/pr67989.C
+++ b/gcc/testsuite/g++.dg/pr67989.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-std=c++11 -O2" } */
+/* { dg-skip-if "do not override -mcpu" { arm*-*-* } { "-march=*" "-mcpu=*" } 
{ "-march=armv4t" } } */
 /* { dg-additional-options "-marm -march=armv4t" { target arm*-*-* } } */
 
 __extension__ typedef unsigned long long int uint64_t;


Is this ok for stage3?

Best regards,

Thomas


Re: [PATCH] libiberty: support demangling of rvalue reference typenames

2016-01-04 Thread Artemiy Volkov
On Mon, Jan 04, 2016 at 10:06:44PM -0800, Ian Lance Taylor wrote:
> Artemiy Volkov  writes:
> 
> > This patch adds handling of 'O' (rvalue ref) type codes in the C++ 
> > demangling
> > code which is done similarly to the 'R' (regular references) case. It also 
> > adds
> > a few testcases for various demangling styles which are just mirrored 
> > versions
> > of the corresponding regular references demangling tests.
> >
> > libiberty/ChangeLog:
> >
> > 2016-01-04  Artemiy Volkov  
> >
> > * cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference
> > constant.
> > (demangle_template_value_parm): Handle tk_rvalue_reference
> > type kind.
> > (do_type): Support 'O' type id (rvalue references).
> 
> Is there a compiler that actually generate these symbols?

Sure, at least gcc and clang generate this. E.g. when compiling:

void f(int&& b) { }

you then have:

➜ nm 1.o 
 T _Z1fOi

> 
> Ian