[PATCH] Fix PR78189

2016-11-07 Thread Richard Biener

The following fixes an oversight when computing alignment in the
vectorizer.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78189
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
alignment computation.

* g++.dg/torture/pr78189.C: New testcase.

Index: gcc/testsuite/g++.dg/torture/pr78189.C
===
--- gcc/testsuite/g++.dg/torture/pr78189.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr78189.C  (working copy)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
+
+#include 
+
+struct A
+{
+  void * a;
+  void * b;
+};
+
+struct alignas(16) B
+{
+  void * pad;
+  void * misaligned;
+  void * pad2;
+
+  A a;
+
+  void Null();
+};
+
+void B::Null()
+{
+  a.a = nullptr;
+  a.b = nullptr;
+}
+
+void __attribute__((noinline,noclone))
+NullB(void * misalignedPtr)
+{
+  B* b = reinterpret_cast(reinterpret_cast(misalignedPtr) - 
offsetof(B, misaligned));
+  b->Null();
+}
+
+int main()
+{
+  B b;
+  NullB(&b.misaligned);
+  return 0;
+}
diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 9346cfe..b03cb1e 100644
--- gcc/tree-vect-data-refs.c
+++ gcc/tree-vect-data-refs.c
@@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference 
*dr)
   base = ref;
   while (handled_component_p (base))
 base = TREE_OPERAND (base, 0);
+  unsigned int base_alignment;
+  unsigned HOST_WIDE_INT base_bitpos;
+  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
+  /* As data-ref analysis strips the MEM_REF down to its base operand
+ to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
+ adjust things to make base_alignment valid as the alignment of
+ DR_BASE_ADDRESS.  */
   if (TREE_CODE (base) == MEM_REF)
-base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
-  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
-  unsigned int base_alignment = get_object_alignment (base);
+{
+  base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
+  base_bitpos &= (base_alignment - 1);
+}
+  if (base_bitpos != 0)
+base_alignment = base_bitpos & -base_bitpos;
+  /* Also look at the alignment of the base address DR analysis
+ computed.  */
+  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
+  if (base_addr_alignment > base_alignment)
+base_alignment = base_addr_alignment;
 
   if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
 DR_VECT_AUX (dr)->base_element_aligned = true;


[PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

Hi,

this patch set performs libsanitizer merge from upstream.

Patch 1 is the library merge itself.

Patch 2 is the reapplied change for SPARC by David S. Miller.

Patch 3 changes heuristic for extracting last PC from stack frame for 
ARM in fast unwind routine. More details can be found here 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).


Patch 4 replaces Jakub's fix for 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes 
CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false 
positive odr violation reports.


Patch 5 combines necessary compiler changes.

Patch 6 adds several new tests, backported from upstream.

Patch 7 adds support for ASan odr indicators at compiler side.

The whole patch set was regtested/bootstrapped/ASan bootstrapped on 
x86_64-unknown-linux-gnu and i386-unknown-linux-gnu.
Also, passed regression tests on arm-linux-gnueabi and aarch64-linux 
under QEMU.


-Maxim


[PATCH 2/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

This is just reapplied patch for SPARC by David S. Miller.
From 0ff8d1c408b076970c323361922c35033aaae245 Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Tue, 25 Oct 2016 20:00:43 +0300
Subject: [PATCH 2/7] libsanitizer/

	PR sanitizer/63958
	Reapply:
	2014-10-14  David S. Miller  

	* sanitizer_common/sanitizer_platform_limits_linux.cc (time_t):
	Define at __kernel_time_t, as needed for sparc.
	(struct __old_kernel_stat): Don't check if __sparc__ is defined.
	* libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
	(__sanitizer): Define struct___old_kernel_stat_sz,
	struct_kernel_stat_sz, and struct_kernel_stat64_sz for sparc.
	(__sanitizer_ipc_perm): Adjust for sparc targets.
	(__sanitizer_shmid_ds): Likewsie.
	(__sanitizer_sigaction): Likewise.
	(IOC_SIZE): Likewsie.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229113 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libsanitizer/ChangeLog | 17 +++
 .../sanitizer_platform_limits_linux.cc |  4 +-
 .../sanitizer_platform_limits_posix.h  | 59 +-
 3 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog
index eaf907c..10b1207 100644
--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,5 +1,22 @@
 2016-11-07  Maxim Ostapenko  
 
+	PR sanitizer/63958
+	Reapply:
+	2014-10-14  David S. Miller  
+
+	* sanitizer_common/sanitizer_platform_limits_linux.cc (time_t):
+	Define at __kernel_time_t, as needed for sparc.
+	(struct __old_kernel_stat): Don't check if __sparc__ is defined.
+	* libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+	(__sanitizer): Define struct___old_kernel_stat_sz,
+	struct_kernel_stat_sz, and struct_kernel_stat64_sz for sparc.
+	(__sanitizer_ipc_perm): Adjust for sparc targets.
+	(__sanitizer_shmid_ds): Likewsie.
+	(__sanitizer_sigaction): Likewise.
+	(IOC_SIZE): Likewsie.
+
+2016-11-07  Maxim Ostapenko  
+
 	* All source files: Merge from upstream 285547.
 	* configure.tgt (SANITIZER_COMMON_TARGET_DEPENDENT_OBJECTS): New
 	variable.
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc b/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
index edc6730..23a0148 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
@@ -36,6 +36,7 @@
 #define uid_t __kernel_uid_t
 #define gid_t __kernel_gid_t
 #define off_t __kernel_off_t
+#define time_t __kernel_time_t
 // This header seems to contain the definitions of _kernel_ stat* structs.
 #include 
 #undef ino_t
@@ -62,7 +63,8 @@ namespace __sanitizer {
 }  // namespace __sanitizer
 
 #if !defined(__powerpc64__) && !defined(__x86_64__) && !defined(__aarch64__)\
-&& !defined(__mips__) && !defined(__s390__)
+&& !defined(__mips__) && !defined(__s390__)\
+&& !defined(__sparc__)
 COMPILER_CHECK(struct___old_kernel_stat_sz == sizeof(struct __old_kernel_stat));
 #endif
 
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index 17906d3..d1a3051 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -85,6 +85,14 @@ namespace __sanitizer {
 #elif defined(__s390x__)
   const unsigned struct_kernel_stat_sz = 144;
   const unsigned struct_kernel_stat64_sz = 0;
+#elif defined(__sparc__) && defined(__arch64__)
+  const unsigned struct___old_kernel_stat_sz = 0;
+  const unsigned struct_kernel_stat_sz = 104;
+  const unsigned struct_kernel_stat64_sz = 144;
+#elif defined(__sparc__) && !defined(__arch64__)
+  const unsigned struct___old_kernel_stat_sz = 0;
+  const unsigned struct_kernel_stat_sz = 64;
+  const unsigned struct_kernel_stat64_sz = 104;
 #endif
   struct __sanitizer_perf_event_attr {
 unsigned type;
@@ -107,7 +115,7 @@ namespace __sanitizer {
 
 #if defined(__powerpc64__) || defined(__s390__)
   const unsigned struct___old_kernel_stat_sz = 0;
-#else
+#elif !defined(__sparc__)
   const unsigned struct___old_kernel_stat_sz = 32;
 #endif
 
@@ -198,6 +206,18 @@ namespace __sanitizer {
 unsigned short __pad1;
 unsigned long __unused1;
 unsigned long __unused2;
+#elif defined(__sparc__)
+# if defined(__arch64__)
+unsigned mode;
+unsigned short __pad1;
+# else
+unsigned short __pad1;
+unsigned short mode;
+unsigned short __pad2;
+# endif
+unsigned short __seq;
+unsigned long long __unused1;
+unsigned long long __unused2;
 #else
 unsigned short mode;
 unsigned short __pad1;
@@ -215,6 +235,26 @@ namespace __sanitizer {
 
   struct __sanitizer_shmid_ds {
 __sanitizer_ipc_perm shm_perm;
+  #if defined(__sparc__)
+  # if !defined(__arch64__)
+u32 __pad1;
+  # endif
+long shm_atime;
+  # if !defi

[PATCH 3/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko
This patch adjusts the fix for 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771 to extract the last 
PC from the stack frame if no valid FP is available for ARM.
From 6dc6e4f761080cf19a161fb0e27c1fd584688f40 Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Tue, 25 Oct 2016 20:27:37 +0300
Subject: [PATCH 3/7] libsanitizer/

	* sanitizer_common/sanitizer_stacktrace.cc (GetCanonicFrame): Assume we
	compiled code with GCC when extracting the caller PC for ARM if no
	valid frame pointer is available.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229115 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libsanitizer/ChangeLog| 6 ++
 libsanitizer/sanitizer_common/sanitizer_stacktrace.cc | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog
index 10b1207..7e4f89f 100644
--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,5 +1,11 @@
 2016-11-07  Maxim Ostapenko  
 
+	* sanitizer_common/sanitizer_stacktrace.cc (GetCanonicFrame): Assume we
+	compiled code with GCC when extracting the caller PC for ARM if no
+	valid frame pointer is available.
+
+2016-11-07  Maxim Ostapenko  
+
 	PR sanitizer/63958
 	Reapply:
 	2014-10-14  David S. Miller  
diff --git a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
index 531f256..cbb3af2 100644
--- a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
@@ -55,8 +55,8 @@ static inline uhwptr *GetCanonicFrame(uptr bp,
   // Nope, this does not look right either. This means the frame after next does
   // not have a valid frame pointer, but we can still extract the caller PC.
   // Unfortunately, there is no way to decide between GCC and LLVM frame
-  // layouts. Assume LLVM.
-  return bp_prev;
+  // layouts. Assume GCC.
+  return bp_prev - 1;
 #else
   return (uhwptr*)bp;
 #endif
-- 
1.9.1



[PATCH 4/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko
This is rewritten Jakub's fix for 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888. Upstream now 
supports new approach for ODR violation detection: compiler emits new 
__odr_asan_XXX symbol for each instrumented global that indicates 
whether this global was already registered and the library checks this 
indicator symbol at runtime.
However, to preserve compatibility, the library still can fall to old, 
incompatible with GCC approach of ODR violation detection (say, when the 
odr indicator symbol wasn't emitted e.g. for static variable, libasan 
tries the old method). To avoid this, this patch removes 
CheckODRViolationViaPoisoning call and leaves only 
CheckODRViolationViaIndicator.


From 5cd9a7cb1c2dd668e533bee1bc15e367d367d84f Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Fri, 28 Oct 2016 10:22:35 +0300
Subject: [PATCH 4/7] libsanitizer/

	* asan/asan_globals.cc (RegisterGlobal): Do not call
	CheckODRViolationViaPoisoning.
	(CheckODRViolationViaPoisoning): Remove.
---
 libsanitizer/ChangeLog|  6 ++
 libsanitizer/asan/asan_globals.cc | 19 ---
 2 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog
index 7e4f89f..d439f45 100644
--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,5 +1,11 @@
 2016-11-07  Maxim Ostapenko  
 
+	* asan/asan_globals.cc (RegisterGlobal): Do not call
+	CheckODRViolationViaPoisoning.
+	(CheckODRViolationViaPoisoning): Remove.
+
+2016-11-07  Maxim Ostapenko  
+
 	* sanitizer_common/sanitizer_stacktrace.cc (GetCanonicFrame): Assume we
 	compiled code with GCC when extracting the caller PC for ARM if no
 	valid frame pointer is available.
diff --git a/libsanitizer/asan/asan_globals.cc b/libsanitizer/asan/asan_globals.cc
index 007fce72..f229292 100644
--- a/libsanitizer/asan/asan_globals.cc
+++ b/libsanitizer/asan/asan_globals.cc
@@ -147,23 +147,6 @@ static void CheckODRViolationViaIndicator(const Global *g) {
   }
 }
 
-// Check ODR violation for given global G by checking if it's already poisoned.
-// We use this method in case compiler doesn't use private aliases for global
-// variables.
-static void CheckODRViolationViaPoisoning(const Global *g) {
-  if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) {
-// This check may not be enough: if the first global is much larger
-// the entire redzone of the second global may be within the first global.
-for (ListOfGlobals *l = list_of_all_globals; l; l = l->next) {
-  if (g->beg == l->g->beg &&
-  (flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
-  !IsODRViolationSuppressed(g->name))
-ReportODRViolation(g, FindRegistrationSite(g),
-   l->g, FindRegistrationSite(l->g));
-}
-  }
-}
-
 // Clang provides two different ways for global variables protection:
 // it can poison the global itself or its private alias. In former
 // case we may poison same symbol multiple times, that can help us to
@@ -211,8 +194,6 @@ static void RegisterGlobal(const Global *g) {
 // where two globals with the same name are defined in different modules.
 if (UseODRIndicator(g))
   CheckODRViolationViaIndicator(g);
-else
-  CheckODRViolationViaPoisoning(g);
   }
   if (CanPoisonMemory())
 PoisonRedZones(*g);
-- 
1.9.1



[PATCH 5/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko
This patch just combines minimal necessary changes to support new 
libasan ABI. This patch doesn't try to implement odr indicators at 
compiler part, it simply pass a zero stub to runtime. The actual 
implementation of odr indicators goes in patch 7.
From 33f6f98faa86c61b9895db0d71e0e88a9ae4fa59 Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Tue, 25 Oct 2016 20:34:23 +0300
Subject: [PATCH 5/7] libsanitizer merge from upstream r285547, compiler part.

gcc/

	* asan.h (ASAN_STACK_MAGIC_PARTIAL): Remove.
	* asan.c (ASAN_STACK_MAGIC_PARTIAL): Replace with
	ASAN_STACK_MAGIC_MIDDLE.
	(asan_global_struct): Increase the size of fields.
	(asan_add_global): Add new field constructor.
	* sanitizer.def (__asan_version_mismatch_check_v6): Replace with
	__asan_version_mismatch_check_v8.

gcc/testsuite/

	* c-c++-common/asan/null-deref-1.c: Adjust testcase.
---
 gcc/ChangeLog  | 10 ++
 gcc/asan.c | 13 -
 gcc/asan.h |  1 -
 gcc/sanitizer.def  |  2 +-
 gcc/testsuite/ChangeLog|  4 
 gcc/testsuite/c-c++-common/asan/null-deref-1.c |  4 ++--
 6 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f29b9b5..943e21c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2016-11-07  Maxim Ostapenko  
+
+	* asan.h (ASAN_STACK_MAGIC_PARTIAL): Remove.
+	* asan.c (ASAN_STACK_MAGIC_PARTIAL): Replace with
+	ASAN_STACK_MAGIC_MIDDLE.
+	(asan_global_struct): Increase the size of fields.
+	(asan_add_global): Add new field constructor.
+	* sanitizer.def (__asan_version_mismatch_check_v6): Replace with
+	__asan_version_mismatch_check_v8.
+
 2016-10-30  Bill Schmidt  
 
 	PR tree-optimization/71915
diff --git a/gcc/asan.c b/gcc/asan.c
index c6d9240..fdc84bd 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1214,7 +1214,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
 		  shadow_bytes[i] = offset - aoff;
 	  }
 	else
-	  shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
+	  shadow_bytes[i] = ASAN_STACK_MAGIC_MIDDLE;
 	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
 	  offset = aoff;
 	}
@@ -2191,19 +2191,20 @@ asan_dynamic_init_call (bool after_p)
  const void *__module_name;
  uptr __has_dynamic_init;
  __asan_global_source_location *__location;
+ char *__odr_indicator;
} type.  */
 
 static tree
 asan_global_struct (void)
 {
-  static const char *field_names[7]
+  static const char *field_names[8]
 = { "__beg", "__size", "__size_with_redzone",
-	"__name", "__module_name", "__has_dynamic_init", "__location"};
-  tree fields[7], ret;
+	"__name", "__module_name", "__has_dynamic_init", "__location", "__odr_indicator"};
+  tree fields[8], ret;
   int i;
 
   ret = make_node (RECORD_TYPE);
-  for (i = 0; i < 7; i++)
+  for (i = 0; i < 8; i++)
 {
   fields[i]
 	= build_decl (UNKNOWN_LOCATION, FIELD_DECL,
@@ -2312,6 +2313,8 @@ asan_add_global (tree decl, tree type, vec *v)
   else
 locptr = build_int_cst (uptr, 0);
   CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, locptr);
+  /* TODO: support ODR indicators.  */
+  CONSTRUCTOR_APPEND_ELT(vinner, NULL_TREE, build_int_cst (uptr, 0));
   init = build_constructor (type, vinner);
   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
 }
diff --git a/gcc/asan.h b/gcc/asan.h
index 7ec693f..a259b1a 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -53,7 +53,6 @@ extern alias_set_type asan_shadow_set;
 #define ASAN_STACK_MAGIC_LEFT		0xf1
 #define ASAN_STACK_MAGIC_MIDDLE		0xf2
 #define ASAN_STACK_MAGIC_RIGHT		0xf3
-#define ASAN_STACK_MAGIC_PARTIAL	0xf4
 #define ASAN_STACK_MAGIC_USE_AFTER_RET	0xf5
 
 #define ASAN_STACK_FRAME_MAGIC		0x41b58ab3
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 303c1e4..ac85096 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -34,7 +34,7 @@ DEF_BUILTIN_STUB(BEGIN_SANITIZER_BUILTINS, (const char *)0)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_INIT, "__asan_init",
 		  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_VERSION_MISMATCH_CHECK,
-		  "__asan_version_mismatch_check_v6",
+		  "__asan_version_mismatch_check_v8",
 		  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
 /* Do not reorder the BUILT_IN_ASAN_{REPORT,CHECK}* builtins, e.g. cfgcleanup.c
relies on this order.  */
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 051ae83..49fab6e 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2016-11-07  Maxim Ostapenko  
+
+	* c-c++-common/asan/null-deref-1.c: Adjust testcase.
+
 2016-10-30  Bill Schmidt  
 
 	PR tree-optimization/71915
diff --git a/gcc/testsuite/c-c++-common/asan/null-deref-1.c b/gcc/testsuite/c-c++-common/asan/null-deref-1.c
index 45d35ac..f4f8f37 100644
--- a/gcc/testsuite/c-c++-common/asan/null-deref-1.c
+++ b/gcc/testsuite/c-c++-common/asan/null-deref-1.c
@@ -17,6

[PATCH 6/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

This patch just adds several tests backported from upstream.
From b4677ed64e7aee1af7772750e0b18ed8271f4757 Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Tue, 1 Nov 2016 16:52:13 +0300
Subject: [PATCH 6/7] Backport several testcases for ASan from upstream.

gcc/

	* asan.h (asan_intercepted_p): Handle BUILT_IN_STRCSPN,
	BUILT_IN_STRPBRK, BUILT_IN_STRSPN and BUILT_IN_STRSTR.

gcc/testsuite/

	* c-c++-common/asan/default_options.h: New file.
	* c-c++-common/asan/strcasestr-1.c: New test.
	* c-c++-common/asan/strcasestr-2.c: Likewise.
	* c-c++-common/asan/strcspn-1.c: Likewise.
	* c-c++-common/asan/strcspn-2.c: Likewise.
	* c-c++-common/asan/strpbrk-1.c: Likewise.
	* c-c++-common/asan/strpbrk-2.c: Likewise.
	* c-c++-common/asan/strspn-1.c: Likewise.
	* c-c++-common/asan/strspn-2.c: Likewise.
	* c-c++-common/asan/strstr-1.c: Likewise.
	* c-c++-common/asan/strstr-2.c: Likewise.
	* c-c++-common/asan/halt_on_error_suppress_equal_pcs-1.c: Likewise.
---
 gcc/ChangeLog  |  5 +++
 gcc/asan.h |  4 +++
 gcc/testsuite/ChangeLog| 15 +
 gcc/testsuite/c-c++-common/asan/default_options.h  |  9 +
 .../asan/halt_on_error_suppress_equal_pcs-1.c  | 38 ++
 gcc/testsuite/c-c++-common/asan/strcasestr-1.c | 32 ++
 gcc/testsuite/c-c++-common/asan/strcasestr-2.c | 32 ++
 gcc/testsuite/c-c++-common/asan/strcspn-1.c| 31 ++
 gcc/testsuite/c-c++-common/asan/strcspn-2.c| 31 ++
 gcc/testsuite/c-c++-common/asan/strpbrk-1.c| 31 ++
 gcc/testsuite/c-c++-common/asan/strpbrk-2.c| 31 ++
 gcc/testsuite/c-c++-common/asan/strspn-1.c | 31 ++
 gcc/testsuite/c-c++-common/asan/strspn-2.c | 31 ++
 gcc/testsuite/c-c++-common/asan/strstr-1.c | 31 ++
 gcc/testsuite/c-c++-common/asan/strstr-2.c | 31 ++
 15 files changed, 383 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/asan/default_options.h
 create mode 100644 gcc/testsuite/c-c++-common/asan/halt_on_error_suppress_equal_pcs-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strcasestr-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strcasestr-2.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strcspn-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strcspn-2.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strpbrk-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strpbrk-2.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strspn-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strspn-2.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strstr-1.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/strstr-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 943e21c..1da0ef9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2016-11-07  Maxim Ostapenko  
 
+	* asan.h (asan_intercepted_p): Handle BUILT_IN_STRCSPN,
+	BUILT_IN_STRPBRK, BUILT_IN_STRSPN and BUILT_IN_STRSTR.
+
+2016-11-07  Maxim Ostapenko  
+
 	* asan.h (ASAN_STACK_MAGIC_PARTIAL): Remove.
 	* asan.c (ASAN_STACK_MAGIC_PARTIAL): Replace with
 	ASAN_STACK_MAGIC_MIDDLE.
diff --git a/gcc/asan.h b/gcc/asan.h
index a259b1a..b96395b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -102,6 +102,10 @@ asan_intercepted_p (enum built_in_function fcode)
 	 || fcode == BUILT_IN_STRNCASECMP
 	 || fcode == BUILT_IN_STRNCAT
 	 || fcode == BUILT_IN_STRNCMP
+	 || fcode == BUILT_IN_STRCSPN
+	 || fcode == BUILT_IN_STRPBRK
+	 || fcode == BUILT_IN_STRSPN
+	 || fcode == BUILT_IN_STRSTR
 	 || fcode == BUILT_IN_STRNCPY;
 }
 #endif /* TREE_ASAN */
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 49fab6e..afa77a8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,20 @@
 2016-11-07  Maxim Ostapenko  
 
+	* c-c++-common/asan/default_options.h: New file.
+	* c-c++-common/asan/strcasestr-1.c: New test.
+	* c-c++-common/asan/strcasestr-2.c: Likewise.
+	* c-c++-common/asan/strcspn-1.c: Likewise.
+	* c-c++-common/asan/strcspn-2.c: Likewise.
+	* c-c++-common/asan/strpbrk-1.c: Likewise.
+	* c-c++-common/asan/strpbrk-2.c: Likewise.
+	* c-c++-common/asan/strspn-1.c: Likewise.
+	* c-c++-common/asan/strspn-2.c: Likewise.
+	* c-c++-common/asan/strstr-1.c: Likewise.
+	* c-c++-common/asan/strstr-2.c: Likewise.
+	* c-c++-common/asan/halt_on_error_suppress_equal_pcs-1.c: Likewise.
+
+2016-11-07  Maxim Ostapenko  
+
 	* c-c++-common/asan/null-deref-1.c: Adjust testcase.
 
 2016-10-30  Bill Schmidt  
diff --git a/gcc/testsuite/c-c++-common/asan/default_options.h b/gcc/testsuite/c-c++-common/asan/default_options.h
new file mode 100644
index 000..1e5c486
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/default_options.h
@@ -0,0 +1,9 @@
+#ifdef __cplusplus
+extern "C"
+#endif
+const char *
+__asan_d

[PATCH 7/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko
This patch tries to implement odr indicators functionality at compiler 
side.
We emit new __odr_asan_XXX symbol for each instrumented global that 
indicates whether this global was already registered and the library 
checks this indicator symbol at runtime. For some globals (e.g. static 
or hidden) the odr indicator is not needed, thus we can skip the 
indicator for them and pass zero to runtime.
If this patch is undesirable at this stage, we can probably postpone it 
until GCC 8 though.


From 137f139972a89259b9d8521e13ecb76fd2cef433 Mon Sep 17 00:00:00 2001
From: Maxim Ostapenko 
Date: Fri, 28 Oct 2016 10:22:03 +0300
Subject: [PATCH 7/7] Add support for ASan odr_indicator.

config/

	* bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
	ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.

gcc/

	* asan.c (asan_global_struct): Refactor.
	(create_odr_indicator): New function.
	(asan_needs_odr_indicator_p): Likewise.
	(is_odr_indicator): Likewise.
	(asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
	constructor.
	(asan_protect_global): Do not protect odr indicators.

gcc/testsuite/

	* c-c++-common/asan/no-redundant-odr-indicators-1.c: New test.
---
 config/ChangeLog   |  5 ++
 config/bootstrap-asan.mk   |  2 +-
 gcc/ChangeLog  | 10 +++
 gcc/asan.c | 76 +++---
 gcc/testsuite/ChangeLog|  4 ++
 .../asan/no-redundant-odr-indicators-1.c   | 17 +
 6 files changed, 105 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-odr-indicators-1.c

diff --git a/config/ChangeLog b/config/ChangeLog
index 3b0092b..0c75185 100644
--- a/config/ChangeLog
+++ b/config/ChangeLog
@@ -1,3 +1,8 @@
+2016-11-07  Maxim Ostapenko  
+
+	* bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
+	ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.
+
 2016-06-21  Trevor Saunders  
 
 	* elf.m4: Remove interix support.
diff --git a/config/bootstrap-asan.mk b/config/bootstrap-asan.mk
index 70baaf9..e73d4c2 100644
--- a/config/bootstrap-asan.mk
+++ b/config/bootstrap-asan.mk
@@ -1,7 +1,7 @@
 # This option enables -fsanitize=address for stage2 and stage3.
 
 # Suppress LeakSanitizer in bootstrap.
-export LSAN_OPTIONS="detect_leaks=0"
+export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1
 
 STAGE2_CFLAGS += -fsanitize=address
 STAGE3_CFLAGS += -fsanitize=address
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1da0ef9..527cafa 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,15 @@
 2016-11-07  Maxim Ostapenko  
 
+	* asan.c (asan_global_struct): Refactor.
+	(create_odr_indicator): New function.
+	(asan_needs_odr_indicator_p): Likewise.
+	(is_odr_indicator): Likewise.
+	(asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
+	constructor.
+	(asan_protect_global): Do not protect odr indicators.
+
+2016-11-07  Maxim Ostapenko  
+
 	* asan.h (asan_intercepted_p): Handle BUILT_IN_STRCSPN,
 	BUILT_IN_STRPBRK, BUILT_IN_STRSPN and BUILT_IN_STRSTR.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index fdc84bd..b54110a 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1329,6 +1329,16 @@ asan_needs_local_alias (tree decl)
   return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
 }
 
+/* Return true if DECL, a global var, is an artificial ODR indicator symbol
+   therefore doesn't need protection.  */
+
+static bool
+is_odr_indicator (tree decl)
+{
+  const char *sym_name = IDENTIFIER_POINTER (DECL_NAME (decl));
+  return strstr(sym_name, "_.__odr_asan_") == sym_name;
+}
+
 /* Return true if DECL is a VAR_DECL that should be protected
by Address Sanitizer, by appending a red zone with protected
shadow memory after it and aligning it to at least
@@ -1377,7 +1387,8 @@ asan_protect_global (tree decl)
   || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
   || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
   || DECL_ALIGN_UNIT (decl) > 2 * ASAN_RED_ZONE_SIZE
-  || TREE_TYPE (decl) == ubsan_get_source_location_type ())
+  || TREE_TYPE (decl) == ubsan_get_source_location_type ()
+  || is_odr_indicator (decl))
 return false;
 
   rtl = DECL_RTL (decl);
@@ -2197,14 +2208,15 @@ asan_dynamic_init_call (bool after_p)
 static tree
 asan_global_struct (void)
 {
-  static const char *field_names[8]
+  static const char *field_names[]
 = { "__beg", "__size", "__size_with_redzone",
-	"__name", "__module_name", "__has_dynamic_init", "__location", "__odr_indicator"};
-  tree fields[8], ret;
-  int i;
+	"__name", "__module_name", "__has_dynamic_init", "__location",
+	"__odr_indicator"};
+  tree fields[ARRAY_SIZE (field_names)], ret;
+  unsigned i;
 
   ret = make_node (RECORD_TYPE);
-  for (i = 0; i < 8; i++)
+  for (i = 0; i < ARRAY_SIZE (field_names); i++)
 {
   fields[i]
 	= build_decl (UNKNOWN_LOCATION, FIELD_DECL,
@@ -2226,6 +2238,52 @@ asan_glo

Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:22:28AM +0300, Maxim Ostapenko wrote:
> this patch set performs libsanitizer merge from upstream.
> 
> Patch 1 is the library merge itself.
> 
> Patch 2 is the reapplied change for SPARC by David S. Miller.
> 
> Patch 3 changes heuristic for extracting last PC from stack frame for ARM in
> fast unwind routine. More details can be found here
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).
> 
> Patch 4 replaces Jakub's fix for
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes
> CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false
> positive odr violation reports.
> 
> Patch 5 combines necessary compiler changes.
> 
> Patch 6 adds several new tests, backported from upstream.
> 
> Patch 7 adds support for ASan odr indicators at compiler side.
> 
> The whole patch set was regtested/bootstrapped/ASan bootstrapped on
> x86_64-unknown-linux-gnu and i386-unknown-linux-gnu.
> Also, passed regression tests on arm-linux-gnueabi and aarch64-linux under
> QEMU.

So, libasan.so.* is again ABI incompatible, but libtsan and libubsan stay
(hopefully) backwards ABI compatible?

Jakub


Re: [PATCH 7/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:31:18AM +0300, Maxim Ostapenko wrote:
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -1329,6 +1329,16 @@ asan_needs_local_alias (tree decl)
>return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
>  }
>  
> +/* Return true if DECL, a global var, is an artificial ODR indicator symbol
> +   therefore doesn't need protection.  */
> +
> +static bool
> +is_odr_indicator (tree decl)
> +{
> +  const char *sym_name = IDENTIFIER_POINTER (DECL_NAME (decl));
> +  return strstr(sym_name, "_.__odr_asan_") == sym_name;

Formatting, missing space before (.
Plus strstr (x, y) == x is very inefficient, strncmp would be cheaper.
But more importantly, you are relying on what exactly does
ASM_GENERATE_INTERNAL_LABEL, that differs between targets, not all of them
e.g. allow . in symbol names, other targets use $, others can only use _,
etc.  I think you'd better just add "asan odr indicator" attribute
(including the spaces, so it isn't something users can add to their
variables) to the artificial vars and lookup_attribute in the
is_odr_indicator predicate (after testing some cheap flags like
DECL_ARTIFICIAL).

> +  tree var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier 
> (sym_name),
> +  char_type_node);
> +  TREE_ADDRESSABLE (var) = TREE_ADDRESSABLE (decl);

How is addressability of the original decl related to addressability of the
indicator?  If you take address of the indicator (it is stored in the
structure), it should be just 1.

> +  TREE_READONLY (var) = 0;
> +  TREE_THIS_VOLATILE (var) = 1;
> +  DECL_GIMPLE_REG_P (var) = DECL_GIMPLE_REG_P (decl);

Again, how is this related?  Just store 0.

> +  DECL_ARTIFICIAL (var) = 1;
> +  DECL_IGNORED_P (var) = DECL_IGNORED_P (decl);

The indicators should be surely not recorded in debug info, so DEC_IGNORED_P
should be 1.

> +  TREE_STATIC (var) = 1;
> +  TREE_PUBLIC (var) = 1;
> +  DECL_VISIBILITY (var) = DECL_VISIBILITY (decl);

Are they meant to have the same visibility and be exported from DSOs if the
original var is?

> @@ -2313,8 +2374,7 @@ asan_add_global (tree decl, tree type, 
> vec *v)
>else
>  locptr = build_int_cst (uptr, 0);
>CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, locptr);
> -  /* TODO: support ODR indicators.  */
> -  CONSTRUCTOR_APPEND_ELT(vinner, NULL_TREE, build_int_cst (uptr, 0));
> +  CONSTRUCTOR_APPEND_ELT(vinner, NULL_TREE, odr_indicator_ptr);

Formatting, missing space before (, both in this patch and in the previous
one.

Jakub


Re: [PATCH] Make direct emission of time profiler counter

2016-11-07 Thread Martin Liška
On 11/05/2016 09:38 AM, Jan Hubicka wrote:
> Looks OK if it passes.
> 
> Honza

Thanks, fixed on trunk as r241894.
Martin


Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-07 Thread Segher Boessenkool
Hi Christophe,

On Fri, Nov 04, 2016 at 02:31:28PM +0100, Christophe Lyon wrote:
> Since this commit I have noticed execution failures on "old" arm targets:
> 
>   gcc.dg/torture/pr48124-4.c   -O1  execution test
>   gcc.dg/torture/pr48124-4.c   -O2  execution test
>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  execution test
>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  execution test
>   gcc.dg/torture/pr48124-4.c   -O3 -g  execution test
>   gcc.dg/torture/pr48124-4.c   -Os  execution test
> 
> For instance on target arm-none-linux-gnueabi --with-cpu=cortex-a9
> --with-mode=arm
> and running the tests with -march=armv5t

Confirmed.  What a nasty, nasty bug, and it has been here for decades
it seems.  Could you please open a PR?


Segher


Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko



On 07/11/16 11:39, Jakub Jelinek wrote:

On Mon, Nov 07, 2016 at 11:22:28AM +0300, Maxim Ostapenko wrote:

this patch set performs libsanitizer merge from upstream.

Patch 1 is the library merge itself.

Patch 2 is the reapplied change for SPARC by David S. Miller.

Patch 3 changes heuristic for extracting last PC from stack frame for ARM in
fast unwind routine. More details can be found here
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).

Patch 4 replaces Jakub's fix for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes
CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false
positive odr violation reports.

Patch 5 combines necessary compiler changes.

Patch 6 adds several new tests, backported from upstream.

Patch 7 adds support for ASan odr indicators at compiler side.

The whole patch set was regtested/bootstrapped/ASan bootstrapped on
x86_64-unknown-linux-gnu and i386-unknown-linux-gnu.
Also, passed regression tests on arm-linux-gnueabi and aarch64-linux under
QEMU.

So, libasan.so.* is again ABI incompatible, but libtsan and libubsan stay
(hopefully) backwards ABI compatible?



libubsan is definitely compatible.
For libtsan we have several changes:

1) Several interceptors (34 of them) were added and 
__interceptor_lstat{64} were removed.

2) __interceptor_strchr has change in its parameters type:
__interceptor_strchr(char*, int) -> __interceptor_strchr(const char*, int)
3) tsan's internal type __tsan::ReportDesc has several changes, but it 
seems that this doesn't introduce ABI incompatibility with compiler side.


Full abidiff listing is attached.

So, I suppose libtsan is also compatible.

-Maxim



Jakub





Functions changes summary: 4 Removed, 3 Changed (70 filtered out), 34 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Function symbols changes summary: 0 Removed, 10 Added function symbols not referenced by debug info
Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info

4 Removed functions:

  'function int __interceptor_lstat(const char*, void*)'{lstat, aliases __interceptor_lstat}
  'function int __interceptor_lstat64(const char*, void*)'{lstat64, aliases __interceptor_lstat64}
  'function int __interceptor_stat(const char*, void*)'{__interceptor_stat, aliases stat}
  'function int __interceptor_stat64(const char*, void*)'{stat64, aliases __interceptor_stat64}

34 Added functions:

  'function char* __interceptor_ctermid(char*)'{__interceptor_ctermid, aliases ctermid}
  'function int __interceptor_epoll_pwait(int, void*, int, int, void*)'{epoll_pwait, aliases __interceptor_epoll_pwait}
  'function int __interceptor_eventfd_read(int, __sanitizer::u64*)'{eventfd_read, aliases __interceptor_eventfd_read}
  'function int __interceptor_eventfd_write(int, __sanitizer::u64)'{__interceptor_eventfd_write, aliases eventfd_write}
  'function void* __interceptor_memmem(SIZE_T, SIZE_T)'{__interceptor_memmem, aliases memmem}
  'function int __interceptor_pthread_sigmask(int, const __sanitizer::__sanitizer_sigset_t*, __sanitizer::__sanitizer_sigset_t*)'{__interceptor_pthread_sigmask, aliases pthread_sigmask}
  'function SSIZE_T __interceptor_recvfrom(int, void*, SIZE_T, int, void*, int*)'{__interceptor_recvfrom, aliases recvfrom}
  'function SSIZE_T __interceptor_sendto(int, void*, SIZE_T, int, void*, int)'{__interceptor_sendto, aliases sendto}
  'function int __interceptor_sigblock(int)'{sigblock, aliases __interceptor_sigblock}
  'function int __interceptor_sigsetmask(int)'{sigsetmask, aliases __interceptor_sigsetmask}
  'function SIZE_T __interceptor_strnlen(const char*, SIZE_T)'{__interceptor_strnlen, aliases strnlen}
  'function int __interceptor_ttyname_r(int, char*, SIZE_T)'{__interceptor_ttyname_r, aliases ttyname_r}
  'function void __sanitizer_cov_trace_pc_guard_init()'{__sanitizer_cov_trace_pc_guard_init}
  'function int __sanitizer_install_malloc_and_free_hooks(void (typedef __sanitizer::uptr)*, void ()*)'{__sanitizer_install_malloc_and_free_hooks}
  'function void __sanitizer_set_report_fd(void*)'{__sanitizer_set_report_fd}
  'function void __sanitizer_symbolize_global(__sanitizer::uptr, const char*, char*, __sanitizer::uptr)'{__sanitizer_symbolize_global}
  'function void __sanitizer_symbolize_pc(__sanitizer::uptr, const char*, char*, __sanitizer::uptr)'{__sanitizer_symbolize_pc}
  'function void __sanitizer_syscall_post_impl_rt_sigaction(long int, long int, const __sanitizer::__sanitizer_kernel_sigaction_t*, __sanitizer::__sanitizer_kernel_sigaction_t*, SIZE_T)'{__sanitizer_syscall_post_impl_rt_sigaction}
  'function void __sanitizer_syscall_post_impl_sigaction(long int, long int, const __sanitizer::__sanitizer_kernel_sigaction_t*, __sanitizer::__sanitizer_kernel_sigaction_t*)'{__sanitizer_syscall_post_impl_sigaction}
  'function void __sanitizer_syscall_pre_impl_rt_sigactio

Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 12:14:39PM +0300, Maxim Ostapenko wrote:
> libubsan is definitely compatible.

Nice.

> For libtsan we have several changes:
> 
> 1) Several interceptors (34 of them) were added and __interceptor_lstat{64}
> were removed.

That is bad, I think we need to readd those and perhaps just do what
lstat*/stat* do.  Weren't we solving the same thing a year ago on some other
symbol?

> 2) __interceptor_strchr has change in its parameters type:
> __interceptor_strchr(char*, int) -> __interceptor_strchr(const char*, int)

That is not a big deal, the function is extern "C".

> 3) tsan's internal type __tsan::ReportDesc has several changes, but it seems
> that this doesn't introduce ABI incompatibility with compiler side.

If __tsan::ReportDesc is not defined in publicly installed headers, I think
we are fine.

Jakub


Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Yuri Gribov
On Mon, Nov 7, 2016 at 9:20 AM, Jakub Jelinek  wrote:
> On Mon, Nov 07, 2016 at 12:14:39PM +0300, Maxim Ostapenko wrote:
>> libubsan is definitely compatible.
>
> Nice.
>
>> For libtsan we have several changes:
>>
>> 1) Several interceptors (34 of them) were added and __interceptor_lstat{64}
>> were removed.
>
> That is bad, I think we need to readd those and perhaps just do what
> lstat*/stat* do.  Weren't we solving the same thing a year ago on some other
> symbol?
>
>> 2) __interceptor_strchr has change in its parameters type:
>> __interceptor_strchr(char*, int) -> __interceptor_strchr(const char*, int)
>
> That is not a big deal, the function is extern "C".
>
>> 3) tsan's internal type __tsan::ReportDesc has several changes, but it seems
>> that this doesn't introduce ABI incompatibility with compiler side.
>
> If __tsan::ReportDesc is not defined in publicly installed headers, I think
> we are fine.

As a side note, why is it in the list of exported symbols?

-I


[PATCH] rs6000: Do swdiv at expand time

2016-11-07 Thread Segher Boessenkool
We transform floating point divide instructions to a faster series of
simple instructions, "swdiv".  Currently we do not do that until the
first splitter pass, which is much too late for most optimisations
that can happen on those new instructions, e.g. the constant loads
are not CSEd inside an unrolled loop.  This patch changes things so
those divide instructions are expanded during expand already.

Bootstrapped and tested on powerpc64-linux; Bill has run SPEC on it,
and if anything it shows a slight improvement.

Is this okay for trunk?


Segher


---
 gcc/config/rs6000/rs6000.md | 10 +-
 gcc/config/rs6000/vector.md | 10 +-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index e432a5a..e08f120 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4457,7 +4457,15 @@ (define_expand "div3"
(div:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
  (match_operand:SFDF 2 "gpc_reg_operand" "")))]
   "TARGET__INSN && !TARGET_SIMPLE_FPU"
-  "")
+{
+  if (RS6000_RECIP_AUTO_RE_P (mode)
+  && can_create_pseudo_p () && flag_finite_math_only
+  && !flag_trapping_math && flag_reciprocal_math)
+{
+  rs6000_emit_swdiv (operands[0], operands[1], operands[2], true);
+  DONE;
+}
+})
 
 (define_insn "*div3_fpr"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 7240345..05f3bdb 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -248,7 +248,15 @@ (define_expand "div3"
(div:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
   (match_operand:VEC_F 2 "vfloat_operand" "")))]
   "VECTOR_UNIT_VSX_P (mode)"
-  "")
+{
+  if (RS6000_RECIP_AUTO_RE_P (mode)
+  && can_create_pseudo_p () && flag_finite_math_only
+  && !flag_trapping_math && flag_reciprocal_math)
+{
+  rs6000_emit_swdiv (operands[0], operands[1], operands[2], true);
+  DONE;
+}
+})
 
 (define_expand "neg2"
   [(set (match_operand:VEC_F 0 "vfloat_operand" "")
-- 
1.9.3



Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

On 07/11/16 12:28, Yuri Gribov wrote:

On Mon, Nov 7, 2016 at 9:20 AM, Jakub Jelinek  wrote:

On Mon, Nov 07, 2016 at 12:14:39PM +0300, Maxim Ostapenko wrote:

libubsan is definitely compatible.

Nice.


For libtsan we have several changes:

1) Several interceptors (34 of them) were added and __interceptor_lstat{64}
were removed.

That is bad, I think we need to readd those and perhaps just do what
lstat*/stat* do.  Weren't we solving the same thing a year ago on some other
symbol?


2) __interceptor_strchr has change in its parameters type:
__interceptor_strchr(char*, int) -> __interceptor_strchr(const char*, int)

That is not a big deal, the function is extern "C".


3) tsan's internal type __tsan::ReportDesc has several changes, but it seems
that this doesn't introduce ABI incompatibility with compiler side.

If __tsan::ReportDesc is not defined in publicly installed headers, I think
we are fine.

As a side note, why is it in the list of exported symbols?


Because it appears as a type of parameter of exported __tsan::OnReport 
function:


// Can be overriden by an application/test to intercept reports.
#ifdef TSAN_EXTERNAL_HOOKS
bool OnReport(const ReportDesc *rep, bool suppressed);
#else
SANITIZER_WEAK_CXX_DEFAULT_IMPL
bool OnReport(const ReportDesc *rep, bool suppressed) {
  (void)rep;
  return suppressed;
}
#endif

This function can be overridden by application for debugging purpose though.



-I







Re: [rs6000] Fix reload failures in 64-bit mode with no special constant pool

2016-11-07 Thread Eric Botcazou
> Now you don't need to have a special pool to call create_TOC_reference, you
> can call it for regular TOC references as well, as done a few lines above:
> 
>   /* If this is a SYMBOL_REF that refers to a constant pool entry,
>and we have put it in the TOC, we just need to make a TOC-relative
>reference to it.  */
>   if (TARGET_TOC
> && GET_CODE (operands[1]) == SYMBOL_REF
> && use_toc_relative_ref (operands[1], mode))
>   operands[1] = create_TOC_reference (operands[1], operands[0]);
> 
> So the attached patch does it there too.
> 
> Tested on PowerPC64/Linux (LRA) and VxWorks (reload), OK for the mainline?

Revised version attached, with Pmode formally changed to mode (but mode == 
Pmode here so no functional change whatsoever).

Tested on PowerPC64/Linux, OK for the mainline?


   * config/rs6000/rs6000.c (rs6000_emit_move): Also use a TOC reference
after forcing to constant memory when the code model is medium.

-- 
Eric BotcazouIndex: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 241856)
+++ config/rs6000/rs6000.c	(working copy)
@@ -10673,10 +10673,7 @@ rs6000_emit_move (rtx dest, rtx source,
 
 	  if (TARGET_TOC
 	  && GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
-	  && constant_pool_expr_p (XEXP (operands[1], 0))
-	  && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (
-			get_pool_constant (XEXP (operands[1], 0)),
-			get_pool_mode (XEXP (operands[1], 0
+	  && use_toc_relative_ref (XEXP (operands[1], 0), mode))
 	{
 	  rtx tocref = create_TOC_reference (XEXP (operands[1], 0),
 		 operands[0]);


Re: [PATCH] fix a few minor nits in -Walloca documentation

2016-11-07 Thread Richard Biener
On Sat, Nov 5, 2016 at 3:25 AM, Jeff Law  wrote:
> On 11/04/2016 06:26 PM, Martin Sebor wrote:
>>
>> While experimenting with -Walloca and cross-referencing the manual
>> I noticed a few minor nits that I thought could stand to corrected
>> and/or clarified.  Attached is a patch.
>>
>> In the update I mentioned that the alloca argument must have integer
>> type for the bounds checking to be recognized to make it clear that
>> for example floating point arguments are not considered to be bounded
>> even if they are constrained.  (Apparently VRP doesn't handle those.)
>
> Right.  VRP doesn't handle floating point.  THere's been some talk of
> starting to track a few key values so we can say things like "this is not a
> NaN".

Yup.  Basically add sth along SSA_NAME_RANGE_INFO for floats
and track answers to isnan, isnormal, etc. -- basically record
fpclassify () for each SSA name.

I'd do this conveniently in tree-ssa-forwprop.c which iterates in RPO
order, folding all stmts.  The actual worker would be a

int
gimple_fpclassify (gimple *stmt)

function classifying the result of stmt (using that SSA info on arguments).
Or if you want it really fancy do it decomposed,

int op_fpclassify (enum tree_code code, tree arg1 [, tree arg2 [, tree arg3]])
int op_fpclassify (enum built_in_function, tree arg1 [, tree arg2 [,
tree arg3]])

Wherever we test stuff like HONOR_NANS we can replace it with sth
operand specific that also evaluates the SSA info.

It shouldn't be much work to start sth along this line.

Richard.

>
> The patch is OK for the trunk.
>
> Thanks,
> Jeff


Re: Simplify X / X, 0 / X and X % X

2016-11-07 Thread Richard Biener
On Fri, Nov 4, 2016 at 9:07 PM, Marc Glisse  wrote:
> Hello,
>
> since we were discussing this recently...
>
> The condition is copied from the existing 0 % X case, visible in the context
> of the diff.
>
> As far as I understand, the main case where we do not want to optimize is
> during constexpr evaluation in the C++ front-end (it wants to detect the
> undefined behavior), and with late folding I think this means we only need
> to care about an explicit 0/0, not about X/X where X would become 0 after
> the simplification.
>
> And later, if we do have something like X/0, we could handle it the same way
> as we currently handle *(char*)0, insert a trap after that instruction and
> clear the following code, which likely gives better code than replacing 0/0
> with 1.
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok.

Thanks,
Richard.

> 2016-11-07  Marc Glisse  
>
> gcc/
> * match.pd (0 / X, X / X, X % X): New simplifications.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/divide-5.c: New file.
>
> --
> Marc Glisse


Re: Simplify X / X, 0 / X and X % X

2016-11-07 Thread Richard Biener
On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law  wrote:
> On 11/04/2016 02:07 PM, Marc Glisse wrote:
>>
>> Hello,
>>
>> since we were discussing this recently...
>>
>> The condition is copied from the existing 0 % X case, visible in the
>> context of the diff.
>>
>> As far as I understand, the main case where we do not want to optimize
>> is during constexpr evaluation in the C++ front-end (it wants to detect
>> the undefined behavior), and with late folding I think this means we
>> only need to care about an explicit 0/0, not about X/X where X would
>> become 0 after the simplification.
>>
>> And later, if we do have something like X/0, we could handle it the same
>> way as we currently handle *(char*)0, insert a trap after that
>> instruction and clear the following code, which likely gives better code
>> than replacing 0/0 with 1.
>
> Yup.  I'd prefer to insert a trap if we ultimately expose a division by zero
> -- including cases where that division occurs as a result of a PHI arg being
> zero and the PHI result being used as a denominator in a division
> expression.
>
> It ought to be extremely easy to detect & transform that case (and probably
> warn for it too).

We have gimple-ssa-isolate-paths.c for that, right?

Richard.

>
>
> I'm leaving the actual review to Richi.
> jeff
>


Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

On 07/11/16 12:20, Jakub Jelinek wrote:

On Mon, Nov 07, 2016 at 12:14:39PM +0300, Maxim Ostapenko wrote:

libubsan is definitely compatible.

Nice.


For libtsan we have several changes:

1) Several interceptors (34 of them) were added and __interceptor_lstat{64}
were removed.

That is bad, I think we need to readd those and perhaps just do what
lstat*/stat* do.  Weren't we solving the same thing a year ago on some other
symbol?


Yeah, that was __tls_get_addr. Actually,  *stat interceptors were moved 
from tsan to common, but it seems that lstat/lstat64 were missed. This 
should be fixed upstream, I suppose.



2) __interceptor_strchr has change in its parameters type:
__interceptor_strchr(char*, int) -> __interceptor_strchr(const char*, int)

That is not a big deal, the function is extern "C".


3) tsan's internal type __tsan::ReportDesc has several changes, but it seems
that this doesn't introduce ABI incompatibility with compiler side.

If __tsan::ReportDesc is not defined in publicly installed headers, I think
we are fine.


I don't see __tsan::ReportDesc in any tsan interface header:

$ grep -nr ReportDesc libsanitizer/tsan/tsan_interface*
$

But since tsan has weak

SANITIZER_WEAK_CXX_DEFAULT_IMPL
bool OnReport(const ReportDesc *rep, bool suppressed {
...
}

that can be overwritten by C++ application (in debugging purposes 
though), is it OK to not change libtsan version?




Jakub







Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3)

2016-11-07 Thread Martin Liška
Hello.

After discussion with Jakub, I'm resending new version of the patch, where I 
changed following:
1) gimplify_ctxp->live_switch_vars is used to track variables introduced in 
switch_expr. Every time
   a case_label_expr is seen, these are unpoisoned. It's quite conservative, 
however it covers all
   corner cases on can come up with. Compared to clang, we are much more 
precise in switch statements
   where a variable liveness crosses label boundary.
2) I found a bug where ASAN_CHECK was optimized out due to missing check of 
IFN_ASAN_MARK internal fn.
   Test was added for that.
3) Multiple switch tests have been added, which is going to be sent in upcoming 
email.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests (+ 
asan bootstrap finishes
successfully).

Martin
>From 2b37a59dd639ad740fdbd49d57b9f1975fc35046 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 3 May 2016 15:35:22 +0200
Subject: [PATCH 1/2] Introduce -fsanitize-address-use-after-scope

gcc/c-family/ChangeLog:

2016-10-27  Martin Liska  

	* c-warn.c (warn_for_unused_label): Save all labels used
	in goto or in &label.

gcc/ChangeLog:

2016-10-27  Martin Liska  

	* asan.c (enum asan_check_flags): Move the enum to header file.
	(asan_init_shadow_ptr_types): Make type creation more generic.
	(shadow_mem_size): New function.
	(asan_emit_stack_protection): Use newly added ASAN_SHADOW_GRANULARITY.
	Rewritten stack unpoisoning code.
	(build_shadow_mem_access): Add new argument return_address.
	(instrument_derefs): Instrument local variables if use after scope
	sanitization is enabled.
	(asan_store_shadow_bytes): New function.
	(asan_expand_mark_ifn): Likewise.
	(asan_sanitize_stack_p): Moved from asan_sanitize_stack_p.
	* asan.h (enum asan_mark_flags): Moved here from asan.c
	(asan_protect_stack_decl): Protect all declaration that need
	to live in memory.
	(asan_sanitize_use_after_scope): New function.
	(asan_no_sanitize_address_p): Likewise.
	* cfgexpand.c (partition_stack_vars): Consider
	asan_sanitize_use_after_scope in condition.
	(expand_stack_vars): Likewise.
	* common.opt (-fsanitize-address-use-after-scope): New option.
	* doc/invoke.texi (use-after-scope-direct-emission-threshold):
	Explain the parameter.
	* flag-types.h (enum sanitize_code): Define SANITIZE_USE_AFTER_SCOPE.
	* gimplify.c (build_asan_poison_call_expr): New function.
	(asan_poison_variable): Likewise.
	(gimplify_bind_expr): Generate poisoning/unpoisoning for local
	variables that have address taken.
	(gimplify_decl_expr): Likewise.
	(gimplify_target_expr): Likewise for C++ temporaries.
	(sort_by_decl_uid): New function.
	(gimplify_expr): Unpoison all variables for a label we can jump
	from outside of a scope.
	(gimplify_switch_expr): Unpoison variables defined in the switch
	context.
	(gimplify_function_tree): Clear asan_poisoned_variables.
	(asan_poison_variables): New function.
	(warn_switch_unreachable_r): Handle IFN_ASAN_MARK.
	* internal-fn.c (expand_ASAN_MARK): New function.
	* internal-fn.def (ASAN_MARK): Declare.
	* opts.c (finish_options): Handle -fstack-reuse if
	-fsanitize-address-use-after-scope is enabled.
	(common_handle_option): Enable address sanitization if
	-fsanitize-address-use-after-scope is enabled.
	* params.def (PARAM_USE_AFTER_SCOPE_DIRECT_EMISSION_THRESHOLD):
	New parameter.
	* params.h: Likewise.
	* sancov.c (pass_sanopt::execute): Handle IFN_ASAN_MARK.
	* sanitizer.def: Define __asan_poison_stack_memory and
	__asan_unpoison_stack_memory functions.
	* asan.c (asan_mark_poison_p): New function.
	(transform_statements): Handle asan_mark_poison_p calls.
	* gimple.c (nonfreeing_call_p): Handle IFN_ASAN_MARK.
---
 gcc/asan.c| 302 +-
 gcc/asan.h|  66 +--
 gcc/c-family/c-warn.c |   9 +-
 gcc/cfgexpand.c   |  18 +--
 gcc/common.opt|   3 +
 gcc/doc/invoke.texi   |  15 ++-
 gcc/gimple.c  |   3 +
 gcc/gimplify.c| 234 +++---
 gcc/internal-fn.c |   9 ++
 gcc/internal-fn.def   |   1 +
 gcc/opts.c|  27 -
 gcc/params.def|   6 +
 gcc/params.h  |   2 +
 gcc/sanitizer.def |   4 +
 gcc/sanopt.c  |   3 +
 15 files changed, 603 insertions(+), 99 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index c6d9240..1e0ce8d 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -245,6 +245,22 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value;
 static bool asan_shadow_offset_computed;
 static vec sanitized_sections;
 
+/* Return true if STMT is ASAN_MARK poisoning internal function call.  */
+static inline bool
+asan_mark_poison_p (gimple *stmt)
+{
+  return (gimple_call_internal_p (stmt, IFN_ASAN_MARK)
+	  && tree_to_uhwi (gimple_call_arg (stmt, 0)) == ASAN_MARK_CLOBBER);
+
+}
+
+/* Set of variable declarations that are going to be guarded by
+   use-after-scope sanitizer.  */
+
+static hash_set *asan_handled_variables = NULL;
+
+hash_set  *asan_used_labels = N

Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:22:28AM +0300, Maxim Ostapenko wrote:
> Hi,
> 
> this patch set performs libsanitizer merge from upstream.
> 
> Patch 1 is the library merge itself.
> 
> Patch 2 is the reapplied change for SPARC by David S. Miller.
> 
> Patch 3 changes heuristic for extracting last PC from stack frame for ARM in
> fast unwind routine. More details can be found here
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).
> 
> Patch 4 replaces Jakub's fix for
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes
> CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false
> positive odr violation reports.
> 
> Patch 5 combines necessary compiler changes.
> 
> Patch 6 adds several new tests, backported from upstream.

The patches 1-6 are ok for trunk now, if you fix the missing space
before ( in patch 5.

> Patch 7 adds support for ASan odr indicators at compiler side.

This one can be applied incrementally once the issues reported in there
are resolved.

And the libtsan ABI stuff (__intercept*stat*) can be resolved incrementally
too.

Thanks.

Jakub


Re: [PATCH, 02/N] Introduce tests for -fsanitize-address-use-after-scope (v3)

2016-11-07 Thread Martin Liška
Third version of the patch.

Martin
>From e790d926afd3d2d6ad41d14d1e91698bf651b41a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 19 Sep 2016 17:39:29 +0200
Subject: [PATCH 2/2] Introduce tests for -fsanitize-address-use-after-scope

gcc/testsuite/ChangeLog:

2016-09-26  Martin Liska  

	* c-c++-common/asan/force-inline-opt0-1.c: Disable
	-f-sanitize-address-use-after-scope.
	* c-c++-common/asan/inc.c: Change number of expected ASAN_CHECK
	internal fn calls.
	* g++.dg/asan/use-after-scope-1.C: New test.
	* g++.dg/asan/use-after-scope-2.C: Likewise.
	* g++.dg/asan/use-after-scope-3.C: Likewise.
	* g++.dg/asan/use-after-scope-types-1.C: Likewise.
	* g++.dg/asan/use-after-scope-types-2.C: Likewise.
	* g++.dg/asan/use-after-scope-types-3.C: Likewise.
	* g++.dg/asan/use-after-scope-types-4.C: Likewise.
	* g++.dg/asan/use-after-scope-types-5.C: Likewise.
	* g++.dg/asan/use-after-scope-types.h: Likewise.
	* gcc.dg/asan/use-after-scope-1.c: Likewise.
	* gcc.dg/asan/use-after-scope-2.c: Likewise.
	* gcc.dg/asan/use-after-scope-3.c: Likewise.
	* gcc.dg/asan/use-after-scope-4.c: Likewise.
	* gcc.dg/asan/use-after-scope-5.c: Likewise.
	* gcc.dg/asan/use-after-scope-6.c: Likewise.
	* gcc.dg/asan/use-after-scope-7.c: Likewise.
	* gcc.dg/asan/use-after-scope-8.c: Likewise.
	* gcc.dg/asan/use-after-scope-9.c: Likewise.
	* gcc.dg/asan/use-after-scope-switch-1.c: Likewise.
	* gcc.dg/asan/use-after-scope-switch-2.c: Likewise.
	* gcc.dg/asan/use-after-scope-switch-3.c: Likewise.
	* gcc.dg/asan/use-after-scope-goto-1.c: Likewise.
	* gcc.dg/asan/use-after-scope-goto-2.c: Likewise.
---
 .../c-c++-common/asan/force-inline-opt0-1.c|  1 +
 gcc/testsuite/c-c++-common/asan/inc.c  |  3 +-
 gcc/testsuite/g++.dg/asan/use-after-scope-1.C  | 21 ++
 gcc/testsuite/g++.dg/asan/use-after-scope-2.C  | 40 ++
 gcc/testsuite/g++.dg/asan/use-after-scope-3.C  | 22 ++
 .../g++.dg/asan/use-after-scope-types-1.C  | 17 
 .../g++.dg/asan/use-after-scope-types-2.C  | 17 
 .../g++.dg/asan/use-after-scope-types-3.C  | 17 
 .../g++.dg/asan/use-after-scope-types-4.C  | 17 
 .../g++.dg/asan/use-after-scope-types-5.C  | 17 
 gcc/testsuite/g++.dg/asan/use-after-scope-types.h  | 30 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-1.c  | 18 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-2.c  | 47 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c  | 20 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-4.c  | 16 
 gcc/testsuite/gcc.dg/asan/use-after-scope-5.c  | 27 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-6.c  | 15 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-7.c  | 15 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-8.c  | 14 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-9.c  | 20 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-1.c | 47 ++
 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-2.c | 25 
 .../gcc.dg/asan/use-after-scope-switch-1.c | 25 
 .../gcc.dg/asan/use-after-scope-switch-2.c | 33 +++
 .../gcc.dg/asan/use-after-scope-switch-3.c | 36 +
 25 files changed, 559 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-1.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-2.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-3.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-1.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-2.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-3.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-4.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types-5.C
 create mode 100644 gcc/testsuite/g++.dg/asan/use-after-scope-types.h
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-4.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-5.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-6.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-7.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-8.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-9.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-goto-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-switch-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-switch-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/use-after-scope-switch-3.c

diff --git a/gcc/testsuite/c-c++-common/asan/force-inlin

Re: [PATCH] Fix PR driver/78206 by silently ignoring EPERM as well as ENOENT

2016-11-07 Thread Richard Biener
On Sun, Nov 6, 2016 at 2:36 PM, Jack Howarth  wrote:
> The use of an Apple sandbox with denied file access permissions into 
> /usr/local
> exposed that cc1 fails on errors of...
>
> cc1: error: /usr/local/include: Operation not permitted
>
> The commonly suggested solution of using --with-local-prefix= set to something
> other than /usr/local is undeirable on darwin because that creates a compiler
> which retains library searches in /usr/local/lib despite no longer searching
> for headers in /usr/local/include (which makes it suspicable to header/library
> mismatches during builds).
>
> The following trivial fix solves the issue by silently ignoring errors from
> denied permissions as well as non-existent dirs from the stat (cur->name, &st)
> call in remove_dup() of gcc/incpath.c. Okay for gcc trunk and backports to
> gcc-5-branch and gcc-6-branch?

I think the patch is reasonable, thus it is ok (also for backporting).

Thanks,
Richard.

>Jack Howarth


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3)

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:03:11AM +0100, Martin Liška wrote:
> Hello.
> 
> After discussion with Jakub, I'm resending new version of the patch, where I 
> changed following:
> 1) gimplify_ctxp->live_switch_vars is used to track variables introduced in 
> switch_expr. Every time
>a case_label_expr is seen, these are unpoisoned. It's quite conservative, 
> however it covers all
>corner cases on can come up with. Compared to clang, we are much more 
> precise in switch statements
>where a variable liveness crosses label boundary.
> 2) I found a bug where ASAN_CHECK was optimized out due to missing check of 
> IFN_ASAN_MARK internal fn.
>Test was added for that.
> 3) Multiple switch tests have been added, which is going to be sent in 
> upcoming email.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests (+ 
> asan bootstrap finishes
> successfully).

Ok for trunk.  Hopefully we can resolve the most common cases for switch
incrementally, either still during stage1 or early in stage3.

Jakub


Re: [match.pd] Fix for PR35691

2016-11-07 Thread Richard Biener
On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:

> On 4 November 2016 at 13:41, Richard Biener  wrote:
> > On Thu, 3 Nov 2016, Marc Glisse wrote:
> >
> >> On Thu, 3 Nov 2016, Richard Biener wrote:
> >>
> >> > > > > The transform would also work for vectors (element_precision for
> >> > > > > the test but also a value-matching zero which should ensure the
> >> > > > > same number of elements).
> >> > > > Um sorry, I didn't get how to check vectors to be of equal length by 
> >> > > > a
> >> > > > matching zero.
> >> > > > Could you please elaborate on that ?
> >> > >
> >> > > He may have meant something like:
> >> > >
> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
> >> >
> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was 
> >> > the
> >> > point of the pattern.
> >>
> >> Oups, that's what I had written first, and then I somehow managed to 
> >> confuse
> >> myself enough to remove it so as to remove the call to types_match :-(
> >>
> >> > > So the last operand is checked with operand_equal_p instead of
> >> > > integer_zerop. But the fact that we could compute bit_ior on the
> >> > > comparison results should already imply that the number of elements is 
> >> > > the
> >> > > same.
> >> >
> >> > Though for equality compares we also allow scalar results IIRC.
> >>
> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate 
> >> one
> >> for a testcase, at least until the GIMPLE FE lands...
> >>
> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some
> >> > > vector mode on s390?), it would be cool to do the same for floats (most
> >> > > likely at the RTL level).
> >> >
> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we cab
> >> > just allow bit-and/or on floats as much as we allow them on pointers.
> >>
> >> Would that generate sensible code on targets that do not have logic insns 
> >> for
> >> floats? Actually, even on x86_64 that generates inefficient code, so there
> >> would be some work (for instance grep finds no gen_iordf3, only 
> >> gen_iorv2df3).
> >>
> >> I am also a bit wary of doing those obfuscating optimizations too early...
> >> a==0 is something that other optimizations might use. long
> >> c=(long&)a|(long&)b; (double&)c==0; less so...
> >>
> >> (and I am assuming that signaling NaNs don't make the whole transformation
> >> impossible, which might be wrong)
> >
> > Yeah.  I also think it's not so much important - I just wanted to mention
> > vectors...
> >
> > Btw, I still think we need a more sensible infrastructure for passes
> > to gather, analyze and modify complex conditions.  (I'm always pointing
> > to tree-affine.c as an, albeit not very good, example for handling
> > a similar problem)
> Thanks for mentioning the value-matching capture @@, I wasn't aware of
> this match.pd feature.
> The current patch keeps it restricted to only bitwise operators on integers.
> Bootstrap+test running on x86_64-unknown-linux-gnu.
> OK to commit if passes ?

+/* PR35691: Transform
+   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
+   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
+

Please omit the vertical space

+(for bitop (bit_and bit_ior)
+ cmp (eq ne)
+ (simplify
+  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))

if you capture the first integer_zerop as @2 then you can re-use it...

+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
(@1)))
+(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));

... here inplace of the { build_zero_cst ... }.

Ok with that changes.

Richard.


Re: [PATCH, 02/N] Introduce tests for -fsanitize-address-use-after-scope (v3)

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:04:23AM +0100, Martin Liška wrote:
> Third version of the patch.
> 
> Martin

> >From e790d926afd3d2d6ad41d14d1e91698bf651b41a Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 19 Sep 2016 17:39:29 +0200
> Subject: [PATCH 2/2] Introduce tests for -fsanitize-address-use-after-scope
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-09-26  Martin Liska  
> 
>   * c-c++-common/asan/force-inline-opt0-1.c: Disable
>   -f-sanitize-address-use-after-scope.
>   * c-c++-common/asan/inc.c: Change number of expected ASAN_CHECK
>   internal fn calls.
>   * g++.dg/asan/use-after-scope-1.C: New test.
>   * g++.dg/asan/use-after-scope-2.C: Likewise.
>   * g++.dg/asan/use-after-scope-3.C: Likewise.
>   * g++.dg/asan/use-after-scope-types-1.C: Likewise.
>   * g++.dg/asan/use-after-scope-types-2.C: Likewise.
>   * g++.dg/asan/use-after-scope-types-3.C: Likewise.
>   * g++.dg/asan/use-after-scope-types-4.C: Likewise.
>   * g++.dg/asan/use-after-scope-types-5.C: Likewise.
>   * g++.dg/asan/use-after-scope-types.h: Likewise.
>   * gcc.dg/asan/use-after-scope-1.c: Likewise.
>   * gcc.dg/asan/use-after-scope-2.c: Likewise.
>   * gcc.dg/asan/use-after-scope-3.c: Likewise.
>   * gcc.dg/asan/use-after-scope-4.c: Likewise.
>   * gcc.dg/asan/use-after-scope-5.c: Likewise.
>   * gcc.dg/asan/use-after-scope-6.c: Likewise.
>   * gcc.dg/asan/use-after-scope-7.c: Likewise.
>   * gcc.dg/asan/use-after-scope-8.c: Likewise.
>   * gcc.dg/asan/use-after-scope-9.c: Likewise.
>   * gcc.dg/asan/use-after-scope-switch-1.c: Likewise.
>   * gcc.dg/asan/use-after-scope-switch-2.c: Likewise.
>   * gcc.dg/asan/use-after-scope-switch-3.c: Likewise.
>   * gcc.dg/asan/use-after-scope-goto-1.c: Likewise.
>   * gcc.dg/asan/use-after-scope-goto-2.c: Likewise.

Ok, thanks.

Jakub


Re: [PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-11-07 Thread Richard Biener
On Fri, 4 Nov 2016, Jakub Jelinek wrote:

> Hi!
> 
> Just 2 nits:
> 
> On Fri, Oct 28, 2016 at 01:46:57PM +0200, Richard Biener wrote:
> > +/* Return a pointer to the Nth token in PARERs tokens_buf.  */
> 
> PARSERs ?

Fixed.

> > @@ -454,7 +423,7 @@ c_lex_one_token (c_parser *parser, c_token *token)
> >  /* Return a pointer to the next token from PARSER, reading it in if
> > necessary.  */
> >  
> > -static inline c_token *
> > +c_token *
> >  c_parser_peek_token (c_parser *parser)
> >  {
> >if (parser->tokens_avail == 0)
> 
> I wonder if turning all of these into non-inlines is a good idea.
> Can't you move them to the common header instead?

The issue with moving is that I failed to export the definition of
c_parser in c-parser.h due to gengtype putting vec 
handlers into gtype-c.h but not gtype-objc.h and thus objc bootstrap
fails :/

I believe (well, I hope) that code generation for the C parser
should be mostly unaffected (inlining is still done as determined
useful) and the performance of the GIMPLE parser shouldn't be
too important.

If anybody feels like digging into the gengtype issue, I gave up
after trying for half a day to trick it to do what I want
(like for example also putting it in gtype-objc.h).

> The rest I defer to Joseph or Marek.

Thanks,
Richard.


Re: [PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-11-07 Thread Richard Biener
On Mon, 7 Nov 2016, Richard Biener wrote:

> On Fri, 4 Nov 2016, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > Just 2 nits:
> > 
> > On Fri, Oct 28, 2016 at 01:46:57PM +0200, Richard Biener wrote:
> > > +/* Return a pointer to the Nth token in PARERs tokens_buf.  */
> > 
> > PARSERs ?
> 
> Fixed.
> 
> > > @@ -454,7 +423,7 @@ c_lex_one_token (c_parser *parser, c_token *token)
> > >  /* Return a pointer to the next token from PARSER, reading it in if
> > > necessary.  */
> > >  
> > > -static inline c_token *
> > > +c_token *
> > >  c_parser_peek_token (c_parser *parser)
> > >  {
> > >if (parser->tokens_avail == 0)
> > 
> > I wonder if turning all of these into non-inlines is a good idea.
> > Can't you move them to the common header instead?
> 
> The issue with moving is that I failed to export the definition of
> c_parser in c-parser.h due to gengtype putting vec 
> handlers into gtype-c.h but not gtype-objc.h and thus objc bootstrap
> fails :/

If anybody wants to try, f82dc04b921a52a9a5c90d957a824e1c2d04
has it (objc build) still broken on the gimplefe git branch.

> I believe (well, I hope) that code generation for the C parser
> should be mostly unaffected (inlining is still done as determined
> useful) and the performance of the GIMPLE parser shouldn't be
> too important.
> 
> If anybody feels like digging into the gengtype issue, I gave up
> after trying for half a day to trick it to do what I want
> (like for example also putting it in gtype-objc.h).
> 
> > The rest I defer to Joseph or Marek.
> 
> Thanks,
> Richard.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko

On 07/11/16 13:04, Jakub Jelinek wrote:

On Mon, Nov 07, 2016 at 11:22:28AM +0300, Maxim Ostapenko wrote:

Hi,

this patch set performs libsanitizer merge from upstream.

Patch 1 is the library merge itself.

Patch 2 is the reapplied change for SPARC by David S. Miller.

Patch 3 changes heuristic for extracting last PC from stack frame for ARM in
fast unwind routine. More details can be found here
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).

Patch 4 replaces Jakub's fix for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes
CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false
positive odr violation reports.

Patch 5 combines necessary compiler changes.

Patch 6 adds several new tests, backported from upstream.

The patches 1-6 are ok for trunk now, if you fix the missing space
before ( in patch 5.


Ok, I'm going to land these shortly, thank you for review.




Patch 7 adds support for ASan odr indicators at compiler side.

This one can be applied incrementally once the issues reported in there
are resolved.


Yes, I'll fix the patch.



And the libtsan ABI stuff (__intercept*stat*) can be resolved incrementally
too.

Thanks.

Jakub







[RFC] Fix PR rtl-optimization/59461

2016-11-07 Thread Eric Botcazou
It's a missed optimization of a redundant zero-extension on the SPARC, which 
originally comes from PR rtl-optimization/58295 for ARM.  The extension is 
eliminated on the ARM because the load is explicitly zero-extended in RTL;
on the SPARC the load is implicitly zero-extended by means of LOAD_EXTEND_OP 
and the combiner is blocked by limitations of the nonzero_bits machinery.

The approach is two-pronged:
 1. it lifts a limitation in reg_nonzero_bits_for_combine that was recently 
added (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg03782.html) and prevents 
the combiner from reasoning on larger modes under certain circumstances.
 2. it makes nonzero_bits1 propagate results from inner REGs to paradoxical 
SUBREGs if both WORD_REGISTER_OPERATIONS and LOAD_EXTEND_OP are set.

This also eliminate quite a few zero-extensions in the compile.exp testsuite 
at -O2 on the SPARC.  Tested on x86-64/Linux and SPARC/Solaris.


2016-11-07  Eric Botcazou  

PR rtl-optimization/59461
* doc/rtl.texi (paradoxical subregs): Add missing word.
* combine.c (reg_nonzero_bits_for_combine): Do not discard results
in modes with precision larger than that of last_set_mode.
* rtlanal.c (nonzero_bits1) : If WORD_REGISTER_OPERATIONS is
set and LOAD_EXTEND_OP is appropriate, propagate results from inner
REGs to paradoxical SUBREGs.
(num_sign_bit_copies1) : Likewise.  Check that the mode is not
larger than a word before invoking LOAD_EXTEND_OP on it.


2016-11-07  Eric Botcazou  

* gcc.target/sparc/pr59461.c: New test.

-- 
Eric Botcazou/* PR rtl-optimization/59461 */

/* { dg-do compile } */
/* { dg-options "-O2" } */

extern char zeb_test_array[10];

unsigned char ee_isdigit2(unsigned int i)
{
  unsigned char c = zeb_test_array[i];
  unsigned char retval;

  retval = ((c>='0') & (c<='9')) ? 1 : 0;
  return retval;
}

/* { dg-final { scan-assembler-not "and\t%" } } */
Index: doc/rtl.texi
===
--- doc/rtl.texi	(revision 241856)
+++ doc/rtl.texi	(working copy)
@@ -1882,7 +1882,7 @@ When used as an rvalue, the low-order bi
 taken from @var{reg} while the high-order bits may or may not be
 defined.
 
-The high-order bits of rvalues are in the following circumstances:
+The high-order bits of rvalues are defined in the following circumstances:
 
 @itemize
 @item @code{subreg}s of @code{mem}
Index: combine.c
===
--- combine.c	(revision 241856)
+++ combine.c	(working copy)
@@ -9878,18 +9878,17 @@ reg_nonzero_bits_for_combine (const_rtx
 		  (DF_LR_IN (ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb),
 		   REGNO (x)
 {
-  unsigned HOST_WIDE_INT mask = rsp->last_set_nonzero_bits;
-
-  if (GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION (mode))
-	/* We don't know anything about the upper bits.  */
-	mask |= GET_MODE_MASK (mode) ^ GET_MODE_MASK (rsp->last_set_mode);
-
-  *nonzero &= mask;
+  /* Note that, even if the precision of last_set_mode is lower than that
+	 of mode, record_value_for_reg invoked nonzero_bits on the register
+	 with nonzero_bits_mode (because last_set_mode is necessarily integral
+	 and HWI_COMPUTABLE_MODE_P in this case) so bits in nonzero_bits_mode
+	 are all valid, hence in mode too since nonzero_bits_mode is defined
+	 to the largest HWI_COMPUTABLE_MODE_P mode.  */
+  *nonzero &= rsp->last_set_nonzero_bits;
   return NULL;
 }
 
   tem = get_last_value (x);
-
   if (tem)
 {
   if (SHORT_IMMEDIATES_SIGN_EXTEND)
@@ -9898,7 +9897,8 @@ reg_nonzero_bits_for_combine (const_rtx
 
   return tem;
 }
-  else if (nonzero_sign_valid && rsp->nonzero_bits)
+
+  if (nonzero_sign_valid && rsp->nonzero_bits)
 {
   unsigned HOST_WIDE_INT mask = rsp->nonzero_bits;
 
Index: rtlanal.c
===
--- rtlanal.c	(revision 241856)
+++ rtlanal.c	(working copy)
@@ -4242,7 +4242,7 @@ cached_nonzero_bits (const_rtx x, machin
 /* Given an expression, X, compute which bits in X can be nonzero.
We don't care about bits outside of those defined in MODE.
 
-   For most X this is simply GET_MODE_MASK (GET_MODE (MODE)), but if X is
+   For most X this is simply GET_MODE_MASK (GET_MODE (X)), but if X is
an arithmetic operation, we can do better.  */
 
 static unsigned HOST_WIDE_INT
@@ -4549,18 +4549,17 @@ nonzero_bits1 (const_rtx x, machine_mode
   /* If this is a SUBREG formed for a promoted variable that has
 	 been zero-extended, we know that at least the high-order bits
 	 are zero, though others might be too.  */
-
   if (SUBREG_PROMOTED_VAR_P (x) && SUBREG_PROMOTED_UNSIGNED_P (x))
 	nonzero = GET_MODE_MASK (GET_MODE (x))
 		  & cached_nonzero_bits (SUBREG_REG (x), GET_MODE (x),
 	 known_x, known_mode, known_ret);
 
-  inner_mode = GET_MODE (SUBREG_REG (x));
   /* If the inner mode is a single word f

Re: Ping^6 Re: [Patch AArch64] Add floatdihf2 and floatunsdihf2 patterns

2016-11-07 Thread James Greenhalgh
On Fri, Oct 21, 2016 at 05:31:14PM +0100, James Greenhalgh wrote:
> On Wed, Oct 12, 2016 at 04:56:52PM +0100, James Greenhalgh wrote:
> > On Wed, Sep 28, 2016 at 05:17:14PM +0100, James Greenhalgh wrote:
> > > On Wed, Sep 21, 2016 at 10:42:03AM +0100, James Greenhalgh wrote:
> > > > On Tue, Sep 13, 2016 at 10:31:28AM +0100, James Greenhalgh wrote:
> > > > > On Tue, Sep 06, 2016 at 10:19:50AM +0100, James Greenhalgh wrote:
> > > > > > This patch adds patterns for conversion from 64-bit integer to 
> > > > > > 16-bit
> > > > > > floating-point values under AArch64 targets which don't have 
> > > > > > support for
> > > > > > the ARMv8.2-A 16-bit floating point extensions.
> > > > > > 
> > > > > > We implement these by first saturating to a SImode (we know that any
> > > > > > values >= 65504 will round to infinity after conversion to HFmode), 
> > > > > > then
> > > > > > converting to a DFmode (unsigned conversions could go to SFmode, 
> > > > > > but there
> > > > > > is no performance benefit to this). Then converting to HFmode.
> > > > > > 
> > > > > > Having added these patterns, the expansion path in "expand_float" 
> > > > > > will
> > > > > > now try to use them for conversions from SImode to HFmode as there 
> > > > > > is no
> > > > > > floatsihf2 pattern. expand_float first tries widening the integer 
> > > > > > size and
> > > > > > looking for a match, so it will try SImode -> DImode. But our DI 
> > > > > > mode
> > > > > > pattern is going to then saturate us back to SImode which is 
> > > > > > wasteful.
> > > > > > 
> > > > > > Better, would be for us to provide float(uns)sihf2 patterns 
> > > > > > directly.
> > > > > > So that's what this patch does.
> > > > > > 
> > > > > > The testcase add in this patch would fail on trunk for AArch64. 
> > > > > > There is
> > > > > > no libgcc routine to make the conversion, and we don't provide 
> > > > > > appropriate
> > > > > > patterns in the backend, so we get a link-time error.
> > > > > > 
> > > > > > Bootstrapped and tested on aarch64-none-linux-gnu
> > > > > > 
> > > > > > OK for trunk?
> > > > > 
> > > > > Ping.
> > > > 
> > > > Ping^2
> > > 
> > > Ping^3
> > 
> > Ping^4
> 
> Ping^5

Ping^6

Thanks,
James

> > > > > > 2016-09-06  James Greenhalgh  
> > > > > > 
> > > > > > * config/aarch64/aarch64.md (sihf2): Convert to expand.
> > > > > > (dihf2): Likewise.
> > > > > > (aarch64_fp16_hf2): New.
> > > > > > 
> > > > > > 2016-09-06  James Greenhalgh  
> > > > > > 
> > > > > > * gcc.target/aarch64/floatdihf2_1.c: New.
> > > > > > 
> > > > > 
> > > > > > diff --git a/gcc/config/aarch64/aarch64.md 
> > > > > > b/gcc/config/aarch64/aarch64.md
> > > > > > index 6afaf90..1882a72 100644
> > > > > > --- a/gcc/config/aarch64/aarch64.md
> > > > > > +++ b/gcc/config/aarch64/aarch64.md
> > > > > > @@ -4630,7 +4630,14 @@
> > > > > >[(set_attr "type" "f_cvti2f")]
> > > > > >  )
> > > > > >  
> > > > > > -(define_insn "hf2"
> > > > > > +;; If we do not have ARMv8.2-A 16-bit floating point extensions, 
> > > > > > the
> > > > > > +;; midend will arrange for an SImode conversion to HFmode to first 
> > > > > > go
> > > > > > +;; through DFmode, then to HFmode.  But first it will try 
> > > > > > converting
> > > > > > +;; to DImode then down, which would match our DImode pattern below 
> > > > > > and
> > > > > > +;; give very poor code-generation.  So, we must provide our own 
> > > > > > emulation
> > > > > > +;; of the mid-end logic.
> > > > > > +
> > > > > > +(define_insn "aarch64_fp16_hf2"
> > > > > >[(set (match_operand:HF 0 "register_operand" "=w")
> > > > > > (FLOATUORS:HF (match_operand:GPI 1 "register_operand" "r")))]
> > > > > >"TARGET_FP_F16INST"
> > > > > > @@ -4638,6 +4645,53 @@
> > > > > >[(set_attr "type" "f_cvti2f")]
> > > > > >  )
> > > > > >  
> > > > > > +(define_expand "sihf2"
> > > > > > +  [(set (match_operand:HF 0 "register_operand")
> > > > > > +   (FLOATUORS:HF (match_operand:SI 1 "register_operand")))]
> > > > > > +  "TARGET_FLOAT"
> > > > > > +{
> > > > > > +  if (TARGET_FP_F16INST)
> > > > > > +emit_insn (gen_aarch64_fp16_sihf2 (operands[0], 
> > > > > > operands[1]));
> > > > > > +  else
> > > > > > +{
> > > > > > +  rtx convert_target = gen_reg_rtx (DFmode);
> > > > > > +  emit_insn (gen_sidf2 (convert_target, operands[1]));
> > > > > > +  emit_insn (gen_truncdfhf2 (operands[0], convert_target));
> > > > > > +}
> > > > > > +  DONE;
> > > > > > +}
> > > > > > +)
> > > > > > +
> > > > > > +;; For DImode there is no wide enough floating-point mode that we
> > > > > > +;; can convert through natively (TFmode would work, but requires a 
> > > > > > library
> > > > > > +;; call).  However, we know that any value >= 65504 will be rounded
> > > > > > +;; to infinity on conversion.  This is well within the range of 
> > > > > > SImode, so
> > > > > > +;; we can:
> > > > > > +;;   Saturate to SImode.
> > > > > > +;;   Convert from that to DFmode
> > > > > > +;;   Convert from that

Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns

2016-11-07 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill

On 31/10/16 12:10, Kyrill Tkachov wrote:

Ping.

Thanks,
Kyrill

On 24/10/16 14:12, Kyrill Tkachov wrote:


On 24/10/16 12:29, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01321.html



I just noticed my original ChangeLog entry was truncated.
It is
2016-10-04  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-04  Kyrylo Tkachov  

PR target/77822
* g++.dg/torture/pr77822.C: New test.

Kyrill



On 17/10/16 17:15, Kyrill Tkachov wrote:

Hi all,

For the attached testcase the code ends up trying to extract bits outside the 
range of the normal register
widths. The aarch64 patterns for ubfz and tbnz end up accepting such operands 
and emitting invalid assembly
such as 'ubfx x18,x2,192,32'

The solution is to add proper predicates and guards to the operands of the 
zero_extract operations that are going on.
I had a look at all the other patterns in aarch64 that generate/use 
zero_extract and they all have guards on their
operands in one form or another to avoid them accessing an area that is out of 
range.

With this patch the testcase compiles and assembles fine.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-10-17  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-17  Kyrylo Tkachov  

PR target/77822










Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns

2016-11-07 Thread James Greenhalgh
On Mon, Oct 17, 2016 at 05:15:21PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> For the attached testcase the code ends up trying to extract bits outside the
> range of the normal register widths. The aarch64 patterns for ubfz and tbnz
> end up accepting such operands and emitting invalid assembly
> such as 'ubfx x18,x2,192,32'
> 
> The solution is to add proper predicates and guards to the operands of the
> zero_extract operations that are going on.  I had a look at all the other
> patterns in aarch64 that generate/use zero_extract and they all have guards
> on their
> operands in one form or another to avoid them accessing an area that is out
> of range.
> 
> With this patch the testcase compiles and assembles fine.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk?

Ok, sorry for the delay on review.

Thanks,
James

> 2016-10-17  Kyrylo Tkachov  
> 
> PR target/77822
> * config/aarch64/aarch64.md (*tb1): Use
> aarch64_simd_shift_imm_ predicate for operand 1.
> (, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
> to restrict them to an appropriate range and add FAIL check if the
> region they specify is out of range.  Delete useless constraint
> strings.
> (*, ANY_EXTRACT): Add appropriate predicates on operands
> 2 and 3 to restrict their range and add pattern predicate.
> 



[PATCH] Fix PR78228

2016-11-07 Thread Richard Biener

The following fixes phiopt to not introduce undefined behavior
in its abs replacement code in case we negate only positive values
in the original code.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78228
* tree-ssa-phiopt.c (abs_replacement): Avoid introducing
undefined behavior.

* gcc.dg/tree-ssa/phi-opt-15.c: New testcase.

Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 241891)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
   else
 negate = false;
 
+  /* If the code negates only iff positive then make sure to not
+ introduce undefined behavior when negating or computing the absolute.
+ ???  We could use range info if present to check for arg1 == INT_MIN.  */
+  if (negate
+  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
+ && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
+return false;
+
   result = duplicate_ssa_name (result, NULL);
 
   if (negate)
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int i)
+{
+  if (i > 0)
+i = -i;
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-not "ABS" "optimized" } } */


[PATCH] Fix PR78218

2016-11-07 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78218
* gimple-ssa-store-merging.c
(pass_store_merging::terminate_all_aliasing_chains):
Drop unused argument, fix alias check to also consider uses.
(pass_store_merging::execute): Adjust.

* gcc.dg/torture/pr78218.c: New testcase.

Index: gcc/gimple-ssa-store-merging.c
===
--- gcc/gimple-ssa-store-merging.c  (revision 241893)
+++ gcc/gimple-ssa-store-merging.c  (working copy)
@@ -726,7 +726,7 @@ private:
   hash_map m_stores;
 
   bool terminate_and_process_all_chains ();
-  bool terminate_all_aliasing_chains (tree, imm_store_chain_info **,
+  bool terminate_all_aliasing_chains (imm_store_chain_info **,
  bool, gimple *);
   bool terminate_and_release_chain (imm_store_chain_info *);
 }; // class pass_store_merging
@@ -755,8 +755,7 @@ pass_store_merging::terminate_and_proces
If that is the case we have to terminate any chain anchored at BASE.  */
 
 bool
-pass_store_merging::terminate_all_aliasing_chains (tree dest,
-  imm_store_chain_info
+pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
 **chain_info,
   bool var_offset_p,
   gimple *stmt)
@@ -788,7 +787,10 @@ pass_store_merging::terminate_all_aliasi
  unsigned int i;
  FOR_EACH_VEC_ELT ((*chain_info)->m_store_info, i, info)
{
- if (stmt_may_clobber_ref_p (info->stmt, dest))
+ if (ref_maybe_used_by_stmt_p (stmt,
+   gimple_assign_lhs (info->stmt))
+ || stmt_may_clobber_ref_p (stmt,
+gimple_assign_lhs (info->stmt)))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -1458,7 +1460,7 @@ pass_store_merging::execute (function *f
}
 
  /* Store aliases any existing chain?  */
- terminate_all_aliasing_chains (lhs, chain_info, false, stmt);
+ terminate_all_aliasing_chains (chain_info, false, stmt);
  /* Start a new chain.  */
  struct imm_store_chain_info *new_chain
= new imm_store_chain_info (base_addr);
@@ -1477,13 +1479,13 @@ pass_store_merging::execute (function *f
}
}
  else
-   terminate_all_aliasing_chains (lhs, chain_info,
+   terminate_all_aliasing_chains (chain_info,
   offset != NULL_TREE, stmt);
 
  continue;
}
 
- terminate_all_aliasing_chains (NULL_TREE, NULL, false, stmt);
+ terminate_all_aliasing_chains (NULL, false, stmt);
}
   terminate_and_process_all_chains ();
 }
Index: gcc/testsuite/gcc.dg/torture/pr78218.c
===
--- gcc/testsuite/gcc.dg/torture/pr78218.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr78218.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+
+struct 
+{
+  int v;
+} a[2];
+
+int b; 
+
+void __attribute__((noinline,noclone))
+check ()
+{
+  if (a[0].v != 1)
+__builtin_abort ();
+}
+
+int main ()
+{
+  a[1].v = 1;
+  a[0] = a[1];
+  a[1].v = 0;
+  check (a);
+  return 0;
+}


[PATCH] Fix PR78229

2016-11-07 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and branch.

Richard.

2016-11-07  Richard Biener  

PR target/78229
* config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust
EH info.

* g++.dg/pr78229.C: New testcase.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 241891)
+++ gcc/config/i386/i386.c  (working copy)
@@ -37664,7 +37664,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
  gsi_insert_before (gsi, g, GSI_SAME_STMT);
  g = gimple_build_assign (gimple_call_lhs (stmt), NOP_EXPR, lhs);
  gimple_set_location (g, loc);
- gsi_replace (gsi, g, true);
+ gsi_replace (gsi, g, false);
  return true;
}
   break;
Index: gcc/testsuite/g++.dg/pr78229.C
===
--- gcc/testsuite/g++.dg/pr78229.C  (revision 0)
+++ gcc/testsuite/g++.dg/pr78229.C  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile { target x86_64-*-* i?86-*-* } } */
+/* { dg-options "-O2 -mbmi -w" } */
+
+void a();
+inline int b(int c) {
+int d = c;
+return __builtin_ia32_tzcnt_u32(d);
+}
+struct e {};
+int f, g, h;
+void fn3() {
+float j;
+&j;
+  {
+   e k;
+   while (h) {
+   if (g == 0)
+ continue;
+   int i = b(g);
+   f = i;
+   }
+   a();
+  }
+}


Re: [PATCH] Fix PR78228

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 01:17:25PM +0100, Richard Biener wrote:
> 
> The following fixes phiopt to not introduce undefined behavior
> in its abs replacement code in case we negate only positive values
> in the original code.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> 
> Richard.
> 
> 2016-11-07  Richard Biener  
> 
>   PR tree-optimization/78228
>   * tree-ssa-phiopt.c (abs_replacement): Avoid introducing
>   undefined behavior.
> 
>   * gcc.dg/tree-ssa/phi-opt-15.c: New testcase.
> 
> Index: gcc/tree-ssa-phiopt.c
> ===
> --- gcc/tree-ssa-phiopt.c (revision 241891)
> +++ gcc/tree-ssa-phiopt.c (working copy)
> @@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
>else
>  negate = false;
>  
> +  /* If the code negates only iff positive then make sure to not
> + introduce undefined behavior when negating or computing the absolute.
> + ???  We could use range info if present to check for arg1 == INT_MIN.  
> */

Perhaps just

> +  if (negate
> +  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
> +   && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
{
  wide_int minv = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (arg1)));
  if (!expr_not_equal_to (arg1, minv))
return false;
}
?

Jakub


Re: [PATCH] Fix PR78228

2016-11-07 Thread Richard Biener
On Mon, 7 Nov 2016, Jakub Jelinek wrote:

> On Mon, Nov 07, 2016 at 01:17:25PM +0100, Richard Biener wrote:
> > 
> > The following fixes phiopt to not introduce undefined behavior
> > in its abs replacement code in case we negate only positive values
> > in the original code.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > Richard.
> > 
> > 2016-11-07  Richard Biener  
> > 
> > PR tree-optimization/78228
> > * tree-ssa-phiopt.c (abs_replacement): Avoid introducing
> > undefined behavior.
> > 
> > * gcc.dg/tree-ssa/phi-opt-15.c: New testcase.
> > 
> > Index: gcc/tree-ssa-phiopt.c
> > ===
> > --- gcc/tree-ssa-phiopt.c   (revision 241891)
> > +++ gcc/tree-ssa-phiopt.c   (working copy)
> > @@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
> >else
> >  negate = false;
> >  
> > +  /* If the code negates only iff positive then make sure to not
> > + introduce undefined behavior when negating or computing the absolute.
> > + ???  We could use range info if present to check for arg1 == INT_MIN. 
> >  */
> 
> Perhaps just
> 
> > +  if (negate
> > +  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
> > + && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
> {
>   wide_int minv = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (arg1)));
>   if (!expr_not_equal_to (arg1, minv))
>   return false;
> }
> ?

rather wi::min_value (TREE_TYPE (arg1), SIGNED) I guess.  Didn't know
of expr_not_equal_to, seems to be only used from i386.c at the moment.

We can improve things on trunk but I'd prefer to be safe on the 
branch(es).

Richard.


[PATCH] Fix PR78205 -- fix BB SLP "gap" handling

2016-11-07 Thread Richard Biener

The following moves a overly conservative check that we do not access
excess elements when vectorizing a BB to a place where we can do
a better job with respect to the elements we actually use.

This means that for the included testcase we are not confused
by the read from c[4] but just do not vectorize the stores to x[0]
and x[1].

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78205
* tree-vect-stmts.c (vectorizable_load): Move check whether
we may run into gaps when BB vectorizing SLP permutations ...
* tree-vect-slp.c (vect_supported_load_permutation_p): ...
here where we can do a more precise check.

* gcc.dg/vect/bb-slp-pr78205.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 241893)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -6548,18 +6611,6 @@ vectorizable_load (gimple *stmt, gimple_
   if (slp && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ())
slp_perm = true;
 
-  /* ???  The following is overly pessimistic (as well as the loop
- case above) in the case we can statically determine the excess
-elements loaded are within the bounds of a decl that is accessed.
-Likewise for BB vectorizations using masked loads is a possibility.  */
-  if (bb_vinfo && slp_perm && group_size % nunits != 0)
-   {
- dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-  "BB vectorization with gaps at the end of a load "
-  "is not supported\n");
- return false;
-   }
-
   /* Invalidate assumptions made by dependence analysis when vectorization
 on the unrolled body effectively re-orders stmts.  */
   if (!PURE_SLP_STMT (stmt_info)
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 241893)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1459,6 +1459,25 @@ vect_supported_load_permutation_p (slp_i
SLP_TREE_LOAD_PERMUTATION (node).release ();
  else
{
+ stmt_vec_info group_info
+   = vinfo_for_stmt (SLP_TREE_SCALAR_STMTS (node)[0]);
+ group_info = vinfo_for_stmt (GROUP_FIRST_ELEMENT (group_info));
+ unsigned nunits
+   = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (group_info));
+ unsigned k, maxk = 0;
+ FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (node), j, k)
+   if (k > maxk)
+ maxk = k;
+ /* In BB vectorization we may not actually use a loaded vector
+accessing elements in excess of GROUP_SIZE.  */
+ if (maxk >= (GROUP_SIZE (group_info) & ~(nunits - 1)))
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "BB vectorization with gaps at the end of "
+  "a load is not supported\n");
+ return false;
+   }
+
  /* Verify the permutation can be generated.  */
  vec tem;
  unsigned n_perms;
Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c
===
--- gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c  (revision 0)
+++ gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c  (working copy)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+double x[2], a[4], b[4], c[5];
+
+void foo ()
+{
+  a[0] = c[0];
+  a[1] = c[1];
+  a[2] = c[0];
+  a[3] = c[1];
+  b[0] = c[2];
+  b[1] = c[3];
+  b[2] = c[2];
+  b[3] = c[3];
+  x[0] = c[4];
+  x[1] = c[4];
+}
+
+/* We may not vectorize the store to x[] as it accesses c out-of bounds
+   but we do want to vectorize the other two store groups.  */
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "x\\\[\[0-1\]\\\] = " 2 "optimized" } } */


[patch,avr] Add new option -mabsdata.

2016-11-07 Thread Georg-Johann Lay
This patch adds a new command line option -mabsdata which can be ised to set 
attribute absdata for all data in static storage so it can be accessed by LDS 
and STS instructions.


This is only useful for some reduced Tiny devices like ATtiny40.

For other reduced Tiny where all of SRAM fits LDS / STS, the new option is 
automatically set by the device specs file.


For ordinary devices the option is accepted but has no effect.

Ok for trunk?

Johann


gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.

gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.
Index: config/avr/avr-arch.h
===
--- config/avr/avr-arch.h	(revision 241841)
+++ config/avr/avr-arch.h	(working copy)
@@ -157,7 +157,9 @@ enum avr_device_specific_features
   AVR_ISA_NONE,
   AVR_ISA_RMW = 0x1, /* device has RMW instructions. */
   AVR_SHORT_SP= 0x2, /* Stack Pointer has 8 bits width. */
-  AVR_ERRATA_SKIP = 0x4  /* device has a core erratum. */
+  AVR_ERRATA_SKIP = 0x4, /* device has a core erratum. */
+  AVR_ISA_LDS = 0x8  /* whether LDS / STS is valid for all data in static
+storage.  Only useful for reduced Tiny.  */
 };
 
 /* Map architecture to its texinfo string.  */
Index: config/avr/avr-mcus.def
===
--- config/avr/avr-mcus.def	(revision 241841)
+++ config/avr/avr-mcus.def	(working copy)
@@ -341,11 +341,11 @@ AVR_MCU ("atxmega128a1u",ARCH_AVRXME
 AVR_MCU ("atxmega128a4u",ARCH_AVRXMEGA7, AVR_ISA_RMW,  "__AVR_ATxmega128A4U__",0x2000, 0x0, 3)
 /* Tiny family */
 AVR_MCU ("avrtiny",  ARCH_AVRTINY, AVR_ISA_NONE, NULL, 0x0040, 0x0, 1)
-AVR_MCU ("attiny4",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny4__",0x0040, 0x0, 1)
-AVR_MCU ("attiny5",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny5__",0x0040, 0x0, 1)
-AVR_MCU ("attiny9",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny9__",0x0040, 0x0, 1) 
-AVR_MCU ("attiny10", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny10__",   0x0040, 0x0, 1)
-AVR_MCU ("attiny20", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny20__",   0x0040, 0x0, 1)
+AVR_MCU ("attiny4",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny4__",0x0040, 0x0, 1)
+AVR_MCU ("attiny5",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny5__",0x0040, 0x0, 1)
+AVR_MCU ("attiny9",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny9__",0x0040, 0x0, 1) 
+AVR_MCU ("attiny10", ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny10__",   0x0040, 0x0, 1)
+AVR_MCU ("attiny20", ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny20__",   0x0040, 0x0, 1)
 AVR_MCU ("attiny40", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny40__",   0x0040, 0x0, 1)
 /* Assembler only.  */
 AVR_MCU ("avr1", ARCH_AVR1, AVR_ISA_NONE, NULL,0x0060, 0x0, 1)
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 241841)
+++ config/avr/avr.c	(working copy)
@@ -10182,14 +10182,18 @@ avr_encode_section_info (tree decl, rtx
   && SYMBOL_REF_P (XEXP (rtl, 0)))
 {
   rtx sym = XEXP (rtl, 0);
+  bool progmem_p = -1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl));
 
-  if (-1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl)))
+  if (progmem_p)
 {
   // Tag symbols for later addition of 0x4000 (AVR_TINY_PM_OFFSET).
   SYMBOL_REF_FLAGS (sym) |= AVR_SYMBOL_FLAG_TINY_PM;
 }
 
   if (avr_decl_absdata_p (decl, DECL_ATTRIBUTES (decl))
+  || (TARGET_ABSDATA
+  && !progmem_p
+  && !addr_attr)
   || (addr_attr
   // If addr_attr is non-null, it has an argument.  Peek into it.
   && TREE_INT_CST_LOW (TREE_VALUE (TREE_VALUE (addr_attr))) < 0xc0))
@@ -10198,7 +10202,7 @@ avr_encode_section_info (tree decl, rtx
   SYMBOL_REF_FLAGS (sym) |= AVR_SYMBOL_FLAG_TINY_ABSDATA;
 }
 
-  if (-1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl))
+  if (progmem_p
   && avr_decl_absdata_p (decl, DECL_ATTRIBUTES (decl)))
 {
   error ("%q+D has incompatible attributes %qs and %qs",
Index: config/avr/avr.opt
===
--- config/avr/avr.opt	(revision 241841)
+

Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-07 Thread Christophe Lyon
On 7 November 2016 at 10:14, Segher Boessenkool
 wrote:
> Hi Christophe,
>
> On Fri, Nov 04, 2016 at 02:31:28PM +0100, Christophe Lyon wrote:
>> Since this commit I have noticed execution failures on "old" arm targets:
>>
>>   gcc.dg/torture/pr48124-4.c   -O1  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
>> -flto-partition=none  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
>> -fno-fat-lto-objects  execution test
>>   gcc.dg/torture/pr48124-4.c   -O3 -g  execution test
>>   gcc.dg/torture/pr48124-4.c   -Os  execution test
>>
>> For instance on target arm-none-linux-gnueabi --with-cpu=cortex-a9
>> --with-mode=arm
>> and running the tests with -march=armv5t
>
> Confirmed.  What a nasty, nasty bug, and it has been here for decades
> it seems.  Could you please open a PR?
>
>
Sure, I've created PR78232 for this.

Thanks.

Christophe

> Segher


[PATCH] Fix -O0 AVX512 comparison ICE (PR target/78227)

2016-11-07 Thread Jakub Jelinek
Hi!

The following testcases ICE at -O0, because ix86_expand_sse_cmp avoid using
the passed in dest only if optimize or if there is some value overlap, but
we actually need to do that also if we have a maskcmp where we want to use
a different mode than dest has.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-07  Jakub Jelinek  

PR target/78227
* config/i386/i386.c (ix86_expand_sse_cmp): Force dest into
cmp_mode argument even for -O0 if cmp_mode != mode and maskcmp.

* gcc.target/i386/pr78227-1.c: New test.
* gcc.target/i386/pr78227-2.c: New test.

--- gcc/config/i386/i386.c.jj   2016-11-04 20:09:48.0 +0100
+++ gcc/config/i386/i386.c  2016-11-07 10:14:15.625018144 +0100
@@ -23561,6 +23561,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_
 cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
 
   if (optimize
+  || (cmp_mode != mode && maskcmp)
   || (op_true && reg_overlap_mentioned_p (dest, op_true))
   || (op_false && reg_overlap_mentioned_p (dest, op_false)))
 dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
--- gcc/testsuite/gcc.target/i386/pr78227-1.c.jj2016-11-07 
10:15:52.606762613 +0100
+++ gcc/testsuite/gcc.target/i386/pr78227-1.c   2016-11-07 10:24:58.821480125 
+0100
@@ -0,0 +1,30 @@
+/* PR target/78227 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -O0 -Wno-psabi" } */
+
+typedef int V __attribute__((vector_size (64)));
+typedef long long int W __attribute__((vector_size (64)));
+
+V
+foo1 (V v)
+{
+  return v > 0;
+}
+
+V
+bar1 (V v)
+{
+  return v != 0;
+}
+
+W
+foo2 (W w)
+{
+  return w > 0;
+}
+
+W
+bar2 (W w)
+{
+  return w != 0;
+}
--- gcc/testsuite/gcc.target/i386/pr78227-2.c.jj2016-11-07 
10:22:17.055670476 +0100
+++ gcc/testsuite/gcc.target/i386/pr78227-2.c   2016-11-07 10:25:03.722413765 
+0100
@@ -0,0 +1,30 @@
+/* PR target/78227 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O0 -Wno-psabi" } */
+
+typedef signed char V __attribute__((vector_size (64)));
+typedef short int W __attribute__((vector_size (64)));
+
+V
+foo1 (V v)
+{
+  return v > 0;
+}
+
+V
+bar1 (V v)
+{
+  return v != 0;
+}
+
+W
+foo2 (W w)
+{
+  return w > 0;
+}
+
+W
+bar2 (W w)
+{
+  return w != 0;
+}

Jakub


Re: [patch,avr] Add new option -mabsdata.

2016-11-07 Thread Georg-Johann Lay

On 07.11.2016 13:54, Georg-Johann Lay wrote:

This patch adds a new command line option -mabsdata which can be ised to set
attribute absdata for all data in static storage so it can be accessed by LDS
and STS instructions.

This is only useful for some reduced Tiny devices like ATtiny40.

For other reduced Tiny where all of SRAM fits LDS / STS, the new option is
automatically set by the device specs file.

For ordinary devices the option is accepted but has no effect.

Ok for trunk?

Johann


gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.

gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.


Here is the complete lag entry (avr-arch.h was missing):

gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr-arch.h (avr_device_specific_features): Add AVR_ISA_LDS.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.
gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.



[committed] Move 3 gcc.target/i386/*.C tests

2016-11-07 Thread Jakub Jelinek
Hi!

Richard noticed 3 misplaced tests - C++ tests don't belong into
gcc.target/ which tests just C.

I've bootstrapped/regtested this on x86_64-linux and i686-linux and
committed to trunk as obvious.

2016-11-07  Jakub Jelinek  

PR middle-end/71529
* gcc.target/i386/pr71529.C: Moved to ...
* g++.dg/opt/pr71529.C: ... here.  New test.  Guard for i?86/x86_64.

PR target/64411
* gcc.target/i386/pr64411.C: Moved to ...
* g++.dg/opt/pr64411.C: ... here.  New test.  Guard for i?86/x86_64
lp64.

PR target/65105
* gcc.target/i386/pr65105-4.C: Moved to ...
* g++.dg/opt/pr65105-4.C: ... here.  New test.  Guard for i?86/x86_64.
Run into compile test rather than execute test.

--- gcc/testsuite/gcc.target/i386/pr71529.C.jj  2016-06-15 19:09:09.0 
+0200
+++ gcc/testsuite/gcc.target/i386/pr71529.C 2016-11-07 10:56:21.835713206 
+0100
@@ -1,22 +0,0 @@
-/* PR71529 */
-/* { dg-do compile { target { ! x32 } } } */
-/* { dg-options "-fcheck-pointer-bounds -mmpx -O2" } */
-
-class c1
-{
- public:
-  virtual ~c1 ();
-};
-
-class c2
-{
- public:
-  virtual ~c2 ();
-};
-
-class c3 : c1, c2 { };
-
-int main (int, char **)
-{
-  c3 obj;
-}
--- gcc/testsuite/gcc.target/i386/pr64411.C.jj  2016-03-15 17:10:18.0 
+0100
+++ gcc/testsuite/gcc.target/i386/pr64411.C 2016-11-07 10:54:34.485101960 
+0100
@@ -1,27 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-Os -mcmodel=medium -fPIC -fschedule-insns 
-fselective-scheduling" } */
-
-typedef __SIZE_TYPE__ size_t;
-
-extern "C"  long strtol ()
-  { return 0; }
-
-static struct {
-  void *sp[2];
-} info;
-
-union S813
-{
-  void * c[5];
-}
-s813;
-
-S813 a813[5];
-S813 check813 (S813, S813 *, S813);
-
-void checkx813 ()
-{
-  __builtin_memset (&s813, '\0', sizeof (s813));
-  __builtin_memset (&info, '\0', sizeof (info));
-  check813 (s813, &a813[1], a813[2]);
-}
--- gcc/testsuite/gcc.target/i386/pr65105-4.C.jj2015-10-11 
19:11:14.214767354 +0200
+++ gcc/testsuite/gcc.target/i386/pr65105-4.C   2016-11-07 10:51:05.333808029 
+0100
@@ -1,19 +0,0 @@
-/* PR target/pr65105 */
-/* { dg-do run { target { ia32 } } } */
-/* { dg-options "-O2 -march=slm" } */
-
-struct s {
-  long long l1, l2, l3, l4, l5;
-} *a;
-long long b;
-long long fn1()
-{
-  try
-{
-  b = (a->l1 | a->l2 | a->l3 | a->l4 | a->l5);
-  return a->l1;
-}
-  catch (int)
-{
-}
-}
--- gcc/testsuite/g++.dg/opt/pr71529.C.jj   2016-11-07 10:55:34.151330081 
+0100
+++ gcc/testsuite/g++.dg/opt/pr71529.C  2016-11-07 10:56:13.319823373 +0100
@@ -0,0 +1,22 @@
+// PR middle-end/71529
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && { ! x32 } } } }
+// { dg-options "-fcheck-pointer-bounds -mmpx -O2" }
+
+class c1
+{
+ public:
+  virtual ~c1 ();
+};
+
+class c2
+{
+ public:
+  virtual ~c2 ();
+};
+
+class c3 : c1, c2 { };
+
+int main (int, char **)
+{
+  c3 obj;
+}
--- gcc/testsuite/g++.dg/opt/pr64411.C.jj   2016-11-07 10:51:38.557378145 
+0100
+++ gcc/testsuite/g++.dg/opt/pr64411.C  2016-11-07 10:54:13.115378412 +0100
@@ -0,0 +1,28 @@
+// PR target/64411
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && lp64 } } }
+// { dg-options "-Os -mcmodel=medium -fPIC -fschedule-insns 
-fselective-scheduling" }
+
+typedef __SIZE_TYPE__ size_t;
+
+extern "C"  long strtol ()
+  { return 0; }
+
+static struct {
+  void *sp[2];
+} info;
+
+union S813
+{
+  void * c[5];
+}
+s813;
+
+S813 a813[5];
+S813 check813 (S813, S813 *, S813);
+
+void checkx813 ()
+{
+  __builtin_memset (&s813, '\0', sizeof (s813));
+  __builtin_memset (&info, '\0', sizeof (info));
+  check813 (s813, &a813[1], a813[2]);
+}
--- gcc/testsuite/g++.dg/opt/pr65105-4.C.jj 2016-11-07 10:48:58.587448018 
+0100
+++ gcc/testsuite/g++.dg/opt/pr65105-4.C2016-11-07 10:50:52.066979690 
+0100
@@ -0,0 +1,19 @@
+// PR target/65105
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && ia32 } } }
+// { dg-options "-O2 -march=slm" }
+
+struct s {
+  long long l1, l2, l3, l4, l5;
+} *a;
+long long b;
+long long fn1()
+{
+  try
+{
+  b = (a->l1 | a->l2 | a->l3 | a->l4 | a->l5);
+  return a->l1;
+}
+  catch (int)
+{
+}
+}

Jakub


Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-07 Thread Segher Boessenkool
On Mon, Nov 07, 2016 at 02:00:46PM +0100, Christophe Lyon wrote:
> > Confirmed.  What a nasty, nasty bug, and it has been here for decades
> > it seems.  Could you please open a PR?
> >
> Sure, I've created PR78232 for this.

Thanks!  I have a patch btw, it's regstrapping.  Not sure it is fully
correct (whether it handles all possible cases), but hey.


Segher


[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 3)

2016-11-07 Thread Jakub Jelinek
On Fri, Nov 04, 2016 at 08:07:37PM +0100, Richard Biener wrote:
> >If/once this is in, I'm planning to test/submit a patch adding
> >  /* If one decl is known to be a function or label in a function and
> > the other is some kind of data, they can't overlap.  */
> >  if ((TREE_CODE (exprx) == FUNCTION_DECL
> >   || TREE_CODE (exprx) == LABEL_DECL)
> >  != (TREE_CODE (expry) == FUNCTION_DECL
> >   || TREE_CODE (expry) == LABEL_DECL))
> >return 1;
> >before that.
> >
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK for trunk and branches (if appropriate)

And here is the incremental patch to disambiguate between code section
objects and variables.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (only)?

2016-11-07  Jakub Jelinek  

PR target/77834
* alias.c (nonoverlapping_memrefs_p): If one decl is
FUNCTION_DECL or LABEL_DECL and the other is not, return 1.

--- gcc/alias.c.jj  2016-11-04 20:13:32.0 +0100
+++ gcc/alias.c 2016-11-07 11:18:57.982160034 +0100
@@ -2755,6 +2755,14 @@ nonoverlapping_memrefs_p (const_rtx x, c
   || TREE_CODE (expry) == CONST_DECL)
 return 1;
 
+  /* If one decl is known to be a function or label in a function and
+ the other is some kind of data, they can't overlap.  */
+  if ((TREE_CODE (exprx) == FUNCTION_DECL
+   || TREE_CODE (exprx) == LABEL_DECL)
+  != (TREE_CODE (expry) == FUNCTION_DECL
+ || TREE_CODE (expry) == LABEL_DECL))
+return 1;
+
   /* If either of the decls doesn't have DECL_RTL set (e.g. marked as
  living in multiple places), we can't tell anything.  Exception
  are FUNCTION_DECLs for which we can create DECL_RTL on demand.  */
@@ -2804,7 +2812,7 @@ nonoverlapping_memrefs_p (const_rtx x, c
 
   /* Offset based disambiguation not appropriate for loop invariant */
   if (loop_invariant)
-return 0;  
+return 0;
 
   /* Offset based disambiguation is OK even if we do not know that the
  declarations are necessarily different


Jakub


Re: [PATCH] rs6000: Do swdiv at expand time

2016-11-07 Thread David Edelsohn
On Mon, Nov 7, 2016 at 4:32 AM, Segher Boessenkool
 wrote:
> We transform floating point divide instructions to a faster series of
> simple instructions, "swdiv".  Currently we do not do that until the
> first splitter pass, which is much too late for most optimisations
> that can happen on those new instructions, e.g. the constant loads
> are not CSEd inside an unrolled loop.  This patch changes things so
> those divide instructions are expanded during expand already.
>
> Bootstrapped and tested on powerpc64-linux; Bill has run SPEC on it,
> and if anything it shows a slight improvement.
>
> Is this okay for trunk?

Okay.

But commenting on the ChangeLog entry is half the fun!

- David


Re: [PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 3)

2016-11-07 Thread Richard Biener
On Mon, 7 Nov 2016, Jakub Jelinek wrote:

> On Fri, Nov 04, 2016 at 08:07:37PM +0100, Richard Biener wrote:
> > >If/once this is in, I'm planning to test/submit a patch adding
> > >  /* If one decl is known to be a function or label in a function and
> > > the other is some kind of data, they can't overlap.  */
> > >  if ((TREE_CODE (exprx) == FUNCTION_DECL
> > >   || TREE_CODE (exprx) == LABEL_DECL)
> > >  != (TREE_CODE (expry) == FUNCTION_DECL
> > > || TREE_CODE (expry) == LABEL_DECL))
> > >return 1;
> > >before that.
> > >
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > OK for trunk and branches (if appropriate)
> 
> And here is the incremental patch to disambiguate between code section
> objects and variables.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (only)?

Ok.

Richard.

> 2016-11-07  Jakub Jelinek  
> 
>   PR target/77834
>   * alias.c (nonoverlapping_memrefs_p): If one decl is
>   FUNCTION_DECL or LABEL_DECL and the other is not, return 1.
> 
> --- gcc/alias.c.jj2016-11-04 20:13:32.0 +0100
> +++ gcc/alias.c   2016-11-07 11:18:57.982160034 +0100
> @@ -2755,6 +2755,14 @@ nonoverlapping_memrefs_p (const_rtx x, c
>|| TREE_CODE (expry) == CONST_DECL)
>  return 1;
>  
> +  /* If one decl is known to be a function or label in a function and
> + the other is some kind of data, they can't overlap.  */
> +  if ((TREE_CODE (exprx) == FUNCTION_DECL
> +   || TREE_CODE (exprx) == LABEL_DECL)
> +  != (TREE_CODE (expry) == FUNCTION_DECL
> +   || TREE_CODE (expry) == LABEL_DECL))
> +return 1;
> +
>/* If either of the decls doesn't have DECL_RTL set (e.g. marked as
>   living in multiple places), we can't tell anything.  Exception
>   are FUNCTION_DECLs for which we can create DECL_RTL on demand.  */
> @@ -2804,7 +2812,7 @@ nonoverlapping_memrefs_p (const_rtx x, c
>  
>/* Offset based disambiguation not appropriate for loop invariant */
>if (loop_invariant)
> -return 0;  
> +return 0;
>  
>/* Offset based disambiguation is OK even if we do not know that the
>   declarations are necessarily different
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH][AArch64] Optimized implementation of search_line_fast for the CPP lexer

2016-11-07 Thread Richard Earnshaw (lists)
This patch contains an implementation of search_line_fast for the CPP
lexer.  It's based in part on the AArch32 (ARM) code but incorporates
new instructions available in AArch64 (reduction add operations) plus
some tricks for reducing the realignment overheads.  We assume a page
size of 4k, but that's a safe assumption -- AArch64 systems can never
have a smaller page size than that: on systems with larger pages we will
go through the realignment code more often than strictly necessary, but
it's still likely to be in the noise (less than 0.5% of the time).
Bootstrapped on aarch64-none-linux-gnu.


Although this is AArch64 specific and therefore I don't think it
requires approval from anyone else, I'll wait 24 hours for comments.

* lex.c (search_line_fast): New implementation for AArch64.

R.
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 6f65fa1..cea8848 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -752,6 +752,101 @@ search_line_fast (const uchar *s, const uchar *end 
ATTRIBUTE_UNUSED)
   }
 }
 
+#elif defined (__ARM_NEON) && defined (__ARM_64BIT_STATE)
+#include "arm_neon.h"
+
+/* This doesn't have to be the exact page size, but no system may use
+   a size smaller than this.  ARMv8 requires a minimum page size of
+   4k.  The impact of being conservative here is a small number of
+   cases will take the slightly slower entry path into the main
+   loop.  */
+
+#define AARCH64_MIN_PAGE_SIZE 4096
+
+static const uchar *
+search_line_fast (const uchar *s, const uchar *end ATTRIBUTE_UNUSED)
+{
+  const uint8x16_t repl_nl = vdupq_n_u8 ('\n');
+  const uint8x16_t repl_cr = vdupq_n_u8 ('\r');
+  const uint8x16_t repl_bs = vdupq_n_u8 ('\\');
+  const uint8x16_t repl_qm = vdupq_n_u8 ('?');
+  const uint8x16_t xmask = (uint8x16_t) vdupq_n_u64 (0x8040201008040201ULL);
+
+#ifdef __AARCH64EB
+  const int16x8_t shift = {8, 8, 8, 8, 0, 0, 0, 0};
+#else
+  const int16x8_t shift = {0, 0, 0, 0, 8, 8, 8, 8};
+#endif
+
+  unsigned int found;
+  const uint8_t *p;
+  uint8x16_t data;
+  uint8x16_t t;
+  uint16x8_t m;
+  uint8x16_t u, v, w;
+
+  /* Align the source pointer.  */
+  p = (const uint8_t *)((uintptr_t)s & -16);
+
+  /* Assuming random string start positions, with a 4k page size we'll take
+ the slow path about 0.37% of the time.  */
+  if (__builtin_expect ((AARCH64_MIN_PAGE_SIZE
+- (((uintptr_t) s) & (AARCH64_MIN_PAGE_SIZE - 1)))
+   < 16, 0))
+{
+  /* Slow path: the string starts near a possible page boundary.  */
+  uint32_t misalign, mask;
+
+  misalign = (uintptr_t)s & 15;
+  mask = (-1u << misalign) & 0x;
+  data = vld1q_u8 (p);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+  t = vandq_u8 (t, xmask);
+  m = vpaddlq_u8 (t);
+  m = vshlq_u16 (m, shift);
+  found = vaddvq_u16 (m);
+  found &= mask;
+  if (found)
+   return (const uchar*)p + __builtin_ctz (found);
+}
+  else
+{
+  data = vld1q_u8 ((const uint8_t *) s);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+  if (__builtin_expect (vpaddd_u64 ((uint64x2_t)t), 0))
+   goto done;
+}
+
+  do
+{
+  p += 16;
+  data = vld1q_u8 (p);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+} while (!vpaddd_u64 ((uint64x2_t)t));
+
+done:
+  /* Now that we've found the terminating substring, work out precisely where
+ we need to stop.  */
+  t = vandq_u8 (t, xmask);
+  m = vpaddlq_u8 (t);
+  m = vshlq_u16 (m, shift);
+  found = vaddvq_u16 (m);
+  return (uintptr_t) p) < (uintptr_t) s) ? s : (const uchar *)p)
+ + __builtin_ctz (found));
+}
+
 #elif defined (__ARM_NEON)
 #include "arm_neon.h"
 


Re: Add missing symbols for versioned namespace

2016-11-07 Thread Jonathan Wakely

On 03/11/16 21:54 +0100, François Dumont wrote:

Hi

   I might not be the right one to propose this patch as I am not 
sure that I fully understand gnu-versioned-namespace.ver organization. 
But with it following test failures when using versioned namespace 
vanish:


FAIL: 20_util/allocator/overaligned.cc (test for excess errors)
FAIL: ext/bitmap_allocator/overaligned.cc (test for excess errors)
FAIL: ext/mt_allocator/overaligned.cc (test for excess errors)
FAIL: ext/new_allocator/overaligned.cc (test for excess errors)
FAIL: ext/pool_allocator/overaligned.cc (test for excess errors)

   Ok to commit ?


This looks correct. OK for trunk, thanks.



[AArch64][GCC][PATCHv2 1/3] Add missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina
Hi all,

This patch (1 of 3) adds the following NEON intrinsics
to the Aarch64 back-end of GCC:

* vsli_n_p64
* vsliq_n_p64

* vld1_p64
* vld1q_p64
* vld1_dup_p64
* vld1q_dup_p64

* vst1_p64
* vst1q_p64
  
* vld2_p64
* vld3_p64
* vld4_p64
* vld2q_p64
* vld3q_p64
* vld4q_p64

* vld2_dup_p64
* vld3_dup_p64james.greenha...@arm.com
* vld4_dup_p64

* __aarch64_vdup_lane_p64
* __aarch64_vdup_laneq_p64
* __aarch64_vdupq_lane_p64
* __aarch64_vdupq_laneq_p64

* vget_lane_p64
* vgetq_lane_p64

* vreinterpret_p8_p64
* vreinterpretq_p8_p64
* vreinterpret_p16_p64
* vreinterpretq_p16_p64

* vreinterpret_p64_f16
* vreinterpret_p64_f64
* vreinterpret_p64_s8
* vreinterpret_p64_s16
* vreinterpret_p64_s32
* vreinterpret_p64_s64
* vreinterpret_p64_f32
* vreinterpret_p64_u8
* vreinterpret_p64_u16
* vreinterpret_p64_u32
* vreinterpret_p64_u64
* vreinterpret_p64_p8

* vreinterpretq_p64_f64
* vreinterpretq_p64_s8
* vreinterpretq_p64_s16
* vreinterpretq_p64_s32
* vreinterpretq_p64_s64
* vreinterpretq_p64_f16
* vreinterpretq_p64_f32
* vreinterpretq_p64_u8
* vreinterpretq_p64_u16
* vreinterpretq_p64_u32
* vreinterpretq_p64_u64
* vreinterpretq_p64_p8

* vreinterpret_f16_p64
* vreinterpretq_f16_p64
* vreinterpret_f32_p64
* vreinterpretq_f32_p64
* vreinterpret_f64_p64
* vreinterpretq_f64_p64
* vreinterpret_s64_p64
* vreinterpretq_s64_p64
* vreinterpret_u64_p64
* vreinterpretq_u64_p64
* vreinterpret_s8_p64
* vreinterpretq_s8_p64
* vreinterpret_s16_p64
* vreinterpret_s32_p64
* vreinterpretq_s32_p64
* vreinterpret_u8_p64
* vreinterpret_u16_p64
* vreinterpretq_u16_p64
* vreinterpret_u32_p64
* vreinterpretq_u32_p64

* vset_lane_p64
* vsetq_lane_p64

* vget_low_p64
* vget_high_p64

* vcombine_p64
* vcreate_p64

* vst2_lane_p64
* vst3_lane_p64
* vst4_lane_p64
* vst2q_lane_p64
* vst3q_lane_p64
* vst4q_lane_p64

* vget_lane_p64
* vget_laneq_p64
* vset_lane_p64
* vset_laneq_p64

* vcopy_lane_p64
* vcopy_laneq_p64  

* vdup_n_p64
* vdupq_n_p64
* vdup_lane_p64
* vdup_laneq_p64

* vld1_p64
* vld1q_p64
* vld1_dup_p64
* vld1q_dup_p64
* vld1q_dup_p64
* vmov_n_p64
* vmovq_n_p64
* vst3q_p64
* vst4q_p64

* vld1_lane_p64
* vld1q_lane_p64
* vst1_lane_p64
* vst1q_lane_p64
* vcopy_laneq_p64
* vcopyq_laneq_p64
* vdupq_laneq_p64

Added new tests for these and ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar

gcc/
2016-11-04  Tamar Christina  

* config/aarch64/aarch64-builtins.c (TYPES_SETREGP): Added poly type.
(TYPES_GETREGP): Likewise.
(TYPES_SHIFTINSERTP): Likewise.
(TYPES_COMBINEP): Likewise.
(TYPES_STORE1P): Likewise.
* config/aarch64/aarch64-simd-builtins.def
(combine): Added poly generator.
(get_dregoi): Likewise.
(get_dregci): Likewise.
(get_dregxi): Likewise.
(ssli_n): Likewise.
(ld1): Likewise.
(st1): Likewise.
* config/aarch64/arm_neon.h
(poly64x1x2_t, poly64x1x3_t): New.
(poly64x1x4_t, poly64x2x2_t): Likewise.
(poly64x2x3_t, poly64x2x4_t): Likewise.
(poly64x1_t): Likewise.
(vcreate_p64, vcombine_p64): Likewise.
(vdup_n_p64, vdupq_n_p64): Likewise.
(vld2_p64, vld2q_p64): Likewise.
(vld3_p64, vld3q_p64): Likewise.
(vld4_p64, vld4q_p64): Likewise.
(vld2_dup_p64, vld3_dup_p64): Likewise.
(vld4_dup_p64, vsli_n_p64): Likewise.
(vsliq_n_p64, vst1_p64): Likewise.
(vst1q_p64, vst2_p64): Likewise.
(vst3_p64, vst4_p64): Likewise.
(__aarch64_vdup_lane_p64, __aarch64_vdup_laneq_p64): Likewise.
(__aarch64_vdupq_lane_p64, __aarch64_vdupq_laneq_p64): Likewise.
(vget_lane_p64, vgetq_lane_p64): Likewise.
(vreinterpret_p8_p64, vreinterpretq_p8_p64): Likewise.
(vreinterpret_p16_p64, vreinterpretq_p16_p64): Likewise.
(vreinterpret_p64_f16, vreinterpret_p64_f64): Likewise.
(vreinterpret_p64_s8, vreinterpret_p64_s16): Likewise.
(vreinterpret_p64_s32, vreinterpret_p64_s64): Likewise.
(vreinterpret_p64_f32, vreinterpret_p64_u8): Likewise.
(vreinterpret_p64_u16, vreinterpret_p64_u32): Likewise.
(vreinterpret_p64_u64, vreinterpret_p64_p8): Likewise.
(vreinterpretq_p64_f64, vreinterpretq_p64_s8): Likewise.
(vreinterpretq_p64_s16, vreinterpretq_p64_s32): Likewise.
(vreinterpretq_p64_s64, vreinterpretq_p64_f16): Likewise.
(vreinterpretq_p64_f32, vreinterpretq_p64_u8): Likewise.
(vreinterpretq_p64_u16, vreinterpretq_p64_u32): Likewise.
(vreinterpretq_p64_u64, vreinterpretq_p64_p8): Likewise.
(vreinterpret_f16_p64, vreinterpretq_f16_p64): Likewise.
(vreinterpret_f32_p64, vreinterpretq_f32_p64): Likewise.
(vreinterpret_f64_p64, vreinterpretq_f64_p64): Likewise.
(vreinterpret_s64_p64, vreinterpretq_s64_p64): Likewise.
(vreinterpret_u64_p64, vreinterpretq_u64_p64): Likewise.
(vreinterpret_s8_p64, vreinterpretq_s8_p64): Li

[AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina
Hi all,

This patch (3 of 3) adds updates tests for the NEON intrinsics
added by the previous patches:

Ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar


gcc/testsuite/
2016-11-04  Tamar Christina  

* gcc.target/aarch64/advsimd-intrinsics/p64.c: New.
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(Poly64x1_t, Poly64x2_t): Added type.
(AARCH64_ONLY): Added macro.
* gcc.target/aarch64/advsimd-intrinsics/vcombine.c:
Added test for Poly64.
* gcc.target/aarch64/advsimd-intrinsics/vcreate.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdup-vmov.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdup_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_high.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_low.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX_dup.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vstX_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c:
Added AArch64 flags.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c:
Added Aarch64 flags.diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
index 462141586b3db7c5256c74b08fa0449210634226..174c1948221025b860aaac503354b406fa804007 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
@@ -32,6 +32,13 @@ extern size_t strlen(const char *);
VECT_VAR(expected, int, 16, 4) -> expected_int16x4
VECT_VAR_DECL(expected, int, 16, 4) -> int16x4_t expected_int16x4
 */
+/* Some instructions don't exist on ARM.
+   Use this macro to guard against them.  */
+#ifdef __aarch64__
+#define AARCH64_ONLY(X) X
+#else
+#define AARCH64_ONLY(X)
+#endif
 
 #define xSTR(X) #X
 #define STR(X) xSTR(X)
@@ -92,6 +99,13 @@ extern size_t strlen(const char *);
 fprintf(stderr, "CHECKED %s %s\n", STR(VECT_TYPE(T, W, N)), MSG);	\
   }
 
+#if defined (__ARM_FEATURE_CRYPTO)
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) \
+	   CHECK(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#else
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#endif
+
 /* Floating-point variant.  */
 #define CHECK_FP(MSG,T,W,N,FMT,EXPECTED,COMMENT)			\
   {	\
@@ -184,6 +198,9 @@ extern ARRAY(expected, uint, 32, 2);
 extern ARRAY(expected, uint, 64, 1);
 extern ARRAY(expected, poly, 8, 8);
 extern ARRAY(expected, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 1);
+#endif
 extern ARRAY(expected, hfloat, 16, 4);
 extern ARRAY(expected, hfloat, 32, 2);
 extern ARRAY(expected, hfloat, 64, 1);
@@ -197,11 +214,14 @@ extern ARRAY(expected, uint, 32, 4);
 extern ARRAY(expected, uint, 64, 2);
 extern ARRAY(expected, poly, 8, 16);
 extern ARRAY(expected, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 2);
+#endif
 extern ARRAY(expected, hfloat, 16, 8);
 extern ARRAY(expected, hfloat, 32, 4);
 extern ARRAY(expected, hfloat, 64, 2);
 
-#define CHECK_RESULTS_NAMED_NO_FP16(test_name,EXPECTED,comment)		\
+#define CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64(test_name,EXPECTED,comment)		\
   {	\
 CHECK(test_name, int, 8, 8, PRIx8, EXPECTED, comment);		\
 CHECK(test_name, int, 16, 4, PRIx16, EXPECTED, comment);		\
@@ -228,6 +248,13 @@ extern ARRAY(expected, hfloat, 64, 2);
 CHECK_FP(test_name, float, 32, 4, PRIx32, EXPECTED, comment);	\
   }	\
 
+#define CHECK_RESULTS_NAMED_NO_FP16(test_name,EXPECTED,comment)		\
+  {	\
+CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64(test_name, EXPECTED, comment);		\
+CHECK_CRYPTO(test_name, poly, 64, 1, PRIx64, EXPECTED, comment);	\
+CHECK_CRYPTO(test_name, poly, 64, 2, PRIx64, EXPECTED, comment);	\
+  }	\
+
 /* Check results against EXPECTED.  Operates on all possible vector types.  */
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
 #define CHECK_RESULTS_NAMED(test_name,EXPECTED,comment)			\
@@ -398,6 +425,9 @@ static void clean_results (void)
   CLEAN(result, uint, 64, 1);
   CLEAN(result, poly, 8, 8);
   CLEAN(result, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+  CLEAN(result, poly, 64, 1);
+#endif
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
   CLEAN(result, float, 16, 4);
 #endif
@@ -413,6 +443,9 @@ static void clean_results (void)
   CLEAN(result, uint, 64, 2);
   CLEAN(resul

[ARM][GCC][PATCHv2 2/3] Add missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina
Hi all,

This patch (2 of 3) adds the following NEON intrinsics to
the ARM back-end of GCC:

* vget_lane_p64

Added new tests for these and ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar

gcc/
2016-11-04  Tamar Christina  

* config/arm/arm_neon.h (vget_lane_p64): New.
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 3898ff7302dc3f21e6b50a8a7b835033c1ae2021..ab29da74e0971cc09ee63b561ecc79e9762e3fb4 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -5411,6 +5411,15 @@ vget_lane_s64 (int64x1_t __a, const int __b)
   return (int64_t)__builtin_neon_vget_lanedi (__a, __b);
 }
 
+#pragma GCC push_options
+#pragma GCC target ("fpu=crypto-neon-fp-armv8")
+__extension__ static __inline poly64_t __attribute__ ((__always_inline__))
+vget_lane_p64 (poly64x1_t __a, const int __b)
+{
+  return (poly64_t)__builtin_neon_vget_lanedi ((int64x1_t) __a, __b);
+}
+
+#pragma GCC pop_options
 __extension__ static __inline uint64_t __attribute__ ((__always_inline__))
 vget_lane_u64 (uint64x1_t __a, const int __b)
 {


[PATCH] Fix PR78224

2016-11-07 Thread Richard Biener

The following fixes an ICE with call cdce where it fails to handle
PHIs in the fallthru destination of a call with EH.  My simple fix is
to simply split the fallthru edge if the dest may contain PHI nodes.

This may also remove the need to free dominance info (hope there's
a testcase for that -- I'll leave the branches alone).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78224
* tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds):
Split the fallthru edge in case its successor may have PHIs.
Do not free dominance info.

* g++.dg/torture/pr78224.C: New testcase.

Index: gcc/tree-call-cdce.c
===
--- gcc/tree-call-cdce.c(revision 241893)
+++ gcc/tree-call-cdce.c(working copy)
@@ -807,15 +807,20 @@ shrink_wrap_one_built_in_call_with_conds
 can_guard_call_p.  */
   join_tgt_in_edge_from_call = find_fallthru_edge (bi_call_bb->succs);
   gcc_assert (join_tgt_in_edge_from_call);
-  free_dominance_info (CDI_DOMINATORS);
+  /* We don't want to handle PHIs.  */
+  if (EDGE_COUNT (join_tgt_in_edge_from_call->dest->preds) > 1)
+   join_tgt_bb = split_edge (join_tgt_in_edge_from_call);
+  else
+   join_tgt_bb = join_tgt_in_edge_from_call->dest;
 }
   else
-join_tgt_in_edge_from_call = split_block (bi_call_bb, bi_call);
+{
+  join_tgt_in_edge_from_call = split_block (bi_call_bb, bi_call);
+  join_tgt_bb = join_tgt_in_edge_from_call->dest;
+}
 
   bi_call_bsi = gsi_for_stmt (bi_call);
 
-  join_tgt_bb = join_tgt_in_edge_from_call->dest;
-
   /* Now it is time to insert the first conditional expression
  into bi_call_bb and split this bb so that bi_call is
  shrink-wrapped.  */
Index: gcc/testsuite/g++.dg/torture/pr78224.C
===
--- gcc/testsuite/g++.dg/torture/pr78224.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr78224.C  (working copy)
@@ -0,0 +1,51 @@
+// { dg-do compile }
+
+extern "C"{
+  float sqrtf(float);
+}
+
+inline float squareroot(const float f)
+{
+  return sqrtf(f);
+}
+
+inline int squareroot(const int f)
+{
+  return static_cast(sqrtf(static_cast(f)));
+}
+
+template 
+class vector2d
+{
+public:
+  vector2d(T nx, T ny) : X(nx), Y(ny) {}
+  T getLength() const { return squareroot( X*X + Y*Y ); }
+  T X;
+  T Y;
+};
+
+vector2d getMousePos();
+
+class Client
+{
+public:
+  Client();
+  ~Client();
+};
+
+void the_game(float turn_amount)
+{
+  Client client;
+  bool first = true;
+
+  while (1) {
+  if (first) {
+first = false;
+  } else {
+int dx = getMousePos().X;
+int dy = getMousePos().Y;
+
+turn_amount = vector2d(dx, dy).getLength();
+  }
+  }
+}


Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-07 Thread Thomas Preudhomme

What about ARM maintainers?

Best regards,

Thomas

On 04/11/16 22:16, Gerald Pfeifer wrote:

On Fri, 4 Nov 2016, Thomas Preudhomme wrote:

This patch document the newly added support in GCC 7 for Cortex-M23 and
Cortex-M33 processors [1][2].

:

Is this ok for ?


Surely so for me.

Gerald


Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-07 Thread Kyrill Tkachov


On 07/11/16 14:00, Thomas Preudhomme wrote:

What about ARM maintainers?



Fine with me too.
Thanks,
Kyrill


Best regards,

Thomas

On 04/11/16 22:16, Gerald Pfeifer wrote:

On Fri, 4 Nov 2016, Thomas Preudhomme wrote:

This patch document the newly added support in GCC 7 for Cortex-M23 and
Cortex-M33 processors [1][2].

:

Is this ok for ?


Surely so for me.

Gerald




[GCC][AArch64][PATCH][Testsuite] Fix failing test vector_initialization_nostack.c

2016-11-07 Thread Tamar Christina
Hi all,

This fixes (PR78142) by turning off scheduling for the test.
r241590 is causing more registers to be used and so
the SP registered happens to be picked and used.

This test I believe was checking explicitly that the
SP is not used if not needed.  

Ran regression tests on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/

2016-11-07  Tamar Christina  

PR middle-end/78142
* gcc.target/aarch64/vector_initialization_nostack.c
(dg-options): Disabled scheduling.diff --git a/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c b/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
index bbad04d00263b6a91b826b4911af92bdd226c821..71699281c5ce79fb5cf37e47b8ba078721c19f3a 100644
--- a/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
+++ b/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -ftree-vectorize -fno-vect-cost-model" } */
+/* { dg-options "-O3 -ftree-vectorize -fno-vect-cost-model -fno-schedule-insns" } */
 float arr_f[100][100];
 float
 f9 (void)


Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-07 Thread Tamar Christina
Ping.


From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, November 1, 2016 3:46:07 PM
To: GCC Patches; r...@cebitec.uni-bielefeld.de; mikest...@comcast.net
Cc: nd
Subject: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

Hi all,

A glibc update recently broke this test by adding a CPP
macro that uses the ## string function which traditional-cpp
does not support.
The change in glibc that made the test fail is from
6962682ffe5e5f0373047a0b894fee7a774be254.

This fixes (PR78136) by changing the test to use a local
include file instead of one from glibc.
The intention of the test is to test that traditional-cpp does
not expand values inside <> blocks of #includes.
As such the include has to be included via <> syntax. To do this
the .exp has been modified to add the test directory to the
Include search path.

Ran regression tests on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/

2016-10-31  Tamar Christina  

PR testsuite/78136
* gcc.dg/cpp/trad/trad.exp
(dg-runtest): Added $srcdir/$subdir/ to Include dirs.
* gcc.dg/cpp/trad/include.c: Use local header file.


Re: New option -flimit-function-alignment

2016-11-07 Thread Bernd Schmidt

On 10/14/2016 08:28 PM, Bernd Schmidt wrote:

On 10/12/2016 09:27 PM, Denys Vlasenko wrote:

Yes, something like "if max_skip >= func_size, temporarily lower
max_skip to func_size-1" (because otherwise we can create padding
bigger-or-equal to the entire function in size, which is stupid
- it's better to just put the function in that space).

This would be a nice.


That would be this patch. Bootstrapped and tested on x86_64-linux, ok?


Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01187.html


Bernd



[GCC][PATCH] Fix ada compile error on Windows x86_64 (committed as r241907 under the obvious rule)

2016-11-07 Thread Tamar Christina
Hi all,

The changes in r240999 re-arranged includes and
left out signal.h for Windows x86 builds.

This breaks the build and prevents GCC builds from
completing with messages such as:

adaint.c:3317:19: error: 'SIGINT' undeclared (first use in this function); did 
you mean 'SAIT'?

else if (sig == SIGINT)
^~

Bootstrapped successfully on x86_64-w64-mingw32.

Committed as r241907.

Thanks,
Tamar

gcc/

2016-11-07  Tamar Christina  

* gcc/ada/adaint.c: Added signal.h for Windows.diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 353914708adbdf301f9d59aaa55debfed469f901..819ea47e449725b08c1a531b340ddc6a74b0e5db 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -190,6 +190,7 @@ UINT CurrentCCSEncoding;
 #include 
 #include 
 #include 
+#include 
 #undef DIR_SEPARATOR
 #define DIR_SEPARATOR '\\'
 


Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-07 Thread Richard Biener
On Sun, Nov 6, 2016 at 5:32 AM, Aaron Sawdey
 wrote:
> On Fri, 2016-11-04 at 20:43 -0600, Jeff Law wrote:
>> So what's the motivation here?  When we don't have any constants
>> then
>> I'd think we'd be better off punting into the library.
>
> When none of the args to strncmp are constant, I'd be inclined to
> agree. However the current state of affairs is that strncmp is not
> expanded in the case where the length is a constant but the strings are
> not. This patch allows the expansion to be attempted.
>
> The target's cmpstrnsi pattern can then make the decision of which
> cases to expand and which cases to punt to the library. For instance RX
> might always want to expand this for all cases as that target has an
> instruction that is intended to map to strncmp.
>
> My particular motivation is that I'm working on a cmpstrnsi pattern for
> powerpc64 and I want to have access to the case where the strings are
> not constant but the length is.

Your patchset doesn't contain a testcase so I really wonder which case
we know the string length but it is not constant.

Yes, there's COND_EXPR handling in c_strlen but that should be mostly
dead code -- the real code should be using get_maxval_strlen or
get_range_strlen but c_strlen does not use those.

Ideally the str optabs would get profile data and alignment similar to
the mem ones.

Care to share a testcase?

Thanks,
Richard.

> Thanks,
>Aaron
>
> --
> Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
> 050-2/C113  (507) 253-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>


[PATCH] Avoid peeling for gaps if accesses are aligned

2016-11-07 Thread Richard Biener

Currently we force peeling for gaps whenever element overrun can occur
but for aligned accesses we know that the loads won't trap and thus
we can avoid this.

Bootstrap and regtest running on x86_64-unknown-linux-gnu (I expect
some testsuite fallout here so didn't bother to invent a new testcase).

Just in case somebody thinks the overrun is a bad idea in general
(even when not trapping).  Like for ASAN or valgrind.

Richard.

2016-11-07  Richard Biener  

* tree-vect-stmts.c (get_group_load_store_type): If the
access is aligned do not trigger peeling for gaps.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 241893)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -1770,6 +1771,11 @@ get_group_load_store_type (gimple *stmt,
   " non-consecutive accesses\n");
  return false;
}
+ /* If the access is aligned an overrun is fine.  */
+ if (overrun_p
+ && aligned_access_p
+  (STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt
+   overrun_p = false;
  if (overrun_p && !can_overrun_p)
{
  if (dump_enabled_p ())
@@ -1789,6 +1795,10 @@ get_group_load_store_type (gimple *stmt,
   /* If there is a gap at the end of the group then these optimizations
 would access excess elements in the last iteration.  */
   bool would_overrun_p = (gap != 0);
+  /* If the access is aligned an overrun is fine.  */
+  if (would_overrun_p
+ && aligned_access_p (STMT_VINFO_DATA_REF (stmt_info)))
+   would_overrun_p = false;
   if (!STMT_VINFO_STRIDED_P (stmt_info)
  && (can_overrun_p || !would_overrun_p)
  && compare_step_with_zero (stmt) > 0)



Re: [PATCH] Fix -O0 AVX512 comparison ICE (PR target/78227)

2016-11-07 Thread Uros Bizjak
On Mon, Nov 7, 2016 at 2:02 PM, Jakub Jelinek  wrote:
> Hi!
>
> The following testcases ICE at -O0, because ix86_expand_sse_cmp avoid using
> the passed in dest only if optimize or if there is some value overlap, but
> we actually need to do that also if we have a maskcmp where we want to use
> a different mode than dest has.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-11-07  Jakub Jelinek  
>
> PR target/78227
> * config/i386/i386.c (ix86_expand_sse_cmp): Force dest into
> cmp_mode argument even for -O0 if cmp_mode != mode and maskcmp.
>
> * gcc.target/i386/pr78227-1.c: New test.
> * gcc.target/i386/pr78227-2.c: New test.

OK with a small nit, please see inline ...

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-11-04 20:09:48.0 +0100
> +++ gcc/config/i386/i386.c  2016-11-07 10:14:15.625018144 +0100
> @@ -23561,6 +23561,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_
>  cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
>
>if (optimize
> +  || (cmp_mode != mode && maskcmp)

Maybe beter to switch condition around, so:

"(maskcmp && cmp_mode != mode)"

>|| (op_true && reg_overlap_mentioned_p (dest, op_true))
>|| (op_false && reg_overlap_mentioned_p (dest, op_false)))
>  dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
> --- gcc/testsuite/gcc.target/i386/pr78227-1.c.jj2016-11-07 
> 10:15:52.606762613 +0100
> +++ gcc/testsuite/gcc.target/i386/pr78227-1.c   2016-11-07 10:24:58.821480125 
> +0100
> @@ -0,0 +1,30 @@
> +/* PR target/78227 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f -O0 -Wno-psabi" } */
> +
> +typedef int V __attribute__((vector_size (64)));
> +typedef long long int W __attribute__((vector_size (64)));
> +
> +V
> +foo1 (V v)
> +{
> +  return v > 0;
> +}
> +
> +V
> +bar1 (V v)
> +{
> +  return v != 0;
> +}
> +
> +W
> +foo2 (W w)
> +{
> +  return w > 0;
> +}
> +
> +W
> +bar2 (W w)
> +{
> +  return w != 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr78227-2.c.jj2016-11-07 
> 10:22:17.055670476 +0100
> +++ gcc/testsuite/gcc.target/i386/pr78227-2.c   2016-11-07 10:25:03.722413765 
> +0100
> @@ -0,0 +1,30 @@
> +/* PR target/78227 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -O0 -Wno-psabi" } */
> +
> +typedef signed char V __attribute__((vector_size (64)));
> +typedef short int W __attribute__((vector_size (64)));
> +
> +V
> +foo1 (V v)
> +{
> +  return v > 0;
> +}
> +
> +V
> +bar1 (V v)
> +{
> +  return v != 0;
> +}
> +
> +W
> +foo2 (W w)
> +{
> +  return w > 0;
> +}
> +
> +W
> +bar2 (W w)
> +{
> +  return w != 0;
> +}
>
> Jakub


Re: [RFA] Fix various PPC build failures due to int-in-boolean-context code

2016-11-07 Thread Bernd Edlinger
On Fri, Oct 28, 2016 at 09:12:29AM -0600, Jeff Law wrote:
 >
 > The PPC port is stumbling over the new integer in boolean context 
warnings.
 >
 > In particular this code from rs6000_option_override_internal is
 > problematical:
 >
 >   HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
 >  :
 > processor_target_table[cpu_index].target_enable);
 >
 > The compiler is flagging the (TARGET_DEFAULT) condition.  That's
 > supposed to to be a boolean.
 >
 > After all the macro expansions are done it ultimately looks something
 > like this:
 >
 >  long flags = (((1L << 7)) ? (1L << 7)
 > : processor_target_table[cpu_index].target_enable);
 >
 > Note the (1L << 7) used as the condition for the ternary.  That's what
 > has the int-in-boolean-context warning tripping.  It's a false positive
 > IMHO.

Hmm...

 From the warning's perspective it would look far less suspicious,
if we make this an unsigned shift op.

I looked at options.h and I think we could also use one bit more
if the shift was unsigned.

Furthermore there are macros TARGET_..._P which do not put
brackets around the macro parameter.

So how about this?

Cross-compiler for powerpc-eabi builds without warning.

Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Bernd.
2016-11-07  Bernd Edlinger  

	* opth-gen.awk: Use unsigned shifts for bit masks.  Allow all bits
	to be used.  Add brackets around macro argument.

Index: gcc/opth-gen.awk
===
--- gcc/opth-gen.awk	(revision 241884)
+++ gcc/opth-gen.awk	(working copy)
@@ -350,11 +350,11 @@ for (i = 0; i < n_opts; i++) {
 		mask_bits[name] = 1
 		vname = var_name(flags[i])
 		mask = "MASK_"
-		mask_1 = "1"
+		mask_1 = "1U"
 		if (vname != "") {
 			mask = "OPTION_MASK_"
 			if (host_wide_int[vname] == "yes")
-mask_1 = "HOST_WIDE_INT_1"
+mask_1 = "HOST_WIDE_INT_1U"
 		} else
 			extra_mask_bits[name] = 1
 		print "#define " mask name " (" mask_1 " << " masknum[vname]++ ")"
@@ -362,16 +362,16 @@ for (i = 0; i < n_opts; i++) {
 }
 for (i = 0; i < n_extra_masks; i++) {
 	if (extra_mask_bits[extra_masks[i]] == 0)
-		print "#define MASK_" extra_masks[i] " (1 << " masknum[""]++ ")"
+		print "#define MASK_" extra_masks[i] " (1U << " masknum[""]++ ")"
 }
 
 for (var in masknum) {
 	if (var != "" && host_wide_int[var] == "yes") {
-		print" #if defined(HOST_BITS_PER_WIDE_INT) && " masknum[var] " >= HOST_BITS_PER_WIDE_INT"
+		print "#if defined(HOST_BITS_PER_WIDE_INT) && " masknum[var] " > HOST_BITS_PER_WIDE_INT"
 		print "#error too many masks for " var
 		print "#endif"
 	}
-	else if (masknum[var] > 31) {
+	else if (masknum[var] > 32) {
 		if (var == "")
 			print "#error too many target masks"
 		else
@@ -401,7 +401,7 @@ for (i = 0; i < n_opts; i++) {
 		print "#define TARGET_" name \
 		  " ((" vname " & " mask name ") != 0)"
 		print "#define TARGET_" name "_P(" vname ")" \
-		  " ((" vname " & " mask name ") != 0)"
+		  " (((" vname ") & " mask name ") != 0)"
 	}
 }
 for (i = 0; i < n_extra_masks; i++) {


Re: Simplify X / X, 0 / X and X % X

2016-11-07 Thread Jeff Law

On 11/07/2016 03:02 AM, Richard Biener wrote:

On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law  wrote:

On 11/04/2016 02:07 PM, Marc Glisse wrote:


Hello,

since we were discussing this recently...

The condition is copied from the existing 0 % X case, visible in the
context of the diff.

As far as I understand, the main case where we do not want to optimize
is during constexpr evaluation in the C++ front-end (it wants to detect
the undefined behavior), and with late folding I think this means we
only need to care about an explicit 0/0, not about X/X where X would
become 0 after the simplification.

And later, if we do have something like X/0, we could handle it the same
way as we currently handle *(char*)0, insert a trap after that
instruction and clear the following code, which likely gives better code
than replacing 0/0 with 1.


Yup.  I'd prefer to insert a trap if we ultimately expose a division by zero
-- including cases where that division occurs as a result of a PHI arg being
zero and the PHI result being used as a denominator in a division
expression.

It ought to be extremely easy to detect & transform that case (and probably
warn for it too).


We have gimple-ssa-isolate-paths.c for that, right?
Right.   I was thinking about instrumenting for it today to see if it's 
worth any effort.  It shouldn't take more than a few minutes once I 
refamiliarize myself with isolate-paths.






jeff



Re: [PATCH] rtx_writer: avoid printing trailing default values

2016-11-07 Thread David Malcolm
On Fri, 2016-11-04 at 20:40 +0100, Bernd Schmidt wrote:
> On 11/04/2016 08:25 PM, David Malcolm wrote:
> 
> >   return m_compact;
> 
> Ok with this one plus a comment.
> 

Thanks.

Using m_compact required turning the static function into a (private)
member function.  For reference, here's what I committed (r241908),
having verified bootstrap®rtest.   
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 241907)
+++ gcc/ChangeLog	(revision 241908)
@@ -1,3 +1,16 @@
+2016-11-07  David Malcolm  
+
+	* print-rtl.c (rtx_writer::operand_has_default_value_p): New
+	method.
+	(rtx_writer::print_rtx): In compact mode, omit trailing operands
+	that have the default values.
+	* print-rtl.h (rtx_writer::operand_has_default_value_p): New
+	method.
+	* rtl-tests.c (selftest::test_dumping_insns): Remove empty
+	label string from expected dump.
+	(seltest::test_uncond_jump): Remove trailing "(nil)" for REG_NOTES
+	from expected dump.
+
 2016-11-07  Jakub Jelinek  
 
 	PR target/77834
Index: gcc/print-rtl.c
===
--- gcc/print-rtl.c	(revision 241907)
+++ gcc/print-rtl.c	(revision 241908)
@@ -564,6 +564,43 @@
 }
 }
 
+/* Subroutine of rtx_writer::print_rtx.
+   In compact mode, determine if operand IDX of IN_RTX is interesting
+   to dump, or (if in a trailing position) it can be omitted.  */
+
+bool
+rtx_writer::operand_has_default_value_p (const_rtx in_rtx, int idx)
+{
+  const char *format_ptr = GET_RTX_FORMAT (GET_CODE (in_rtx));
+
+  switch (format_ptr[idx])
+{
+case 'e':
+case 'u':
+  return XEXP (in_rtx, idx) == NULL_RTX;
+
+case 's':
+  return XSTR (in_rtx, idx) == NULL;
+
+case '0':
+  switch (GET_CODE (in_rtx))
+	{
+	case JUMP_INSN:
+	  /* JUMP_LABELs are always omitted in compact mode, so treat
+	 any value here as omittable, so that earlier operands can
+	 potentially be omitted also.  */
+	  return m_compact;
+
+	default:
+	  return false;
+
+	}
+
+default:
+  return false;
+}
+}
+
 /* Print IN_RTX onto m_outfile.  This is the recursive part of printing.  */
 
 void
@@ -681,9 +718,18 @@
 	fprintf (m_outfile, " %d", INSN_UID (in_rtx));
 }
 
+  /* Determine which is the final operand to print.
+ In compact mode, skip trailing operands that have the default values
+ e.g. trailing "(nil)" values.  */
+  int limit = GET_RTX_LENGTH (GET_CODE (in_rtx));
+  if (m_compact)
+while (limit > idx && operand_has_default_value_p (in_rtx, limit - 1))
+  limit--;
+
   /* Get the format string and skip the first elements if we have handled
  them already.  */
-  for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++)
+
+  for (; idx < limit; idx++)
 print_rtx_operand (in_rtx, idx);
 
   switch (GET_CODE (in_rtx))
Index: gcc/print-rtl.h
===
--- gcc/print-rtl.h	(revision 241907)
+++ gcc/print-rtl.h	(revision 241908)
@@ -39,6 +39,7 @@
   void print_rtx_operand_code_r (const_rtx in_rtx);
   void print_rtx_operand_code_u (const_rtx in_rtx, int idx);
   void print_rtx_operand (const_rtx in_rtx, int idx);
+  bool operand_has_default_value_p (const_rtx in_rtx, int idx);
 
  private:
   FILE *m_outfile;
Index: gcc/rtl-tests.c
===
--- gcc/rtl-tests.c	(revision 241907)
+++ gcc/rtl-tests.c	(revision 241908)
@@ -122,7 +122,7 @@
   /* Labels.  */
   rtx_insn *label = gen_label_rtx ();
   CODE_LABEL_NUMBER (label) = 42;
-  ASSERT_RTL_DUMP_EQ ("(clabel 0 42 \"\")\n", label);
+  ASSERT_RTL_DUMP_EQ ("(clabel 0 42)\n", label);
 
   LABEL_NAME (label)= "some_label";
   ASSERT_RTL_DUMP_EQ ("(clabel 0 42 (\"some_label\"))\n", label);
@@ -176,8 +176,7 @@
   ASSERT_TRUE (control_flow_insn_p (jump_insn));
 
   ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n"
-		  "(label_ref 0))\n"
-		  " (nil))\n",
+		  "(label_ref 0)))\n",
 		  jump_insn);
 }
 


[patch, fortran, committed] Fill in some more locations

2016-11-07 Thread Thomas Koenig

Hello world,

I have committed the little patchlet below as obvious, after
regression-testing.

Regards

Thomas

2016-11-07  Thomas Koenig  

PR fortran/78826
* match.c (gfc_match_select_type):  Add where for expr1.
* resolve.c (resolev_select_type): Add where for expr1 of new
statement.
Index: match.c
===
--- match.c	(Revision 241887)
+++ match.c	(Arbeitskopie)
@@ -5898,6 +5898,7 @@ gfc_match_select_type (void)
 {
   expr1 = gfc_get_expr ();
   expr1->expr_type = EXPR_VARIABLE;
+  expr1->where = expr2->where;
   if (gfc_get_sym_tree (name, NULL, &expr1->symtree, false))
 	{
 	  m = MATCH_ERROR;
Index: resolve.c
===
--- resolve.c	(Revision 241887)
+++ resolve.c	(Arbeitskopie)
@@ -8857,6 +8857,7 @@ resolve_select_type (gfc_code *code, gfc_namespace
 	  new_st->expr1->value.function.actual = gfc_get_actual_arglist ();
 	  new_st->expr1->value.function.actual->expr = gfc_get_variable_expr (selector_expr->symtree);
 	  new_st->expr1->value.function.actual->expr->where = code->loc;
+	  new_st->expr1->where = code->loc;
 	  gfc_add_vptr_component (new_st->expr1->value.function.actual->expr);
 	  vtab = gfc_find_derived_vtab (body->ext.block.case_list->ts.u.derived);
 	  st = gfc_find_symtree (vtab->ns->sym_root, vtab->name);


Re: [patch, fortran, committed] Fill in some more locations

2016-11-07 Thread Thomas Koenig

Am 07.11.2016 um 16:25 schrieb Thomas Koenig:


PR fortran/78826


... should have been PR 78226.



[PING, PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-11-07 Thread Dominik Vogt
Ping.

https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02525.html

On Mon, Oct 31, 2016 at 08:56:10PM +0100, Dominik Vogt wrote:
> The attached patch does a little change in
> combine.c:combine_simplify_rtx() to prevent a "simplification"
> where the rtl code gets more complex in reality.  The complete
> description of the change can be found in the commit comment in
> the attached patch.
> 
> The patch reduces the number of patterns in the s390 backend and
> slightly reduces the size of the compiled SPEC2006 code.  (Code
> size or runtime only tested on s390x with -m64.)  It is
> theoretically possible that this patch leads to somewhat worse
> code on some target if that only has a pattern for the formerly replaced
> rtl expression but not for the original one.
> 
> The patch has passed the testsuite on s390, s390x biarch, x86_64
> and Power biarch.
> 
> --
> 
> (I'm not sure whether the const_int expression can appear in both
> operands or only as the second.  If the latter is the case, the
> conditions can be simplified a bit.)
> 
> What do you think about this patch?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Fix build of jit (was Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3))

2016-11-07 Thread David Malcolm
On Mon, 2016-11-07 at 11:03 +0100, Martin Liška wrote:
> Hello.
> 
> After discussion with Jakub, I'm resending new version of the patch,
> where I changed following:
> 1) gimplify_ctxp->live_switch_vars is used to track variables
> introduced in switch_expr. Every time
>a case_label_expr is seen, these are unpoisoned. It's quite
> conservative, however it covers all
>corner cases on can come up with. Compared to clang, we are much
> more precise in switch statements
>where a variable liveness crosses label boundary.
> 2) I found a bug where ASAN_CHECK was optimized out due to missing
> check of IFN_ASAN_MARK internal fn.
>Test was added for that.
> 3) Multiple switch tests have been added, which is going to be sent
> in upcoming email.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression
> tests (+ asan bootstrap finishes
> successfully).

The patch (r241896) introduced an error in the build of the jit:

../../src/gcc/jit/jit-builtins.c:62:1: error: invalid conversion from
‘int’ to ‘gcc::jit::built_in_attribute’ [-fpermissive]
 };
 ^

which seems to be due to the "0" for ATTRS in:

--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -165,6 +165,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
  "__asan_after_dynamic_init",
  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
+ BT_FN_VOID_PTR_PTRMODE, 0)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, 
"__asan_unpoison_stack_memory",
+ BT_FN_VOID_PTR_PTRMODE, 0)

Is the attached patch OK as a fix? (assuming testing passes)  Or should
these builtins have other attrs?  (sorry, am not very familiar with the
sanitizer code).

Dave
From 6db5f9e50dc95f504d33970ee553172bbf400ae7 Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Mon, 7 Nov 2016 11:21:20 -0500
Subject: [PATCH] Fix build of jit

gcc/ChangeLog:
	* asan.c (ATTR_NULL): Define.
	* sanitizer.def (BUILT_IN_ASAN_CLOBBER_N): Use ATTR_NULL rather
	than 0.
	(BUILT_IN_ASAN_UNCLOBBER_N): Likewise.
---
 gcc/asan.c| 2 ++
 gcc/sanitizer.def | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 1e0ce8d..4a124cb 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2463,6 +2463,8 @@ initialize_sanitizer_builtins (void)
 #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
 #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
 #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
+#undef ATTR_NULL
+#define ATTR_NULL 0
 #undef ATTR_NOTHROW_LEAF_LIST
 #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
 #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 1c142e9..596b8b0 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -166,9 +166,9 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
 		  "__asan_after_dynamic_init",
 		  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
-		  BT_FN_VOID_PTR_PTRMODE, 0)
+		  BT_FN_VOID_PTR_PTRMODE, ATTR_NULL)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, "__asan_unpoison_stack_memory",
-		  BT_FN_VOID_PTR_PTRMODE, 0)
+		  BT_FN_VOID_PTR_PTRMODE, ATTR_NULL)
 
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
-- 
1.8.5.3



Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-07 Thread Bernd Schmidt

On 11/04/2016 05:35 PM, Jakub Jelinek wrote:


2016-11-04  Jakub Jelinek  

PR target/77834
* dse.c (dse_step5): Call scan_reads even if just
insn_info->frame_read.  Improve and fix dump file messages.


Sounds reasonable, and I checked and it seems not to change code 
generation for any .i files from my collection. So, OK.



Bernd


Re: Fix build of jit (was Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3))

2016-11-07 Thread Jakub Jelinek
On Mon, Nov 07, 2016 at 11:07:13AM -0500, David Malcolm wrote:
> The patch (r241896) introduced an error in the build of the jit:
> 
> ../../src/gcc/jit/jit-builtins.c:62:1: error: invalid conversion from
> ‘int’ to ‘gcc::jit::built_in_attribute’ [-fpermissive]
>  };
>  ^
> 
> which seems to be due to the "0" for ATTRS in:
> 
> --- a/gcc/sanitizer.def
> +++ b/gcc/sanitizer.def
> @@ -165,6 +165,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
>  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
> "__asan_after_dynamic_init",
> BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
> +   BT_FN_VOID_PTR_PTRMODE, 0)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, 
> "__asan_unpoison_stack_memory",
> +   BT_FN_VOID_PTR_PTRMODE, 0)

I believe the 0 here is a bug, I'd think we should be using something like
ATTR_TMPURE_NOTHROW_LEAF_LIST that we are using __asan_load* - the functions
aren't going to throw, nor call anything in the current TU.  Not 100% sure
about the TMPURE, after all they do write/read memory (the shadow one).
So maybe ATTR_NOTHROW_LEAF_LIST instead for now?  Martin?

> Is the attached patch OK as a fix? (assuming testing passes)  Or should
> these builtins have other attrs?  (sorry, am not very familiar with the
> sanitizer code).

Jakub


[PATCH,testsuite] MIPS: Upgrade to MIPS IV if using (HAS_MOVN) with MIPS III.

2016-11-07 Thread Toma Tabacu
Hi,

The (HAS_MOVN) option should cause an upgrade to MIPS IV if the target is
pre-MIPS IV. However, the upgrade condition checks for "$isa < 3", which means
that we won't upgrade if we're targeting MIPS III.

This results in failures for the movcc-{1,2,3}.c and branch-cost-2.c tests
when the target is MIPS III.

This patch fixes the condition to include MIPS III.

Tested with mips-mti-elf.

Regards,
Toma Tabacu

gcc/testsuite/ChangeLog:

2016-11-07  Toma Tabacu  

* gcc.target/mips/mips.exp (mips-dg-options): Upgrade to MIPS IV if 
using
(HAS_MOVN) with MIPS III.

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 39f44ff..e22d782 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -1129,7 +1129,7 @@ proc mips-dg-options { args } {
 # We need MIPS IV or higher for:
#
#
-   } elseif { $isa < 3
+   } elseif { $isa < 4
   && [mips_have_test_option_p options "HAS_MOVN"] } {
mips_make_test_option options "-mips4"
 # We need MIPS III or higher for:



Re: [PATCH] Make direct emission of time profiler counter

2016-11-07 Thread Christophe Lyon
On 7 November 2016 at 09:58, Martin Liška  wrote:
> On 11/05/2016 09:38 AM, Jan Hubicka wrote:
>> Looks OK if it passes.
>>
>> Honza
>
> Thanks, fixed on trunk as r241894.
> Martin

Thanks, this fixed the problems I reported.

Christophe


Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-07 Thread Aaron Sawdey
On Mon, 2016-11-07 at 15:26 +0100, Richard Biener wrote:
> Your patchset doesn't contain a testcase so I really wonder which
> case
> we know the string length but it is not constant.
> 
> Yes, there's COND_EXPR handling in c_strlen but that should be mostly
> dead code -- the real code should be using get_maxval_strlen or
> get_range_strlen but c_strlen does not use those.
> 
> Ideally the str optabs would get profile data and alignment similar
> to
> the mem ones.
> 
> Care to share a testcase?

I think I haven't explained this well. The case I am interested in is
where the string arguments are indeed of unknown length, but the length
argument to strncmp is a constant. This is the case that I'm attempting
to address with this patch series.

This is from the strncmp-1.c test case, but modified for a constant
length argument to strncmp.

#include 
#include 
#include 

void
test (const unsigned char *s1, const unsigned char *s2, int expected)
{
  register int value = strncmp ((char *) s1, (char *) s2, 5);

  if (expected < 0 && value >= 0)
abort ();
  else if (expected == 0 && value != 0)
abort ();
  else if (expected > 0 && value <= 0)
abort ();
}

I added this small bit to builtins.c so we can see what happens:

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 241911)
+++ gcc/builtins.c  (working copy)
@@ -67,6 +67,7 @@
 #include "internal-fn.h"
 #include "case-cfn-macros.h"
 #include "gimple-fold.h"
+#include "print-tree.h"
 
 
 struct target_builtins default_target_builtins;
@@ -3932,6 +3933,9 @@
 len1 = c_strlen (arg1, 1);
 len2 = c_strlen (arg2, 1);
 
+printf("len1 = %p len2 = %p\n",(void*)len1,(void*)len2);
+debug_tree(arg3);
+
 if (len1)
   len1 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len1);
 if (len2)

The output then is as follows:

build/gcc/xgcc -B build/gcc -S -O1 strncmp-test.c
len1 = (nil) len2 = (nil)
 
constant 5>

Looking in the .s file you can see that strncmp was not expanded.
However the current code in i386.md for cmpstrnsi does not handle the
case where the 0 byte in both strings may occur before the length given
to strncmp.

test:
.LFB22:
.cfi_startproc
pushq   %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl%edx, %ebx
movl$5, %edx
callstrncmp
movl%ebx, %edx

I think it's pretty clear from the code in expand_builtin_strncmp that
if len1 and len2 are both NULL, you end up with len=len2 and then it
returns NULL_RTX.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-11-07 Thread Bernd Schmidt

On 11/03/2016 03:00 PM, Eric Botcazou wrote:

FWIW here's a more complete version of my patch which I'm currently
testing. Let me know if you think it's at least a good enough
intermediate step to be installed.


It is, thanks.


Testing showed the same issue as Jiong found, so I've committed it with 
that extra tweak.



Bernd



Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-11-07 Thread Jiong Wang



On 07/11/16 17:04, Bernd Schmidt wrote:

On 11/03/2016 03:00 PM, Eric Botcazou wrote:

FWIW here's a more complete version of my patch which I'm currently
testing. Let me know if you think it's at least a good enough
intermediate step to be installed.


It is, thanks.


Testing showed the same issue as Jiong found, so I've committed it 
with that extra tweak.


Thanks very much!  I have closed PR middle-end/78016

Regards,
Jiong


Re: [match.pd] Fix for PR35691

2016-11-07 Thread Prathamesh Kulkarni
On 7 November 2016 at 15:43, Richard Biener  wrote:
> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 4 November 2016 at 13:41, Richard Biener  wrote:
>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
>> >
>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
>> >>
>> >> > > > > The transform would also work for vectors (element_precision for
>> >> > > > > the test but also a value-matching zero which should ensure the
>> >> > > > > same number of elements).
>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length 
>> >> > > > by a
>> >> > > > matching zero.
>> >> > > > Could you please elaborate on that ?
>> >> > >
>> >> > > He may have meant something like:
>> >> > >
>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>> >> >
>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was 
>> >> > the
>> >> > point of the pattern.
>> >>
>> >> Oups, that's what I had written first, and then I somehow managed to 
>> >> confuse
>> >> myself enough to remove it so as to remove the call to types_match :-(
>> >>
>> >> > > So the last operand is checked with operand_equal_p instead of
>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
>> >> > > comparison results should already imply that the number of elements 
>> >> > > is the
>> >> > > same.
>> >> >
>> >> > Though for equality compares we also allow scalar results IIRC.
>> >>
>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate 
>> >> one
>> >> for a testcase, at least until the GIMPLE FE lands...
>> >>
>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe 
>> >> > > some
>> >> > > vector mode on s390?), it would be cool to do the same for floats 
>> >> > > (most
>> >> > > likely at the RTL level).
>> >> >
>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we 
>> >> > cab
>> >> > just allow bit-and/or on floats as much as we allow them on pointers.
>> >>
>> >> Would that generate sensible code on targets that do not have logic insns 
>> >> for
>> >> floats? Actually, even on x86_64 that generates inefficient code, so there
>> >> would be some work (for instance grep finds no gen_iordf3, only 
>> >> gen_iorv2df3).
>> >>
>> >> I am also a bit wary of doing those obfuscating optimizations too early...
>> >> a==0 is something that other optimizations might use. long
>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
>> >>
>> >> (and I am assuming that signaling NaNs don't make the whole transformation
>> >> impossible, which might be wrong)
>> >
>> > Yeah.  I also think it's not so much important - I just wanted to mention
>> > vectors...
>> >
>> > Btw, I still think we need a more sensible infrastructure for passes
>> > to gather, analyze and modify complex conditions.  (I'm always pointing
>> > to tree-affine.c as an, albeit not very good, example for handling
>> > a similar problem)
>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
>> this match.pd feature.
>> The current patch keeps it restricted to only bitwise operators on integers.
>> Bootstrap+test running on x86_64-unknown-linux-gnu.
>> OK to commit if passes ?
>
> +/* PR35691: Transform
> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
> +
>
> Please omit the vertical space
>
> +(for bitop (bit_and bit_ior)
> + cmp (eq ne)
> + (simplify
> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
>
> if you capture the first integer_zerop as @2 then you can re-use it...
>
> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
> +   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
> (@1)))
> +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
>
> ... here inplace of the { build_zero_cst ... }.
>
> Ok with that changes.
Thanks, committed the attached version as r241915.

>
> Richard.
2016-11-07  Prathamesh Kulkarni  

PR middle-end/35691
* match.pd: Add following two patterns:
(x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
(x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.

testsuite/
* gcc.dg/pr35691-1.c: New test-case.
* gcc.dg/pr35691-4.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index 48f7351..29ddcd8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -519,6 +519,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (TYPE_UNSIGNED (type))
 (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1)
 
+/* PR35691: Transform
+   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
+   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
+(for bitop (bit_and bit_ior)
+ cmp (eq ne)
+ (simplify
+  (bitop (cmp @0 integer_zerop@2) (cmp @1 integer_zerop))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)))
+(cmp (bit_ior @0 (convert @1)) @2
+
 /* Fold (A

Re: [match.pd] Fix for PR35691

2016-11-07 Thread Prathamesh Kulkarni
On 7 November 2016 at 23:06, Prathamesh Kulkarni
 wrote:
> On 7 November 2016 at 15:43, Richard Biener  wrote:
>> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
>>
>>> On 4 November 2016 at 13:41, Richard Biener  wrote:
>>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
>>> >
>>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
>>> >>
>>> >> > > > > The transform would also work for vectors (element_precision for
>>> >> > > > > the test but also a value-matching zero which should ensure the
>>> >> > > > > same number of elements).
>>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length 
>>> >> > > > by a
>>> >> > > > matching zero.
>>> >> > > > Could you please elaborate on that ?
>>> >> > >
>>> >> > > He may have meant something like:
>>> >> > >
>>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>>> >> >
>>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which 
>>> >> > was the
>>> >> > point of the pattern.
>>> >>
>>> >> Oups, that's what I had written first, and then I somehow managed to 
>>> >> confuse
>>> >> myself enough to remove it so as to remove the call to types_match :-(
>>> >>
>>> >> > > So the last operand is checked with operand_equal_p instead of
>>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
>>> >> > > comparison results should already imply that the number of elements 
>>> >> > > is the
>>> >> > > same.
>>> >> >
>>> >> > Though for equality compares we also allow scalar results IIRC.
>>> >>
>>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate 
>>> >> one
>>> >> for a testcase, at least until the GIMPLE FE lands...
>>> >>
>>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe 
>>> >> > > some
>>> >> > > vector mode on s390?), it would be cool to do the same for floats 
>>> >> > > (most
>>> >> > > likely at the RTL level).
>>> >> >
>>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we 
>>> >> > cab
>>> >> > just allow bit-and/or on floats as much as we allow them on pointers.
>>> >>
>>> >> Would that generate sensible code on targets that do not have logic 
>>> >> insns for
>>> >> floats? Actually, even on x86_64 that generates inefficient code, so 
>>> >> there
>>> >> would be some work (for instance grep finds no gen_iordf3, only 
>>> >> gen_iorv2df3).
>>> >>
>>> >> I am also a bit wary of doing those obfuscating optimizations too 
>>> >> early...
>>> >> a==0 is something that other optimizations might use. long
>>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
>>> >>
>>> >> (and I am assuming that signaling NaNs don't make the whole 
>>> >> transformation
>>> >> impossible, which might be wrong)
>>> >
>>> > Yeah.  I also think it's not so much important - I just wanted to mention
>>> > vectors...
>>> >
>>> > Btw, I still think we need a more sensible infrastructure for passes
>>> > to gather, analyze and modify complex conditions.  (I'm always pointing
>>> > to tree-affine.c as an, albeit not very good, example for handling
>>> > a similar problem)
>>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
>>> this match.pd feature.
>>> The current patch keeps it restricted to only bitwise operators on integers.
>>> Bootstrap+test running on x86_64-unknown-linux-gnu.
>>> OK to commit if passes ?
>>
>> +/* PR35691: Transform
>> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
>> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
>> +
>>
>> Please omit the vertical space
>>
>> +(for bitop (bit_and bit_ior)
>> + cmp (eq ne)
>> + (simplify
>> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
>>
>> if you capture the first integer_zerop as @2 then you can re-use it...
>>
>> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> +   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
>> +   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
>> (@1)))
>> +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
>>
>> ... here inplace of the { build_zero_cst ... }.
>>
>> Ok with that changes.
> Thanks, committed the attached version as r241915.
ugh, the svn commit message has:

testsuite/
* gcc.dg/pr35691-1.c: New test-case.
* gcc.dg/pr35691-4.c: Likewise.

pr35691-4.c was a typo, should be pr35691-2.c :/
However testsuite/ChangeLog correctly has entry for pr35691-2.c
Is it possible to edit the commit message for r241915 ?
Sorry about this.

Regards,
Prathamesh
>
>>
>> Richard.


Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
> This is an initial patch in a series that converts Darwin's configury to 
> detect ld64 features, rather than the current process of hard-coding them on 
> target system version.

So, I really do hate to ask, but does this have to be a config option?  
Normally, we'd just have configure examine things by itself.  For canadian 
crosses, there should be enough state present to key off of directly, specially 
if they are wired up to work.

I've rather have the thing that doesn't just work without that config flag, 
just work.  I'd like to think I can figure how how to make it just work, if 
given an idea of what doesn't actually work.

Essentially, you do the operation that doesn't work, detect it failed to work, 
then the you know it didn't work.



Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-07 Thread Mike Stump
On Nov 1, 2016, at 8:46 AM, Tamar Christina  wrote:
> 
> A glibc update recently broke this test by adding a CPP
> macro that uses the ## string function which traditional-cpp
> does not support.
> The change in glibc that made the test fail is from
> 6962682ffe5e5f0373047a0b894fee7a774be254.
> 
> This fixes (PR78136) by changing the test to use a local
> include file instead of one from glibc.
> The intention of the test is to test that traditional-cpp does
> not expand values inside <> blocks of #includes.
> As such the include has to be included via <> syntax. To do this
> the .exp has been modified to add the test directory to the
> Include search path.
> 
> Ran regression tests on aarch64-none-linux-gnu.
> 
> Ok for trunk?

Ok.

Can you remove the comment: Newlib uses ## when including stdlib.h as of 
2007-09-07.  while you are at it?  I think it doesn't make any sense post the 
change unless one reads history.

> 2016-10-31  Tamar Christina  
> 
>   PR testsuite/78136
>   * gcc.dg/cpp/trad/trad.exp
>   (dg-runtest): Added $srcdir/$subdir/ to Include dirs.
>   * gcc.dg/cpp/trad/include.c: Use local header 
> file.



Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Iain Sandoe

> On 7 Nov 2016, at 09:51, Mike Stump  wrote:
> 
> [ possible dup ]
> 
>> Begin forwarded message:
>> 
>> From: Mike Stump 
>> Subject: Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to 
>> be detected as Darwin's linker
>> Date: November 7, 2016 at 9:48:53 AM PST
>> To: Iain Sandoe 
>> Cc: GCC Patches , Jeff Law 
>> 
>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
>>> This is an initial patch in a series that converts Darwin's configury to 
>>> detect ld64 features, rather than the current process of hard-coding them 
>>> on target system version.
>> 
>> So, I really do hate to ask, but does this have to be a config option?  
>> Normally, we'd just have configure examine things by itself.  For canadian 
>> crosses, there should be enough state present to key off of directly, 
>> specially if they are wired up to work.
>> 
>> I've rather have the thing that doesn't just work without that config flag, 
>> just work.  I'd like to think I can figure how how to make it just work, if 
>> given an idea of what doesn't actually work.
>> 
>> Essentially, you do the operation that doesn't work, detect it failed to 
>> work, then the you know it didn't work.

Well, if you can run the tool, that’s fine - I wanted to cover the base where 
we have a native or canadian that’s using a newer ld64 than is installed by the 
‘last available xcode’ on a given platform - which is the common case (since 
the older versions of ld64 in particular don’t really support the features we 
want, they def. won’t support building LLVM for ex.).

I am *really really* trying to get away from the assumption that darwinNN 
implies some ld64 capability - because that’s just wrong, really - makes way 
too many assuptions.  I also want to get to the “end game” that we just 
configure *-*-darwin and use the cross-capability of the toolchain (we’re a 
ways away from that upstream, but my local patch set acheives it at least for 
5.4 and 6.2).

It’s true that adding configure options is not #1 choice in life - but I think 
darwin is getting to the stage where there are too many choices to cover 
without.

Open to alternate suggestions, of course
Iain



Re: [PATCH fix PR71767 1/4 : ld64 atoms] Make PIC indirections and constant labels linker-visible.

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 11:37 AM, Iain Sandoe  wrote:
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/71767
>   * config/darwin.c (imachopic_indirection_name): Make data section 
> indirections
>   linker-visible.
>   * config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Make local constant
>   labels linker-visible.


Re: [PATCH fix PR71767 4/4 : testsuite] Fix testsuite fallout from section and linker sym visibility changes.

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 11:41 AM, Iain Sandoe  wrote:
> OK for trunk (after the relevant patches are applied)?
> OK for open branches (likewise)?

Ok.

>   PR target/71767
> 
>   * g++.dg/abi/key2.C: Adjust for changed Darwin sections and 
> linker-visible symbols.
>   * g++.dg/torture/darwin-cfstring-3.C: Likewise.
>   * gcc.dg/const-uniq-1.c: Likewise.
>   * gcc.dg/torture/darwin-cfstring-3.c: Likewise.
>   * gcc.target/i386/pr70799-1.c: Likewise.


Re: [PATCH fix PR71767 3/4 : Darwin sections] Fix PR71767 - adjust the sections used in response to ld64 version.

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 11:40 AM, Iain Sandoe  wrote:
> 
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/71767
>   * config/darwin-sections.def (picbase_thunk_section): New.
>   * config/darwin.c (darwin_init_sections): Set up picbase thunk section.
>   (darwin_rodata_section, darwin_objc2_section, machopic_select_section,
>   darwin_asm_declare_constant_name, darwin_emit_weak_or_comdat, 
>   darwin_function_section): Don’t use coalesced with newer linkers.
>   (darwin_override_options): Decide on usage of coalesed sections on the
>   basis of the target linker version.
>   * config/darwin.h (MIN_LD64_NO_COAL_SECTS): New.
>   * config/darwin.opt  (mtarget-linker): New.
>   * config/i386/i386.c (ix86_code_end): Do not force the thunks into a 
> coalesced
>   section, instead use a thunks section.


Re: [PATCH, Darwin] fix for PR67710 : Update 'as' specs and inputs to handle newer assembler versions.

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 12:53 PM, Iain Sandoe  wrote:
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   * config.in: Regenerate
>   * config/darwin-driver.c (darwin_driver_init): Emit a version string 
> for the assembler.
>   * config/darwin.h(ASM_MMACOSX_VERSION_MIN_SPEC): New, new tests.
>   * config/darwin.opt(asm_macosx_version_min): New.
>   * config/i386/darwin.h: Handle ASM_MMACOSX_VERSION_MIN_SPEC.
>   * configure: Regenerate
>   * configure.ac: Check for mmacosx-version-min handling.
> 
> gcc/testsuite/
> 
> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   *  gcc.dg/darwin-minversion-1.c: Update min version check.
>   *  gcc.dg/darwin-minversion-2.c: Likewise.
>   *  gcc.dg/darwin-minversion-3.c: Likewise.
> 
> libgcc/
> 
> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   *  libgcc/config/t-darwin: Default builds to 10.5 codegen.



Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-07 Thread Mike Stump
On Nov 6, 2016, at 12:13 PM, Iain Sandoe  wrote:
> 
> OK for trunk?
> OK for open branches?

For the darwin parts, Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/57438
>   * config/i386/i386.c (ix86_code_end): Note that we emitted code where 
> the
>   function might otherwise appear empty for picbase thunks.
>   (ix86_output_function_epilogue): If we find a zero-sized function 
> assume that
>   reaching it is UB and trap.  If we find a trailing label append a nop.
>   * config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we find
>   a zero-sized function assume that reaching it is UB and trap.  If we 
> find a
>   trailing label, append a nop.
> 
> gcc/testsuite/
> 
> 2016-11-06  Iain Sandoe  
> 
>   PR target/57438
>   * gcc.dg/pr57438-1.c: New.
>   * gcc.dg/pr57438-2.c: New.


Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Joseph Myers
On Sun, 6 Nov 2016, Iain Sandoe wrote:

> This adds an option --with-ld64[=version] that allows the configurer to 

New configure options should be documented in install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


[hsa-branch] Append UID to local variable names

2016-11-07 Thread Martin Jambor
Hi,

when looking at stuff to merge to trunk, I have found out that this
patch has slipped thorough the cracks.  It adds the UID to names of
private symbols so that variables with the same name but different
scope, particularly OpenMP re-mapped ones, do not clash.

Committed to the hsa branch, will include it in the merge to trunk
too.

Thanks,

Martin


2016-11-07  Martin Jambor  

* hsa-gen.c (hsa_get_declaration_name): Append UID to local variable
names.
---
 gcc/hsa-gen.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index b6e8345..f138434 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -781,7 +781,8 @@ hsa_needs_cvt (BrigType16_t dtype, BrigType16_t stype)
   return false;
 }
 
-/* Return declaration name if exists.  */
+/* Return declaration name if it exists or create one from UID if it does not.
+   If DECL is a local variable, make UID part of its name.  */
 
 const char *
 hsa_get_declaration_name (tree decl)
@@ -789,7 +790,7 @@ hsa_get_declaration_name (tree decl)
   if (!DECL_NAME (decl))
 {
   char buf[64];
-  snprintf (buf, 64, "__hsa_anon_%i", DECL_UID (decl));
+  snprintf (buf, 64, "__hsa_anon_%u", DECL_UID (decl));
   size_t len = strlen (buf);
   char *copy = (char *) obstack_alloc (&hsa_obstack, len + 1);
   memcpy (copy, buf, len + 1);
@@ -808,7 +809,19 @@ hsa_get_declaration_name (tree decl)
   if (name[0] == '*')
 name++;
 
-  return name;
+  if ((TREE_CODE (decl) == VAR_DECL)
+  && decl_function_context (decl))
+{
+  size_t len = strlen (name);
+  char *buf = (char *) alloca (len + 32);
+  snprintf (buf, len + 32, "%s_%u", name, DECL_UID (decl));
+  len = strlen (buf);
+  char *copy = (char *) obstack_alloc (&hsa_obstack, len + 1);
+  memcpy (copy, buf, len + 1);
+  return copy;
+}
+  else
+return name;
 }
 
 /* Lookup or create the associated hsa_symbol structure with a given VAR_DECL
-- 
2.10.1



[hsa-branch] Remove superfluous lastprivate check

2016-11-07 Thread Martin Jambor
Hi,

this is another simple cleanup that I forgot to commit, which just
removes a lastprivate check (which hsa now can handle) at a place
where it cannot ever be anyway.

Committed to the hsa branch, will include it in the pile of OpenMP
stuff to request to merge to trunk later this week.

Thanks,

Martin


2016-11-07  Martin Jambor  

* omp-low.c (grid_target_follows_gridifiable_pattern): Do not
check for lastprivate clause on teams construct.
---
 gcc/omp-low.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ac87a91..65b0ddc 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17972,13 +17972,6 @@ grid_target_follows_gridifiable_pattern (gomp_target 
*target, grid_prop *grid)
 "clause is present\n ");
  return false;
 
-   case OMP_CLAUSE_LASTPRIVATE:
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, tloc,
-GRID_MISSED_MSG_PREFIX "a lastprivate "
-"clause is present\n ");
- return false;
-
case OMP_CLAUSE_THREAD_LIMIT:
  if (!integer_zerop (OMP_CLAUSE_OPERAND (clauses, 0)))
group_size = OMP_CLAUSE_OPERAND (clauses, 0);
-- 
2.10.1



Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Jeff Law

On 11/07/2016 10:48 AM, Mike Stump wrote:

On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:

This is an initial patch in a series that converts Darwin's configury to detect 
ld64 features, rather than the current process of hard-coding them on target 
system version.


So, I really do hate to ask, but does this have to be a config option?  
Normally, we'd just have configure examine things by itself.  For canadian 
crosses, there should be enough state present to key off of directly, specially 
if they are wired up to work.

I've rather have the thing that doesn't just work without that config flag, 
just work.  I'd like to think I can figure how how to make it just work, if 
given an idea of what doesn't actually work.

Essentially, you do the operation that doesn't work, detect it failed to work, 
then the you know it didn't work.

But how is that supposed to work in a cross environment when he can't 
directly query the linker's behavior?


In an ideal world we could trivially query the linker's behavior prior 
to invocation.  But we don't have that kind of infrastructure in place.


ISTM the way to go is to have a configure test to try and DTRT 
automatically for native builds and a flag to set for crosses (or 
potentially override the configure test).



Jeff


[PATCH] A special predicate for type size equality

2016-11-07 Thread Martin Jambor
Hi,

this has been in my TODO list for at least two years, probably longer,
although I do no longer remember why I added it there.  The idea is to
introduce a special wrapper around operands_equal_p for TYPE_SIZE
comparisons, which would try simple pointer equality before calling more
complex operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0), because
when equal, the sizes are most likely going to be the same tree anyway.

All users also test whether both TYPE_SIZEs are NULL, most of them to
test for known size equality, but unfortunately there is one (ODR
warning) that tests for known inequality.  Nevertheless, the former use
case seems so much natural that I have outlined it into the new
predicate as well.

I am no longer sure whether it is a scenario that happens so often to
justify a wrapper, but I'd like to propose it anyway, at least to remove
it from the TODO list as a not-so-good-idea-after-all :-)

Bootstrapped and tested on x86_64-linux.  Is it a good idea?  OK for
trunk?

Thanks,

Martin

2016-11-03  Martin Jambor  

* fold-const.c (type_sizes_equal_p): New function.
* fold-const.h (type_sizes_equal_p): Declare.
* ipa-devirt.c (odr_types_equivalent_p): Use it.
* ipa-polymorphic-call.c (meet_with): Likewise.
* tree-ssa-alias.c (stmt_kills_ref_p): Likewise.
---
 gcc/fold-const.c   | 19 +++
 gcc/fold-const.h   |  1 +
 gcc/ipa-devirt.c   |  2 +-
 gcc/ipa-polymorphic-call.c | 10 ++
 gcc/tree-ssa-alias.c   |  7 +--
 5 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 603aff0..ab77b8d 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3342,6 +3342,25 @@ operand_equal_for_comparison_p (tree arg0, tree arg1, 
tree other)
 
   return 0;
 }
+
+/* Given two types, return true if both have a non-NULL TYPE_SIZE and these
+   sizes have the same value.  */
+
+bool
+type_sizes_equal_p (const_tree t1, const_tree t2)
+{
+  gcc_checking_assert (TYPE_P (t1));
+  gcc_checking_assert (TYPE_P (t2));
+  t1 = TYPE_SIZE (t1);
+  t2 = TYPE_SIZE (t2);
+
+  if (!t1 || !t2)
+return false;
+  else if (t1 == t2)
+return true;
+  else
+return operand_equal_p (t1, t2, 0);
+}
 
 /* See if ARG is an expression that is either a comparison or is performing
arithmetic on comparisons.  The comparisons must only be comparing
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index ae37142..014ca34 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -89,6 +89,7 @@ extern void fold_undefer_and_ignore_overflow_warnings (void);
 extern bool fold_deferring_overflow_warnings_p (void);
 extern void fold_overflow_warning (const char*, enum 
warn_strict_overflow_code);
 extern int operand_equal_p (const_tree, const_tree, unsigned int);
+extern bool type_sizes_equal_p (const_tree, const_tree);
 extern int multiple_of_p (tree, const_tree, const_tree);
 #define omit_one_operand(T1,T2,T3)\
omit_one_operand_loc (UNKNOWN_LOCATION, T1, T2, T3)
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 49e2195..d2db6f2 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -1671,7 +1671,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, bool 
*warned,
 
   /* Those are better to come last as they are utterly uninformative.  */
   if (TYPE_SIZE (t1) && TYPE_SIZE (t2)
-  && !operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0))
+  && !type_sizes_equal_p (t1, t2))
 {
   warn_odr (t1, t2, NULL, NULL, warn, warned,
G_("a type with different size "
diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
index 8d9f22a..b66fd76 100644
--- a/gcc/ipa-polymorphic-call.c
+++ b/gcc/ipa-polymorphic-call.c
@@ -2454,10 +2454,7 @@ ipa_polymorphic_call_context::meet_with 
(ipa_polymorphic_call_context ctx,
   if (!dynamic
  && (ctx.dynamic
  || (!otr_type
- && (!TYPE_SIZE (ctx.outer_type)
- || !TYPE_SIZE (outer_type)
- || !operand_equal_p (TYPE_SIZE (ctx.outer_type),
-  TYPE_SIZE (outer_type), 0)
+ && (!type_sizes_equal_p (ctx.outer_type, outer_type)
{
  dynamic = true;
  updated = true;
@@ -2472,10 +2469,7 @@ ipa_polymorphic_call_context::meet_with 
(ipa_polymorphic_call_context ctx,
   if (!dynamic
  && (ctx.dynamic
  || (!otr_type
- && (!TYPE_SIZE (ctx.outer_type)
- || !TYPE_SIZE (outer_type)
- || !operand_equal_p (TYPE_SIZE (ctx.outer_type),
-  TYPE_SIZE (outer_type), 0)
+ && (!type_sizes_equal_p (ctx.outer_type, outer_type)
dynamic = true;
   outer_type = ctx.outer_type;
   offset = ctx.offset;
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index ebae6cf..98cd1d7 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tr

Re: [PATCH v2] aarch64: Add split-stack initial support

2016-11-07 Thread Adhemerval Zanella


On 14/10/2016 15:59, Wilco Dijkstra wrote:
> Hi,
> 

Thanks for the thoughtful review and sorry for late response. 

>> Split-stack prologue on function entry is as follow (this goes before the
>> usual function prologue):
> 
>>  mrsx9, tpidr_el0
>>  movx10, -
> 
> As Jiong already remarked, the nop won't work. Do we know the maximum 
> adjustment
> that the linker is allowed to make? If so, and we can limit the adjustment to 
> 16MB in
> most cases, emitting 2 subtracts is best. Larger offset need mov/movk/sub but 
> that
> should be extremely rare.

There is no limit afaik on gold split stack allocation handling,
and I think one could be added for each backend (in the method
override require to implement it).

In fact it is not really required to tie the nop generation with the
instruction generated by 'aarch64_internal_mov_immediate', it is
just a matter to simplify linker code.  

And although 16MB should be rare, nilptr2.go tests allocates 134217824
so this test fails with this low stack limit.  I am not sure how well
is the stack usage on 'go', but I think we should at least support
current testcase scenario.  So for current iteration I kept my
current approach, but I am open to suggestions.


> 
>>  nop/movk
> 
>>  addx10, sp, x10
>>  ldrx9, [x9, 16]
> 
> Is there any need to detect underflow of x10 or is there a guarantee that 
> stacks are
> never allocated in the low 2GB (given the maximum adjustment is 2GB)? It's 
> safe
> to do a signed comparison.

I do not think so, at least none of current backend that implements
split stack do so.

> 
>>  cmpx10, x9
>>  b.csenough
> 
> Why save/restore x30 and the call x30+8 trick when we could pass the
> continuation address and use a tailcall? That also avoids emitting extra 
> unwind info.
> 
>>  stpx30, [sp, -16]
>>  bl __morestack
>>  ldpx30, [sp], 16
>>  ret
> 
> This part doesn't make any sense - both x28 and carry flag as an input, and 
> spread
> across the prolog - why???
> 
>> enough:
>>  mov x10, sp
>   [prolog]
>>  b.cscontinue
>>  mov x10, x28
> continue:
>   [rest of function]
> 
> Why not do this?
> 
> function:
>   mrsx9, tpidr_el0
>   subx10, sp, N & 0xfff000
>   subx10, x10, N & 0xfff
>   ldrx9, [x9, 16]
>   adr x12, main_fn_entry
>   movx11, sp   [if function has stacked arguments]
>   cmpx10, x9
>   b.gemain_fn_entry
>   b __morestack
> main_fn_entry: [x11 is argument pointer]
>   [prolog]
>   [rest of function]
> 
> In __morestack you need to save x8 as well (another argument register!) and 
> x12 (the 
> continuation address). After returning from the call x8 doesn't need to be 
> preserved.

Indeed this strategy is way better and I adjusted the code follow it.
The only change is I am using a:

[...]
cmp x9, x10
b.ltmain_fn_entr
b   __morestack.
[...]

So I can issue a 'cmp , 0' on __morestack to indicate
the function was called.

> 
> There are several issues with unwinding in __morestack. x28 is not described 
> as a callee-save
> so will be corrupted if unwinding across a __morestack call. This won't 
> unwind correctly after
> the ldp as the unwinder will use the restored frame pointer to try to restore 
> x29/x30:
> 
> + ldp x29, x30, [x28, STACKFRAME_BASE]
> + ldr x28, [x28, STACKFRAME_BASE + 80]
> +
> + .cfi_remember_state
> + .cfi_restore 30
> + .cfi_restore 29
> + .cfi_def_cfa 31, 0

Indeed, it misses x28 save/restore. I think I have added the missing bits, but I
must confess that I am not well versed in CFI directives.  I will appreciate if 
you could help me on this new version.

> 
> This stores a random x30 value on the stack, what is the purpose of this? 
> Nothing can unwind
> to here:
> 
> + # Start using new stack
> + stp x29, x30, [x0, -16]!
> + mov sp, x0
> 
> Also we no longer need split_stack_arg_pointer_used_p () or any code that 
> uses it (functions
> that don't have any arguments passed on the stack could omit the mov x11, sp).

Right, we new strategy you proposed to do a branch this is indeed not
really required.  I remove it from on this new patch.

> 
> Wilco
> 
From dd2927aa5deb8d609c748014f3b566962fb852c5 Mon Sep 17 00:00:00 2001
From: Adhemerval Zanella 
Date: Wed, 4 May 2016 21:13:39 +
Subject: [PATCH 2/2] aarch64: Add split-stack initial support

This patch adds the split-stack support on aarch64 (PR #67877).  As for
other ports this patch should be used along with glibc and gold support.

The support is done similar to other architectures: a __private_ss field is
added on TCB in glibc, a target-specific __morestack implementation and
helper functions are added in libgcc and compiler supported in adjusted
(split-stack prologue, va_start for argument handling).  I also plan to
send the gold support to adjus

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Jack Howarth
Iain,
 It certainly looks like you dropped a file here. The proposed
ChangeLog shows...

* config.in: Likewise.

but the previously proposed hunk from...

diff --git a/gcc/config.in b/gcc/config.in
index a736de3..a7ff3ee 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1934,6 +1934,18 @@
 #endif


+/* Define to 1 if ld64 supports '-export_dynamic'. */
+#ifndef USED_FOR_TARGET
+#undef LD64_HAS_EXPORT_DYNAMIC
+#endif
+
+
+/* Define to ld64 version. */
+#ifndef USED_FOR_TARGET
+#undef LD64_VERSION
+#endif
+
+
 /* Define to the linker option to ignore unused dependencies. */
 #ifndef USED_FOR_TARGET
 #undef LD_AS_NEEDED_OPTION

from PR71767-vs-240230 has gone missing. The current patch still
produces a compiler which triggers warnings of...

warning: section "__textcoal_nt" is deprecated

during the bootstrap until that hunk of the original patch is restored.
Jack

On Sun, Nov 6, 2016 at 2:39 PM, Iain Sandoe  wrote:
> Hi Folks,
>
> This is an initial patch in a series that converts Darwin's configury to 
> detect ld64 features, rather than the current process of hard-coding them on 
> target system version.
>
> This adds an option --with-ld64[=version] that allows the configurer to 
> specify that the Darwin ld64 linker is in use.  If the version is given then 
> that will be used to determine the capabilities of the linker in native and 
> canadian crosses.  For Darwin targets this flag will default to "on", since 
> such targets require an ld64-compatible linker.
>
> If a DEFAULT_LINKER is set via --with-ld= then this will also be tested to 
> see if it is ld64.
>
> The ld64 version is determined (unless overridden by --with-ld64=version) and 
> this is exported for use in setting a default value for -mtarget-linker 
> (needed for run-time code-gen changes to section choices).
>
> In this initial patch, support for -rdynamic is converted to be detected at 
> config time, or by the ld64 version if that is explicitly given (as an 
> example of usage).
>
> OK for trunk?
> OK for open branches?
> Iain
>
> gcc/
>
> 2016-11-06  Iain Sandoe  
>
>PR target/71767
> * configure.ac (with-ld64): New arg-with.  gcc_ld64_version: New,
> new test.  gcc_cv_ld64_export_dynamic: New, New test.
> * configure: Regenerate.
> * config.in: Likewise.
> * darwin.h: Use LD64_HAS_DYNAMIC export. DEF_LD64: New, define.
> * darwin10.h(DEF_LD64): Update for this target version.
> * darwin12.h(LINK_GCC_C_SEQUENCE_SPEC): Remove rdynamic test.
> (DEF_LD64): Update for this target version.
> ---
>  gcc/config/darwin.h   | 16 ++-
>  gcc/config/darwin10.h |  5 
>  gcc/config/darwin12.h |  7 -
>  gcc/configure.ac  | 74 
> +++
>  4 files changed, 100 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index 045f70b..541bcb3 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -165,6 +165,12 @@ extern GTY(()) int darwin_ms_struct;
> specifying the handling of options understood by generic Unix
> linkers, and for positional arguments like libraries.  */
>
> +#if LD64_HAS_EXPORT_DYNAMIC
> +#define DARWIN_EXPORT_DYNAMIC " %{rdynamic:-export_dynamic}"
> +#else
> +#define DARWIN_EXPORT_DYNAMIC " %{rdynamic: %nrdynamic is not supported}"
> +#endif
> +
>  #define LINK_COMMAND_SPEC_A \
> "%{!fdump=*:%{!fsyntax-only:%{!c:%{!M:%{!MM:%{!E:%{!S:\
>  %(linker)" \
> @@ -185,7 +191,9 @@ extern GTY(()) int darwin_ms_struct;
>  %{!nostdlib:%{!nodefaultlibs:\
>%{%:sanitize(address): -lasan } \
>%{%:sanitize(undefined): -lubsan } \
> -  %(link_ssp) %(link_gcc_c_sequence)\
> +  %(link_ssp) \
> +  " DARWIN_EXPORT_DYNAMIC " % +  %(link_gcc_c_sequence) \
>  }}\
>  %{!nostdlib:%{!nostartfiles:%E}} %{T*} %{F*} }}}"
>
> @@ -932,4 +940,10 @@ extern void darwin_driver_init (unsigned int *,struct 
> cl_decoded_option **);
> fall-back default.  */
>  #define DEF_MIN_OSX_VERSION "10.5"
>
> +#ifndef LD64_VERSION
> +#define LD64_VERSION "85.2"
> +#else
> +#define DEF_LD64 LD64_VERSION
> +#endif
> +
>  #endif /* CONFIG_DARWIN_H */
> diff --git a/gcc/config/darwin10.h b/gcc/config/darwin10.h
> index 5829d78..a81fbdc 100644
> --- a/gcc/config/darwin10.h
> +++ b/gcc/config/darwin10.h
> @@ -32,3 +32,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  #undef DEF_MIN_OSX_VERSION
>  #define DEF_MIN_OSX_VERSION "10.6"
> +
> +#ifndef LD64_VERSION
> +#undef DEF_LD64
> +#define DEF_LD64 "97.7"
> +#endif
> diff --git a/gcc/config/darwin12.h b/gcc/config/darwin12.h
> index e366982..f88e2a4 100644
> --- a/gcc/config/darwin12.h
> +++ b/gcc/config/darwin12.h
> @@ -21,10 +21,15 @@ along with GCC; see the file COPYING3.  If not see
>  #undef  LINK_GCC_C_SEQUENCE_SPEC
>  #define LINK_GCC_C_SEQUENCE_SPEC \
>  "%:version-compare(>= 10.6 mmacosx-version-min= -no_compact_unwind) \
> -   %{rdynamic:-export_dynamic} %{!stati

[PATCH, rs6000] Modify include paths in config.gcc for Advance Toolchain builds

2016-11-07 Thread Peter Bergner
Gabriel and I have been tracking down an include path issue for GCC 6
Advance Toolchain builds (ie, --with-advance-toolchain=...).  The solution
that fixes the problem for us is to configure with --with-local-prefix=...
and removing the following hunk from config.gcc.  Gabriel has confirmed
this fixes his AT builds (native and cross) and I've verified that this
patch bootstraps with no regressions.

Is this ok for trunk and the GCC 6 branch?

Peter

* config.gcc (powerpc*-*-*, rs6000*-*-*): Remove setting of
INCLUDE_EXTRA_SPEC for Advance Toolchain builds.

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 241917)
+++ gcc/config.gcc  (working copy)
@@ -4137,16 +4137,6 @@ case "${target}" in
(at="/opt/$with_advance_toolchain"
 echo "/* Use Advance Toolchain $at */"
 echo
-echo "#ifndef USE_AT_INCLUDE_FILES"
-echo "#define USE_AT_INCLUDE_FILES 1"
-echo "#endif"
-echo
-echo "#if USE_AT_INCLUDE_FILES"
-echo "#undef  INCLUDE_EXTRA_SPEC"
-echo "#define INCLUDE_EXTRA_SPEC" \
- "\"-isystem $at/include\""
-echo "#endif"
-echo
 echo "#undef  LINK_OS_EXTRA_SPEC32"
 echo "#define LINK_OS_EXTRA_SPEC32" \
  "\"%(link_os_new_dtags)" \



C++ PATCH to announce template instantiations if not -quiet

2016-11-07 Thread Jason Merrill
It occurred to me that a simple trace of template instantiations would
fit simply into the stream of function declarations that
announce_function prints when -quiet is not specified to the compiler.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit ae7b4a929fbd05de433451a1d92794d962366646
Author: Jason Merrill 
Date:   Fri Nov 4 09:22:32 2016 -0400

Add template instantiations to the announce_function stream.

* pt.c (push_tinst_level_loc): Add template instantiations to the
announce_function stream.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c8d4a06..f910d40 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -9170,6 +9170,13 @@ push_tinst_level_loc (tree d, location_t loc)
   if (limit_bad_template_recursion (d))
 return false;
 
+  /* When not -quiet, dump template instantiations other than functions, since
+ announce_function will take care of those.  */
+  if (!quiet_flag
+  && TREE_CODE (d) != TREE_LIST
+  && TREE_CODE (d) != FUNCTION_DECL)
+fprintf (stderr, " %s", decl_as_string (d, TFF_DECL_SPECIFIERS));
+
   new_level = ggc_alloc ();
   new_level->decl = d;
   new_level->locus = loc;


[PATCH] print_rtx: implement support for reuse IDs (v2)

2016-11-07 Thread David Malcolm
On Tue, 2016-10-25 at 14:47 +0200, Bernd Schmidt wrote:
> On 10/21/2016 10:27 PM, David Malcolm wrote:
> > Thanks.  I attemped to use those fields of recog_data, but it
> > doesn't
> > seem to be exactly what's needed here.
> 
> Yeah, I may have been confused. I'm not sure that just looking at
> SCRATCHes is the right thing either, but I think you're on the right
> track, and we can use something like your patch for now and extend it
> later if necessary.
> 
> > + public:
> > +  rtx_reuse_manager ();
> > +  ~rtx_reuse_manager ();
> > +  static rtx_reuse_manager *get () { return singleton; }
> 
> OTOH, this setup looks a bit odd to me. Are you trying to avoid
> converting the print_rtx stuff to its own class, or avoid passing the
> reuse manager as an argument to a lot of functions?
>
> Some of this setup might not even be necessary. We have a "used" flag
> on
> rtx objects which is used to unshare RTL, and I think could also be
> used
> for a similar purpose when dumping. So, before printing, call
> reset_insn_used_flags on everything, then have another pass to set
> bits
> on everything that could conceivably be shared, and when you find
> something that already has the bit set, enter it into a table.
> Finally,
> print everything out, using the table. I think this would be somewhat
> simpler than adding another header file and class definition.

Now that we have a class rtx_writer, it's much clearer to drop the
singleton.

In this version I've eliminated the rtx_reuse_manager singleton,
instead allowing callers to pass a rtx_reuse_manager * to
rtx_writer's ctor.  This can be NULL, allowing most dumps to opt
out of the reuse-tracking, minimizing the risk of changing an
existing testcase; only print_rtl_function makes use of it (and
the selftests).

I eliminated print-rtl-reuse.h, moving class rtx_reuse_manager into
print-rtl.h and print-rtl.c

I kept the class rtx_reuse_manager, as it seems appropriate to
put responsibility for this aspect of dumping into its own class.
I attempted to move it into rtx_writer itself, but doing so made
the code less clear.
 
> > +void
> > +rtx_reuse_manager::preprocess (const_rtx x)
> > +{
> > +  subrtx_iterator::array_type array;
> > +  FOR_EACH_SUBRTX (iter, array, x, NONCONST)
> > +if (uses_rtx_reuse_p (*iter))
> > +  {
> > +   if (int *count = m_rtx_occurrence_count.get (*iter))
> > + {
> > +   if (*count == 1)
> > + {
> > +   m_rtx_reuse_ids.put (*iter, m_next_id++);
> > + }
> > +   (*count)++;
> > + }
> > +   else
> > + m_rtx_occurrence_count.put (*iter, 1);
> > +  }
> 
> Formatting rules suggest no braces around single statements, I think
> a
> more readable version of this would be:
> 
>if (uses_rtx_reuse_p (*iter))
>  {
>int *count = m_rtx_occurrence_count.get (*iter)
>if (count)
>  {
>if ((*count)++ == 1)
>  m_rtx_reuse_ids.put (*iter, m_next_id++);
>  }
>else
>   m_rtx_occurrence_count.put (*iter, 1);
>  }
> 
> 
> Bernd

Fixed in the way you you noted.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* config/i386/i386.c: Include print-rtl.h.
(selftest::ix86_test_dumping_memory_blockage): New function.
(selftest::ix86_run_selftests): Call it.
* print-rtl-function.c (print_rtx_function): Create an
rtx_reuse_manager and use it.
* print-rtl.c: Include "rtl-iter.h".
(rtx_writer::rtx_writer): Add reuse_manager param.
(rtx_reuse_manager::rtx_reuse_manager): New ctor.
(uses_rtx_reuse_p): New function.
(rtx_reuse_manager::preprocess): New function.
(rtx_reuse_manager::has_reuse_id): New function.
(rtx_reuse_manager::seen_def_p): New function.
(rtx_reuse_manager::set_seen_def): New function.
(rtx_writer::print_rtx): If "in_rtx" has a reuse ID, print it as a
prefix the first time in_rtx is seen, and print reuse_rtx
subsequently.
(print_inline_rtx): Supply NULL for new reuse_manager param.
(debug_rtx): Likewise.
(print_rtl): Likewise.
(print_rtl_single): Likewise.
(rtx_writer::print_rtl_single_with_indent): Likewise.
* print-rtl.h: Include bitmap.h when building for host.
(rtx_writer::rtx_writer): Add reuse_manager param.
(rtx_writer::m_rtx_reuse_manager): New field.
(class rtx_reuse_manager): New class.
* rtl-tests.c (selftest::assert_rtl_dump_eq): Add reuse_manager
param and use it when constructing rtx_writer.
(selftest::test_dumping_rtx_reuse): New function.
(selftest::rtl_tests_c_tests): Call it.
* selftest-rtl.h (class rtx_reuse_manager): New forward decl.
(selftest::assert_rtl_dump_eq): Add reuse_manager param.
(ASSERT_RTL_DUMP_EQ): Supply NULL for reuse_manager param.
(ASSERT_RTL_DUMP_EQ_WITH_REUSE): New macro.
---
 gcc/config/i386/i

  1   2   >