date:20170424

Re: [PATCH] Remove dead code from c_common_get_alias_set

2017-04-24 Thread Richard Biener

On Fri, 21 Apr 2017, Bernd Edlinger wrote:

> Hi!
> 
> 
> This removes some dead and unreachable code in c_common_get_alias_set:
> Because cc1 was recently changed to be only called with one file at a
> time, the code after "if (num_in_fnames == 1) return -1;" is no longer
> reachable, and can thus be removed.

While I think you are correct it looks like c_common_parse_file still
happily parses multiple infiles.  That is, only for
flag_preprocess_only we have a

  if (num_in_fnames > 1)
error ("too many filenames given.  Type %s --help for usage",
   progname);

and:

gcc> ./cc1 -quiet t.c t2.c
t2.c:5:6: error: conflicting types for ‘bar’
 void bar () { struct X x; *(volatile char *)x.buf = 1; }
  ^~~
t.c:8:1: note: previous definition of ‘bar’ was here
 bar (int x)
 ^~~

which means it actually still "works" to combine two source files
(yes, the driver no longer seems to have the ability to pass down
multiple inputs to cc1).

Thus, can you first remove that "feature"?

Thanks,
Richard.

> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: X /[ex] 4 < Y /[ex] 4

2017-04-24 Thread Marc Glisse


On Mon, 24 Apr 2017, Jakub Jelinek wrote:


+/* X / 4 < Y / 4 iif X < Y when the division is known to be exact.  */


s/iif/iff/ ?


Indeed, thanks, I've fixed it locally.

--
Marc Glisse

Re: PR79697: Delete calls to strdup, strndup, realloc if there is no lhs

2017-04-24 Thread Prathamesh Kulkarni

On 27 February 2017 at 00:10, Prathamesh Kulkarni
 wrote:
> On 26 February 2017 at 05:16, Martin Sebor  wrote:
>> On 02/25/2017 11:54 AM, Prathamesh Kulkarni wrote:
>>>
>>> On 25 February 2017 at 14:43, Marc Glisse  wrote:

 On Sat, 25 Feb 2017, Prathamesh Kulkarni wrote:

> Hi,
> The attached patch deletes calls to strdup, strndup if it's
> return-value is unused,
> and same for realloc if the first arg is NULL.
> Bootstrap+tested on x86_64-unknown-linux-gnu.
> OK for GCC 8 ?



 Instead of specializing realloc(0,*) wherever we can perform the same
 optimization as with malloc, wouldn't it be better to optimize:
 realloc(0,n) -> malloc(n)
 and let the malloc optimizations happen?
>>>
>>> Thanks for the suggestions. In the attached patch, realloc (0, n) is
>>> folded to malloc (n).
>>> Bootstrap+test in progress on x86_64-unknown-linux-gnu.
>>> Does the patch look OK ?
>>
>>
>> Although it's not part of the bug report, I wonder if a complete
>> patch should also extend the malloc/free DCE to str{,n}dup/free
>> calls and eliminate pairs like these:
>>
>>   void f (const char *s)
>>   {
>> char *p = strdup (src);
>> free (p);
>>   }
>>
>> (That seems to be just a matter of adding a couple of conditionals
>> for BUILT_IN_STR{,N}DUP in propagate_necessity).
> Hi Martin,
> Thanks for the suggestions, I have updated the patch accordingly.
> Does it look OK ?
Hi,
The attached patch for PR79697 passes bootstrap+test on x86_64 and
cross-tested on arm*-*-* and aarch64*-*-*.
As per the previous suggestions, the patch folds realloc (0, n) -> malloc (n).
Is it OK for trunk ?

Thanks,
Prathamesh
>>
>> Martin
>>
>> PS Another optimization, though one that's most likely outside
>> the scope of this patch, is to eliminate all of the following:
>>
>>   void f (const char *s, char *s2)
>>   {
>> char *p = strdup (s);
>> strcpy (p, s2);
>> free (p);
>>   }
>>
>> as is done in:
>>
>>   void f (unsigned n, const char *s)
>>   {
>> char *p = malloc (n);
>> memcpy (p, s, n);
>> free (p);
>>   }
>>
> Hmm, dse1 detects that the store to p in memcpy() is dead and deletes the 
> call.
> cddce1 then removes calls to  malloc and free thus making the function empty.
> It doesn't work if memcpy is replaced by strcpy:
>
> void f (unsigned n, char *s2)
> {
>   char *p = __builtin_malloc (n);
>   __builtin_strcpy (p, s2);
>   __builtin_free (p);
> }
>
> I suppose strcpy should be special-cased too in dse similar to memcpy ?
>
> Thanks,
> Prathamesh
2017-04-24  Prathamesh Kulkarni  

PR tree-optimization/79697
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Check if callee
is BUILT_IN_STRDUP, BUILT_IN_STRNDUP, BUILT_IN_REALLOC.
(propagate_necessity): Check if def_callee is BUILT_IN_STRDUP or
BUILT_IN_STRNDUP.
* gimple-fold.c (gimple_fold_builtin_realloc): New function.
(gimple_fold_builtin): Call gimple_fold_builtin_realloc.

testsuite/
* gcc.dg/tree-ssa/pr79697.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index a75dd91..e6eceea 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -3251,6 +3251,28 @@ gimple_fold_builtin_acc_on_device (gimple_stmt_iterator 
*gsi, tree arg0)
   return true;
 }
 
+/* Fold realloc (0, n) -> malloc (n).  */
+
+static bool
+gimple_fold_builtin_realloc (gimple_stmt_iterator *gsi)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  tree arg = gimple_call_arg (stmt, 0);
+  tree size = gimple_call_arg (stmt, 1);
+
+  if (operand_equal_p (arg, null_pointer_node, 0))
+{
+  tree fn_malloc = builtin_decl_implicit (BUILT_IN_MALLOC);
+  if (fn_malloc)
+   {
+ gcall *repl = gimple_build_call (fn_malloc, 1, size);
+ replace_call_with_call_and_fold (gsi, repl);
+ return true;
+   }
+}
+  return false;
+}
+
 /* Fold the non-target builtin at *GSI and return whether any simplification
was made.  */
 
@@ -3409,6 +3431,9 @@ gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case BUILT_IN_ACC_ON_DEVICE:
   return gimple_fold_builtin_acc_on_device (gsi,
gimple_call_arg (stmt, 0));
+case BUILT_IN_REALLOC:
+  return gimple_fold_builtin_realloc (gsi);
+
 default:;
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr79697.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr79697.c
new file mode 100644
index 000..d4f6473
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr79697.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-gimple -fdump-tree-cddce-details 
-fdump-tree-optimized" } */
+
+void f(void)
+{
+  __builtin_strdup ("abc");
+}
+
+void g(void)
+{
+  __builtin_strndup ("abc", 3);
+}
+
+void h(void)
+{
+  __builtin_realloc (0, 10);
+}
+
+void k(void)
+{
+  char *p = __builtin_strdup ("abc");
+  __builtin_free (p);
+
+  char *q = __builtin_strndup ("abc", 3);
+  __builtin_free (q);
+}
+
+/* { dg-final { scan-tree-dump "D

Re: [PATCH] squash spurious warnings in dominance.c

2017-04-24 Thread Richard Biener

On Sat, Apr 22, 2017 at 2:51 AM, Martin Sebor  wrote:
> Bug 80486 - spurious -Walloc-size-larger-than and
> -Wstringop-overflow in dominance.c during profiledbootstrap
> points out a number of warnings that show up in dominance.c
> during a profiledbootstrap.  I'm pretty sure the warnings
> are due to the size check the C++ new expression introduces
> to avoid unsigned overflow before calling operator new, and
> by some optimization like jump threading introducing a branch
> with the call to the allocation function and memset with
> the excessive constant size.
>
> Two ways to avoid it come to mind: 1) use the libiberty
> XCNEWVEC and XNEWVEC macros instead of C++ new expressions,
> and 2) constraining the size variable to a valid range.
>
> Either of these approaches should result in better code than
> the new expression because they both eliminate the test for
> the overflow.  Attached is a patch that implements (1). I
> chose it mainly because it seems in line with GCC's memory
> management policy and with avoiding exceptions.
>
> An alternate patch should be straightforward.  Either add
> an assert like the one below or change the type of
> m_n_basic_blocks from size_t to unsigned.  This approach,
> though less intrusive, will likely bring the warning back
> in ILP32 builds; I'm not sure if it matters.

Please change m_n_basic_blocks (and local copies) from size_t
to unsigned int.  This is an odd inconsistency that's worth fixing
in any case.

Richard.

> Martin
>
> diff --git a/gcc/dominance.c b/gcc/dominance.c
> index c76e62e..ebb0a8f 100644
> --- a/gcc/dominance.c
> +++ b/gcc/dominance.c
> @@ -161,6 +161,9 @@ void
>  dom_info::dom_init (void)
>  {
>size_t num = m_n_basic_blocks;
> +
> +  gcc_assert (num < SIZE_MAX / sizeof (basic_block) / 2);
> +
>m_dfs_parent = new_zero_array  (num);
>m_dom = new_zero_array  (num);
>

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Saturday 22 April 2017, Allan Sandfeld Jensen wrote:
> Replaces definitions of immediate logical shift intrinsics with GCC
> extension syntax. Tests are added to ensure the intrinsics still produce
> the right instructions and that a few basic optimizations now work.
> 
> Compared to the earlier version of the patch, all potentially undefined
> shifts are now avoided, which also means no variable shifts or arithmetic
> right shifts.

Fixed 2 errors in the tests.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b58f5050db0..b9406550fc5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2017-04-22  Allan Sandfeld Jensen  
+
+	* config/i386/emmintrin.h (_mm_slli_*, _mm_srli_*):
+	Use vector intrinstics instead of builtins.
+	* config/i386/avx2intrin.h (_mm256_slli_*, _mm256_srli_*):
+	Use vector intrinstics instead of builtins.
+
 2017-04-21  Uros Bizjak  
 
 	* config/i386/i386.md (*extzvqi_mem_rex64): Move above *extzv.
diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h
index 82f170a3d61..acb49734131 100644
--- a/gcc/config/i386/avx2intrin.h
+++ b/gcc/config/i386/avx2intrin.h
@@ -667,7 +667,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_slli_epi16 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
+  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
@@ -681,7 +681,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_slli_epi32 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_pslldi256 ((__v8si)__A, __B);
+  return ((__B & 0xff) < 32) ? (__m256i)((__v8si)__A << (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
@@ -695,7 +695,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_slli_epi64 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psllqi256 ((__v4di)__A, __B);
+  return ((__B & 0xff) < 64) ? (__m256i)((__v4di)__A << (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
@@ -758,7 +758,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_srli_epi16 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psrlwi256 ((__v16hi)__A, __B);
+  return ((__B & 0xff) < 16) ? (__m256i) ((__v16hu)__A >> (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
@@ -772,7 +772,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_srli_epi32 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psrldi256 ((__v8si)__A, __B);
+  return ((__B & 0xff) < 32) ? (__m256i) ((__v8su)__A >> (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
@@ -786,7 +786,7 @@ extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_srli_epi64 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psrlqi256 ((__v4di)__A, __B);
+  return ((__B & 0xff) < 64) ? (__m256i) ((__v4du)__A >> (__B & 0xff)) : _mm256_setzero_si256();
 }
 
 extern __inline __m256i
diff --git a/gcc/config/i386/emmintrin.h b/gcc/config/i386/emmintrin.h
index 828f4a07a9b..5c048d9fd0d 100644
--- a/gcc/config/i386/emmintrin.h
+++ b/gcc/config/i386/emmintrin.h
@@ -1140,19 +1140,19 @@ _mm_mul_epu32 (__m128i __A, __m128i __B)
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_slli_epi16 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psllwi128 ((__v8hi)__A, __B);
+  return ((__B & 0xff) < 16) ? (__m128i)((__v8hi)__A << (__B & 0xff)) : _mm_setzero_si128();
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_slli_epi32 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_pslldi128 ((__v4si)__A, __B);
+  return ((__B & 0xff) < 32) ? (__m128i)((__v4si)__A << (__B & 0xff)) : _mm_setzero_si128();
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_slli_epi64 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psllqi128 ((__v2di)__A, __B);
+  return ((__B & 0xff) < 64) ? (__m128i)((__v2di)__A << (__B & 0xff)) : _mm_setzero_si128();
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
@@ -1205,19 +1205,19 @@ _mm_slli_si128 (__m128i __A, const int __N)
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_srli_epi16 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psrlwi128 ((__v8hi)__A, __B);
+  return ((__B & 0xff) < 16) ? (__m128i)((__v8hu)__A >> (__B & 0xff)) : _mm_setzero_si128();
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_srli_epi32 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psrldi128

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Richard Biener

On Sun, Apr 23, 2017 at 11:38 PM, Marc Glisse  wrote:
> Hello,
>
> this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME using
> range information and known (non-)zero bits, by delegating to
> expr_not_equal_to which already knows how to handle all that.
>
> This makes one strict overflow warning disappear. It isn't particularly
> surprising, since the new code makes tree_expr_nonzero_warnv_p return true
> without warning (we do not remember if the range information was obtained
> using strict overflow). In my opinion, improving code generation is more
> important than this specific warning.
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Hmm, I think you need to guard this with a INTEGRAL_TYPE_P check
given the comment on tree_single_nonzero_warnv_p also talks about
FP.

Ok wiht that change.

Richard.

> 2017-04-24  Marc Glisse  
>
> gcc/
> * fold-const.c (tree_single_nonzero_warnv_p): Handle SSA_NAME.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/cmpmul-1.c: New file.
> * gcc.dg/Wstrict-overflow-18.c: Xfail.
>
> --
> Marc Glisse

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> --- a/gcc/config/i386/avx2intrin.h
> +++ b/gcc/config/i386/avx2intrin.h
> @@ -667,7 +667,7 @@ extern __inline __m256i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm256_slli_epi16 (__m256i __A, int __B)
>  {
> -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) : 
> _mm256_setzero_si256();
>  }

What is the advantage of doing that when you replace one operation with
several (&, <, ?:, <<)?
I'd say instead we should fold the builtins if in the gimple fold target
hook we see the shift count constant and can decide based on that.
Or we could use __builtin_constant_p (__B) to decide whether to use
the generic vector shifts or builtin, but that means larger IL.

Jakub

Re: X /[ex] 4 < Y /[ex] 4

2017-04-24 Thread Richard Biener

On Mon, Apr 24, 2017 at 9:04 AM, Marc Glisse  wrote:
> On Mon, 24 Apr 2017, Jakub Jelinek wrote:
>
>>> +/* X / 4 < Y / 4 iif X < Y when the division is known to be exact.  */
>>
>>
>> s/iif/iff/ ?
>
>
> Indeed, thanks, I've fixed it locally.

Ok.

As of strict-overflow warnings I'd like to kill -f[no-]strict-overflow
by aliasing
it to -f[no-]wrapv and thus removing the !TYPE_OVERFLOW_WRAPS
&& !TYPE_OVERFLOW_UNDEFINED && !TYPE_OVERFLOW_TRAPS case.

This means effectively killing -Wstrict-overflow which is IMHO reasonable
as it's quite broken plus we now have ubsan which does a better job.

Thanks,
Richard.

> --
> Marc Glisse

[PATCH] Fix PR79725

2017-04-24 Thread Richard Biener


I have tested the following patch to remove (where easy) dead code
in the way of sinking.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-04-24  Richard Biener  

PR tree-optimization/79725
* tree-ssa-sink.c (statement_sink_location): Return whether
failure reason was zero uses.  Move that check later.
(sink_code_in_bb): Deal with zero uses by removing the stmt
if possible.

* gcc.dg/tree-ssa/ssa-sink-15.c: New testcase.

Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-15.c
===
*** gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-15.c (nonexistent)
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-15.c (working copy)
***
*** 0 
--- 1,14 
+ /* PR79725 */
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -fdump-tree-optimized" } */
+ 
+ _Complex double f(_Complex double x[])
+ {
+   _Complex float p = 1.0;
+   for (int i = 0; i < 100; i++)
+ p = x[i];
+   return p;
+ }
+ 
+ /* Verify we end up with a single BB and no loop.  */
+ /* { dg-final { scan-tree-dump-times "goto" 0 "optimized" } } */
Index: gcc/tree-ssa-sink.c
===
--- gcc/tree-ssa-sink.c (revision 247060)
+++ gcc/tree-ssa-sink.c (working copy)
@@ -244,7 +244,7 @@ select_best_block (basic_block early_bb,
 
 static bool
 statement_sink_location (gimple *stmt, basic_block frombb,
-gimple_stmt_iterator *togsi)
+gimple_stmt_iterator *togsi, bool *zero_uses_p)
 {
   gimple *use;
   use_operand_p one_use = NULL_USE_OPERAND_P;
@@ -254,6 +254,8 @@ statement_sink_location (gimple *stmt, b
   ssa_op_iter iter;
   imm_use_iterator imm_iter;
 
+  *zero_uses_p = false;
+
   /* We only can sink assignments.  */
   if (!is_gimple_assign (stmt))
 return false;
@@ -263,10 +265,6 @@ statement_sink_location (gimple *stmt, b
   if (def_p == NULL_DEF_OPERAND_P)
 return false;
 
-  /* Return if there are no immediate uses of this stmt.  */
-  if (has_zero_uses (DEF_FROM_PTR (def_p)))
-return false;
-
   /* There are a few classes of things we can't or don't move, some because we
  don't have code to handle it, some because it's not profitable and some
  because it's not legal.
@@ -292,11 +290,17 @@ statement_sink_location (gimple *stmt, b
   */
   if (stmt_ends_bb_p (stmt)
   || gimple_has_side_effects (stmt)
-  || gimple_has_volatile_ops (stmt)
   || (cfun->has_local_explicit_reg_vars
  && TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode))
 return false;
 
+  /* Return if there are no immediate uses of this stmt.  */
+  if (has_zero_uses (DEF_FROM_PTR (def_p)))
+{
+  *zero_uses_p = true;
+  return false;
+}
+
   if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (DEF_FROM_PTR (def_p)))
 return false;
 
@@ -483,12 +487,23 @@ sink_code_in_bb (basic_block bb)
 {
   gimple *stmt = gsi_stmt (gsi);
   gimple_stmt_iterator togsi;
+  bool zero_uses_p;
 
-  if (!statement_sink_location (stmt, bb, &togsi))
+  if (!statement_sink_location (stmt, bb, &togsi, &zero_uses_p))
{
+ gimple_stmt_iterator saved = gsi;
  if (!gsi_end_p (gsi))
gsi_prev (&gsi);
- last = false;
+ /* If we face a dead stmt remove it as it possibly blocks
+sinking of uses.  */
+ if (zero_uses_p
+ && ! gimple_vdef (stmt))
+   {
+ gsi_remove (&saved, true);
+ release_defs (stmt);
+   }
+ else
+   last = false;
  continue;
}
   if (dump_file)

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 09:41:01AM +0200, Richard Biener wrote:
> On Sun, Apr 23, 2017 at 11:38 PM, Marc Glisse  wrote:
> > Hello,
> >
> > this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME using
> > range information and known (non-)zero bits, by delegating to
> > expr_not_equal_to which already knows how to handle all that.
> >
> > This makes one strict overflow warning disappear. It isn't particularly
> > surprising, since the new code makes tree_expr_nonzero_warnv_p return true
> > without warning (we do not remember if the range information was obtained
> > using strict overflow). In my opinion, improving code generation is more
> > important than this specific warning.
> >
> > Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
> 
> Hmm, I think you need to guard this with a INTEGRAL_TYPE_P check
> given the comment on tree_single_nonzero_warnv_p also talks about
> FP.

I vaguely remember there were issues with that, because VRP uses
the *_nonzero_warnv* functions to compute the ranges and now those
functions would use range info.  But it has been some time ago and maybe this
patch is different enough from what I've been trying back then.

So just please watch carefully for any fallout.

Jakub

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > --- a/gcc/config/i386/avx2intrin.h
> > +++ b/gcc/config/i386/avx2intrin.h
> > @@ -667,7 +667,7 @@ extern __inline __m256i
> > 
> >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> >  _mm256_slli_epi16 (__m256i __A, int __B)
> >  {
> > 
> > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) :
> > _mm256_setzero_si256();
> > 
> >  }
> 
> What is the advantage of doing that when you replace one operation with
> several (&, <, ?:, <<)?
> I'd say instead we should fold the builtins if in the gimple fold target
> hook we see the shift count constant and can decide based on that.
> Or we could use __builtin_constant_p (__B) to decide whether to use
> the generic vector shifts or builtin, but that means larger IL.

The advantage is that in this builtin, the __B is always a literal (or 
constexpr), so the if statement is resolved at compile time.

`Allan

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 09:51:29AM +0200, Allan Sandfeld Jensen wrote:
> On Monday 24 April 2017, Jakub Jelinek wrote:
> > On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > > --- a/gcc/config/i386/avx2intrin.h
> > > +++ b/gcc/config/i386/avx2intrin.h
> > > @@ -667,7 +667,7 @@ extern __inline __m256i
> > > 
> > >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > >  _mm256_slli_epi16 (__m256i __A, int __B)
> > >  {
> > > 
> > > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > > +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) :
> > > _mm256_setzero_si256();
> > > 
> > >  }
> > 
> > What is the advantage of doing that when you replace one operation with
> > several (&, <, ?:, <<)?
> > I'd say instead we should fold the builtins if in the gimple fold target
> > hook we see the shift count constant and can decide based on that.
> > Or we could use __builtin_constant_p (__B) to decide whether to use
> > the generic vector shifts or builtin, but that means larger IL.
> 
> The advantage is that in this builtin, the __B is always a literal (or 
> constexpr), so the if statement is resolved at compile time.

Do we really want to support all the thousands _mm* intrinsics in constexpr
contexts?  People can just use generic vectors instead.

That said, both the options I've mentioned above provide the same advantages
and don't have the disadvantages of pessimizing normal code.

Jakub

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 09:51:29AM +0200, Allan Sandfeld Jensen wrote:
> > On Monday 24 April 2017, Jakub Jelinek wrote:
> > > On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > > > --- a/gcc/config/i386/avx2intrin.h
> > > > +++ b/gcc/config/i386/avx2intrin.h
> > > > @@ -667,7 +667,7 @@ extern __inline __m256i
> > > > 
> > > >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > > >  _mm256_slli_epi16 (__m256i __A, int __B)
> > > >  {
> > > > 
> > > > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > > > +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B &
> > > > 0xff)) : _mm256_setzero_si256();
> > > > 
> > > >  }
> > > 
> > > What is the advantage of doing that when you replace one operation with
> > > several (&, <, ?:, <<)?
> > > I'd say instead we should fold the builtins if in the gimple fold
> > > target hook we see the shift count constant and can decide based on
> > > that. Or we could use __builtin_constant_p (__B) to decide whether to
> > > use the generic vector shifts or builtin, but that means larger IL.
> > 
> > The advantage is that in this builtin, the __B is always a literal (or
> > constexpr), so the if statement is resolved at compile time.
> 
> Do we really want to support all the thousands _mm* intrinsics in constexpr
> contexts?  People can just use generic vectors instead.
> 
I would love to support it, but first we need a C extension attribute matching 
constexpr, and I consider it a separate issue.

> That said, both the options I've mentioned above provide the same
> advantages and don't have the disadvantages of pessimizing normal code.
> 
What pessimizing? This produce the same or better code for all legal 
arguments. The only difference besides better generated code is that it allows 
the intrinsics to be used incorrectly with non-literal arguments because we 
lack the C-extension for constexp to prevent that.

`Allan

[PATCH] Fix PR80494

2017-04-24 Thread Richard Biener


The following patch fixes this PR.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-04-24  Richard Biener  

PR tree-optimization/80494
* tree-scalar-evolution.c (analyze_scalar_evolution_1): Bail
out for complex types.

* gfortran.dg/pr80494.f90: New testcase.

Index: gcc/tree-scalar-evolution.c
===
--- gcc/tree-scalar-evolution.c (revision 247091)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -2049,7 +2049,9 @@ analyze_scalar_evolution_1 (struct loop
   basic_block bb;
   struct loop *def_loop;
 
-  if (loop == NULL || TREE_CODE (type) == VECTOR_TYPE)
+  if (loop == NULL
+  || TREE_CODE (type) == VECTOR_TYPE
+  || TREE_CODE (type) == COMPLEX_TYPE)
 return chrec_dont_know;
 
   if (TREE_CODE (var) != SSA_NAME)
Index: gcc/testsuite/gfortran.dg/pr80494.f90
===
--- gcc/testsuite/gfortran.dg/pr80494.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr80494.f90   (working copy)
@@ -0,0 +1,32 @@
+! { dg-do compile }
+! { dg-options "-std=gnu -O2" }
+
+subroutine CalcCgr(C,rmax,ordgr_max)
+  integer, intent(in) :: rmax,ordgr_max
+  double complex :: Zadj(2,2), Zadj2(2,2)
+  double complex, intent(out) :: C(0:rmax,0:rmax,0:rmax)
+  double complex, allocatable :: Cexpgr(:,:,:,:)
+  double complex :: Caux
+  integer :: rmaxB,rmaxExp,r,n0,n1,n2,k,l,i,j,m,n,nn
+
+  rmaxB = 2*rmax
+  rmaxExp = rmaxB
+  allocate(Cexpgr(0:rmaxExp/2,0:rmaxExp,0:rmaxExp,0:ordgr_max))
+   
+  rloop: do r=0,rmaxExp/2
+do n0=r,1,-1
+  do nn=r-n0,0,-1
+do i=1,2
+  Caux = Caux - Zadj(i,l)
+end do
+Cexpgr(n0,0,0,0) = Caux/(2*(nn+1))
+  end do
+end do
+do n1=0,r
+  n2 = r-n1
+  if (r.le.rmax) then
+C(0,n1,n2) = Cexpgr(0,n1,n2,0)
+  end if
+end do
+  end do rloop
+end subroutine CalcCgr

Re: [Patch, Fortran] PR 80121: Memory leak with derived-type intent(out) argument

2017-04-24 Thread Christophe Lyon

Hi,

On 23 April 2017 at 10:51, Janus Weil  wrote:
> Hi Thomas,
>
>>> the patch in the attachment fixes a memory leak by auto-deallocating
>>> the allocatable components of an allocatable intent(out) argument.
>>>
>>> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>>
>> OK for trunk.
>
> thanks for the review! Committed as r247083.

This patch causes an error message from DejaGnu:
(DejaGnu) proc "cleanup-tree-dump original" does not exist.
I'm not familiar with fortran, so I'm not sure it is as obvious as
removing cleanup-tree-dump as it is done in the other neighboring tests?

Thanks,

Christophe

>
>
>> Also (because this is a quite serious bug)
>> OK for gcc 7 after the release of 7.1.
>
> I tend to be a bit hesitant with backporting non-regression-fixes, but
> in this case I agree that it might make sense. (Will commit to
> 7-branch once it's open again, if no one objects.)
>
> At least the patch fixes the leak in the least invasive way I could
> find, although I was thinking out loudly about ways to refactor the
> intent(out) handling:
>
>> My feeling is that it would be a good idea to handle allocatable derived 
>> types inside
>> of the callee as well. I can see at least two advantages:
>> * It would avoid code duplication if the procedure is called several times.
>> * It would take some complexity out of gfc_conv_procedure_call, which is 
>> quite a
>> monster.
>>
>> From the technical side a treatment in the callee should be possible AFAICS. 
>> I wonder
>> why it is being done in the caller at all?
>
>
> Any feedback is welcome here (possibly I'm missing some reason why
> this needs to be done by the caller?) ...
>
> Cheers,
> Janus

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 10:02:40AM +0200, Allan Sandfeld Jensen wrote:
> > That said, both the options I've mentioned above provide the same
> > advantages and don't have the disadvantages of pessimizing normal code.
> > 
> What pessimizing? This produce the same or better code for all legal 
> arguments. The only difference besides better generated code is that it 
> allows 

No.  Have you really tried that?

> the intrinsics to be used incorrectly with non-literal arguments because we 
> lack the C-extension for constexp to prevent that.

Consider e.g. -O2 -mavx2 -mtune=intel:
#include 

__m256i
foo (__m256i x, int s)
{
  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)x, s);
}

__m256i
bar (__m256i x, int s)
{
  return ((s & 0xff) < 16) ? (__m256i)((__v16hi)x << (s & 0xff)) : 
_mm256_setzero_si256 ();
}

The first one generates
movl%edi, %edi
vmovq   %rdi, %xmm1
vpsllw  %xmm1, %ymm0, %ymm0
ret
(because that is actually what the instruction does), the second one
movzbl  %dil, %edi
cmpl$15, %edi
jg  .L5
vmovq   %rdi, %xmm1
vpsllw  %xmm1, %ymm0, %ymm0
ret
.p2align 4,,7
.p2align 3
.L5:
vpxor   %xmm0, %xmm0, %xmm0
ret

Jakub

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 10:02:40AM +0200, Allan Sandfeld Jensen wrote:
> > > That said, both the options I've mentioned above provide the same
> > > advantages and don't have the disadvantages of pessimizing normal code.
> > 
> > What pessimizing? This produce the same or better code for all legal
> > arguments. The only difference besides better generated code is that it
> > allows
> 
> No.  Have you really tried that?
> 
> > the intrinsics to be used incorrectly with non-literal arguments because
> > we lack the C-extension for constexp to prevent that.
> 
> Consider e.g. -O2 -mavx2 -mtune=intel:
> #include 
> 
> __m256i
> foo (__m256i x, int s)
> {
>   return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)x, s);
> }
> 
> __m256i
> bar (__m256i x, int s)
> {
>   return ((s & 0xff) < 16) ? (__m256i)((__v16hi)x << (s & 0xff)) :
> _mm256_setzero_si256 (); }
> 
> The first one generates
> movl%edi, %edi
> vmovq   %rdi, %xmm1
> vpsllw  %xmm1, %ymm0, %ymm0
> ret
> (because that is actually what the instruction does), the second one
That is a different instruction. That is the vpsllw not vpsllwi

The intrinsics I changed is the immediate version, I didn't change the non-
immediate version. It is probably a bug if you can give non-immediate values 
to the immediate only intrinsic. At least both versions handles it, if in 
different ways, but is is illegal arguments.

`Allan

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Allan Sandfeld Jensen wrote:
> On Monday 24 April 2017, Jakub Jelinek wrote:
> > On Mon, Apr 24, 2017 at 10:02:40AM +0200, Allan Sandfeld Jensen wrote:
> > > > That said, both the options I've mentioned above provide the same
> > > > advantages and don't have the disadvantages of pessimizing normal
> > > > code.
> > > 
> > > What pessimizing? This produce the same or better code for all legal
> > > arguments. The only difference besides better generated code is that it
> > > allows
> > 
> > No.  Have you really tried that?
> > 
> > > the intrinsics to be used incorrectly with non-literal arguments
> > > because we lack the C-extension for constexp to prevent that.
> > 
> > Consider e.g. -O2 -mavx2 -mtune=intel:
> > #include 
> > 
> > __m256i
> > foo (__m256i x, int s)
> > {
> > 
> >   return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)x, s);
> > 
> > }
> > 
> > __m256i
> > bar (__m256i x, int s)
> > {
> > 
> >   return ((s & 0xff) < 16) ? (__m256i)((__v16hi)x << (s & 0xff)) :
> > _mm256_setzero_si256 (); }
> > 
> > The first one generates
> > 
> > movl%edi, %edi
> > vmovq   %rdi, %xmm1
> > vpsllw  %xmm1, %ymm0, %ymm0
> > ret
> > 
> > (because that is actually what the instruction does), the second one
> 
> That is a different instruction. That is the vpsllw not vpsllwi
> 
> The intrinsics I changed is the immediate version, I didn't change the non-
> immediate version. It is probably a bug if you can give non-immediate
> values to the immediate only intrinsic. At least both versions handles it,
> if in different ways, but is is illegal arguments.
> 
Though I now that I think about it, this means my change of to the existing 
sse-psslw-1.c test and friends is wrong, because it uses variable input.

`Allan

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 10:34:58AM +0200, Allan Sandfeld Jensen wrote:
> That is a different instruction. That is the vpsllw not vpsllwi
> 
> The intrinsics I changed is the immediate version, I didn't change the non-
> immediate version. It is probably a bug if you can give non-immediate values 
> to the immediate only intrinsic. At least both versions handles it, if in 
> different ways, but is is illegal arguments.

The documentation is unclear on that and I've only recently fixed up some
cases where these intrinsics weren't able to handle non-constant arguments
in some cases, while both ICC and clang coped with that fine.
So it is clearly allowed and handled by all the compilers and needs to be
supported, people use that in real-world code.

Jakub

[PATCH] Fix PR79201 (half-way)

2017-04-24 Thread Richard Biener


One issue in PR79201 is that we don't sink pure/const calls which is
what the following simple patch fixes.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-04-24  Richard Biener  

PR tree-optimization/79201
* tree-ssa-sink.c (statement_sink_location): Handle calls.

* gcc.dg/tree-ssa/ssa-sink-16.c: New testcase.

Index: gcc/tree-ssa-sink.c
===
*** gcc/tree-ssa-sink.c (revision 247091)
--- gcc/tree-ssa-sink.c (working copy)
*** statement_sink_location (gimple *stmt, b
*** 256,263 
  
*zero_uses_p = false;
  
!   /* We only can sink assignments.  */
!   if (!is_gimple_assign (stmt))
  return false;
  
/* We only can sink stmts with a single definition.  */
--- 257,268 
  
*zero_uses_p = false;
  
!   /* We only can sink assignments and non-looping const/pure calls.  */
!   int cf;
!   if (!is_gimple_assign (stmt)
!   && (!is_gimple_call (stmt)
! || !((cf = gimple_call_flags (stmt)) & (ECF_CONST|ECF_PURE))
! || (cf & ECF_LOOPING_CONST_OR_PURE)))
  return false;
  
/* We only can sink stmts with a single definition.  */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-16.c
===
*** gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-16.c (nonexistent)
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-16.c (working copy)
***
*** 0 
--- 1,14 
+ /* { dg-do compile } */
+ /* Note PRE rotates the loop and blocks the sinking opportunity.  */
+ /* { dg-options "-O2 -fno-tree-pre -fdump-tree-sink -fdump-tree-optimized" } 
*/
+ 
+ int f(int n)
+ {
+   int i,j=0;
+   for (i = 0; i < 31; i++)
+ j = __builtin_ffs(i);
+   return j;
+ }
+ 
+ /* { dg-final { scan-tree-dump "Sinking j_. = __builtin_ffs" "sink" } } */
+ /* { dg-final { scan-tree-dump "return 2;" "optimized" } } */

Re: [Patch, Fortran] PR 80121: Memory leak with derived-type intent(out) argument

2017-04-24 Thread Janus Weil

Hi Christophe,

2017-04-24 10:25 GMT+02:00 Christophe Lyon :
 the patch in the attachment fixes a memory leak by auto-deallocating
 the allocatable components of an allocatable intent(out) argument.

 Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>>>
>>> OK for trunk.
>>
>> thanks for the review! Committed as r247083.
>
> This patch causes an error message from DejaGnu:
> (DejaGnu) proc "cleanup-tree-dump original" does not exist.

thanks for letting me know. I didn't notice that ...


> I'm not familiar with fortran, so I'm not sure it is as obvious as
> removing cleanup-tree-dump as it is done in the other neighboring tests?

Yes, probably it should just be removed. I assume this kind of cleanup
is being done automatically now? I actually took it from this wiki
page:

https://gcc.gnu.org/wiki/TestCaseWriting

So I guess this needs to be updated as well. Will take care of both
points tonight ...

Cheers,
Janus

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 10:34:58AM +0200, Allan Sandfeld Jensen wrote:
> > That is a different instruction. That is the vpsllw not vpsllwi
> > 
> > The intrinsics I changed is the immediate version, I didn't change the
> > non- immediate version. It is probably a bug if you can give
> > non-immediate values to the immediate only intrinsic. At least both
> > versions handles it, if in different ways, but is is illegal arguments.
> 
> The documentation is unclear on that and I've only recently fixed up some
> cases where these intrinsics weren't able to handle non-constant arguments
> in some cases, while both ICC and clang coped with that fine.
> So it is clearly allowed and handled by all the compilers and needs to be
> supported, people use that in real-world code.
> 
Undoubtedly it happens. I just make a mistake myself that created that case. 
But it is rather unfortunate, and means we make wrong code currently for 
corner case values.

Note the difference in definition between the two intrinsics: 
_mm_ssl_epi16:
FOR j := 0 to 7
i := j*16
IF count[63:0] > 15
dst[i+15:i] := 0
ELSE
dst[i+15:i] := ZeroExtend(a[i+15:i] << count[63:0])
FI
ENDFOR

_mm_ssli_epi16:
FOR j := 0 to 7
i := j*16
IF imm8[7:0] > 15
dst[i+15:i] := 0
ELSE
dst[i+15:i] := ZeroExtend(a[i+15:i] << imm8[7:0])
FI
ENDFOR

For a value such as 257, the immediate version does a 1 bit shift, while the 
non-immediate returns a zero vector. A simple function using the immediate 
intrinsic has to have an if-statement, if transformed to using the non-
immediate instruction.

`Allan

Re: PR79715: Special case strcpy/strncpy for dse

2017-04-24 Thread Prathamesh Kulkarni

On 1 March 2017 at 13:24, Richard Biener  wrote:
> On Tue, 28 Feb 2017, Jeff Law wrote:
>
>> On 02/28/2017 05:59 AM, Prathamesh Kulkarni wrote:
>> > On 28 February 2017 at 15:40, Jakub Jelinek  wrote:
>> > > On Tue, Feb 28, 2017 at 03:33:11PM +0530, Prathamesh Kulkarni wrote:
>> > > > Hi,
>> > > > The attached patch adds special-casing for strcpy/strncpy to dse pass.
>> > > > Bootstrapped+tested on x86_64-unknown-linux-gnu.
>> > > > OK for GCC 8 ?
>> > >
>> > > What is special on strcpy/strncpy?  Unlike memcpy/memmove/memset, you
>> > > don't
>> > > know the length they store (at least not in general), you don't know the
>> > > value, all you know is that they are for the first argument plain store
>> > > without remembering the pointer or anything based on it anywhere except 
>> > > in
>> > > the return value.
>> > > I believe stpcpy, stpncpy, strcat, strncat at least have the same
>> > > behavior.
>> > > On the other side, without knowing the length of the store, you can't
>> > > treat
>> > > it as killing something (ok, some calls like strcpy or stpcpy store at
>> > > least
>> > > the first byte).
>> > Well, I assumed a store to dest by strcpy (and friends), which gets
>> > immediately freed would count
>> > as a dead store since free would kill the whole memory block pointed
>> > to by dest ?
>> Yes.  But does it happen often in practice?  I wouldn't mind exploring this
>> for gcc-8, but I'd like to see real-world code where this happens.
>
> Actually I don't mind for "real-world code" - the important part is
> that I believe it's reasonable to assume it can happen from some C++
> abstraction and optimization.
Hi,
I have updated the patch to include stp[n]cpy and str[n]cat.
In initialize_ao_ref_for_dse for strncat, I suppose for strncat we
need to treat size as NULL_TREE
instead of setting it 2nd arg since we cannot know size (in general)
after strncat ?
Patch passes bootstrap+test on x86_64-unknown-linux-gnu.
Cross-tested on arm*-*-*, aarch64*-*-*.

Thanks,
Prathamesh
>
> Richard.
2017-04-24  Prathamesh Kulkarni  

* tree-ssa-dse.c (initialize_ao_ref_for_dse): Add cases for
BUILT_IN_STRNCPY, BUILT_IN_STRCPY, BUILT_IN_STPNCPY, BUILT_IN_STPCPY,
BUILT_IN_STRNCAT, BUILT_IN_STRCAT.
(maybe_trim_memstar_call): Likewise.
(dse_dom_walker::dse_optimize_stmt): Likewise.

testsuite/
* gcc.dg/tree-ssa/pr79715.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr79715.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr79715.c
new file mode 100644
index 000..2cd4c99
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr79715.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse-details" } */
+
+void f1(const char *s)
+{
+  unsigned n = __builtin_strlen (s) + 1;
+  char *p = __builtin_malloc (n);
+  __builtin_strcpy (p, s);
+  __builtin_free (p);
+}
+
+void f2(const char *s, unsigned n)
+{
+  char *p = __builtin_malloc (n);
+  __builtin_strncpy (p, s, n);
+  __builtin_free (p);
+}
+
+void f3(const char *s, unsigned n)
+{
+  char *p = __builtin_malloc (n);
+  __builtin_stpncpy (p, s, n);
+  __builtin_free (p);
+}
+
+void f4(char *s, char *t)
+{
+  __builtin_strcat (s, t);
+  __builtin_free (s);
+}
+
+void f5(char *s, char *t, unsigned n)
+{
+  __builtin_strncat (s, t, n);
+  __builtin_free (s);
+}
+
+/* { dg-final { scan-tree-dump "Deleted dead call: __builtin_strcpy" "dse1" } 
} */
+/* { dg-final { scan-tree-dump "Deleted dead call: __builtin_strncpy" "dse1" } 
} */
+/* { dg-final { scan-tree-dump "Deleted dead call: __builtin_stpncpy" "dse1" } 
} */
+/* { dg-final { scan-tree-dump "Deleted dead call: __builtin_strcat" "dse1" } 
} */
+/* { dg-final { scan-tree-dump "Deleted dead call: __builtin_strncat" "dse1" } 
} */
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 90230ab..752b2fa 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -92,15 +92,24 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
   /* It's advantageous to handle certain mem* functions.  */
   if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 {
+  tree size = NULL_TREE;
   switch (DECL_FUNCTION_CODE (gimple_call_fndecl (stmt)))
{
  case BUILT_IN_MEMCPY:
  case BUILT_IN_MEMMOVE:
  case BUILT_IN_MEMSET:
+ case BUILT_IN_STRNCPY:
+ case BUILT_IN_STRCPY:
+ case BUILT_IN_STPNCPY:
+ case BUILT_IN_STPCPY:
{
- tree size = NULL_TREE;
  if (gimple_call_num_args (stmt) == 3)
size = gimple_call_arg (stmt, 2);
+   }
+ /* fallthrough.  */
+ case BUILT_IN_STRCAT:
+ case BUILT_IN_STRNCAT:
+   {
  tree ptr = gimple_call_arg (stmt, 0);
  ao_ref_init_from_ptr_and_size (write, ptr, size);
  return true;
@@ -395,6 +404,12 @@ maybe_trim_memstar_call (ao_ref *ref, sbitmap live, gimple 
*stmt)
 {
 case BUILT_IN_MEMCPY:
 case BUILT_IN_MEMMOVE:
+case BUILT_IN_STRN

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 11:01:29AM +0200, Allan Sandfeld Jensen wrote:
> On Monday 24 April 2017, Jakub Jelinek wrote:
> > On Mon, Apr 24, 2017 at 10:34:58AM +0200, Allan Sandfeld Jensen wrote:
> > > That is a different instruction. That is the vpsllw not vpsllwi
> > > 
> > > The intrinsics I changed is the immediate version, I didn't change the
> > > non- immediate version. It is probably a bug if you can give
> > > non-immediate values to the immediate only intrinsic. At least both
> > > versions handles it, if in different ways, but is is illegal arguments.
> > 
> > The documentation is unclear on that and I've only recently fixed up some
> > cases where these intrinsics weren't able to handle non-constant arguments
> > in some cases, while both ICC and clang coped with that fine.
> > So it is clearly allowed and handled by all the compilers and needs to be
> > supported, people use that in real-world code.
> > 
> Undoubtedly it happens. I just make a mistake myself that created that case. 
> But it is rather unfortunate, and means we make wrong code currently for 
> corner case values.

The intrinsic documentation is poor, usually you have a good documentation
on what the instructions do, and then you just have to guess what the
intrinsics do.  You can of course ask Intel for clarification.

If you try:
#include 

__m128i
foo (__m128i a, int b)
{
  return _mm_slli_epi16 (a, b);
}
and call it with 257 from somewhere else, you can see that all the compilers
will give you zero vector.  And similarly if you use 257 literally instead
of b.  So what the intrinsic (unlike the instruction)
actually does is that it compares all bits of the imm8 argument (supposedly
using unsigned comparison) and if it is bigger than 15 (or 7 or 31 or 63
depending on the bitsize of element) it yields 0 vector.

Jakub

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 11:01:29AM +0200, Allan Sandfeld Jensen wrote:
> > On Monday 24 April 2017, Jakub Jelinek wrote:
> > > On Mon, Apr 24, 2017 at 10:34:58AM +0200, Allan Sandfeld Jensen wrote:
> > > > That is a different instruction. That is the vpsllw not vpsllwi
> > > > 
> > > > The intrinsics I changed is the immediate version, I didn't change
> > > > the non- immediate version. It is probably a bug if you can give
> > > > non-immediate values to the immediate only intrinsic. At least both
> > > > versions handles it, if in different ways, but is is illegal
> > > > arguments.
> > > 
> > > The documentation is unclear on that and I've only recently fixed up
> > > some cases where these intrinsics weren't able to handle non-constant
> > > arguments in some cases, while both ICC and clang coped with that
> > > fine.
> > > So it is clearly allowed and handled by all the compilers and needs to
> > > be supported, people use that in real-world code.
> > 
> > Undoubtedly it happens. I just make a mistake myself that created that
> > case. But it is rather unfortunate, and means we make wrong code
> > currently for corner case values.
> 
> The intrinsic documentation is poor, usually you have a good documentation
> on what the instructions do, and then you just have to guess what the
> intrinsics do.  You can of course ask Intel for clarification.
> 
> If you try:
> #include 
> 
> __m128i
> foo (__m128i a, int b)
> {
>   return _mm_slli_epi16 (a, b);
> }
> and call it with 257 from somewhere else, you can see that all the
> compilers will give you zero vector.  And similarly if you use 257
> literally instead of b.  So what the intrinsic (unlike the instruction)
> actually does is that it compares all bits of the imm8 argument (supposedly
> using unsigned comparison) and if it is bigger than 15 (or 7 or 31 or 63
> depending on the bitsize of element) it yields 0 vector.
> 
Good point. I was using intel's documentation at 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/, but if all 
compilers including us does something else, practicality wins.

It did make me curious and test out what _mm_slli_epi16(v, -250); compiles to. 
For some reason that becomes an undefined shift using the non-immediate sll in 
gcc, but returns the zero-vector in clang. With my patch it was a 6 bit shift, 
but that is apparently not de-facto standard.


`Allan

Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

2017-04-24 Thread Kyrill Tkachov


Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html

Thanks,
Kyrill

On 28/02/17 12:29, Kyrill Tkachov wrote:

Hi all,

For the testcase in this patch we currently generate:
foo:
mov w1, 0
ldaxr   w2, [x0]
cmp w2, 3
bne .L2
stxrw3, w1, [x0]
cmp w3, 0
.L2:
csetw0, eq
ret

Note that the STXR could have been storing the WZR register instead of moving 
zero into w1.
This is due to overly strict predicates and constraints in the store exclusive 
pattern and the
atomic compare exchange expanders and splitters.
This simple patch fixes that in the patterns concerned and with it we can 
generate:
foo:
ldaxr   w1, [x0]
cmp w1, 3
bne .L2
stxrw2, wzr, [x0]
cmp w2, 0
.L2:
csetw0, eq
ret


Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

Thanks,
Kyrill

2017-02-28  Kyrylo Tkachov  

* config/aarch64/atomics.md (atomic_compare_and_swap expander):
Use aarch64_reg_or_zero predicate for operand 4.
(aarch64_compare_and_swap define_insn_and_split):
Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
(aarch64_store_exclusive): Likewise for operand 2.

2017-02-28  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.

Re: [PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero

2017-04-24 Thread Kyrill Tkachov


Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html

Thanks,
Kyrill

On 08/03/17 16:35, Kyrill Tkachov wrote:

Hi all,

For the testcase in this patch where the value of x is zero we currently 
generate:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cmp w2, 0
bne .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
csetw0, eq
ret

We currently cannot merge the cmp and b.ne inside the loop into a cbnz because 
we need
the condition flags set for the return value of the function (i.e. the cset at 
the end).
But if we re-jig the sequence in that case we can generate a tighter loop:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cbnzw2, .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
cmp w2, 0
csetw0, eq
ret

So we add an explicit compare after the loop and inside the loop we use the 
fact that
we're comparing against zero to emit a CBNZ. This means we may re-do the 
comparison twice
(once inside the CBNZ, once at the CMP at the end), but there is now less code 
inside the loop.

I've seen this sequence appear in glibc locking code so maybe it's worth adding 
the extra bit
of complexity to the compare-exchange splitter to catch this case.

Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of 
the patch where
I had gotten some logic wrong it would cause miscompiles of libgomp leading to 
timeouts in its
testsuite but this version passes everything cleanly.

Ok for GCC 8? (I know it's early, but might as well get it out in case someone 
wants to try it out)

Thanks,
Kyrill

2017-03-08  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
Emit CBNZ inside loop when doing a strong exchange and comparing
against zero.  Generate the CC flags after the loop.

2017-03-08  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.

[wwwdocs] Change "GCC 6" to "GCC 7" in /gcc7/index.html

2017-04-24 Thread Jonathan Wakely


Committed to CVS.

Index: htdocs/gcc-7/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/index.html,v
retrieving revision 1.1
diff -u -r1.1 index.html
--- htdocs/gcc-7/index.html	20 Apr 2017 10:32:58 -	1.1
+++ htdocs/gcc-7/index.html	24 Apr 2017 09:38:17 -
@@ -8,7 +8,7 @@
 
 GCC 7 Release Series
 
-As of this time no releases of GCC 6 have yet been made.
+As of this time no releases of GCC 7 have yet been made.
 
 References and Acknowledgements

[PATCH] Cleanup SCCVN

2017-04-24 Thread Richard Biener


This does a cleanup I postponed for GCC 8, namely not fail value-numbering
completely if we end up with a large SCC but instead just drop that SCC
to varying.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-04-24  Richard Biener  

* tree-ssa-sccvn.h (run_scc_vn): Adjust prototype.
* tree-ssa-sccvn.c (print_scc): Print SCC size.
(extract_and_process_scc_for_name): Never fail but drop SCC to varying.
(DFS): Adjust and never fail.
(sccvn_dom_walker::fail): Remove.
(sccvn_dom_walker::before_dom_children): Adjust.
(run_scc_vn): Likewise and never fail.
* tree-ssa-pre.c (pass_pre::execute): Adjust.
(pass_fre::execute): Likewise.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 247091)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -5086,11 +5086,7 @@ pass_pre::execute (function *fun)
  loop_optimizer_init may create new phis, etc.  */
   loop_optimizer_init (LOOPS_NORMAL);
 
-  if (!run_scc_vn (VN_WALK))
-{
-  loop_optimizer_finalize ();
-  return 0;
-}
+  run_scc_vn (VN_WALK);
 
   init_pre ();
   scev_initialize ();
@@ -5202,8 +5198,7 @@ pass_fre::execute (function *fun)
 {
   unsigned int todo = 0;
 
-  if (!run_scc_vn (VN_WALKREWRITE))
-return 0;
+  run_scc_vn (VN_WALKREWRITE);
 
   memset (&pre_stats, 0, sizeof (pre_stats));
 
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 247091)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3153,7 +3153,7 @@ print_scc (FILE *out, vec scc)
   tree var;
   unsigned int i;
 
-  fprintf (out, "SCC consists of:");
+  fprintf (out, "SCC consists of %u:", scc.length ());
   FOR_EACH_VEC_ELT (scc, i, var)
 {
   fprintf (out, " ");
@@ -4316,7 +4316,7 @@ process_scc (vec scc)
and process them.  Returns true if all went well, false if
we run into resource limits.  */
 
-static bool
+static void
 extract_and_process_scc_for_name (tree name)
 {
   auto_vec scc;
@@ -4332,24 +4332,37 @@ extract_and_process_scc_for_name (tree n
   scc.safe_push (x);
 } while (x != name);
 
-  /* Bail out of SCCVN in case a SCC turns out to be incredibly large.  */
-  if (scc.length ()
-  > (unsigned)PARAM_VALUE (PARAM_SCCVN_MAX_SCC_SIZE))
+  /* Drop all defs in the SCC to varying in case a SCC turns out to be
+ incredibly large.
+ ???  Just switch to a non-optimistic mode that avoids any iteration.  */
+  if (scc.length () > (unsigned)PARAM_VALUE (PARAM_SCCVN_MAX_SCC_SIZE))
 {
   if (dump_file)
-   fprintf (dump_file, "WARNING: Giving up with SCCVN due to "
-"SCC size %u exceeding %u\n", scc.length (),
-(unsigned)PARAM_VALUE (PARAM_SCCVN_MAX_SCC_SIZE));
-
-  return false;
+   {
+ print_scc (dump_file, scc);
+ fprintf (dump_file, "WARNING: Giving up value-numbering SCC due to "
+  "size %u exceeding %u\n", scc.length (),
+  (unsigned)PARAM_VALUE (PARAM_SCCVN_MAX_SCC_SIZE));
+   }
+  tree var;
+  unsigned i;
+  FOR_EACH_VEC_ELT (scc, i, var)
+   {
+ gimple *def = SSA_NAME_DEF_STMT (var);
+ mark_use_processed (var);
+ if (SSA_NAME_IS_DEFAULT_DEF (var)
+ || gimple_code (def) == GIMPLE_PHI)
+   set_ssa_val_to (var, var);
+ else
+   defs_to_varying (def);
+   }
+  return;
 }
 
   if (scc.length () > 1)
 sort_scc (scc);
 
   process_scc (scc);
-
-  return true;
 }
 
 /* Depth first search on NAME to discover and process SCC's in the SSA
@@ -4359,7 +4372,7 @@ extract_and_process_scc_for_name (tree n
Returns true if successful, false if we stopped processing SCC's due
to resource constraints.  */
 
-static bool
+static void
 DFS (tree name)
 {
   auto_vec itervec;
@@ -4399,12 +4412,11 @@ start_over:
{
  /* See if we found an SCC.  */
  if (VN_INFO (name)->low == VN_INFO (name)->dfsnum)
-   if (!extract_and_process_scc_for_name (name))
- return false;
+   extract_and_process_scc_for_name (name);
 
  /* Check if we are done.  */
  if (namevec.is_empty ())
-   return true;
+   return;
 
  /* Restore the last use walker and continue walking there.  */
  use = name;
@@ -4687,7 +4699,7 @@ class sccvn_dom_walker : public dom_walk
 {
 public:
   sccvn_dom_walker ()
-: dom_walker (CDI_DOMINATORS, true), fail (false), cond_stack (0) {}
+: dom_walker (CDI_DOMINATORS, true), cond_stack (0) {}
 
   virtual edge before_dom_children (basic_block);
   virtual void after_dom_children (basic_block);
@@ -4697,7 +4709,6 @@ public:
   void record_conds (basic_block,
 enum tree_code code, tree lhs, tree rhs, bool value);
 
-  bool fail;
   auto_vec > >
 cond_stack;
 };
@@ -4

Re: [PATCH GCC8][02/33]Remove code handling pseudo candidate

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:38 PM, Bin Cheng  wrote:
> Hi,
> We don't have pseudo candidate nowadays, so remove any related code.
>
> Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Remove pseudo
> iv_cand code.
> (determine_group_iv_cost_cond, determine_iv_cost): Ditto.
> (iv_ca_set_no_cp, create_new_iv): Ditto.

Re: [PATCH GCC8][03/33]Refactor invariant variable/expression handling

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:38 PM, Bin Cheng  wrote:
> Hi,
> This patch refactors how invariant variable/expressions are handled.  Now 
> they are
> recorded in the same kind data structure and handled similarly, which makes 
> code
> easier to understand.
>
> Is it OK?

Ok.

Richard.

> Thanks,
> bin
>
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (struct cost_pair): Rename depends_on to
> inv_vars.  Add inv_exprs.
> (struct iv_cand): Rename depends_on to inv_vars.
> (struct ivopts_data): Rename max_inv_id/n_invariant_uses to
> max_inv_var_id/n_inv_var_uses.  Move max_inv_expr_id around.
> Refactor field used_inv_exprs from has_map to array n_inv_expr_uses.
> (dump_cand): Dump inv_vars.
> (tree_ssa_iv_optimize_init): Support inv_vars and inv_exprs.
> (record_invariant, find_depends, add_candidate_1): Ditto.
> (set_group_iv_cost, force_var_cost): Ditto.
> (split_address_cost, ptr_difference_cost, difference_cost): Ditto.
> (get_computation_cost_at, get_computation_cost): Ditto.
> (determine_group_iv_cost_generic): Ditto.
> (determine_group_iv_cost_address): Ditto.
> (determine_group_iv_cost_cond, autoinc_possible_for_pair): Ditto.
> (determine_group_iv_costs): Ditto.
> (iv_ca_recount_cost): Update call to ivopts_global_cost_for_size.
> (iv_ca_set_remove_invariants): Renamed to ...
> (iv_ca_set_remove_invs): ... this.  Support inv_vars and inv_exprs.
> (iv_ca_set_no_cp): Use iv_ca_set_remove_invs.
> (iv_ca_set_add_invariants):  Renamed to ...
> (iv_ca_set_add_invs): ... this.  Support inv_vars and inv_exprs.
> (iv_ca_set_cp): Use iv_ca_set_add_invs.
> (iv_ca_has_deps): Support inv_vars and inv_exprs.
> (iv_ca_new, iv_ca_free, iv_ca_dump, free_loop_data): Ditto.
> (create_new_ivs): Remove useless dump.
>
> gcc/testsuite/ChangeLog
> 2017-04-11  Bin Cheng  
>
> * g++.dg/tree-ssa/ivopts-3.C: Adjust test string.

Re: [PATCH GCC8][04/33]Single interface finding invariant variables

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:39 PM, Bin Cheng  wrote:
> Hi,
> This patch refactors interface finding invariant variables.  Now customers
> only need to call find_inv_vars, rather than set global variable 
> fd_ivopts_data
> then call walk_tree.
> Is it OK?

Ok.

RIchard.

> Thanks,
> bin
>
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (struct walk_tree_data): New.
> (find_inv_vars_cb): New.
> (find_depends): Renamed to ...
> (find_inv_vars): ... this.
> (add_candidate_1, force_var_cost): Call find_inv_vars.
> (split_address_cost, determine_group_iv_cost_cond): Ditto.

Re: [PATCH GCC8][06/33]Simple refactor of function rewrite_use_address

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:40 PM, Bin Cheng  wrote:
> Hi,
> Simple refactor for function rewrite_use_address.

Ok.

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (rewrite_use_address): Simple refactor.

Re: [PATCH GCC8][05/33]Count invariant and candidate separately

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:40 PM, Bin Cheng  wrote:
> Hi,
> Simple refactor counting invariant (both variables and expressions) and 
> induction variables separately.
> Is it OK?

Ok.

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (struct iv_ca): Rename n_regs to n_invs.
> (ivopts_global_cost_for_size): Rename parameter and update uses.
> (iv_ca_recount_cost): Update uses.
> (iv_ca_set_remove_invs, iv_ca_set_no_cp): Record invariants and
> candidates seperately in n_invs and n_cands.
> (iv_ca_set_add_invs, iv_ca_set_cp, iv_ca_new): Ditto.

Re: [PATCH GCC8][07/33]Offset validity check in address expression

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:41 PM, Bin Cheng  wrote:
> Hi,
> For now, we check validity of offset by computing the maximum offset then 
> checking if
> offset is smaller than the max offset.  This is inaccurate, for example, some 
> targets
> may require offset to be aligned by power of 2.  This patch introduces new 
> interface
> checking validity of offset.  It also buffers rtx among different calls.
>
> Is it OK?

-  static vec max_offset_list;
-
+  auto_vec addr_list;
   as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));
   mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));

-  num = max_offset_list.length ();
+  num = addr_list.length ();
   list_index = (unsigned) as * MAX_MACHINE_MODE + (unsigned) mem_mode;
   if (list_index >= num)

num here is always zero and thus the compare is always true.

+  addr_list.safe_grow_cleared (list_index + MAX_MACHINE_MODE);
+  for (; num < addr_list.length (); num++)
+   addr_list[num] = NULL;

the loop is now redundant (safe_grow_cleared)

+  addr = addr_list[list_index];
+  if (!addr)
 {

always true again...

I wonder if you really indented to drop 'static' from addr_list?
There's no caching
across function calls.

+ /* Split group if aksed to, or the offset against the first
+use can't fit in offset part of addressing mode.  IV uses
+having the same offset are still kept in one group.  */
+ if (offset != 0 &&
+ (split_p || !addr_offset_valid_p (use, offset)))

&& goes to the next line.

Richard.



> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (compute_max_addr_offset): Delete.
> (addr_offset_valid_p): New function.
> (split_address_groups): Check offset validity with above function.

Re: [PATCH GCC8][08/33]Clean get_computation_*interfaces

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:41 PM, Bin Cheng  wrote:
> Hi,
> This patch cleans get_computation* interfaces.  Specifically, it removes
> get_computation and get_computation_cost_at.
>
> Is it OK?

Ok.

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_computation_at): Reorder parameters.
> (get_computation): Delete.
> (get_computation_cost): Implement like get_computation_cost_at.
> Use get_computation_at.
> (get_computation_cost_at): Delete.
> (rewrite_use_nonlinear_expr): Use get_computation_at.
> (rewrite_use_compare, remove_unused_ivs): Ditto.

Re: [PATCH GCC8][09/33]Compute separate aff_trees for invariant and induction parts

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:42 PM, Bin Cheng  wrote:
> Hi,
> This patch computes and returns separate aff_trees for invariant expression
> and induction expression, so that invariant and induction parts can be handled
> separately in both cost computation and code generation.
>
> Is it OK?

Ok.

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_computation_aff_1): New.
> (get_computation_aff): Reorder parameters.  Use get_computation_aff_1.
> (get_computation_at, rewrite_use_address): Update use of
> get_computation_aff.

Re: [PATCH GCC8][10/33]Clean get_scaled_computation_cost_at and the dump info

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:42 PM, Bin Cheng  wrote:
> Hi,
> This patch simplifies function get_scaled_computation_cost_at and the dump 
> information.
>
> Is it OK?

Ok.

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_scaled_computation_cost_at): Delete
> parameter cand.  Update dump information.
> (get_computation_cost): Update uses.

Re: [PATCH GCC8][11/33]New interfaces for tree affine

2017-04-24 Thread Richard Biener

On Tue, Apr 18, 2017 at 12:43 PM, Bin Cheng  wrote:
> Hi,
> This patch adds three simple interfaces for tree affine which will be used in
> cost computation later.
>
> Is it OK?

+static inline tree
+aff_combination_type (aff_tree *aff)

misses a function comment.  Please do not introduce new 'static inline'
function in headers but instead use plain 'inline'.

+/* Return true if AFF is simple enough.  */
+static inline bool
+aff_combination_simple_p (aff_tree *aff)
+{

what is "simple"?  Based on that find a better name.
"singleton"?  But aff_combination_const_p isn't
simple_p (for whatever reason).

Richard.

> Thanks,
> bin
> 2017-04-11  Bin Cheng  
>
> * tree-affine.h (aff_combination_type): New interface.
> (aff_combination_const_p, aff_combination_simple_p): New interfaces.

Fix -fdump-ipa-all ICE

2017-04-24 Thread Jan Hubicka

Hi,
this patch fixes ICE in dumping that triggers somewhat overactive sanity check.
I think it would be nice to get it into release branches so we could debug 
things
more easily.

I am going to commit to trunk, OK for release branches?
Honza
PR middle-end/79931
* ipa-devirt.c (dump_possible_polymorphic_call_targets): Fix ICE.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 246970)
+++ ipa-devirt.c(working copy)
@@ -3367,7 +3367,13 @@ dump_possible_polymorphic_call_targets (
   fprintf (f, "  Speculative targets:");
   dump_targets (f, targets);
 }
-  gcc_assert (targets.length () <= len);
+  /* Ugly: during callgraph construction the target cache may get populated
+ before all targets are found.  While this is harmless (because all local
+ types are discovered and only in those case we devirtualize fully and we
+ don't do speculative devirtualization before IPA stage) it triggers
+ assert here when dumping at that stage also populates the case with
+ speculative targets.  Quietly ignore this.  */
+  gcc_assert (symtab->state < IPA_SSA || targets.length () <= len);
   fprintf (f, "\n");
 }

Re: Fix -fdump-ipa-all ICE

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 12:48:19PM +0200, Jan Hubicka wrote:
> Hi,
> this patch fixes ICE in dumping that triggers somewhat overactive sanity 
> check.
> I think it would be nice to get it into release branches so we could debug 
> things
> more easily.
> 
> I am going to commit to trunk, OK for release branches?
> Honza
>   PR middle-end/79931
>   * ipa-devirt.c (dump_possible_polymorphic_call_targets): Fix ICE.

No testcase in the patch?

> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c  (revision 246970)
> +++ ipa-devirt.c  (working copy)
> @@ -3367,7 +3367,13 @@ dump_possible_polymorphic_call_targets (
>fprintf (f, "  Speculative targets:");
>dump_targets (f, targets);
>  }
> -  gcc_assert (targets.length () <= len);
> +  /* Ugly: during callgraph construction the target cache may get populated
> + before all targets are found.  While this is harmless (because all local
> + types are discovered and only in those case we devirtualize fully and we
> + don't do speculative devirtualization before IPA stage) it triggers
> + assert here when dumping at that stage also populates the case with
> + speculative targets.  Quietly ignore this.  */
> +  gcc_assert (symtab->state < IPA_SSA || targets.length () <= len);
>fprintf (f, "\n");
>  }
>  

Jakub

[PATCH] [MSP430] [STAGE 1] Fix PR78849: ICE on initialization of global struct containing __int20 array

2017-04-24 Thread Jozef Lawrynowicz

The attached patch modifies the setting of TYPE_SIZE for __intN types
to use GET_MODE_BITSIZE rather than the bitsize extracted from the N
value. TYPE_SIZE for sizetype and bitsizetype are also modified to use
GET_MODE_BITSIZE rather than the precision of the type.

This fixes an issue for the msp430 target where the TYPE_SIZE of the
__int20 type was set using the precision (20 bits) instead of the
in-memory size (32 bits) of the type. This was reported in PR78849:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78849

I've added a testcase for the bug report, however I had to modify the
test driver, msp430.exp, for the test to pass. The other test drivers
add -pedantic-errors to DEFAULT_CFLAGS, and this causes an error when
the __int20 type is used:
> pr78849.c:5:3: error: ISO C does not support '__int20' types [-Wpedantic]
msp430.exp now removes -pedantic-errors from DEFAULT_CFLAGS.

The patch passed bootstrap and regression testing with no regressions
on recent trunk (r247020) for x86_64-pc-linux-gnu.
The patch passed regression testing with "-mcpu=msp430x/-mlarge" for
msp430-elf with no regressions, and fixed some failures in the gcc and
g++ testsuites:

> c-c++-common/torture/builtin-arith-overflow-7.c   -O0  execution test
> c-c++-common/torture/builtin-arith-overflow-8.c   -O0  execution test
> c-c++-common/torture/builtin-arith-overflow-9.c   -O0  execution test
> gcc.dg/pow-sqrt-1.c execution test
> gcc.dg/pow-sqrt-2.c execution test
> gcc.dg/pow-sqrt-3.c execution test
> gcc.dg/pr47201.c (test for excess errors)
>
> g++.dg/torture/pr37922.C   -O1  execution test
> g++.dg/torture/pr37922.C   -O2  execution test
> g++.dg/torture/pr37922.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  execution test
> g++.dg/torture/pr37922.C   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
>  execution test
> g++.dg/torture/pr37922.C   -O3 -fomit-frame-pointer -funroll-loops 
> -fpeel-loops -ftracer -finline-functions  execution test
> g++.dg/torture/pr37922.C   -O3 -g  execution test
> g++.dg/torture/pr37922.C   -Os  execution test
> g++.dg/torture/pr41775.C   -O1  (test for excess errors)
> g++.dg/torture/pr41775.C   -O2  (test for excess errors)
> g++.dg/torture/pr41775.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> g++.dg/torture/pr41775.C   -O3 -fomit-frame-pointer -funroll-loops 
> -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> g++.dg/torture/pr41775.C   -O3 -g  (test for excess errors)
> g++.dg/torture/pr41775.C   -Os  (test for excess errors)

I am not aware of any targets except msp430 where the value returned
by GET_MODE_BITSIZE and the types' precision are different, hence
despite these changes being in a target-independent part of the
compiler I wouldn't expect any change in behaviour for other targets.
If this patch is acceptable, I would appreciate if someone could
commit it for me, as I don't have write access to the SVN repository.


0001-Use-GET_MODE_BITSIZE-when-setting-TYPE_SIZE.patch
Description: Binary data

[PATCH][RFC] Enable -fstrict-overflow by default

2017-04-24 Thread Richard Biener


The following makes signed overflow undefined for all (non-)optimization
levels.  The intent is to remove -fno-strict-overflow signed overflow
behavior as that is not a sensible option to the user (it ends up
with the worst of both -fwrapv and -fno-wrapv).  The implementation
details need to be preserved for the forseeable future to not wreck
UBSAN with either associating (-fwrapv behavior) or optimizing
(-fno-wrapv behavior).

The other choice would be to make -fwrapv the default for -O[01].

A second patch in this series would unify -f[no-]wrapv, -f[no-]trapv
and -f[no-]strict-overflow with a 
-fsigned-integer-overflow={undefined,wrapping,trapping[,sanitized]}
option, making conflicts amongst the options explicit (and reduce
the number of flag_ variables).  'sanitized' would essentially map
to todays flag_strict_overflow = 0.  There's another sole user
of flag_strict_overflow, POINTER_TYPE_OVERFLOW_UNDEFINED - not sure
what to do about that, apart from exposing it as different flag
alltogether.

Further patches in the series would remove -Wstrict-overflow (and
cleanup VRP for example).

Anyway, most controversical part(?) below.

Any comments on this particular patch (and the overall proposal)?

Cleaning up the options is probably a no-brainer anyways.

Thanks,
Richard.

2017-04-24  Richard Biener  

* common.opt (fstrict-overflow): Enable by default.
* opts.c (default_options_table): Remove OPT_fstrict_overflow entry.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 247091)
+++ gcc/common.opt  (working copy)
@@ -2342,7 +2342,7 @@ Common Report Var(flag_strict_aliasing)
 Assume strict aliasing rules apply.
 
 fstrict-overflow
-Common Report Var(flag_strict_overflow) Optimization
+Common Report Var(flag_strict_overflow) Init(1) Optimization
 Treat signed overflow as undefined.
 
 fsync-libcalls
Index: gcc/opts.c
===
--- gcc/opts.c  (revision 247091)
+++ gcc/opts.c  (working copy)
@@ -496,7 +496,6 @@ static const struct default_options defa
 { OPT_LEVELS_2_PLUS, OPT_fschedule_insns2, NULL, 1 },
 #endif
 { OPT_LEVELS_2_PLUS, OPT_fstrict_aliasing, NULL, 1 },
-{ OPT_LEVELS_2_PLUS, OPT_fstrict_overflow, NULL, 1 },
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_freorder_blocks_algorithm_, NULL,
   REORDER_BLOCKS_ALGORITHM_STC },
 { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },

Re: [PATCH] [MSP430] [STAGE 1] Fix PR78849: ICE on initialization of global struct containing __int20 array

2017-04-24 Thread Jozef Lawrynowicz

On 24 April 2017 at 12:02, Jozef Lawrynowicz  wrote:
> The patch passed bootstrap and regression testing with no regressions
> on recent trunk (r247020) for x86_64-pc-linux-gnu.
> The patch passed regression testing with "-mcpu=msp430x/-mlarge" for
> msp430-elf with no regressions, and fixed some failures in the gcc and
> g++ testsuites:

Building and testing GCC for msp430-elf was done on the gcc-6-branch,
since trunk doesn't build with C++ enabled for msp430-elf.

[PATCH] PR libstdc++/80493 fix invalid exception specification

2017-04-24 Thread Jonathan Wakely


This was fixed for std::optional, but not the TS version, so Clang
rejects it.

PR libstdc++/80493
* include/experimental/optional (optional::swap): Fix exception
specification.

Tested powerpc64le-linux, committed to trunk.


commit 1d7d6db208e6371f00f8aecd77eb1a78a6fbe578
Author: Jonathan Wakely 
Date:   Mon Apr 24 12:15:13 2017 +0100

PR libstdc++/80493 fix invalid exception specification

PR libstdc++/80493
* include/experimental/optional (optional::swap): Fix exception
specification.

diff --git a/libstdc++-v3/include/experimental/optional 
b/libstdc++-v3/include/experimental/optional
index 197a1fc..4a1e71d 100644
--- a/libstdc++-v3/include/experimental/optional
+++ b/libstdc++-v3/include/experimental/optional
@@ -690,7 +690,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   swap(optional& __other)
   noexcept(is_nothrow_move_constructible<_Tp>()
-   && noexcept(swap(declval<_Tp&>(), declval<_Tp&>(
+   && __is_nothrow_swappable<_Tp>::value)
   {
 using std::swap;

[PATCH] PR libstdc++/80504 qualify calls to avoid ADL

2017-04-24 Thread Jonathan Wakely


This has always been wrong, since 2004 when the overloads were added
to .

PR libstdc++/80504
* include/bits/refwrap.h (ref, cref): Qualify calls.
* testsuite/20_util/reference_wrapper/80504.cc: New test.

Tested powerpc64l4-linux, committed to trunk. I'll backport it too.

commit e284d7bff062e16b4e5b9c27f08e1e401840f867
Author: Jonathan Wakely 
Date:   Mon Apr 24 12:21:34 2017 +0100

PR libstdc++/80504 qualify calls to avoid ADL

PR libstdc++/80504
* include/bits/refwrap.h (ref, cref): Qualify calls.
* testsuite/20_util/reference_wrapper/80504.cc: New test.

diff --git a/libstdc++-v3/include/bits/refwrap.h 
b/libstdc++-v3/include/bits/refwrap.h
index 124ee97..786087e 100644
--- a/libstdc++-v3/include/bits/refwrap.h
+++ b/libstdc++-v3/include/bits/refwrap.h
@@ -361,17 +361,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 void cref(const _Tp&&) = delete;
 
-  /// Partial specialization.
+  /// std::ref overload to prevent wrapping a reference_wrapper
   template
 inline reference_wrapper<_Tp>
 ref(reference_wrapper<_Tp> __t) noexcept
-{ return ref(__t.get()); }
+{ return __t; }
 
-  /// Partial specialization.
+  /// std::cref overload to prevent wrapping a reference_wrapper
   template
 inline reference_wrapper
 cref(reference_wrapper<_Tp> __t) noexcept
-{ return cref(__t.get()); }
+{ return { __t.get() }; }
 
   // @} group functors
 
diff --git a/libstdc++-v3/testsuite/20_util/reference_wrapper/80504.cc 
b/libstdc++-v3/testsuite/20_util/reference_wrapper/80504.cc
new file mode 100644
index 000..727a560
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/reference_wrapper/80504.cc
@@ -0,0 +1,34 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+namespace X {
+  struct Y { };
+  template void ref(T) { }
+  template void cref(T) { }
+}
+
+int main()
+{
+  X::Y i;
+  std::reference_wrapper r(i);
+  ref(r);
+  cref(r);
+}

Re: Fix -fdump-ipa-all ICE

2017-04-24 Thread Martin Liška

On 04/24/2017 12:51 PM, Jakub Jelinek wrote:
> No testcase in the patch?

As Honza is busy right not, I'm sending one.

Martin
>From 87cef5e3123723f81c44dfafe86fa10b7925cea8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 24 Apr 2017 14:02:54 +0200
Subject: [PATCH] Add new test-case.

gcc/testsuite/ChangeLog:

2017-04-24  Martin Liska  

	* g++.dg/ipa/pr79931.C: New test.
---
 gcc/testsuite/g++.dg/ipa/pr79931.C | 24 
 1 file changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr79931.C

diff --git a/gcc/testsuite/g++.dg/ipa/pr79931.C b/gcc/testsuite/g++.dg/ipa/pr79931.C
new file mode 100644
index 000..78f6e03c458
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr79931.C
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-all" } */
+
+class DocumentImpl;
+struct NodeImpl
+{
+  virtual DocumentImpl * getOwnerDocument();
+  virtual NodeImpl * getParentNode();
+  virtual NodeImpl * removeChild(NodeImpl *oldChild);
+};
+struct AttrImpl : NodeImpl
+{
+  NodeImpl *insertBefore(NodeImpl *newChild, NodeImpl *refChild);
+};
+struct DocumentImpl : NodeImpl
+{
+  virtual NodeImpl *removeChild(NodeImpl *oldChild);
+  virtual int* getRanges();
+};
+NodeImpl *AttrImpl::insertBefore(NodeImpl *newChild, NodeImpl *refChild) {
+  NodeImpl *oldparent = newChild->getParentNode();
+  oldparent->removeChild(newChild);
+  this->getOwnerDocument()->getRanges();
+}
-- 
2.12.2

Re: Fix -fdump-ipa-all ICE

2017-04-24 Thread Jan Hubicka

> On 04/24/2017 12:51 PM, Jakub Jelinek wrote:
> > No testcase in the patch?
> 
> As Honza is busy right not, I'm sending one.
Thanks (in fact I just forgot to include it and was about to send it now) but
help is welcome!

Honza
> 
> Martin

> >From 87cef5e3123723f81c44dfafe86fa10b7925cea8 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 24 Apr 2017 14:02:54 +0200
> Subject: [PATCH] Add new test-case.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-04-24  Martin Liska  
> 
>   * g++.dg/ipa/pr79931.C: New test.
> ---
>  gcc/testsuite/g++.dg/ipa/pr79931.C | 24 
>  1 file changed, 24 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr79931.C
> 
> diff --git a/gcc/testsuite/g++.dg/ipa/pr79931.C 
> b/gcc/testsuite/g++.dg/ipa/pr79931.C
> new file mode 100644
> index 000..78f6e03c458
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ipa/pr79931.C
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-ipa-all" } */
> +
> +class DocumentImpl;
> +struct NodeImpl
> +{
> +  virtual DocumentImpl * getOwnerDocument();
> +  virtual NodeImpl * getParentNode();
> +  virtual NodeImpl * removeChild(NodeImpl *oldChild);
> +};
> +struct AttrImpl : NodeImpl
> +{
> +  NodeImpl *insertBefore(NodeImpl *newChild, NodeImpl *refChild);
> +};
> +struct DocumentImpl : NodeImpl
> +{
> +  virtual NodeImpl *removeChild(NodeImpl *oldChild);
> +  virtual int* getRanges();
> +};
> +NodeImpl *AttrImpl::insertBefore(NodeImpl *newChild, NodeImpl *refChild) {
> +  NodeImpl *oldparent = newChild->getParentNode();
> +  oldparent->removeChild(newChild);
> +  this->getOwnerDocument()->getRanges();
> +}
> -- 
> 2.12.2
>

Re: [Patch, Fortran] PR 80121: Memory leak with derived-type intent(out) argument

2017-04-24 Thread Bernhard Reutner-Fischer

On 24 April 2017 10:56:57 CEST, Janus Weil  wrote:
>Hi Christophe,
>
>2017-04-24 10:25 GMT+02:00 Christophe Lyon
>:
> the patch in the attachment fixes a memory leak by
>auto-deallocating
> the allocatable components of an allocatable intent(out) argument.
>
> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

 OK for trunk.
>>>
>>> thanks for the review! Committed as r247083.
>>
>> This patch causes an error message from DejaGnu:
>> (DejaGnu) proc "cleanup-tree-dump original" does not exist.
>
>thanks for letting me know. I didn't notice that ...
>
>
>> I'm not familiar with fortran, so I'm not sure it is as obvious as
>> removing cleanup-tree-dump as it is done in the other neighboring
>tests?
>
>Yes, probably it should just be removed. I assume this kind of cleanup
>is being done automatically now? I actually took it from this wiki
>page:

Yes it is done automatically nowadays.
>
>https://gcc.gnu.org/wiki/TestCaseWriting
>
>So I guess this needs to be updated as well. Will take care of both
>points tonight ...

Obviously I did not think about updating the wiki back then.
TIA for fixing the wiki.

Cheers,

Re: Fix -fdump-ipa-all ICE

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 02:12:31PM +0200, Jan Hubicka wrote:
> > On 04/24/2017 12:51 PM, Jakub Jelinek wrote:
> > > No testcase in the patch?
> > 
> > As Honza is busy right not, I'm sending one.
> Thanks (in fact I just forgot to include it and was about to send it now) but
> help is welcome!

Ok for release branches.

Jakub

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Marc Glisse


On Mon, 24 Apr 2017, Jakub Jelinek wrote:


On Mon, Apr 24, 2017 at 09:41:01AM +0200, Richard Biener wrote:

On Sun, Apr 23, 2017 at 11:38 PM, Marc Glisse  wrote:

Hello,

this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME using
range information and known (non-)zero bits, by delegating to
expr_not_equal_to which already knows how to handle all that.

This makes one strict overflow warning disappear. It isn't particularly
surprising, since the new code makes tree_expr_nonzero_warnv_p return true
without warning (we do not remember if the range information was obtained
using strict overflow). In my opinion, improving code generation is more
important than this specific warning.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.


Hmm, I think you need to guard this with a INTEGRAL_TYPE_P check
given the comment on tree_single_nonzero_warnv_p also talks about
FP.


The SSA_NAME case in expr_not_equal_to starts with

  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
return false;

Do you still want the extra check in tree_single_nonzero_warnv_p before 
calling expr_not_equal_to?



I vaguely remember there were issues with that, because VRP uses
the *_nonzero_warnv* functions to compute the ranges and now those
functions would use range info.  But it has been some time ago and maybe this
patch is different enough from what I've been trying back then.


I hope it is safe, but we'll see.


So just please watch carefully for any fallout.


Ok.

--
Marc Glisse

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-04-24 Thread Allan Sandfeld Jensen

On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > --- a/gcc/config/i386/avx2intrin.h
> > +++ b/gcc/config/i386/avx2intrin.h
> > @@ -667,7 +667,7 @@ extern __inline __m256i
> > 
> >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> >  _mm256_slli_epi16 (__m256i __A, int __B)
> >  {
> > 
> > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) :
> > _mm256_setzero_si256();
> > 
> >  }
> 
> What is the advantage of doing that when you replace one operation with
> several (&, <, ?:, <<)?
> I'd say instead we should fold the builtins if in the gimple fold target
> hook we see the shift count constant and can decide based on that.
> Or we could use __builtin_constant_p (__B) to decide whether to use
> the generic vector shifts or builtin, but that means larger IL.
> 
Okay, I have tried that, and I also made it more obvious how the intrinsics 
can become non-immediate shift.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b58f5050db0..b9406550fc5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2017-04-22  Allan Sandfeld Jensen  
+
+	* config/i386/emmintrin.h (_mm_slli_*, _mm_srli_*):
+	Use vector intrinstics instead of builtins.
+	* config/i386/avx2intrin.h (_mm256_slli_*, _mm256_srli_*):
+	Use vector intrinstics instead of builtins.
+
 2017-04-21  Uros Bizjak  
 
 	* config/i386/i386.md (*extzvqi_mem_rex64): Move above *extzv.
diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h
index 82f170a3d61..64ba52b244e 100644
--- a/gcc/config/i386/avx2intrin.h
+++ b/gcc/config/i386/avx2intrin.h
@@ -665,13 +665,6 @@ _mm256_slli_si256 (__m256i __A, const int __N)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_slli_epi16 (__m256i __A, int __B)
-{
-  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
-}
-
-extern __inline __m256i
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_sll_epi16 (__m256i __A, __m128i __B)
 {
   return (__m256i)__builtin_ia32_psllw256((__v16hi)__A, (__v8hi)__B);
@@ -679,9 +672,11 @@ _mm256_sll_epi16 (__m256i __A, __m128i __B)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_slli_epi32 (__m256i __A, int __B)
+_mm256_slli_epi16 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_pslldi256 ((__v8si)__A, __B);
+  if (__builtin_constant_p(__B))
+return ((unsigned int)__B < 16) ? (__m256i)((__v16hi)__A << __B) : _mm256_setzero_si256();
+  return _mm256_sll_epi16(__A, _mm_cvtsi32_si128(__B));
 }
 
 extern __inline __m256i
@@ -693,9 +688,11 @@ _mm256_sll_epi32 (__m256i __A, __m128i __B)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_slli_epi64 (__m256i __A, int __B)
+_mm256_slli_epi32 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psllqi256 ((__v4di)__A, __B);
+  if (__builtin_constant_p(__B))
+return ((unsigned int)__B < 32) ? (__m256i)((__v8si)__A << __B) : _mm256_setzero_si256();
+  return _mm256_sll_epi32(__A, _mm_cvtsi32_si128(__B));
 }
 
 extern __inline __m256i
@@ -707,6 +704,15 @@ _mm256_sll_epi64 (__m256i __A, __m128i __B)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_slli_epi64 (__m256i __A, int __B)
+{
+  if (__builtin_constant_p(__B))
+return ((unsigned int)__B < 64) ? (__m256i)((__v4di)__A << __B) : _mm256_setzero_si256();
+  return _mm256_sll_epi64(__A, _mm_cvtsi32_si128(__B));
+}
+
+extern __inline __m256i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_srai_epi16 (__m256i __A, int __B)
 {
   return (__m256i)__builtin_ia32_psrawi256 ((__v16hi)__A, __B);
@@ -756,13 +762,6 @@ _mm256_srli_si256 (__m256i __A, const int __N)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_srli_epi16 (__m256i __A, int __B)
-{
-  return (__m256i)__builtin_ia32_psrlwi256 ((__v16hi)__A, __B);
-}
-
-extern __inline __m256i
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_srl_epi16 (__m256i __A, __m128i __B)
 {
   return (__m256i)__builtin_ia32_psrlw256((__v16hi)__A, (__v8hi)__B);
@@ -770,9 +769,11 @@ _mm256_srl_epi16 (__m256i __A, __m128i __B)
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_srli_epi32 (__m256i __A, int __B)
+_mm256_srli_epi16 (__m256i __A, int __B)
 {
-  return (__m256i)__builtin_ia32_psrldi256 ((__v8si)__A, __B);
+  if (__builtin_constant_p(__B))
+return ((unsigned int)__B < 16) ? (__m256i)((__v16hu)__A >> __B) : _mm256_setzero_si256();
+  return _mm256_srl_epi16(__A, _mm_cvtsi32_si128(__B));
 }
 
 extern __inline __m256i
@@ -784,9 +785,11 @@ _mm256_srl_epi32 (__m256i __A, __m128i __B)
 
 extern __inline __m256i
 __attribut

[PATCH] PR libstdc++/80506 fix constant used in condition

2017-04-24 Thread Jonathan Wakely


We use the wrong constant for the Marsaglia Tsang algorithm.

PR libstdc++/80506
* include/bits/random.tcc (gamma_distribution::operator()): Fix magic
number used in loop condition.

Tested powerpc64le-linux, committed to trunk.


commit aa4da6523b7bfab7ba92c2e9e505155e1ce432a7
Author: Jonathan Wakely 
Date:   Mon Apr 24 13:10:49 2017 +0100

PR libstdc++/80506 fix constant used in condition

PR libstdc++/80506
* include/bits/random.tcc (gamma_distribution::operator()): Fix magic
number used in loop condition.

diff --git a/libstdc++-v3/include/bits/random.tcc 
b/libstdc++-v3/include/bits/random.tcc
index df05ebe..63d1c02 100644
--- a/libstdc++-v3/include/bits/random.tcc
+++ b/libstdc++-v3/include/bits/random.tcc
@@ -2356,7 +2356,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__v = __v * __v * __v;
__u = __aurng();
  }
-   while (__u > result_type(1.0) - 0.331 * __n * __n * __n * __n
+   while (__u > result_type(1.0) - 0.0331 * __n * __n * __n * __n
   && (std::log(__u) > (0.5 * __n * __n + __a1
* (1.0 - __v + std::log(__v);

Re: [PATCH v5] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-24 Thread Ulrich Weigand

Dominik Vogt wrote:
> On Mon, Mar 27, 2017 at 09:27:35PM +0100, Dominik Vogt wrote:
> > The attached patch optimizes the atomic_exchange and
> > atomic_compare patterns on s390 and s390x (mostly limited to
> > SImode and DImode).  Among general optimizaation, the changes fix
> > most of the problems reported in PR 80080:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080
> > 
> > Bootstrapped and regression tested on a zEC12 with s390 and s390x
> > biarch.
> 
> v5:
>   * Generate LT pattern directly for const 0 value.
>   * Split into three patches.
> 
> Bootstrapped and regression tested on a zEC12 with s390 and s390x
> biarch.

> gcc/ChangeLog-dv-atomic-gcc7-1
> 
>   * config/s390/s390.md ("cstorecc4"): Use load-on-condition and deal
>   with CCZmode for TARGET_Z196.

> gcc/ChangeLog-dv-atomic-gcc7-2
> 
>   * config/s390/s390.md (define_peephole2): New peephole to help
>   combining the load-and-test pattern with volatile memory.

> gcc/ChangeLog-dv-atomic-gcc7-3
> 
>   * s390-protos.h (s390_expand_cs_hqi): Removed.
>   (s390_expand_cs, s390_expand_atomic_exchange_tdsi): New prototypes.
>   * config/s390/s390.c (s390_emit_compare_and_swap): Handle all integer
>   modes as well as CCZ1mode and CCZmode.
>   (s390_expand_atomic_exchange_tdsi, s390_expand_atomic): Adapt to new
>   signature of s390_emit_compare_and_swap.
>   (s390_expand_cs_hqi): Likewise, make static.
>   (s390_expand_cs_tdsi): Generate an explicit compare before trying
>   compare-and-swap, in some cases.
>   (s390_expand_cs): Wrapper function.
>   (s390_expand_atomic_exchange_tdsi): New backend specific expander for
>   atomic_exchange.
>   (s390_match_ccmode_set): Allow CCZmode <-> CCZ1 mode.
>   * config/s390/s390.md ("atomic_compare_and_swap"): Merge the
>   patterns for small and large integers.  Forbid symref memory operands.
>   Move expander to s390.c.  Require cc register.
>   ("atomic_compare_and_swap_internal")
>   ("*atomic_compare_and_swap_1")
>   ("*atomic_compare_and_swapdi_2")
>   ("*atomic_compare_and_swapsi_3"): Use s_operand to forbid
>   symref memory operands.  Remove CC mode and call s390_match_ccmode
>   instead.
>   ("atomic_exchange"): Allow and implement all integer modes.
>
> gcc/testsuite/ChangeLog-dv-atomic-gcc7
> 
>   * gcc.target/s390/md/atomic_compare_exchange-1.c: New test.
>   * gcc.target/s390/md/atomic_compare_exchange-1.inc: New test.
>   * gcc.target/s390/md/atomic_exchange-1.inc: New test.


These all look good to me now.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

[PATCH] Fix test-case on ppc64le (PR testsuite/79455).

2017-04-24 Thread Martin Liška

Hi.

Test-case mentioned in the PR looks as follows on ppc64le:

WARNING: ThreadSanitizer: data race (pid=45910)
  Atomic read of size 1 at 0x10020200 by thread T2:
#0 pthread_mutex_lock 
../../../../libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3608
 (libtsan.so.0+0x00044724)

#1 Thread2(void*) 
/home/marxin/Programming/gcc/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c:22 
(race_on_mutex.exe+0x10001110)

  Previous write of size 8 at 0x10020200 by thread T1:
#0 memset 
../../../../libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:558 
(libtsan.so.0+0x00036194)
#1 __pthread_mutex_init  (libpthread.so.0+0xadcc)
#2 Thread1(void*) 
/home/marxin/Programming/gcc/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c:12 
(race_on_mutex.exe+0x1fe4)

compared to what's on x86_64-linux-gnu:

WARNING: ThreadSanitizer: data race (pid=8917)
  Atomic read of size 1 at 0x00602100 by thread T2:
#0 pthread_mutex_lock 
../../../../libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3608
 (libtsan.so.0+0x0003bc1f)
#1 Thread2(void*) 
/home/marxin/Programming/gcc2/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c:22
 (race_on_mutex.exe+0x00400f1c)

  Previous write of size 1 at 0x00602100 by thread T1:
#0 pthread_mutex_init 
../../../../libsanitizer/tsan/tsan_interceptors.cc:1117 
(libtsan.so.0+0x0002bf5e)
#1 Thread1(void*) 
/home/marxin/Programming/gcc2/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c:12
 (race_on_mutex.exe+0x00400e99)

Bill suggested to disable memset builtin expansion, but it won't help as the 
back-trace leads to libpthread.so.
Thus I'm make the scan more verbose.

Tested on both x86_64-linux-gnu and ppc64le-linux-gnu.

Ready for trunk?
Thanks,
Martin
>From 2a9a475169bc95f147d67df343458fc0f7069512 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 24 Apr 2017 14:59:18 +0200
Subject: [PATCH] Fix test-case on ppc64le (PR testsuite/79455).

gcc/testsuite/ChangeLog:

2017-04-24  Martin Liska  

	* c-c++-common/tsan/race_on_mutex.c: Make the scanned pattern
	more generic.
---
 gcc/testsuite/c-c++-common/tsan/race_on_mutex.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c b/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c
index ae30d053c92..80c193789d7 100644
--- a/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c
+++ b/gcc/testsuite/c-c++-common/tsan/race_on_mutex.c
@@ -40,6 +40,6 @@ int main() {
 /* { dg-output "  Atomic read of size 1 at .* by thread T2:(\n|\r\n|\r)" } */
 /* { dg-output "#0 pthread_mutex_lock.*" } */
 /* { dg-output "#1 Thread2.* .*(race_on_mutex.c:22|\\?{2}:0) (.*)" } */
-/* { dg-output "  Previous write of size 1 at .* by thread T1:(\n|\r\n|\r)" } */
-/* { dg-output "#0 pthread_mutex_init .* (.)*" } */
-/* { dg-output "#1 Thread1.* .*(race_on_mutex.c:12|\\?{2}:0) .*" } */
+/* { dg-output "  Previous write of size . at .* by thread T1:(\n|\r\n|\r)" } */
+/* { dg-output "#. .*pthread_mutex_init .* (.)*" } */
+/* { dg-output "#. Thread1.* .*(race_on_mutex.c:12|\\?{2}:0) .*" } */
-- 
2.12.2

Re: [RFC, testsuite] Add dg-save-linenr

2017-04-24 Thread David Malcolm

On Sat, 2017-04-22 at 19:49 +0200, Tom de Vries wrote:
> Hi,
> 
> there are currently two types of line number supported in
> dg-{error,warning,message,bogus} directives: absolute and relative. 
> With an absolute line number, it's immediately clear what line number
> is 
> meant, but when a line is added at the start of the file, the line 
> number needs to be updated.  With a relative line number, that
> problem 
> is solved, but when relative line numbers become large, it becomes
> less 
> clear what line it refers to, and when adding a line inbetween the 
> directive using the relative line number and the line it refers to,
> the 
> relative line number still needs to be updated.
> 
> This patch adds a directive dg-save-linenr with argument varname,
> that 
> saves the line number of the directive in a variable varname, which
> can 
> be used as line number in dg directives.
> 
> Testing status:
> - tested updated test-case objc.dg/try-catch-12.m
> - ran tree-ssa.exp
> 
> RFC:
> - good idea?

Excellent idea; thanks!  There are various places where I'd find this
useful.

> - naming of directive dg-save-linenr (dg-linenr, dg-save-line-nr,
>dg-save-lineno, dg-save-line-number, etc)

How about just "dg-line"?  (if it's not already taken)
or "dg-name-line" / "dg-named-line" ?
in that the directive is effectively giving the line a name, giving:

[...]

extern void some_func (int *); /* { dg-line some_func_decl } */

[...]

  /* { dg-message "but argument is of type" "" { target *-*-* }
some_func_decl } */



> - allowed variable names (currently: start with letter, followed by
>alphanumerical or underscore)

Seems reasonable; lack of leading digit allows it to be distinguished
from absolute and relative numbers.

> - should we use a prefix symbol or some such when the variable is
> used
>(and possibly defined as well)? F.i.:
>/* { dg-save-linenr %some_func_decl } *./
>/* { dg-message "but argument is of type" "" { target *-*-* }
> %some_func_decl } */

These are sometimes called "sigils".

I'd prefer not.

> - error message formulation

Nit: the new function should have a leading comment, explaining the
usage.


Thanks again
Dave

Re: [C++ PATCH] Fix-it info for duplicate tokens

2017-04-24 Thread Nathan Sidwell


On 04/21/2017 12:01 PM, Volker Reichelt wrote:

Hi,

the following patch adds fix-it info to error messages in 4 places
where the C++ parser complains about duplicate tokens.
The fix-it infos suggest to remove the duplicates.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

OK for trunk?



ok


--
Nathan Sidwell

Re: [PATCH] Simplify quoting in diagnostics of C++ frontend

2017-04-24 Thread Nathan Sidwell


On 04/23/2017 02:53 PM, Volker Reichelt wrote:

Hi,

the following patch simplifies quoting in diagnostics by using
%qD instead of %<%D%> etc.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

Btw, line 14563 in pt.c
   error ("enumerator value %E is outside the range of underlying "
contains an unquoted %E. Shouldn't that be replaced with %qE?


yes, please add that change too.


OK for trunk?


ok


--
Nathan Sidwell

Re: [C++ PATCH] Fix-it info for duplicate tokens

2017-04-24 Thread David Malcolm

On Fri, 2017-04-21 at 18:01 +0200, Volker Reichelt wrote:
> Hi,
> 
> the following patch adds fix-it info to error messages in 4 places
> where the C++ parser complains about duplicate tokens.
> The fix-it infos suggest to remove the duplicates.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?

OK for trunk (with my "diagnostic messages" maintainer hat on).

Ideally we'd put a secondary range into each of these messages, via
   rich_loc.add_range (other_token_loc, false)
highlighting the other token, but I don't think the parser retains the
pertinent location_t information.


Thanks
Dave

> Regards,
> Volker
> 
> 
> 2017-04-21  Volker Reichelt  
> 
> * parser.c (cp_parser_cv_qualifier_seq_opt): Add fix-it info
> to
> error message.
> (cp_parser_virt_specifier_seq_opt): Likewise.
> (set_and_check_decl_spec_loc): Likewise twice.
> 
> Index: gcc/cp/parser.c
> ===
> --- gcc/cp/parser.c (revision 246880)
> +++ gcc/cp/parser.c (working copy)
> @@ -20258,7 +20258,9 @@
>  
>if (cv_quals & cv_qualifier)
> {
> - error_at (token->location, "duplicate cv-qualifier");
> + gcc_rich_location richloc (token->location);
> + richloc.add_fixit_remove ();
> + error_at_rich_loc (&richloc, "duplicate cv-qualifier");
>   cp_lexer_purge_token (parser->lexer);
> }
>else
> @@ -20405,7 +20407,9 @@
>  
>if (virt_specifiers & virt_specifier)
> {
> - error_at (token->location, "duplicate virt-specifier");
> + gcc_rich_location richloc (token->location);
> + richloc.add_fixit_remove ();
> + error_at_rich_loc (&richloc, "duplicate virt-specifier");
>   cp_lexer_purge_token (parser->lexer);
> }
>else
> @@ -27665,7 +27669,11 @@
> error_at (location,
>   "both %<__thread%> and %
> specified");
>   else
> -   error_at (location, "duplicate %qD", token->u.value);
> +   {
> + gcc_rich_location richloc (location);
> + richloc.add_fixit_remove ();
> + error_at_rich_loc (&richloc, "duplicate %qD", token
> ->u.value);
> +   }
> }
>else
> {
> @@ -27686,8 +27694,9 @@
>  "constexpr",
> "__complex"
>   };
> - error_at (location,
> -   "duplicate %qs", decl_spec_names[ds]);
> + gcc_rich_location richloc (location);
> + richloc.add_fixit_remove ();
> + error_at_rich_loc (&richloc, "duplicate %qs",
> decl_spec_names[ds]);
> }
>  }
>  }
> 
> 2017-04-21  Volker Reichelt  
> 
> * g++.dg/diagnostic/duplicate1.C: New test.
> * g++.dg/cpp0x/duplicate1.C: New test.
> 
> Index: gcc/testsuite/g++.dg/diagnostic/duplicate1.C
> ===
> --- gcc/testsuite/g++.dg/diagnostic/duplicate1.C2017-04-21
> +++ gcc/testsuite/g++.dg/diagnostic/duplicate1.C2017-04-21
> @@ -0,0 +1,18 @@
> +// { dg-options "-fdiagnostics-show-caret" }
> +
> +struct A
> +{
> +  void foo() const const;  /* { dg-error "duplicate cv-qualifier" }
> +  { dg-begin-multiline-output "" }
> +   void foo() const const;
> +^
> +-
> +  { dg-end-multiline-output "" } */
> +};
> +
> +volatile volatile int i = 0;  /* { dg-error "duplicate" }
> +  { dg-begin-multiline-output "" }
> + volatile volatile int i = 0;
> +  ^~~~
> +  
> +  { dg-end-multiline-output "" } */
> Index: gcc/testsuite/g++.dg/cpp0x/duplicate1.C
> ===
> --- gcc/testsuite/g++.dg/cpp0x/duplicate1.C 2017-04-21
> +++ gcc/testsuite/g++.dg/cpp0x/duplicate1.C 2017-04-21
> @@ -0,0 +1,29 @@
> +// { dg-options "-fdiagnostics-show-caret" }
> +// { dg-do compile { target c++11 } }
> +
> +struct A
> +{
> +  virtual void foo() const;
> +};
> +
> +struct B final final : A  /* { dg-error "duplicate virt-specifier" }
> +  { dg-begin-multiline-output "" }
> + struct B final final : A
> +^
> +-
> +  { dg-end-multiline-output "" } */
> +{
> +  virtual void foo() const override final override;  /* { dg-error
> "duplicate virt-specifier" }
> +  { dg-begin-multiline-output "" }
> +   virtual void foo() const override final override;
> +   ^~~~
> +   
> +  { dg-end-multiline-output "" } */
> +};
> +
> +thread_local thread_local int i = 0;  /* { dg-error "duplicate" }
> +  { dg-begin-multiline-output "" }
> + thread_local thread_local int i = 0;
> +  ^~~~
> +  
> +  { dg-end-multiline-output "" } */
> ===
>

Re: X /[ex] 4 < Y /[ex] 4

2017-04-24 Thread Jeff Law


On 04/24/2017 12:24 AM, Marc Glisse wrote:

Hello,

we were missing this simplification on comparisons. Note that the testcase
still does not simplify as much as one might like, we don't turn x+zI'm all for improving our ability to analyze and generate good code for 
std::vector.  So any steps you take in this space are greatly appreciated.


Martin Sebor was considering looking at a variety of issues affecting 
our ability to do a good job with std::vector.  You might want to 
coordinate with him to make sure y'all don't duplicate work.


jeff

[PATCH] S/390: PR79895: Fix TImode constant handling

2017-04-24 Thread Andreas Krebbel

The P constraint letter is supposed to match every constant which is
acceptable during reload.  However, constraints do not appear to be
able to handle const_wide_int yet.  It works with predicates so the
alternative is modelled with a new predicate now.

Bootstrapped and regression tested on s390x.

Bye,

-Andreas-

gcc/ChangeLog:

2017-04-24  Andreas Krebbel  

* config/s390/predicates.md (reload_const_wide_int_operand): New
predicate.
* config/s390/s390.md ("movti"): Remove d/P alternative.
("movti_bigconst"): New pattern definition.

gcc/testsuite/ChangeLog:

2017-04-24  Andreas Krebbel  

* gcc.target/s390/pr79895.c: New test.
---
 gcc/config/s390/predicates.md   |  5 +
 gcc/config/s390/s390.md | 13 +++--
 gcc/testsuite/gcc.target/s390/pr79895.c |  9 +
 3 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr79895.c

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 0c82efc..34a7ea2 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -199,6 +199,11 @@
   (and (match_code "const_int")
   (match_test "INTVAL (op) <= 32767 && INTVAL (op) >= -32768"
 
+(define_predicate "reload_const_wide_int_operand"
+  (and (match_code "const_wide_int")
+   (match_test "legitimate_reload_constant_p (op)")))
+
+
 ;; operators --
 
 ;; Return nonzero if OP is a valid comparison operator
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 59f189c..36e2a40 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1479,11 +1479,20 @@
 ; movti instruction pattern(s).
 ;
 
+
+; Separate out the register pair alternative since constraints (P) are
+; not able to deal with const_wide_int's.  But predicates do.
+(define_insn "*movti_bigconst"
+  [(set (match_operand:TI 0 "register_operand"  "=d")
+(match_operand:TI 1 "reload_const_wide_int_operand" ""))]
+  "TARGET_ZARCH"
+  "#")
+
 ; FIXME: More constants are possible by enabling jxx, jyy constraints
 ; for TImode (use double-int for the calculations)
 (define_insn "movti"
-  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,S,v,  v,  v,v,d,v,R,  
d,o")
-(match_operand:TI 1 "general_operand"  " 
S,d,v,j00,jm1,d,v,R,v,dPT,d"))]
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,S,v,  v,  v,v,d,v,R, 
d,o")
+(match_operand:TI 1 "general_operand"  " 
S,d,v,j00,jm1,d,v,R,v,dT,d"))]
   "TARGET_ZARCH"
   "@
lmg\t%0,%N0,%S1
diff --git a/gcc/testsuite/gcc.target/s390/pr79895.c 
b/gcc/testsuite/gcc.target/s390/pr79895.c
new file mode 100644
index 000..02374e4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr79895.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O1 -mno-lra" } */
+
+unsigned __int128 g;
+void
+foo ()
+{
+  g = (unsigned __int128)1 << 127;
+}
-- 
2.9.1

[PATCH] S/390: PR80464: Split MEM->GPR vector moves

2017-04-24 Thread Andreas Krebbel

We do this already for TImode values but it was missing for vector
modes.

Bootstrapped and regression tested on s390x.

Bye,

-Andreas-

gcc/ChangeLog:

2017-04-24  Andreas Krebbel  

* config/s390/vector.md: Split MEM->GPR vector moves for
non-s_operand addresses.

gcc/testsuite/ChangeLog:

2017-04-24  Andreas Krebbel  

* gfortran.fortran-torture/compile/pr80464.f90: New test.
---
 gcc/config/s390/vector.md  | 19 +++
 .../gfortran.fortran-torture/compile/pr80464.f90   | 39 ++
 2 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.fortran-torture/compile/pr80464.f90

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 7535b9d..2952893 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -248,6 +248,25 @@
   operands[5] = operand_subword (operands[1], 0, 0, mode);
 })
 
+; This is the vector equivalent to the TImode splitter in s390.md.  It
+; is required if both target GPRs occur in the source address operand.
+
+; For non-s_operands at least one of the target GPRs does not conflict
+; with the address operand and one of the splitters above will take
+; over.
+(define_split
+  [(set (match_operand:V_128 0 "register_operand" "")
+(match_operand:V_128 1 "memory_operand" ""))]
+  "TARGET_ZARCH && reload_completed
+   && !VECTOR_REG_P (operands[0])
+   && !s_operand (operands[1], VOIDmode)"
+  [(set (match_dup 0) (match_dup 1))]
+{
+  rtx addr = operand_subword (operands[0], 1, 0, mode);
+  addr = gen_lowpart (Pmode, addr);
+  s390_load_address (addr, XEXP (operands[1], 0));
+  operands[1] = replace_equiv_address (operands[1], addr);
+})
 
 ; Moves for smaller vector modes.
 
diff --git a/gcc/testsuite/gfortran.fortran-torture/compile/pr80464.f90 
b/gcc/testsuite/gfortran.fortran-torture/compile/pr80464.f90
new file mode 100644
index 000..d3a3943
--- /dev/null
+++ b/gcc/testsuite/gfortran.fortran-torture/compile/pr80464.f90
@@ -0,0 +1,39 @@
+subroutine bla(a,bar,lb,ne,nt,v,b)
+  character*8 lb
+  integer bar(20),foo(8,5)
+  real*8 a(3,*),x(3,8),v(0:3,*)
+  if(lb(4:4).eq.'3') then
+ n=8
+  elseif(lb(4:5).eq.'10') then
+ n=10
+ ns=6
+ m=4
+  endif
+  call blub(id)
+  do
+ if(id.eq.0) exit
+ if(lb(4:4).eq.'6') then
+m=1
+ endif
+ if((n.eq.20).or.(n.eq.8)) then
+if(b.eq.0) then
+   do i=1,ns
+  do j=1,3
+ x(j,i)=a(j,bar(foo(i,ig)))
+  enddo
+   enddo
+else
+   do i=1,ns
+  do j=1,3
+ x(j,i)=a(j,bar(foo(i,ig)))+v(j,bar(foo(i,ig)))
+  enddo
+   enddo
+endif
+ endif
+ do i=1,m
+if(lb(4:5).eq.'1E') then
+   call blab(x)
+endif
+ enddo
+  enddo
+end subroutine bla
-- 
2.9.1

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Richard Biener

On April 24, 2017 2:49:04 PM GMT+02:00, Marc Glisse  
wrote:
>On Mon, 24 Apr 2017, Jakub Jelinek wrote:
>
>> On Mon, Apr 24, 2017 at 09:41:01AM +0200, Richard Biener wrote:
>>> On Sun, Apr 23, 2017 at 11:38 PM, Marc Glisse 
>wrote:
 Hello,

 this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME
>using
 range information and known (non-)zero bits, by delegating to
 expr_not_equal_to which already knows how to handle all that.

 This makes one strict overflow warning disappear. It isn't
>particularly
 surprising, since the new code makes tree_expr_nonzero_warnv_p
>return true
 without warning (we do not remember if the range information was
>obtained
 using strict overflow). In my opinion, improving code generation is
>more
 important than this specific warning.

 Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>>
>>> Hmm, I think you need to guard this with a INTEGRAL_TYPE_P check
>>> given the comment on tree_single_nonzero_warnv_p also talks about
>>> FP.
>
>The SSA_NAME case in expr_not_equal_to starts with
>
>   if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
> return false;
>
>Do you still want the extra check in tree_single_nonzero_warnv_p before
>
>calling expr_not_equal_to?

Yes because you create a wide int with TYPE_PRECISION.

Richard.

>> I vaguely remember there were issues with that, because VRP uses
>> the *_nonzero_warnv* functions to compute the ranges and now those
>> functions would use range info.  But it has been some time ago and
>maybe this
>> patch is different enough from what I've been trying back then.
>
>I hope it is safe, but we'll see.
>
>> So just please watch carefully for any fallout.
>
>Ok.

Re: [PATCH] Remove dead code from c_common_get_alias_set

2017-04-24 Thread Bernd Edlinger

On 04/24/17 09:00, Richard Biener wrote:
> On Fri, 21 Apr 2017, Bernd Edlinger wrote:
>
>> Hi!
>>
>>
>> This removes some dead and unreachable code in c_common_get_alias_set:
>> Because cc1 was recently changed to be only called with one file at a
>> time, the code after "if (num_in_fnames == 1) return -1;" is no longer
>> reachable, and can thus be removed.
>
> While I think you are correct it looks like c_common_parse_file still
> happily parses multiple infiles.  That is, only for
> flag_preprocess_only we have a
>
>   if (num_in_fnames > 1)
> error ("too many filenames given.  Type %s --help for usage",
>progname);
>
> and:
>
> gcc> ./cc1 -quiet t.c t2.c
> t2.c:5:6: error: conflicting types for ‘bar’
>  void bar () { struct X x; *(volatile char *)x.buf = 1; }
>   ^~~
> t.c:8:1: note: previous definition of ‘bar’ was here
>  bar (int x)
>  ^~~
>
> which means it actually still "works" to combine two source files
> (yes, the driver no longer seems to have the ability to pass down
> multiple inputs to cc1).
>
> Thus, can you first remove that "feature"?
>

Yes, sure.  See updated patch.


Thanks
Bernd.
2017-04-24  Bernd Edlinger  

	* c-common.c (c_type_hasher, type_hash_table): Remove.
	(c_common_get_alias_set): Remove unreachable code.
	* c-opts.c (c_common_post_options): Make sure cc1 takes only one file.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(Revision 247029)
+++ gcc/c-family/c-common.c	(Arbeitskopie)
@@ -3508,67 +3508,6 @@ c_apply_type_quals_to_decl (int type_quals, tree d
 }
 }
 
-struct c_type_hasher : ggc_ptr_hash
-{
-  static hashval_t hash (tree);
-  static bool equal (tree, tree);
-};
-
-/* Hash function for the problem of multiple type definitions in
-   different files.  This must hash all types that will compare
-   equal via comptypes to the same value.  In practice it hashes
-   on some of the simple stuff and leaves the details to comptypes.  */
-
-hashval_t
-c_type_hasher::hash (tree t)
-{
-  int n_elements;
-  int shift, size;
-  tree t2;
-  switch (TREE_CODE (t))
-{
-/* For pointers, hash on pointee type plus some swizzling.  */
-case POINTER_TYPE:
-  return hash (TREE_TYPE (t)) ^ 0x3003003;
-/* Hash on number of elements and total size.  */
-case ENUMERAL_TYPE:
-  shift = 3;
-  t2 = TYPE_VALUES (t);
-  break;
-case RECORD_TYPE:
-  shift = 0;
-  t2 = TYPE_FIELDS (t);
-  break;
-case QUAL_UNION_TYPE:
-  shift = 1;
-  t2 = TYPE_FIELDS (t);
-  break;
-case UNION_TYPE:
-  shift = 2;
-  t2 = TYPE_FIELDS (t);
-  break;
-default:
-  gcc_unreachable ();
-}
-  /* FIXME: We want to use a DECL_CHAIN iteration method here, but
- TYPE_VALUES of ENUMERAL_TYPEs is stored as a TREE_LIST.  */
-  n_elements = list_length (t2);
-  /* We might have a VLA here.  */
-  if (TREE_CODE (TYPE_SIZE (t)) != INTEGER_CST)
-size = 0;
-  else
-size = TREE_INT_CST_LOW (TYPE_SIZE (t));
-  return ((size << 24) | (n_elements << shift));
-}
-
-bool
-c_type_hasher::equal (tree t1, tree t2)
-{
-  return lang_hooks.types_compatible_p (t1, t2);
-}
-
-static GTY(()) hash_table *type_hash_table;
-
 /* Return the typed-based alias set for T, which may be an expression
or a type.  Return -1 if we don't do anything special.  */
 
@@ -3607,60 +3546,6 @@ c_common_get_alias_set (tree t)
 	return get_alias_set (t1);
 }
 
-  /* Handle the case of multiple type nodes referring to "the same" type,
- which occurs with IMA.  These share an alias set.  FIXME:  Currently only
- C90 is handled.  (In C99 type compatibility is not transitive, which
- complicates things mightily. The alias set splay trees can theoretically
- represent this, but insertion is tricky when you consider all the
- different orders things might arrive in.) */
-
-  if (c_language != clk_c || flag_isoc99)
-return -1;
-
-  /* Save time if there's only one input file.  */
-  if (num_in_fnames == 1)
-return -1;
-
-  /* Pointers need special handling if they point to any type that
- needs special handling (below).  */
-  if (TREE_CODE (t) == POINTER_TYPE)
-{
-  tree t2;
-  /* Find bottom type under any nested POINTERs.  */
-  for (t2 = TREE_TYPE (t);
-	   TREE_CODE (t2) == POINTER_TYPE;
-	   t2 = TREE_TYPE (t2))
-	;
-  if (!RECORD_OR_UNION_TYPE_P (t2)
-	  && TREE_CODE (t2) != ENUMERAL_TYPE)
-	return -1;
-  if (TYPE_SIZE (t2) == 0)
-	return -1;
-}
-  /* These are the only cases that need special handling.  */
-  if (!RECORD_OR_UNION_TYPE_P (t)
-  && TREE_CODE (t) != ENUMERAL_TYPE
-  && TREE_CODE (t) != POINTER_TYPE)
-return -1;
-  /* Undefined? */
-  if (TYPE_SIZE (t) == 0)
-return -1;
-
-  /* Look up t in hash table.  Only one of the compatible types within each
- alias set is recorded in the table.  */
-  if (!type_hash_table)
-type_hash_table = hash_tab

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Martin Sebor


On 04/23/2017 03:38 PM, Marc Glisse wrote:

Hello,

this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME using
range information and known (non-)zero bits, by delegating to
expr_not_equal_to which already knows how to handle all that.

This makes one strict overflow warning disappear. It isn't particularly
surprising, since the new code makes tree_expr_nonzero_warnv_p return
true without warning (we do not remember if the range information was
obtained using strict overflow). In my opinion, improving code
generation is more important than this specific warning.


Since this change effectively introduces a regression and also
adds an xfail for it I would suggest to open a new bug to track
it and reference the bug in the xfail.  That way, if/when someone
comes across either in the future, the background will be easier
to find.

Martin



Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2017-04-24  Marc Glisse  

gcc/
* fold-const.c (tree_single_nonzero_warnv_p): Handle SSA_NAME.

gcc/testsuite/
* gcc.dg/tree-ssa/cmpmul-1.c: New file.
* gcc.dg/Wstrict-overflow-18.c: Xfail.

Re: [PATCH 0/7] [ARC] Fix constraint letters and allow extra registers

2017-04-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-04-14 14:14:37 
+0200]:

> From: claziss 
> 
> Hi,
> 
> There is an issue with 'h'- register class for ARCv2, which accepts
> only the first 32 general purposes registers as oposite to the ARCv1
> which accepts all 64 GPRs. Fix this issue in two patches for CMP and
> ADD instructions.
> 
> Also, allow the compiler to use extra GPRs if they are available and
> mark D0, D1 registers fixed when not available.
> 
> Fix also C++ calling multiple inheritances when compiling for PIC, and
> allow addresses to use Rx + @symbol.

These all look good.

Thanks,
Andrew



> 
> --
> 
> Claudiu Zissulescu (7):
>   [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for CMP
> insns.
>   [ARC] Differentiate between ARCv1 and ARCv2 'h'-reg class for ADD
> insns.
>   [ARC] Allow extension core registers to be used for addresses.
>   [ARC] Make D0, D1 double regs fix when not used.
>   [ARC] Use ACCL, ACCH registers whenever they are available.
>   [ARC] [Cxx] Fix calling multiple inheritances.
>   [ARC] Addresses can use long immediate for offsets.
> 
>  gcc/config/arc/arc.c | 124 
> +--
>  gcc/config/arc/arc.h |  20 ---
>  gcc/config/arc/arc.md|  28 +-
>  gcc/config/arc/predicates.md |  13 +
>  4 files changed, 135 insertions(+), 50 deletions(-)
> 
> -- 
> 1.9.1
>

Re: Let tree_single_nonzero_warnv_p use VRP

2017-04-24 Thread Marc Glisse


On Mon, 24 Apr 2017, Martin Sebor wrote:


On 04/23/2017 03:38 PM, Marc Glisse wrote:

Hello,

this patches teaches tree_expr_nonzero_warnv_p to handle SSA_NAME using
range information and known (non-)zero bits, by delegating to
expr_not_equal_to which already knows how to handle all that.

This makes one strict overflow warning disappear. It isn't particularly
surprising, since the new code makes tree_expr_nonzero_warnv_p return
true without warning (we do not remember if the range information was
obtained using strict overflow). In my opinion, improving code
generation is more important than this specific warning.


Since this change effectively introduces a regression and also
adds an xfail for it I would suggest to open a new bug to track
it and reference the bug in the xfail.  That way, if/when someone
comes across either in the future, the background will be easier
to find.


Well, it seems that Richard is going to kill Wstrict-overflow, so this 
testcase will disappear. But ok, I'll do that.


--
Marc Glisse

[PING for gcc 8] Re: [PATCH] Fix spelling suggestions for reserved words (PR c++/80177)

2017-04-24 Thread David Malcolm

Ping for gcc 8.

On Fri, 2017-03-31 at 12:41 -0400, David Malcolm wrote:
> As noted in the PR, the C++ frontend currently offers a poor
> suggestion for this misspelling:
> 
> a.C: In function ‘void f()’:
> a.C:3:3: error: ‘static_assertion’ was not declared in this scope
>static_assertion (1 == 0, "1 == 0");
>^~~~
> a.C:3:3: note: suggested alternative: ‘__cpp_static_assert’
>static_assertion (1 == 0, "1 == 0");
>^~~~
>__cpp_static_assert
> 
> when it ought to offer "static_assert" as a suggestion instead.
> 
> The root causes are two issues within lookup_name_fuzzy
> (called here with FUZZY_LOOKUP_NAME):
> 
> (a) If it finds a good enough match in the preprocessor it will
> return the best match *before* considering reserved words,
> rather than picking the closest match overall.
> 
> The fix is to have merge all the results into one best_match
> instance, and pick the overall winner.  However, given that
> some candidates are identifiers (trees), and others are cpp
> macros, the best_match instance's candidate type needs to
> be converted from tree to const char *.  This has some minor
> knock-on effects within name-lookup.c.  Sadly it means some
> extra calls to strlen (one per candidate), but this will be
> purely when error-handling.
> 
> (b) It rejects "static_assert" here:
> 
>   4998if (!cp_keyword_starts_decl_specifier_p (resword->rid))
>   4999  continue;
> 
> as "static_assert" doesn't start a decl specifier.
> 
> The fix is to only apply this rejection criterion if we're
> looking
> for typenames, rather than for names in general.
> 
> This patch addresses both issues and adds test coverage.
> 
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> Adds 7 PASS and 1 UNSUPPORTED (for -std=c++98) to g++.sum
> 
> OK for next stage 1?
> 
> gcc/cp/ChangeLog:
>   PR c++/80177
>   * name-lookup.c (suggest_alternative_in_explicit_scope):
> Convert
>   candidate type of bm from tree to const char *.
>   (consider_binding_level): Likewise.
>   (lookup_name_fuzzy): Likewise, using this to merge the best
>   result from the preprocessor into bm, rather than immediately
>   returning, so that better matches from reserved words can
> "win".
>   Guard the rejection of keywords that don't start decl
> -specifiers
>   so it only happens for FUZZY_LOOKUP_TYPENAME.
> 
> gcc/testsuite/ChangeLog:
>   PR c++/80177
>   * g++.dg/spellcheck-pr80177.C: New test case.
> ---
>  gcc/cp/name-lookup.c  | 37 +
> --
>  gcc/testsuite/g++.dg/spellcheck-pr80177.C |  7 ++
>  2 files changed, 23 insertions(+), 21 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/spellcheck-pr80177.C
> 
> diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> index 994f7f0..16ec0a1 100644
> --- a/gcc/cp/name-lookup.c
> +++ b/gcc/cp/name-lookup.c
> @@ -48,7 +48,8 @@ static bool lookup_using_namespace (tree, struct
> scope_binding *, tree,
>   tree, int);
>  static bool qualified_lookup_using_namespace (tree, tree,
> struct scope_binding
> *, int);
> -static void consider_binding_level (tree name, best_match  tree> &bm,
> +static void consider_binding_level (tree name,
> + best_match 
> &bm,
>   cp_binding_level *lvl,
>   bool look_within_fields,
>   enum lookup_name_fuzzy_kind
> kind);
> @@ -4550,14 +4551,13 @@ suggest_alternative_in_explicit_scope
> (location_t location, tree name,
>  
>cp_binding_level *level = NAMESPACE_LEVEL (scope);
>  
> -  best_match  bm (name);
> +  best_match  bm (name);
>consider_binding_level (name, bm, level, false,
> FUZZY_LOOKUP_NAME);
>  
>/* See if we have a good suggesion for the user.  */
> -  tree best_id = bm.get_best_meaningful_candidate ();
> -  if (best_id)
> +  const char *fuzzy_name = bm.get_best_meaningful_candidate ();
> +  if (fuzzy_name)
>  {
> -  const char *fuzzy_name = IDENTIFIER_POINTER (best_id);
>gcc_rich_location richloc (location);
>richloc.add_fixit_replace (fuzzy_name);
>inform_at_rich_loc (&richloc, "suggested alternative: %qs",
> @@ -4797,7 +4797,7 @@ qualified_lookup_using_namespace (tree name,
> tree scope,
> Traverse binding level LVL, looking for good name matches for
> NAME
> (and BM).  */
>  static void
> -consider_binding_level (tree name, best_match  &bm,
> +consider_binding_level (tree name, best_match 
> &bm,
>   cp_binding_level *lvl, bool
> look_within_fields,
>   enum lookup_name_fuzzy_kind kind)
>  {
> @@ -4809,7 +4809,7 @@ consider_binding_level (tree name, best_match
>  &bm,
>   tree best_matching_field
> = lookup_member_fuzzy (type, name, want_type_p);

Re: [wwwdocs] powerpc: Another update for gcc-7/changes.html

2017-04-24 Thread Gerald Pfeifer

On Fri, 21 Apr 2017, Segher Boessenkool wrote:
>>> +  There are new options -mstack-protector-guard=global,
>>> +-mstack-protector-guard=tls,
>>> +-mstack-protector-guard-reg=, and
>>> +-mstack-protector-guard-offset=, to change how the stack
>>> +protector gets the value to use as canary.
>> no comma before "to change".
> Oxford comma :-)  I'll get rid of it, sure.

The comma before "and" (and the last item in the list) is an
Oxford comma, and I'm all for keeping it.  The one before "to
change" would be an Amsterdam comma. :-)

>> Well, one question:  Don't these options do more than just changing
>> how the value is obtained?  The way I read the documentation the first
>> two initiate generation of stack protection code?  Am I confused, or
>> should either the web patch or the documentation be adjusted?
> No, this is correct.  The documentation could be clearer yes.  I'll
> try to improve it; help is more than welcome ;-)

I'll volunteer myself to review any patch (to see whether it helps
improve my understanding and otherwise), but am afraid I can't come
up with one.

Gerald

[PATCH, gcc 8] C++: hints for missing std:: headers

2017-04-24 Thread David Malcolm

If the user forgets to include an STL header, then an attempt
to use a class in an explicitly scoped "std::" currently leads to
this error:

test.cc:3:8: error: 'string' is not a member of 'std'
   std::string s ("hello world");
^~

This patch attempts to make this error a bit more user-friendly
by hinting at which header file is missing, for a subset of
STL types/headers (somewhat arbitrarily chosen by me when browsing
cppreference.com).

This turns the above into:

test.cc:3:8: error: 'string' is not a member of 'std'
   std::string s ("hello world");
^~
test.cc:3:8: note: 'std::string' is defined in header ''; did you 
forget to '#include '?

...and ultimately, once fix-it hints can contain newlines, we can
also provide a fix-it hint, making it easy for an IDE to insert
the missing #include when the user clicks on the error.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
* name-lookup.c (get_std_name_hint): New function.
(maybe_suggest_missing_header): New function.
(suggest_alternative_in_explicit_scope): Call
maybe_suggest_missing_header.

gcc/testsuite/ChangeLog:
* g++.dg/lookup/missing-std-include.C: New test file.
---
 gcc/cp/name-lookup.c  | 109 ++
 gcc/testsuite/g++.dg/lookup/missing-std-include.C |  29 ++
 2 files changed, 138 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-include.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index eda6db2..0c5df93 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -4537,6 +4537,113 @@ suggest_alternatives_for (location_t location, tree 
name,
   candidates.release ();
 }
 
+/* Subroutine of maybe_suggest_missing_header for handling unrecognized names
+   for some of the most common names within "std::".
+   Given non-NULL NAME, a name for lookup within "std::", return the header
+   name defining it within the C++ Standard Library (without '<' and '>'),
+   or NULL.  */
+
+static const char *
+get_std_name_hint (const char *name)
+{
+  struct std_name_hint
+  {
+const char *name;
+const char *header;
+  };
+  static const std_name_hint hints[] = {
+/* .  */
+{"array", "array"}, // C++11
+/* .  */
+{"deque", "deque"},
+/* .  */
+{"forward_list", "forward_list"},  // C++11
+/* .  */
+{"basic_filebuf", "fstream"},
+{"basic_ifstream", "fstream"},
+{"basic_ofstream", "fstream"},
+{"basic_fstream", "fstream"},
+/* .  */
+{"cin", "iostream"},
+{"cout", "iostream"},
+{"cerr", "iostream"},
+{"clog", "iostream"},
+{"wcin", "iostream"},
+{"wcout", "iostream"},
+{"wclog", "iostream"},
+/* .  */
+{"list", "list"},
+/* .  */
+{"map", "map"},
+{"multimap", "map"},
+/* .  */
+{"queue", "queue"},
+{"priority_queue", "queue"},
+/* .  */
+{"ostream", "ostream"},
+{"wostream", "ostream"},
+{"ends", "ostream"},
+{"flush", "ostream"},
+{"endl", "ostream"},
+/* .  */
+{"set", "set"},
+{"multiset", "set"},
+/* .  */
+{"basic_stringbuf", "sstream"},
+{"basic_istringstream", "sstream"},
+{"basic_ostringstream", "sstream"},
+{"basic_stringstream", "sstream"},
+/* .  */
+{"stack", "stack"},
+/* .  */
+{"string", "string"},
+{"wstring", "string"},
+{"u16string", "string"},
+{"u32string", "string"},
+/* .  */
+{"unordered_map", "unordered_map"}, // C++11
+{"unordered_multimap", "unordered_map"}, // C++11
+/* .  */
+{"unordered_set", "unordered_set"}, // C++11
+{"unordered_multiset", "unordered_set"}, // C++11
+/* .  */
+{"vector", "vector"},
+  };
+  const size_t num_hints = sizeof (hints) / sizeof (hints[0]);
+  for (size_t i = 0; i < num_hints; i++)
+{
+  if (0 == strcmp (name, hints[i].name))
+   return hints[i].header;
+}
+  return NULL;
+}
+
+/* Subroutine of suggest_alternative_in_explicit_scope, for use when we have no
+   suggestions to offer.
+   If SCOPE is the "std" namespace, then suggest pertinent header
+   files for NAME.  */
+
+static void
+maybe_suggest_missing_header (location_t location, tree name, tree scope)
+{
+  if (scope == NULL_TREE)
+return;
+  if (TREE_CODE (scope) != NAMESPACE_DECL)
+return;
+  /* We only offer suggestions for the "std" namespace.  */
+  if (scope != std_node)
+return;
+  gcc_assert (TREE_CODE (name) == IDENTIFIER_NODE);
+
+  const char *name_str = IDENTIFIER_POINTER (name);
+  const char *header_hint = get_std_name_hint (name_str);
+  if (header_hint)
+inform (location,
+   "% is defined in header %<<%s>%>;"
+   " did you forget to %<#include <%s>%>?",
+   name_str, header_hint, header_hint);
+}
+
 /* Look for alternatives for NAME, an IDENTIFIER_NODE for which name
lookup failed within the explicitly provided SCOPE.  Suggest the
the best mean

[PATCH] C: fix-it hint for removing stray semicolons

2017-04-24 Thread David Malcolm

Patch adds a fix-it hint to a pre-existing pedwarn to make
it easier for IDEs to assist in fixing the mistake.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/c/ChangeLog:
* c-parser.c (c_parser_struct_or_union_specifier): Add fix-it
hint for removing extra semicolon.

gcc/testsuite/ChangeLog:
* gcc.dg/semicolon-fixits.c: New test case.
---
 gcc/c/c-parser.c|  9 +++--
 gcc/testsuite/gcc.dg/semicolon-fixits.c | 17 +
 2 files changed, 24 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/semicolon-fixits.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 988369e..9398652 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -2948,8 +2948,13 @@ c_parser_struct_or_union_specifier (c_parser *parser)
  /* Parse any stray semicolon.  */
  if (c_parser_next_token_is (parser, CPP_SEMICOLON))
{
- pedwarn (c_parser_peek_token (parser)->location, OPT_Wpedantic,
-  "extra semicolon in struct or union specified");
+ location_t semicolon_loc
+   = c_parser_peek_token (parser)->location;
+ gcc_rich_location richloc (semicolon_loc);
+ richloc.add_fixit_remove ();
+ pedwarn_at_rich_loc
+   (&richloc, OPT_Wpedantic,
+"extra semicolon in struct or union specified");
  c_parser_consume_token (parser);
  continue;
}
diff --git a/gcc/testsuite/gcc.dg/semicolon-fixits.c 
b/gcc/testsuite/gcc.dg/semicolon-fixits.c
new file mode 100644
index 000..e7d5322
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/semicolon-fixits.c
@@ -0,0 +1,17 @@
+/* { dg-options "-fdiagnostics-show-caret -Wpedantic" } */
+
+/* Struct with extra semicolon.  */
+struct s1 { int i;; }; /* { dg-warning "19: extra semicolon in struct or union 
specified" } */
+/* { dg-begin-multiline-output "" }
+ struct s1 { int i;; };
+   ^
+   -
+   { dg-end-multiline-output "" } */
+
+/* Union with extra semicolon.  */
+union u1 { int i;; }; /* { dg-warning "18: extra semicolon in struct or union 
specified" } */
+/* { dg-begin-multiline-output "" }
+ union u1 { int i;; };
+  ^
+  -
+   { dg-end-multiline-output "" } */
-- 
1.8.5.3

[PATCH] C++: fix-it hint for removing stray semicolons

2017-04-24 Thread David Malcolm

Patch adds a fix-it hint to a pre-existing pedwarn to make
it easier for IDEs to assist in fixing the mistake.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
* parser.c (cp_parser_member_declaration): Add fix-it hint
for removing stray semicolons.

gcc/testsuite/ChangeLog:
* g++.dg/semicolon-fixits.C: New test case.
---
 gcc/cp/parser.c |  6 +-
 gcc/testsuite/g++.dg/semicolon-fixits.C | 17 +
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/semicolon-fixits.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f164d7e..9d5baf8 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -23123,7 +23123,11 @@ cp_parser_member_declaration (cp_parser* parser)
{
  cp_token *token = cp_lexer_peek_token (parser->lexer);
  if (!in_system_header_at (token->location))
-   pedwarn (token->location, OPT_Wpedantic, "extra %<;%>");
+   {
+ gcc_rich_location richloc (token->location);
+ richloc.add_fixit_remove ();
+ pedwarn_at_rich_loc (&richloc, OPT_Wpedantic, "extra %<;%>");
+   }
}
   else
{
diff --git a/gcc/testsuite/g++.dg/semicolon-fixits.C 
b/gcc/testsuite/g++.dg/semicolon-fixits.C
new file mode 100644
index 000..a9cc783
--- /dev/null
+++ b/gcc/testsuite/g++.dg/semicolon-fixits.C
@@ -0,0 +1,17 @@
+/* { dg-options "-fdiagnostics-show-caret -Wpedantic" } */
+
+/* Struct with extra semicolon.  */
+struct s1 { int i;; }; /* { dg-warning "19: extra .;." } */
+/* { dg-begin-multiline-output "" }
+ struct s1 { int i;; };
+   ^
+   -
+   { dg-end-multiline-output "" } */
+
+/* Union with extra semicolon.  */
+union u1 { int i;; }; /* { dg-warning "18: extra .;." } */
+/* { dg-begin-multiline-output "" }
+ union u1 { int i;; };
+  ^
+  -
+   { dg-end-multiline-output "" } */
-- 
1.8.5.3

[PATCH] C++: fix-it hints suggesting accessors for private fields

2017-04-24 Thread David Malcolm

Given e.g.
   class foo
   {
   public:
 int get_field () const { return m_field; }

   private:
 int m_field;
   };

...if the user attempts to access the private field from the
wrong place we emit:

test.cc: In function ‘int test(foo*)’:
test.cc:12:13: error: ‘int foo::m_field’ is private within this context
   return f->m_field;
 ^~~
test.cc:7:7: note: declared private here
   int m_field;
   ^~~

This patch adds a note with a fix-it hint to the above, suggesting
the correct accessor to use:

test.cc:12:13: note: field ‘int foo::m_field’ can be accessed via ‘int 
foo::get_field() const’
   return f->m_field;
 ^~~
 get_field()

Assuming that an IDE can offer to apply fix-it hints, this should
make it easier to handle refactorings where one makes a field
private and adds a getter.

It also helps by letting the user know that a getter exists, and
the name of the getter ("is it "field", "get_field", etc?").

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
* call.c (maybe_suggest_accessor): New function.
(enforce_access): Call maybe_suggest_accessor for inaccessible
decls.
* cp-tree.h (locate_field_accessor): New decl.
* search.c (matches_code_and_type_p): New function.
(field_access_p): New function.
(direct_accessor_p): New function.
(reference_accessor_p): New function.
(field_accessor_p): New function.
(dfs_locate_field_accessor_pre): New function.
(locate_field_accessor): New function.

gcc/testsuite/ChangeLog:
* g++.dg/other/accessor-fixits-1.C: New test case.
* g++.dg/other/accessor-fixits-2.C: New test case.
* g++.dg/other/accessor-fixits-3.C: New test case.
---
 gcc/cp/call.c  |  28 
 gcc/cp/cp-tree.h   |   1 +
 gcc/cp/search.c| 204 +
 gcc/testsuite/g++.dg/other/accessor-fixits-1.C | 176 +
 gcc/testsuite/g++.dg/other/accessor-fixits-2.C | 104 +
 gcc/testsuite/g++.dg/other/accessor-fixits-3.C |  15 ++
 6 files changed, 528 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/other/accessor-fixits-1.C
 create mode 100644 gcc/testsuite/g++.dg/other/accessor-fixits-2.C
 create mode 100644 gcc/testsuite/g++.dg/other/accessor-fixits-3.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index c15b8e4..67f18aa 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6408,6 +6408,31 @@ build_op_delete_call (enum tree_code code, tree addr, 
tree size,
   return error_mark_node;
 }
 
+/* Helper function for enforce_access when FIELD_DECL is not accessible
+   along BASETYPE_PATH (e.g. due to being private).
+   Attempt to locate an accessor function for the field, and if one is
+   available, add a note and fix-it hint suggesting using it.  */
+
+static void
+maybe_suggest_accessor (tree basetype_path, tree field_decl)
+{
+  tree accessor = locate_field_accessor (basetype_path, field_decl);
+  if (!accessor)
+return;
+
+  /* The accessor must itself be accessible for it to be a reasonable
+ suggestion.  */
+  if (!accessible_p (basetype_path, accessor, true))
+return;
+
+  rich_location richloc (line_table, input_location);
+  pretty_printer pp;
+  pp_printf (&pp, "%s()", IDENTIFIER_POINTER (DECL_NAME (accessor)));
+  richloc.add_fixit_replace (pp_formatted_text (&pp));
+  inform_at_rich_loc (&richloc, "field %q#D can be accessed via %q#D",
+ field_decl, accessor);
+}
+
 /* If the current scope isn't allowed to access DECL along
BASETYPE_PATH, give an error.  The most derived class in
BASETYPE_PATH is the one used to qualify DECL. DIAG_DECL is
@@ -6441,17 +6466,20 @@ enforce_access (tree basetype_path, tree decl, tree 
diag_decl,
  error ("%q#D is private within this context", diag_decl);
  inform (DECL_SOURCE_LOCATION (diag_decl),
  "declared private here");
+ maybe_suggest_accessor (basetype_path, diag_decl);
}
  else if (TREE_PROTECTED (decl))
{
  error ("%q#D is protected within this context", diag_decl);
  inform (DECL_SOURCE_LOCATION (diag_decl),
  "declared protected here");
+ maybe_suggest_accessor (basetype_path, diag_decl);
}
  else
{
  error ("%q#D is inaccessible within this context", diag_decl);
  inform (DECL_SOURCE_LOCATION (diag_decl), "declared here");
+ maybe_suggest_accessor (basetype_path, diag_decl);
}
}
   return false;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 67dfea2..e5bb6b7 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6323,6 +6323,7 @@ extern tree lookup_fnfields   (tree, 
tree, int);
 extern tree lookup_member

Re: [RFC, testsuite] Add dg-save-linenr

2017-04-24 Thread David Malcolm

On Mon, 2017-04-24 at 11:20 -0400, David Malcolm wrote:
> On Sat, 2017-04-22 at 19:49 +0200, Tom de Vries wrote:
> > Hi,
> > 
> > there are currently two types of line number supported in
> > dg-{error,warning,message,bogus} directives: absolute and relative.
> > With an absolute line number, it's immediately clear what line
> > number
> > is 
> > meant, but when a line is added at the start of the file, the line 
> > number needs to be updated.  With a relative line number, that
> > problem 
> > is solved, but when relative line numbers become large, it becomes
> > less 
> > clear what line it refers to, and when adding a line inbetween the 
> > directive using the relative line number and the line it refers to,
> > the 
> > relative line number still needs to be updated.
> > 
> > This patch adds a directive dg-save-linenr with argument varname,
> > that 
> > saves the line number of the directive in a variable varname, which
> > can 
> > be used as line number in dg directives.
> > 
> > Testing status:
> > - tested updated test-case objc.dg/try-catch-12.m
> > - ran tree-ssa.exp
> > 
> > RFC:
> > - good idea?
> 
> Excellent idea; thanks!  There are various places where I'd find this
> useful

e.g. the test cases within
https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01061.html

Re: std::vector move assign patch

2017-04-24 Thread Marc Glisse


On Thu, 9 Jan 2014, Jonathan Wakely wrote:


On 9 January 2014 12:22, H.J. Lu wrote:

On Fri, Dec 27, 2013 at 10:27 AM, François Dumont  wrote:

Hi

Here is a patch to fix an issue in normal mode during the move
assignment. The destination vector allocator instance is moved too during
the assignment which is wrong.

As I discover this problem while working on issues with management of
safe iterators during move operations this patch also fix those issues in
the debug mode for the vector container. Fixes for other containers in debug
mode will come later.

2013-12-27  François Dumont 

* include/bits/stl_vector.h (std::vector<>::_M_move_assign): Pass
*this allocator instance when building temporary vector instance
so that *this allocator do not get moved.
* include/debug/safe_base.h
(_Safe_sequence_base(_Safe_sequence_base&&)): New.
* include/debug/vector (__gnu_debug::vector<>(vector&&)): Use
latter.
(__gnu_debug::vector<>(vector&&, const allocator_type&)): Swap
safe iterators if the instance is moved.
(__gnu_debug::vector<>::operator=(vector&&)): Likewise.
* testsuite/23_containers/vector/allocator/move.cc (test01): Add
check on a vector iterator.
* testsuite/23_containers/vector/allocator/move_assign.cc
(test02): Likewise.
(test03): New, test with a non-propagating allocator.
* testsuite/23_containers/vector/debug/move_assign_neg.cc: New.

Tested under Linux x86_64 normal and debug modes.

I will be in vacation for a week starting today so if you want to apply it
quickly do not hesitate to do it yourself.



This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59738


Fixed by the attached patch, tested x86_64-linux and committed to
trunk.  I've also rotated the libstdc++ ChangeLog.


2014-01-09  Jonathan Wakely  

   PR libstdc++/59738
   * include/bits/stl_vector.h (vector<>::_M_move_assign): Restore
   support for non-Movable types.


It seems that this patch had 2 consequences that may or may not have been 
planned. Consider this example (from PR64601)


#include 
typedef std::vector V;
void f(V&v,V&w){ V(std::move(w)).swap(v); }
void g(V&v,V&w){ v=std::move(w); }

1) We generate shorter code for f than for g, probably since the fix for 
PR59738. g ends up zeroing v, copying w to v, and finally zeroing w, and 
for weird reasons (and because we swap the members one by one) the 
standard prevents us from assuming that v and w do not overlap in weird 
ways so we cannot optimize as much as one might expect.


2) g(v,v) seems to turn v into a nice empty vector, while f(v,v) turns it 
into an invalid vector pointing at released memory.


Since 2) is a nice side-effect, it may not be worth rewriting operator= in 
a way that improves 1) but loses 2). Anyway, just mentioning this here.


--
Marc Glisse

Re: [PATCH] Add gcc_jit_type_get_aligned

2017-04-24 Thread David Malcolm

On Fri, 2017-03-31 at 17:13 -0400, David Malcolm wrote:
> On Thu, 2017-03-30 at 22:28 +0200, Florian Weimer wrote:
> > * David Malcolm:
> > 
> > > Here's a work-in-progress implementation of the idea, adding this
> > > entrypoint to the API:
> > > 
> > >   extern gcc_jit_type *
> > >   gcc_jit_type_get_aligned (gcc_jit_type *type,
> > > unsigned int alignment_in_bytes);
> > 
> > Should be size_t, not unsigned int.  A 2**31 alignment isn't as
> > ridiculous as it might seem.  x86-64 already has a 2**30 alignment
> > requirement in some contexts.
> 
> Thanks; fixed in this version.
> 
> Here's a completed version of the patch.
> 
> It also implements the missing C++ binding
> gccjit::type::get_const, needed by a test case.
> 
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> Takes jit.sum from 8609 to 9349 PASS results.
> 
> Release managers: is it acceptable to commit this to trunk in
> stage 4?  It purely touches jit-related code/testcases, but I
> appreciate it's very late to be adding features.
> 
> Otherwise I'll commit it in the next stage 1.

I've committed this to trunk (for gcc 8) as r247111.

[...snip...]

Re: [Patch, Fortran] PR 80121: Memory leak with derived-type intent(out) argument

2017-04-24 Thread Janus Weil

2017-04-24 14:29 GMT+02:00 Bernhard Reutner-Fischer :
>>>
>>> This patch causes an error message from DejaGnu:
>>> (DejaGnu) proc "cleanup-tree-dump original" does not exist.
>>
>>thanks for letting me know. I didn't notice that ...
>>
>>
>>> I'm not familiar with fortran, so I'm not sure it is as obvious as
>>> removing cleanup-tree-dump as it is done in the other neighboring
>>tests?
>>
>>Yes, probably it should just be removed. I assume this kind of cleanup
>>is being done automatically now? I actually took it from this wiki
>>page:
>
> Yes it is done automatically nowadays.

Thanks for the confirmation.


>>https://gcc.gnu.org/wiki/TestCaseWriting
>>
>>So I guess this needs to be updated as well. Will take care of both
>>points tonight ...
>
> Obviously I did not think about updating the wiki back then.
> TIA for fixing the wiki.

No problem. Wiki is updated by now, and the dejagnu test is fixed with r247115.

Cheers,
Janus

Re: [PATCH] squash spurious warnings in dominance.c

2017-04-24 Thread Martin Sebor


On 04/24/2017 01:32 AM, Richard Biener wrote:

On Sat, Apr 22, 2017 at 2:51 AM, Martin Sebor  wrote:

Bug 80486 - spurious -Walloc-size-larger-than and
-Wstringop-overflow in dominance.c during profiledbootstrap
points out a number of warnings that show up in dominance.c
during a profiledbootstrap.  I'm pretty sure the warnings
are due to the size check the C++ new expression introduces
to avoid unsigned overflow before calling operator new, and
by some optimization like jump threading introducing a branch
with the call to the allocation function and memset with
the excessive constant size.

Two ways to avoid it come to mind: 1) use the libiberty
XCNEWVEC and XNEWVEC macros instead of C++ new expressions,
and 2) constraining the size variable to a valid range.

Either of these approaches should result in better code than
the new expression because they both eliminate the test for
the overflow.  Attached is a patch that implements (1). I
chose it mainly because it seems in line with GCC's memory
management policy and with avoiding exceptions.

An alternate patch should be straightforward.  Either add
an assert like the one below or change the type of
m_n_basic_blocks from size_t to unsigned.  This approach,
though less intrusive, will likely bring the warning back
in ILP32 builds; I'm not sure if it matters.


Please change m_n_basic_blocks (and local copies) from size_t
to unsigned int.  This is an odd inconsistency that's worth fixing
in any case.


Attached is this version of the patch.  It also eliminates
the warnings and passes profiledbootstrap/regression test
on x86_64.

Martin

PR bootstrap/80486 - spurious -Walloc-size-larger-than and -Wstringop-overflow in dominance.c during profiledbootstrap

gcc/ChangeLog:

	PR bootstrap/80486
	* dominance.c (dom_info::m_n_basic_blocks): Change type to unsigned.
	(new_zero_array): Adjust signature.
	(dom_info::dom_init): Used unsigned rather that size_t.
	(dom_info::dom_info): Same.

diff --git a/gcc/dominance.c b/gcc/dominance.c
index c76e62e..1d4bd54 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -125,7 +125,7 @@ private:
   bitmap m_fake_exit_edge;
 
   /* Number of basic blocks in the function being compiled.  */
-  size_t m_n_basic_blocks;
+  unsigned m_n_basic_blocks;
 
   /* True, if we are computing postdominators (rather than dominators).  */
   bool m_reverse;
@@ -148,7 +148,7 @@ void debug_dominance_tree (cdi_direction, basic_block);
`x = new T[num] {};'.  */
 
 template
-inline T *new_zero_array (size_t num)
+inline T *new_zero_array (unsigned num)
 {
   T *result = new T[num];
   memset (result, 0, sizeof (T) * num);
@@ -160,14 +160,15 @@ inline T *new_zero_array (size_t num)
 void
 dom_info::dom_init (void)
 {
-  size_t num = m_n_basic_blocks;
+  unsigned num = m_n_basic_blocks;
+
   m_dfs_parent = new_zero_array  (num);
   m_dom = new_zero_array  (num);
 
   m_path_min = new TBB[num];
   m_key = new TBB[num];
   m_set_size = new unsigned int[num];
-  for (size_t i = 0; i < num; i++)
+  for (unsigned i = 0; i < num; i++)
 {
   m_path_min[i] = m_key[i] = i;
   m_set_size[i] = 1;
@@ -221,13 +222,13 @@ dom_info::dom_info (function *fn, cdi_direction dir)
 dom_info::dom_info (vec region, cdi_direction dir)
 {
   m_n_basic_blocks = region.length ();
-  unsigned int nm1 = m_n_basic_blocks - 1;
+  unsigned nm1 = m_n_basic_blocks - 1;
 
   dom_init ();
 
   /* Determine max basic block index in region.  */
   int max_index = region[0]->index;
-  for (size_t i = 1; i <= nm1; i++)
+  for (unsigned i = 1; i <= nm1; i++)
 if (region[i]->index > max_index)
   max_index = region[i]->index;
   max_index += 1;  /* set index on the first bb out of region.  */

Re: PR79697: Delete calls to strdup, strndup, realloc if there is no lhs

2017-04-24 Thread Jeff Law


On 02/25/2017 01:40 AM, Prathamesh Kulkarni wrote:

Hi,
The attached patch deletes calls to strdup, strndup if it's
return-value is unused,
and same for realloc if the first arg is NULL.
Bootstrap+tested on x86_64-unknown-linux-gnu.
OK for GCC 8 ?

Thanks,
Prathamesh


pr79697-1.txt


2017-02-25  Prathamesh Kulkarni

PR tree-optimization/79697
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Check if callee
is BUILT_IN_STRDUP, BUILT_IN_STRNDUP, BUILT_IN_REALLOC.

testsuite/
* gcc.dg/tree-ssa/pr79697.c: New test.

OK for the trunk.

jeff

[PATCH] Fix rtl sharing issue in RTL loop unroller (PR rtl-optimization/80500)

2017-04-24 Thread Jakub Jelinek

Hi!

The following testcase ICEs, because ve->reg is a SUBREG and thus should not
be shared, but we initialize sum to ve->reg, then add to it some REGs (which
can be shared always), and finally assign the addition etc. result into
ve->reg.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-04-24  Jakub Jelinek  

PR rtl-optimization/80500
* loop-unroll.c (combine_var_copies_in_loop_exit): Call copy_rtx on
sum's initial value.

* gcc.dg/pr80500.c: New test.

--- gcc/loop-unroll.c.jj2017-04-04 07:32:57.0 +0200
+++ gcc/loop-unroll.c   2017-04-24 09:22:35.816571901 +0200
@@ -1913,6 +1913,7 @@ combine_var_copies_in_loop_exit (struct
   if (ve->var_expansions.length () == 0)
 return;
 
+  sum = copy_rtx (sum);
   start_sequence ();
   switch (ve->op)
 {
--- gcc/testsuite/gcc.dg/pr80500.c.jj   2017-04-24 09:29:39.382031846 +0200
+++ gcc/testsuite/gcc.dg/pr80500.c  2017-04-24 09:29:15.0 +0200
@@ -0,0 +1,15 @@
+/* PR rtl-optimization/80500 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -funroll-loops -ftree-loop-if-convert 
-fvariable-expansion-in-unroller" } */
+
+signed char v;
+
+void
+foo (int x)
+{
+  while (x != 0)
+{
+  v = (x >= 0) + 1;
+  ++x;
+}
+}

Jakub

Re: [PATCH GCC8][01/33]Handle TRUNCATE between tieable modes in rtx_cost

2017-04-24 Thread Jeff Law


On 04/18/2017 04:37 AM, Bin Cheng wrote:

Is it OK?

Thanks,
bin

2017-04-11  Bin Cheng  

* rtlanal.c (rtx_cost): Handle TRUNCATE between tieable modes.
This is fine.  You might consider adding tests for this kind of change, 
but I also realize they could end up being pretty fragile.  Hmm, maybe 
they would be better as unit tests of rtx_cost?


jeff

Re: [PATCH GCC8][02/33]Remove code handling pseudo candidate

2017-04-24 Thread Jeff Law


On 04/18/2017 04:38 AM, Bin Cheng wrote:

Hi,
We don't have pseudo candidate nowadays, so remove any related code.

Is it OK?

Thanks,
bin

2017-04-11  Bin Cheng  

* tree-ssa-loop-ivopts.c (get_computation_cost_at): Remove pseudo
iv_cand code.
(determine_group_iv_cost_cond, determine_iv_cost): Ditto.
(iv_ca_set_no_cp, create_new_iv): Ditto.


OK.
jeff

Re: [PATCH 1/7] [D] libiberty: Add support for demangling scope attributes.

2017-04-24 Thread Jeff Law


On 04/15/2017 09:19 AM, Iain Buclaw wrote:

The next version of D adds a new `scope' function attribute.  This
adds support for demangling them.

---


01-d-demangle-scope-postfix.patch


commit 15a0592cf6403fccbf43f3c7dc44f7d22c0f3dfa
Author: Iain Buclaw
Date:   Sat Apr 15 11:15:41 2017 +0200

 libiberty/ChangeLog:
 
 2017-04-15  Iain Buclaw
 
 	* d-demangle.c (dlang_attributes): Handle scope attributes.

* testsuite/d-demangle-expected: Add tests.

OK for the trunk.
jeff

[PATCH] Fix make_compound_operation in COMPAREs (PR rtl-optimization/80501)

2017-04-24 Thread Jakub Jelinek

Hi!

For SUBREGs, make_compound_operation* recurses on the SUBREG_REG.
If the original in_code is EQ (i.e. equality comparison against 0) or SET,
we handle it correctly, but as the following testcase shows, for COMPARE
(some other comparison against 0) we in some cases don't.

The problem is that the recursive call can encounter:
  /* If we are in a comparison and this is an AND with a power of two,
 convert this into the appropriate bit extract.  */
  else if (in_code == COMPARE
   && (i = exact_log2 (UINTVAL (XEXP (x, 1 >= 0
   && (equality_comparison || i < GET_MODE_PRECISION (mode) - 1))
new_rtx = make_extraction (mode,
   make_compound_operation (XEXP (x, 0),
next_code),
   i, NULL_RTX, 1, 1, 0, 1);
and mode in that case is the mode of the SUBREG_REG, so on the following
testcase SImode, while SUBREG's mode is QImode, and inner is AND with 0x80
constant.  For COMPARE (i.e. non-equality_comparison), we can do that only
if the AND is with a mask smaller than the sign bit and we can then just
extract the corresponding bits in lower positions.  But with SUBREGs,
we actually need to make sure that we only consider masks smaller than
the sign bit of the outer SUBREG's mode.

Apparently the caller already has a spot where I've fixed similar issues
already, but only if the mask was completely outside of the bits of the
inner mode, the following patch just extends it also to the sign bit
of the SUBREG's mode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.1?

2017-04-24  Jakub Jelinek  

PR rtl-optimization/80501
* combine.c (make_compound_operation_int): Set subreg_code to SET
even for AND with mask of the sign bit of mode.

* gcc.c-torture/execute/pr80501.c: New test.

--- gcc/combine.c.jj2017-03-30 15:24:24.0 +0200
+++ gcc/combine.c   2017-04-24 13:05:49.264713210 +0200
@@ -8170,12 +8170,15 @@ make_compound_operation_int (machine_mod
|| GET_CODE (inner) == SUBREG
/* (subreg:SI (and:DI (reg:DI) (const_int 0x8)) 0)
   is (const_int 0), rather than
-  (subreg:SI (lshiftrt:DI (reg:DI) (const_int 35)) 0).  */
+  (subreg:SI (lshiftrt:DI (reg:DI) (const_int 35)) 0).
+  Similarly (subreg:QI (and:SI (reg:SI) (const_int 0x80)) 0)
+  for non-equality comparisons against 0 is not equivalent
+  to (subreg:QI (lshiftrt:SI (reg:SI) (const_int 7)) 0).  */
|| (GET_CODE (inner) == AND
&& CONST_INT_P (XEXP (inner, 1))
&& GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (inner))
&& exact_log2 (UINTVAL (XEXP (inner, 1)))
-  >= GET_MODE_BITSIZE (mode
+  >= GET_MODE_BITSIZE (mode) - 1)))
  subreg_code = SET;
 
tem = make_compound_operation (inner, subreg_code);
--- gcc/testsuite/gcc.c-torture/execute/pr80501.c.jj2017-04-24 
13:20:19.681024137 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr80501.c   2017-04-24 
13:20:08.0 +0200
@@ -0,0 +1,23 @@
+/* PR rtl-optimization/80501 */
+
+signed char v = 0;
+
+static signed char
+foo (int x, int y)
+{
+  return x << y;
+}
+
+__attribute__((noinline, noclone)) int
+bar (void)
+{
+  return foo (v >= 0, __CHAR_BIT__ - 1) >= 1;
+}
+
+int
+main ()
+{
+  if (sizeof (int) > sizeof (char) && bar () != 0)
+__builtin_abort ();
+  return 0;
+}

Jakub

Re: [PATCH] Fix rtl sharing issue in RTL loop unroller (PR rtl-optimization/80500)

2017-04-24 Thread Jeff Law


On 04/24/2017 03:13 PM, Jakub Jelinek wrote:

Hi!

The following testcase ICEs, because ve->reg is a SUBREG and thus should not
be shared, but we initialize sum to ve->reg, then add to it some REGs (which
can be shared always), and finally assign the addition etc. result into
ve->reg.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-04-24  Jakub Jelinek  

PR rtl-optimization/80500
* loop-unroll.c (combine_var_copies_in_loop_exit): Call copy_rtx on
sum's initial value.

* gcc.dg/pr80500.c: New test.

OK with some kind of comment indicating why we need to do the copy.

Jeff

Re: [PATCH] handle enumerated types in -Wformat-overflow (PR 80397)

2017-04-24 Thread Jeff Law


On 04/11/2017 12:57 PM, Martin Sebor wrote:

In a review of my fix for bug 80364 Jakub pointed out that to
determine whether an argument to an integer directive is of
an integer type the gimple-ssa-sprintf pass tests the type code
for equality to INTEGER_TYPE when it should instead be using
INTEGRAL_TYPE_P().  This has the effect of the pass being unable
to use the available range of arguments of enumerated types,
resulting in both false positives and false negatives, and in
some cases, in emitting suboptimal code.

The attached patch replaces those tests with INTEGRAL_TYPE_P().

Since this is not a regression I submit it for GCC 8.

You might consider using POINTER_TYPE_P as well.

It's also worth noting that an enum object can have values that aren't 
part of the enum.  It's a long standing language wart that I don't 
expect to ever be fixed.  As a result enums often don't have as good of 
range data as you might expect.


jeff

Re: [PATCH 2/7] [D] libiberty: Add support for demangling template constraints.

2017-04-24 Thread Jeff Law


On 04/15/2017 09:20 AM, Iain Buclaw wrote:

This implements a previously an undocumented part of the D ABI spec,
where symbols instantiated inside a template constraint are given a
different prefix.  In practise however this should never be
encountered, as such instantiations are normally considered
speculative, and so never make it to object code.

---


02-d-demangle-template-constraints.patch


commit 9f3fa1f5842dc317b0aaf2f9aa159548c9b7a7f7
Author: Iain Buclaw
Date:   Sat Apr 15 11:29:35 2017 +0200

 libiberty/ChangeLog:
 
 2017-04-15  Iain Buclaw
 
 	* d-demangle.c (dlang_identifier): Handle template constraint symbols.

(dlang_parse_template): Only advance if template symbol prefix is
followed by a digit.
* testsuite/d-demangle-expected: Add tests.

OK for the trunk.

jeff

Re: [PATCH 3/7] [D] libiberty: Recognize anonymous symbols names.

2017-04-24 Thread Jeff Law


On 04/15/2017 09:21 AM, Iain Buclaw wrote:

This implements another previously undocumented part of the D ABI
spec, where symbols with no name are always encoded into the mangled
name.

 SymbolName:
 LName
 TemplateInstanceName
 0 // anonymous symbols

This has never really been a problem, as strtol() kindly jumps over
any leading zeros in the number it is parsing.  However this change
makes it so they are at least explicitly skipped over, rather than
silently ignored.

---


03-d-demangle-anonymous-symbols.patch


commit 6ffcb4ce75e471304960c97bec596c89e26894f8
Author: Iain Buclaw
Date:   Sat Apr 15 11:32:07 2017 +0200

 libiberty/ChangeLog:
 
 2017-04-15  Iain Buclaw
 
 	* d-demangle.c (dlang_parse_symbol): Skip over anonymous symbols.

* testsuite/d-demangle-expected: Add tests.

OK for the trunk.

jeff

Re: [PATCH][libgcc, fuchsia]

2017-04-24 Thread Jeff Law


On 01/17/2017 11:40 AM, Josh Conner via gcc-patches wrote:

The attached patch adds fuchsia support to libgcc.

OK for trunk?

Thanks -

Josh

2017-01-17  Joshua Conner  

 * config/arm/unwind-arm.h (_Unwind_decode_typeinfo_ptr): Use
 pc-relative indirect handling for fuchsia.
 * config/t-slibgcc-fuchsia: New file.
 * config.host (*-*-fuchsia*, aarch64*-*-fuchsia*, arm*-*-fuchsia*,
 x86_64-*-fuchsia*): Add definitions.


OK for the trunk.

jeff

Re: RFC: seeking insight on store_data_bypass_p (recog.c)

2017-04-24 Thread Jeff Law


On 04/14/2017 09:58 AM, Richard Sandiford wrote:


.md files do have the option of using a single rtl instruction to
represent a sequence of several machine instructions but:

(a) they're then effectively asking the target-independent code to
 treat the sequence "as-if" it was a single indivisble instruction.

(b) that hampers scheduling in lots of ways, so should be avoided
 unless there's really no alternative.  One problem is that it
 stops other machine instructions from being scheduled in the
 sequence.  Another is that it makes it harder to describe the
 microarchitecture effects of the sequence, since more than one
 instruction is going through the pipeline.

So yeah, if a target does put several machine instructions into
a single rtl instruction, and one of those instructions is a store,
using store_data_bypass_p on it is going to give poor results.
But it would give poor results in general, even without the bypass.
I think it's case of "don't do that".

Sometimes it is (or was) useful to treat multiple machine
instructions as single rtl instructions during early rtl
optimisation.  It's still better to split them into individual
machine instructions for scheduling though, via define_split or
define_insn_and_split.

Agreed 100% with everything Richard says here.

I wouldn't lose any sleep if the store bypass code missed cases where 
multiple instructions are implemented in a single insn.


Jeff

Re: [PATCH] Fix make_compound_operation in COMPAREs (PR rtl-optimization/80501)

2017-04-24 Thread Segher Boessenkool

Hi Jakub,

On Mon, Apr 24, 2017 at 11:23:42PM +0200, Jakub Jelinek wrote:
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.1?

Yes, okay everywhere, thank you for fixing it!

Segher


> 2017-04-24  Jakub Jelinek  
> 
>   PR rtl-optimization/80501
>   * combine.c (make_compound_operation_int): Set subreg_code to SET
>   even for AND with mask of the sign bit of mode.
> 
>   * gcc.c-torture/execute/pr80501.c: New test.

[PATCH, rs6000] pr80482 Relax vector builtin parameter checks

2017-04-24 Thread Bill Seurer

[PATCH, rs6000] pr80482 Relax vector builtin parameter checks

This patch changes the parameter testing for powerpc vector builtins to relax
the existing requirement that the parameters be identical to instead that they
be compatible.  This allows for mixing parameters with differing qualified
(const, volatile, etc.) types.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80482 for more information.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu and
powerpc64be-unknown-linux-gnu with no regressions.  Is this ok for trunk?

[gcc]

2017-04-24  Bill Seurer  

* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Change
type checks to test for compatibility instead of equality.

[gcc/testsuite]

2017-04-24  Bill Seurer  

* gcc.target/powerpc/vec-constvolatile.c: New test.


Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 247111)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -5595,11 +5595,11 @@ altivec_resolve_overloaded_builtin (location_t loc
   tree arg1 = (*arglist)[1];
   tree arg1_type = TREE_TYPE (arg1);
 
-  /* Both arguments must be vectors and the types must match.  */
-  if (arg0_type != arg1_type)
-   goto bad;
+  /* Both arguments must be vectors and the types must be compatible.  */
   if (TREE_CODE (arg0_type) != VECTOR_TYPE)
goto bad;
+  if (!lang_hooks.types_compatible_p (arg0_type, arg1_type))
+   goto bad;
 
   switch (TYPE_MODE (TREE_TYPE (arg0_type)))
{
@@ -5610,8 +5610,8 @@ altivec_resolve_overloaded_builtin (location_t loc
  case TImode:
{
  /* For scalar types just use a multiply expression.  */
- return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0),
-   arg0, arg1);
+ return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
+ fold_convert (TREE_TYPE (arg0), arg1));
}
  case SFmode:
{
@@ -5655,13 +5655,12 @@ altivec_resolve_overloaded_builtin (location_t loc
  || (TYPE_MODE (TREE_TYPE (arg0_type)) == SFmode)
  || (TYPE_MODE (TREE_TYPE (arg0_type)) == DFmode))
{
- /* Both arguments must be vectors and the types must match.  */
- if (arg0_type != arg1_type)
-   goto bad;
+ /* Both arguments must be vectors and the types must be compatible.  
*/
  if (TREE_CODE (arg0_type) != VECTOR_TYPE)
goto bad;
+ if (!lang_hooks.types_compatible_p (arg0_type, arg1_type))
+   goto bad;
 
-
  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
{
  /* vec_cmpneq (va, vb) == vec_nor (vec_cmpeq (va, vb),
@@ -5720,11 +5719,12 @@ altivec_resolve_overloaded_builtin (location_t loc
   tree arg2_type = TREE_TYPE (arg2);
 
   /* All 3 arguments must be vectors of (signed or unsigned) (int or
- __int128) and the types must match.  */
-  if ((arg0_type != arg1_type) || (arg1_type != arg2_type))
-   goto bad;
+ __int128) and the types must be compatible.  */
   if (TREE_CODE (arg0_type) != VECTOR_TYPE)
goto bad;
+  if (!lang_hooks.types_compatible_p (arg0_type, arg1_type) ||
+ !lang_hooks.types_compatible_p (arg1_type, arg2_type))
+   goto bad;
 
   switch (TYPE_MODE (TREE_TYPE (arg0_type)))
{
@@ -5783,11 +5783,13 @@ altivec_resolve_overloaded_builtin (location_t loc
   tree arg2_type = TREE_TYPE (arg2);
 
   /* All 3 arguments must be vectors of (signed or unsigned) (int or
-   __int128) and the types must match.  */
-  if (arg0_type != arg1_type || arg1_type != arg2_type)
-   goto bad;
+   __int128) and the types must be compatible.  */
+  /* Both arguments must be vectors and the types must be compatible.  */
   if (TREE_CODE (arg0_type) != VECTOR_TYPE)
goto bad;
+  if (!lang_hooks.types_compatible_p (arg0_type, arg1_type) ||
+ !lang_hooks.types_compatible_p (arg1_type, arg2_type))
+   goto bad;
 
   switch (TYPE_MODE (TREE_TYPE (arg0_type)))
{
Index: gcc/testsuite/gcc.target/powerpc/vec-constvolatile.c
===
--- gcc/testsuite/gcc.target/powerpc/vec-constvolatile.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vec-constvolatile.c(working copy)
@@ -0,0 +1,31 @@
+/* Test that const and volatile qualifiers can mix on vec_mul operands.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -mvsx" } */
+
+#include 
+
+void P() {
+  const volatile vector float cvva = vec_splats(0.00187682f);
+  volatile vector float vva = vec_splats(0.00187682f);
+  const vector float cva = vec_splats(0.00187682f);
+  vector float va = vec_splats(0.00187682f);

Re: [PATCH, rs6000] pr80482 Relax vector builtin parameter checks

2017-04-24 Thread Jakub Jelinek

On Mon, Apr 24, 2017 at 05:38:58PM -0500, Bill Seurer wrote:
> [PATCH, rs6000] pr80482 Relax vector builtin parameter checks
> 
> This patch changes the parameter testing for powerpc vector builtins to relax
> the existing requirement that the parameters be identical to instead that they
> be compatible.  This allows for mixing parameters with differing qualified
> (const, volatile, etc.) types.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80482 for more information.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu and
> powerpc64be-unknown-linux-gnu with no regressions.  Is this ok for trunk?
> 
> [gcc]
> 
> 2017-04-24  Bill Seurer  
> 
The ChangeLog entries as well as the commit message should contain
PR target/80482

>   * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Change
>   type checks to test for compatibility instead of equality.

I'll defer the actual review to rs6000 maintainers.

Jakub

[PATCH] avoid assuming all integers are representable in HOST_WIDE_INT (PR #80497)

2017-04-24 Thread Martin Sebor


Bug 80497 brings to light that my fix for PR 80364 where I corrected
the handling for int128_t was incomplete.  I handled the non-constant
case but missed the INTEGER_CST case just a few lines above.  The
attached patch also corrects that problem plus one more elsewhere
in the pass.

Both of the changes in this patch seem safe enough to make even now
in GCC 7 but since they are ice-on-invalid-code perhaps it's better
to wait for 7.1?

Martin
PR tree-optimization/80497 - ICE at -O1 and above on valid code on x86_64-linux-gnu in tree_to_uhwi

gcc/ChangeLog:

	PR tree-optimization/80497
	* gimple-ssa-sprintf.c (get_int_range): Avoid assuming all integer
	constants are representable in HOST_WIDE_INT.
	(parse_directive): Ditto.

gcc/testsuite/ChangeLog:

	PR tree-optimization/80497
	* gcc.dg/tree-ssa/builtin-sprintf-warn-17.c: New test.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 2e62086..d3771dd 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -948,7 +948,8 @@ get_int_range (tree arg, HOST_WIDE_INT *pmin, HOST_WIDE_INT *pmax,
   *pmin = tree_to_shwi (TYPE_MIN_VALUE (type));
   *pmax = tree_to_shwi (TYPE_MAX_VALUE (type));
 }
-  else if (TREE_CODE (arg) == INTEGER_CST)
+  else if (TREE_CODE (arg) == INTEGER_CST
+	   && TYPE_PRECISION (TREE_TYPE (arg)) <= TYPE_PRECISION (type))
 {
   /* For a constant argument return its value adjusted as specified
 	 by NEGATIVE and NEGBOUND and return true to indicate that the
@@ -2916,7 +2917,9 @@ parse_directive (pass_sprintf_length::call_info &info,
   if (width != -1)
 	dollar = width + info.argidx;
   else if (star_width
-	   && TREE_CODE (star_width) == INTEGER_CST)
+	   && TREE_CODE (star_width) == INTEGER_CST
+	   && (TYPE_PRECISION (TREE_TYPE (star_width))
+		   <= TYPE_PRECISION (integer_type_node)))
 	dollar = width + tree_to_shwi (star_width);
 
   /* Bail when the numbered argument is out of range (it will
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-17.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-17.c
new file mode 100644
index 000..27aa839
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-17.c
@@ -0,0 +1,42 @@
+/* PR tree-optimization/80497 - ICE at -O1 and above on valid code on
+   x86_64-linux-gnu in "tree_to_uhwi"
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wformat-overflow" }
+   { dg-require-effective-target int128 } */
+
+extern char buf[];
+
+const __int128_t sint128_max
+  = (__int128_t)1 << (sizeof sint128_max * __CHAR_BIT__ - 2);
+
+void fn0 (void)
+{
+  __int128_t si128 = 0;
+
+  __builtin_sprintf (buf, "%*i", si128, 0);
+
+  __builtin_sprintf (buf, "%.*i", si128, 0);
+
+  __builtin_sprintf (buf, "%i", si128);
+
+  __builtin_sprintf (buf, "%2$*1$i", si128, 0);
+
+  __builtin_sprintf (buf, "%2$.*1$i", si128, 0);
+}
+
+void fn1 (void)
+{
+  __int128_t si128 = sint128_max;
+
+  __builtin_sprintf (buf, "%*i", si128, 0);
+
+  __builtin_sprintf (buf, "%.*i", si128, 0);
+
+  __builtin_sprintf (buf, "%i", si128);
+
+  __builtin_sprintf (buf, "%2$*1$i", si128, 0);
+
+  __builtin_sprintf (buf, "%2$.*1$i", si128, 0);
+}
+
+/* { dg-prune-output "expects argument of type .int." } */

Re: [PATCH, rs6000] pr80482 Relax vector builtin parameter checks

2017-04-24 Thread Segher Boessenkool

On Mon, Apr 24, 2017 at 05:38:58PM -0500, Bill Seurer wrote:
> [PATCH, rs6000] pr80482 Relax vector builtin parameter checks
> 
> This patch changes the parameter testing for powerpc vector builtins to relax
> the existing requirement that the parameters be identical to instead that they
> be compatible.  This allows for mixing parameters with differing qualified
> (const, volatile, etc.) types.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80482 for more information.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu and
> powerpc64be-unknown-linux-gnu with no regressions.  Is this ok for trunk?

It looks fine to me, okay for trunk, thanks (with Jakub's comment taken
care of).

Also okay for the 7 branch if the RMs agree (it fixes a regression from
GCC 6 and it seems unlikely to cause new problems).


Segher


> [gcc]
> 
> 2017-04-24  Bill Seurer  
> 
>   * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Change
>   type checks to test for compatibility instead of equality.
> 
> [gcc/testsuite]
> 
> 2017-04-24  Bill Seurer  
> 
>   * gcc.target/powerpc/vec-constvolatile.c: New test.

Add support for use_hazard_barrier_return function attribute

2017-04-24 Thread Prachi Godbole

This patch adds support for function attribute  __attribute__ 
((use_hazard_barrier_return)). The attribute will generate hazard barrier 
return (jr.hb) instead of a normal return instruction.

Changelog:

2017-04-25  Prachi Godbole  

gcc/
* config/mips/mips.h (machine_function): New variable
use_hazard_barrier_return_p.
* config/mips/mips.md (UNSPEC_JRHB): New unspec.
(mips_hb_return_internal): New insn pattern.
* config/mips/mips.c (mips_attribute_table): Add attribute
use_hazard_barrier_return.
(mips_use_hazard_barrier_return_p): New static function.
(mips_function_attr_inlinable_p): Likewise.
(mips_compute_frame_info): Set use_hazard_barrier_return_p.  Emit error
for unsupported architecture choice.
(mips_function_ok_for_sibcall, mips_can_use_return_insn): Return false
for use_hazard_barrier_return.
(mips_expand_epilogue): Emit hazard barrier return.
* doc/extend.texi: Document use_hazard_barrier_return.

gcc/testsuite/
* gcc.target/mips/hazard-barrier-return-attribute.c: New test.


Ok for stage1?

Regards,
Prachi


Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 246899)
+++ gcc/doc/extend.texi (working copy)
@@ -4496,6 +4496,12 @@ On MIPS targets, you can use the @code{nocompressi
 to locally turn off MIPS16 and microMIPS code generation.  This attribute
 overrides the @option{-mips16} and @option{-mmicromips} options on the
 command line (@pxref{MIPS Options}).
+
+@item use_hazard_barrier_return
+@cindex @code{use_hazard_barrier_return} function attribute, MIPS
+This function attribute instructs the compiler to generate hazard barrier 
return
+that clears all execution and instruction hazards while returning, instead of
+generating a normal return instruction.
 @end table
 
 @node MSP430 Function Attributes
Index: gcc/config/mips/mips.md
===
--- gcc/config/mips/mips.md (revision 246899)
+++ gcc/config/mips/mips.md (working copy)
@@ -156,6 +156,7 @@
 
   ;; The `.insn' pseudo-op.
   UNSPEC_INSN_PSEUDO
+  UNSPEC_JRHB
 ])
 
 (define_constants
@@ -6578,6 +6579,20 @@
   [(set_attr "type""jump")
(set_attr "mode""none")])
 
+;; Insn to clear execution and instruction hazards while returning.
+;; However, it doesn't clear hazards created by the insn in its delay slot.
+;; Thus, explicitly place a nop in its delay slot.
+
+(define_insn "mips_hb_return_internal"
+  [(return)
+   (unspec_volatile [(match_operand 0 "pmode_register_operand" "")]
+   UNSPEC_JRHB)]
+  ""
+  {
+return "%(jr.hb\t$31%/%)";
+  }
+  [(set_attr "insn_count" "2")])
+
 ;; Normal return.
 
 (define_insn "_internal"
Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  (revision 246899)
+++ gcc/config/mips/mips.c  (working copy)
@@ -615,6 +615,7 @@ static const struct attribute_spec mips_attribute_
 mips_handle_use_shadow_register_set_attr, false },
   { "keep_interrupts_masked",  0, 0, false, true,  true, NULL, false },
   { "use_debug_exception_return", 0, 0, false, true,  true, NULL, false },
+  { "use_hazard_barrier_return", 0, 0, true, false, false, NULL, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 

@@ -1275,6 +1276,16 @@ mips_use_debug_exception_return_p (tree type)
   TYPE_ATTRIBUTES (type)) != NULL;
 }
 
+/* Check if the attribute to use hazard barrier return is set for
+   the function declaration DECL.  */
+
+static bool
+mips_use_hazard_barrier_return_p (tree decl)
+{
+  return lookup_attribute ("use_hazard_barrier_return",
+   DECL_ATTRIBUTES (decl)) != NULL;
+}
+
 /* Return the set of compression modes that are explicitly required
by the attributes in ATTRIBUTES.  */
 
@@ -1460,6 +1471,21 @@ mips_can_inline_p (tree caller, tree callee)
   return default_target_can_inline_p (caller, callee);
 }
 
+/* Implement TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P.
+
+   A function reqeuesting clearing of all instruction and execution hazards
+   before returning cannot be inlined - thereby not clearing any hazards.
+   All our other function attributes are related to how out-of-line copies
+   should be compiled or called.  They don't in themselves prevent inlining.  
*/
+
+static bool
+mips_function_attr_inlinable_p (const_tree decl)
+{
+  if (mips_use_hazard_barrier_return_p (const_cast(decl)))
+return false;
+  return hook_bool_const_tree_true (decl);
+}
+
 /* Handle an "interrupt" attribute with an optional argument.  */
 
 static tree
@@ -7863,6 +7889,11 @@ mips_function_ok_for_sibcall (tree decl, tree exp
   && !targetm.binds_local_p (decl))
 return false;
 
+  /* Can't generate sibling calls if returning from current function using
+ hazard barrier return.

Re: PR79697: Delete calls to strdup, strndup, realloc if there is no lhs

2017-04-24 Thread Prathamesh Kulkarni

On 25 April 2017 at 02:41, Jeff Law  wrote:
> On 02/25/2017 01:40 AM, Prathamesh Kulkarni wrote:
>>
>> Hi,
>> The attached patch deletes calls to strdup, strndup if it's
>> return-value is unused,
>> and same for realloc if the first arg is NULL.
>> Bootstrap+tested on x86_64-unknown-linux-gnu.
>> OK for GCC 8 ?
>>
>> Thanks,
>> Prathamesh
>>
>>
>> pr79697-1.txt
>>
>>
>> 2017-02-25  Prathamesh Kulkarni
>>
>> PR tree-optimization/79697
>> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Check if
>> callee
>> is BUILT_IN_STRDUP, BUILT_IN_STRNDUP, BUILT_IN_REALLOC.
>>
>> testsuite/
>> * gcc.dg/tree-ssa/pr79697.c: New test.
>
> OK for the trunk.
Hi Jeff,
Did you intend to approve the original patch (pr79697-1.txt) or the
latest one (pr79697-3.txt that also folds realloc (0, n) to malloc
(n)) ?

Thanks,
Prathamesh
>
> jeff
>

[PING] [PATCH] [AArch64] PR target/71663 Improve Vector Initializtion

2017-04-24 Thread Hurugalawadi, Naveen

Hi,

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00718.html

Thanks,
Naveen

[PING][PATCH][AArch64] Add addr_type attribute

2017-04-24 Thread Hurugalawadi, Naveen

Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00222.html

Thanks,
Naveen

99 matches

Mail list logo