[PATCH] Fix PR82436

2017-10-06 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-10-06  Richard Biener  

PR tree-optimization/82436
* tree-vect-slp.c (vect_supported_load_permutation_p): More
conservatively choose the vectorization factor when checking
whether we can perform the required load permutation.
(vect_transform_slp_perm_load): Assert when we may not fail.

* gcc.dg/vect/pr82436.c: New testcase.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 253439)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1567,14 +1567,20 @@ vect_supported_load_permutation_p (slp_i
   return true;
 }
 
-  /* For loop vectorization verify we can generate the permutation.  */
+  /* For loop vectorization verify we can generate the permutation.  Be
+ conservative about the vectorization factor, there are permutations
+ that will use three vector inputs only starting from a specific factor
+ and the vectorization factor is not yet final.
+ ???  The SLP instance unrolling factor might not be the maximum one.  */
   unsigned n_perms;
+  unsigned test_vf
+= least_common_multiple (SLP_INSTANCE_UNROLLING_FACTOR (slp_instn),
+LOOP_VINFO_VECT_FACTOR
+  (STMT_VINFO_LOOP_VINFO (vinfo_for_stmt (stmt;
   FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (slp_instn), i, node)
 if (node->load_permutation.exists ()
-   && !vect_transform_slp_perm_load
- (node, vNULL, NULL,
-  SLP_INSTANCE_UNROLLING_FACTOR (slp_instn), slp_instn, true,
-  &n_perms))
+   && !vect_transform_slp_perm_load (node, vNULL, NULL, test_vf,
+ slp_instn, true, &n_perms))
   return false;
 
   return true;
@@ -3560,6 +3566,7 @@ vect_transform_slp_perm_load (slp_tree n
  dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
stmt, 0);
}
+ gcc_assert (analyze_only);
  return false;
}
 
@@ -3583,6 +3590,7 @@ vect_transform_slp_perm_load (slp_tree n
dump_printf (MSG_MISSED_OPTIMIZATION, "%d ", mask[i]);
  dump_printf (MSG_MISSED_OPTIMIZATION, "}\n");
}
+ gcc_assert (analyze_only);
  return false;
}
 
Index: gcc/testsuite/gcc.dg/vect/pr82436.c
===
--- gcc/testsuite/gcc.dg/vect/pr82436.c (nonexistent)
+++ gcc/testsuite/gcc.dg/vect/pr82436.c (working copy)
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Ofast -fno-tree-scev-cprop" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
+
+struct reflection_type
+{
+  int h;
+  int k;
+  int l;
+  double f_exp;
+  double f_sigma;
+  _Complex double f_calc;
+  double f_pred;
+  double i_exp;
+  double i_sigma;
+  double i_pred;
+};
+
+double y, w;
+int foo (struct reflection_type *r, int n, unsigned s)
+{
+  int i;
+  y = 0;
+  w = 0;
+  for (i = 1; i < n; ++i)
+{
+  struct reflection_type *x = &r[i*s];
+  double fpred = x->f_pred;
+  double fexp = x->f_exp;
+  double tem = (fpred - fexp);
+  y += __builtin_fabs (tem / x->f_sigma);
+  w += __builtin_fabs (tem / fexp);
+}
+  return i;
+}


Re: [PATCH] C++17 P0067R5 std::to_chars and std::from_chars (partial)

2017-10-06 Thread Christophe Lyon
On 5 October 2017 at 22:27, Jonathan Wakely  wrote:
> On 05/10/17 22:00 +0200, Christophe Lyon wrote:
>>
>> Hi Jonathan,
>>
>> On 3 October 2017 at 16:31, Jonathan Wakely  wrote:
>>>
>>> On 02/10/17 15:13 +0100, Jonathan Wakely wrote:


 +#ifndef _GLIBCXX_CHARCONV
 +#define _GLIBCXX_CHARCONV 1
 +
 +#pragma GCC system_header
 +
 +#if __cplusplus >= 201402L
 +
 +#include 
 +#include 
 +#include 
 +#include  // for std::errc
>>>
>>>
>>>
>>> I forgot to mention that I've made this header work for C++14 not just
>>> C++17. I
>>> did this for similar reasons as we make some C++17 things available for
>>> -std=gnu++11 and -std=gnu++14:
>>>
>>> #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or
>>> gnu++11
>>>
>>> But in this case  is a completely new header so we don't have
>>> to
>>> limit it to -std=gnu++NN modes only. The new functions won't pollute
>>> namespace
>>> std unless the new header gets included, which strictly-conforming C++14
>>> code
>>> won't do anyway.
>>>
>>
>> Sorry for the delay, I'm still catching up after Linaro Connect.
>>
>> I've noticed that one of the new tests (20_util/to_chars/1.cc)
>> fails to compile on ARM and AArch64 bare-metal targets (with Newlib).
>> That is, arm-none-eabi and aarch64-none-elf targets.
>>
>> The error message I'm seeing is:
>> /libstdc++-v3/testsuite/20_util/to_chars/1.cc: In function 'void
>> test01()':
>> /libstdc++-v3/testsuite/20_util/to_chars/1.cc:88: error:
>> 'std::to_string' has not been declared
>> In file included from /libstdc++-v3/testsuite/20_util/to_chars/1.cc:38:
>> /libstdc++-v3/testsuite/20_util/to_chars/1.cc:90: error: 'to_string'
>> was not declared in this scope
>> /libstdc++-v3/testsuite/util/testsuite_hooks.h:57: note: in definition
>> of macro 'VERIFY'
>
>
> Should be fixed with the attached patch.
>

Indeed, this makes the tests unsupported rather than fail.

Thanks,

Christophe

Reviewed-by:  Christophe Lyon  


RE: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing vec align tests.

2017-10-06 Thread Tamar Christina


> -Original Message-
> From: Rainer Orth [mailto:r...@cebitec.uni-bielefeld.de]
> Sent: 05 October 2017 20:16
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd; James Greenhalgh; Richard Earnshaw;
> Marcus Shawcroft
> Subject: Re: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing
> vec align tests.
> 
> Hi Tamar,
> 
> > Previously I had corrected the vect_hw_misalign check which prompted
> > these three test to start failing because the condition needs to be
> > inverted in the testcases.
> >
> > Regtested on aarch64-none-elf, arm-none-linux-gnueabihf and x86_64-pc-
> linux-gnu.
> >
> > Ok for trunk?
> >
> > Thanks,
> > Tamar.
> >
> > gcc/testsuite/
> > 2017-10-02  Tamar Christina  
> >
> > * gcc.dg/vect/vect-align-1.c: Fix vect_hw_misalign condition.
> > * gcc.dg/vect/vect-align-2.c: Likewise.
> > * gcc.dg/vect/vect-multitypes-1.c: Likewise.
> 
> unfortunately, your patch caused gcc.dg/vect/vect-multitypes-1.c to FAIL on
> sparc-sun-solaris2.11 (32 and 64-bit):
> 
> FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  scan-tree-dump-
> times vect "Vectorizing an unaligned access" 4
> FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect
> "Vectorizing an unaligned access" 4

Thanks! I'll take a look.

Tamar

> 
> It had XFAILed before.
> 
>   Rainer
> 
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-06 Thread Christophe Lyon
On 5 October 2017 at 22:28, Alexander Monakov  wrote:
> On Thu, 5 Oct 2017, Maxim Kuvyrkov wrote:
>> I'm still working on analysis, but it appears to me that Alexander's patch
>> (current state of trunk) fails qsort check due to not being symmetric for
>> load/store analysis (write == 0 or write == 1) in comparisons with
>> "irrelevant" instructions.  Wilco's patch does not seem to address that, and,
>> possibly, makes the failure latent (I may be wrong here, it's late and I
>> didn't finish analysis yet).
>
> Yes, your analysis is incomplete, it should be easy to see that for 
> always-false
> multi_mem_insn_p, autopref_rank_for_schedule implements lexicographical order.
> The problem is that when multi_mem_insn_p may be true, autopref_rank_data is
> not guaranteed to be transitive.
>
> I think your patch loses transitivity in autopref_rank_for_schedule, see 
> Wilco's
> response.
>
> FWIW, this hunk from my patch posted back on Friday is sufficient to restore
> bootstrap as confirmed (again, back on Friday) by Steve.  It avoids the fancy
> non-transitive comparison for qsort (but autopref_rank_data is still used in
> multipref_dfa_lookahead_guard).
>
> (I'm surprised Kyrill wasn't Cc'ed - adding him now)
>
> Alexander
>
> * haifa-sched.c (autopref_rank_for_schedule): Do not invoke
> autopref_rank_data, order by min_offset.
>
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 549e8961411..cea1242e1f1 100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -5725,7 +5669,7 @@ autopref_rank_for_schedule (const rtx_insn *insn1, 
> const rtx_insn *insn2)
>int irrel2 = data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
>
>if (!irrel1 && !irrel2)
> -   r = autopref_rank_data (data1, data2);
> +   r = data1->min_offset - data2->min_offset;
>else
> r = irrel2 - irrel1;
>  }

Hi,

FWIW, I've ran validations with this small patch, and it fixes:
- the aarch64-linux-gnu build problem
- a few other regressions (ICEs) that had probably already been
reported (not sure I saw all the regression reports after r253295)

See 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/253456-aarch64-bootstrap.patch/report-build-info.html

Where "REF-BUILDFAILED" for aarch64-linux-gnu means the reference
build failed and the patched build succeeded.
The "REGRESSED" cell for aarch64-none-elf with ilp32 only "restores" a
previous failure of gcc.target/aarch64/aapcs64/func-ret-3.c execution,
 -O3 -g

Thanks,

Christophe


Re: [PATCH] Improve -fstore-merging for bool/enum constants (PR tree-optimization/82434)

2017-10-06 Thread Richard Biener
On Thu, 5 Oct 2017, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase fails, because can_native_encode_type_p doesn't
> handle BOOLEAN_TYPE nor ENUMERAL_TYPE (while native_encode_expr handles
> those just fine).
> But, it isn't just those, can_native_encode_type_p doesn't really make
> sense to me, since whether native_encode_expr fails or not doesn't
> really depend on the expression type, but rather on what exact tcc_constant
> the expression is, what size it has and some other properties of the
> expression.
> Instead of writing a routine similar to can_native_encode_string_p that
> would handle all the cases when native_encode_expr fails, I've changed
> native_encode_expr itself, so that it has a faster dry run mode, where
> ptr is NULL, which doesn't store anything, but just returns what it would
> return given a non-NULL ptr.
> The patch then changes vectorizable_store as well as store merging to use
> this to check whether native_encode_expr will be successful.
> 
> In addition to that, I've found a thinko in store merging stmt counting,
> where it would unnecessarily look for 3rd non-debug stmt even when 2
> stmts is what it checks after the loop.  And store merging was for some
> unknown reason calling native_encode_expr with 4 arguments, while the
> standard/preferred way to call it is with 3 arguments, then it verifies
> whether the constant encoding can fit into the buffer etc.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2017-10-05  Jakub Jelinek  
> 
>   PR tree-optimization/82434
>   * fold-const.h (can_native_encode_type_p,
>   can_native_encode_string_p): Remove.
>   * fold-const.c (native_encode_int): Formatting fixes.  If ptr is NULL,
>   don't encode anything, just return what would be otherwise returned.
>   (native_encode_fixed, native_encode_complex, native_encode_vector):
>   Likewise.
>   (native_encode_string): Likewise.  Inline by hand
>   can_native_encode_string_p.
>   (can_native_encode_type_p): Remove.
>   (can_native_encode_string_p): Remove.
>   * tree-vect-stmts.c (vectorizable_store): Instead of testing just
>   STRING_CSTs using can_native_encode_string_p, test all
>   CONSTANT_CLASS_P values using native_encode_expr with NULL ptr.
>   * gimple-ssa-store-merging.c (encode_tree_to_bitpos): Remove last
>   argument from native_encode_expr.
>   (rhs_valid_for_store_merging_p): Use native_encode_expr with NULL ptr.
>   (pass_store_merging::execute): Don't unnecessarily look for 3 stmts,
>   but just 2.
> 
>   * gcc.dg/store_merging_9.c: New test.
> 
> --- gcc/fold-const.h.jj   2017-09-05 23:28:10.0 +0200
> +++ gcc/fold-const.h  2017-10-05 13:16:27.355770215 +0200
> @@ -27,8 +27,6 @@ extern int folding_initializer;
>  /* Convert between trees and native memory representation.  */
>  extern int native_encode_expr (const_tree, unsigned char *, int, int off = 
> -1);
>  extern tree native_interpret_expr (tree, const unsigned char *, int);
> -extern bool can_native_encode_type_p (tree);
> -extern bool can_native_encode_string_p (const_tree);
>  
>  /* Fold constants as much as possible in an expression.
> Returns the simplified expression.
> --- gcc/fold-const.c.jj   2017-10-04 16:45:28.0 +0200
> +++ gcc/fold-const.c  2017-10-05 13:17:42.195863063 +0200
> @@ -6982,11 +6982,15 @@ native_encode_int (const_tree expr, unsi
>int byte, offset, word, words;
>unsigned char value;
>  
> -  if ((off == -1 && total_bytes > len)
> -  || off >= total_bytes)
> +  if ((off == -1 && total_bytes > len) || off >= total_bytes)
>  return 0;
>if (off == -1)
>  off = 0;
> +
> +  if (ptr == NULL)
> +/* Dry run.  */
> +return MIN (len, total_bytes - off);
> +
>words = total_bytes / UNITS_PER_WORD;
>  
>for (byte = 0; byte < total_bytes; byte++)
> @@ -7009,8 +7013,7 @@ native_encode_int (const_tree expr, unsi
>   }
>else
>   offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
> -  if (offset >= off
> -   && offset - off < len)
> +  if (offset >= off && offset - off < len)
>   ptr[offset - off] = value;
>  }
>return MIN (len, total_bytes - off);
> @@ -7036,8 +7039,7 @@ native_encode_fixed (const_tree expr, un
>  
>i_type = lang_hooks.types.type_for_size (GET_MODE_BITSIZE (mode), 1);
>  
> -  if (NULL_TREE == i_type
> -  || TYPE_PRECISION (i_type) != total_bytes)
> +  if (NULL_TREE == i_type || TYPE_PRECISION (i_type) != total_bytes)
>  return 0;
>
>value = TREE_FIXED_CST (expr);
> @@ -7065,11 +7067,15 @@ native_encode_real (const_tree expr, uns
>   up to 192 bits.  */
>long tmp[6];
>  
> -  if ((off == -1 && total_bytes > len)
> -  || off >= total_bytes)
> +  if ((off == -1 && total_bytes > len) || off >= total_bytes)
>  return 0;
>if (off == -1)
>  off = 0;
> +
> +  if (ptr == NULL)
> +/* Dry run.  */
> 

[PATCH] Testcase for PR82421

2017-10-06 Thread Richard Biener

Which was fixed by the code-gen rewrite.

Richard.

2017-10-06  Richard Biener  

PR tree-optimization/82421
* gcc.dg/graphite/pr82421.c: New testcase.

Index: gcc/testsuite/gcc.dg/graphite/pr82421.c
===
--- gcc/testsuite/gcc.dg/graphite/pr82421.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr82421.c (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -floop-nest-optimize" } */
+
+int pc;
+
+void
+qy (int l9)
+{
+  int tw = 4;
+  int fb[tw];
+
+  while (l9 < 1)
+{
+  int dr;
+
+  pc = fb[2];
+  for (dr = 0; dr < tw; ++dr)
+   fb[dr] = 0;
+  ++l9;
+}
+}


[PATCH] Testcase for PR82422

2017-10-06 Thread Richard Biener

Committed.

Richard.

2017-10-06  Richard Biener  

PR tree-optimization/82422
* gcc.dg/graphite/pr82422.c: New testcase.

Index: gcc/testsuite/gcc.dg/graphite/pr82422.c
===
--- gcc/testsuite/gcc.dg/graphite/pr82422.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr82422.c (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -floop-nest-optimize -Wno-aggressive-loop-optimizations" 
} */
+
+int a;
+int b[6];
+int c ()
+{
+  int d;
+  for (; d; d++)
+b[d] = 0;
+  for (; d < 8; d++)
+a += b[d];
+}


[PATCH] Fix PR82397

2017-10-06 Thread Richard Biener

I am testing the following patch to fix the qsort intransitiveness
of dr_group_sort_cmp.  The patch removes the overly powerful
operand_equal_p checks (handling commutativity )
because those do not mix well with the sorting constraints.

I am also testing a followup to address the missed equalities by
canonicalization -- the interesting trees happen to be all built
by split_constant_offset where we can do some elaborate canonicalization
in linear complexity (as opposed to operand_equal_p's exponential
handling or trying to handle it in data_ref_compare_tree where it would
be done at least multiple times during qsort invocation).

Bootstrapped on x86_64-unknown-linux-gnu, testing still in progress
(so is a quick SPEC 2k6 build where the issue showed up in multiple
places).

Richard.

2017-10-06  Richard Biener  

PR tree-optimization/82397
* tree-vect-data-refs.c (dr_group_sort_cmp): Do not use
operand_equal_p but rely on data_ref_compare_tree for detecting
equalities.
(vect_analyze_data_ref_accesses): Use data_ref_compare_tree
to match up with dr_group_sort_cmp.

* gfortran.dg/pr82397.f: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 253475)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -2727,43 +2727,30 @@ dr_group_sort_cmp (const void *dra_, con
 return loopa->num < loopb->num ? -1 : 1;
 
   /* Ordering of DRs according to base.  */
-  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0))
-{
-  cmp = data_ref_compare_tree (DR_BASE_ADDRESS (dra),
-  DR_BASE_ADDRESS (drb));
-  if (cmp != 0)
-return cmp;
-}
+  cmp = data_ref_compare_tree (DR_BASE_ADDRESS (dra),
+  DR_BASE_ADDRESS (drb));
+  if (cmp != 0)
+return cmp;
 
   /* And according to DR_OFFSET.  */
-  if (!dr_equal_offsets_p (dra, drb))
-{
-  cmp = data_ref_compare_tree (DR_OFFSET (dra), DR_OFFSET (drb));
-  if (cmp != 0)
-return cmp;
-}
+  cmp = data_ref_compare_tree (DR_OFFSET (dra), DR_OFFSET (drb));
+  if (cmp != 0)
+return cmp;
 
   /* Put reads before writes.  */
   if (DR_IS_READ (dra) != DR_IS_READ (drb))
 return DR_IS_READ (dra) ? -1 : 1;
 
   /* Then sort after access size.  */
-  if (!operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dra))),
-   TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (drb))), 0))
-{
-  cmp = data_ref_compare_tree (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dra))),
-  TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (drb;
-  if (cmp != 0)
-return cmp;
-}
+  cmp = data_ref_compare_tree (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dra))),
+  TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (drb;
+  if (cmp != 0)
+return cmp;
 
   /* And after step.  */
-  if (!operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
-{
-  cmp = data_ref_compare_tree (DR_STEP (dra), DR_STEP (drb));
-  if (cmp != 0)
-return cmp;
-}
+  cmp = data_ref_compare_tree (DR_STEP (dra), DR_STEP (drb));
+  if (cmp != 0)
+return cmp;
 
   /* Then sort after DR_INIT.  In case of identical DRs sort after stmt UID.  
*/
   cmp = tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb));
@@ -2835,9 +2822,9 @@ vect_analyze_data_ref_accesses (vec_info
 and they are both either store or load (not load and store,
 not masked loads or stores).  */
  if (DR_IS_READ (dra) != DR_IS_READ (drb)
- || !operand_equal_p (DR_BASE_ADDRESS (dra),
-  DR_BASE_ADDRESS (drb), 0)
- || !dr_equal_offsets_p (dra, drb)
+ || data_ref_compare_tree (DR_BASE_ADDRESS (dra),
+   DR_BASE_ADDRESS (drb)) != 0
+ || data_ref_compare_tree (DR_OFFSET (dra), DR_OFFSET (drb)) != 0
  || !gimple_assign_single_p (DR_STMT (dra))
  || !gimple_assign_single_p (DR_STMT (drb)))
break;
@@ -2851,7 +2838,7 @@ vect_analyze_data_ref_accesses (vec_info
break;
 
  /* Check that the data-refs have the same step.  */
- if (!operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
+ if (data_ref_compare_tree (DR_STEP (dra), DR_STEP (drb)) != 0)
break;
 
  /* Do not place the same access in the interleaving chain twice.  */
Index: gcc/testsuite/gfortran.dg/pr82397.f
===
--- gcc/testsuite/gfortran.dg/pr82397.f (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr82397.f (working copy)
@@ -0,0 +1,32 @@
+! { dg-do compile }
+! { dg-options "-Ofast" }
+
+  subroutine foo(U,V,R,N,A)
+  integer N
+  real*8 U(N,N,N),V(N,N,N),R(N,N,N),A(0:3)
+  integer I3, I2, I1
+C
+  do I3=2,N-1
+   do I2=2,N-1
+do I1=2,N-1
+ 

Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-06 Thread Iain Buclaw
On 6 October 2017 at 02:57, Walter Bright  wrote:
>
>
> On 10/5/2017 3:59 AM, Iain Buclaw wrote:
>>
>> On 3 October 2017 at 23:36, Joseph Myers  wrote:
>>>
>>> On Tue, 3 Oct 2017, Jeff Law wrote:
>>>
 /* Copyright (c) 2010-2014 by Digital Mars
   * All Rights Reserved, written by Walter Bright
   * http://www.digitalmars.com
   * Distributed under the Boost Software License, Version 1.0.
   * (See accompanying file LICENSE or copy at
 http://www.boost.org/LICENSE_1_0.txt)

 If the code was assigned to the FSF in 2011, then the FSF would have
 ownership of the code.  And the FSF would be the only entity that could
 change the license (which according to your message changed to Boost in
 2014).  So something seems wrong here.
>>>
>>>
>>> The standard FSF assignment would allow the contributor to distribute
>>> their own code under such terms as they see fit.
>>>
>>
>> Walter, would you mind clarifying details of your assignment? Was it a
>> standard assignment? Did you request for any amendments?
>
>
> I'm good with FSF owning their copy and it being under the GPL and Digital
> Mars owning our copy and it being Boost licensed.
>

Out of curiosity, I did have a look at some of the tops of gofrontend
sources this morning.  They are all copyright the Go Authors, and are
licensed as BSD.  So I'm not sure if having copyright FSF and
distributing under GPL is strictly required.  And from a maintenance
point of view, it would be easier to merge in upstream changes as-is
without some diff/merging tool.

Regards,
Iain.


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-06 Thread Maxim Kuvyrkov
> On Oct 5, 2017, at 11:28 PM, Alexander Monakov  wrote:
> 
> On Thu, 5 Oct 2017, Maxim Kuvyrkov wrote:
>> I'm still working on analysis, but it appears to me that Alexander's patch
>> (current state of trunk) fails qsort check due to not being symmetric for
>> load/store analysis (write == 0 or write == 1) in comparisons with
>> "irrelevant" instructions.  Wilco's patch does not seem to address that, and,
>> possibly, makes the failure latent (I may be wrong here, it's late and I
>> didn't finish analysis yet).
> 
> Yes, your analysis is incomplete, it should be easy to see that for 
> always-false
> multi_mem_insn_p, autopref_rank_for_schedule implements lexicographical order.
> The problem is that when multi_mem_insn_p may be true, autopref_rank_data is
> not guaranteed to be transitive.

Agree.

> 
> I think your patch loses transitivity in autopref_rank_for_schedule, see 
> Wilco's
> response.

Agree.

> 
> FWIW, this hunk from my patch posted back on Friday is sufficient to restore
> bootstrap as confirmed (again, back on Friday) by Steve.  It avoids the fancy
> non-transitive comparison for qsort (but autopref_rank_data is still used in
> multipref_dfa_lookahead_guard).
> 
> (I'm surprised Kyrill wasn't Cc'ed - adding him now)
> 
> Alexander
> 
>   * haifa-sched.c (autopref_rank_for_schedule): Do not invoke
>   autopref_rank_data, order by min_offset.
> 
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 549e8961411..cea1242e1f1 100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -5725,7 +5669,7 @@ autopref_rank_for_schedule (const rtx_insn *insn1, 
> const rtx_insn *insn2)
>   int irrel2 = data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
> 
>   if (!irrel1 && !irrel2)
> - r = autopref_rank_data (data1, data2);
> + r = data1->min_offset - data2->min_offset;
>   else
>   r = irrel2 - irrel1;
> }

I think that this is the best solution so far.  Could you add a comment like 
the following?
==
Ideally, we would call autopref_rank_data() here, but we can't since it is not 
guaranteed to return transitive results fro multi_mem_insns.  We use an 
approximation here and rely on lookahead_guard below to force instruction order 
according to autopref_rank_data().
==

I think that the above patch qualifies as obvious to unbreak the bootstrap.  
Wilco, any objection to the above fix?

Regards,

--
Maxim Kuvyrkov
www.linaro.org







Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Jakub Jelinek
On Thu, Oct 05, 2017 at 03:52:45PM +0200, Martin Liška wrote:
> > Do you really need to handle offset != NULL_TREE?
> > If the bit offset is representable in shwi, then it will just be
> > in bitpos and offset will be NULL.
> 
> For this:
> UBSAN_PTR (&MEM[(void *)&b + 9223372036854775807B], 1);
> 
> I have offset:
>  
> constant 9223372036854775807>
> 
> which is a valid offset.

Sure, it is valid, but this is an optimization code.
Objects larger than 1/8 of 64-bit address space (since we have
always 64-bit HWI) are non-existent and will not appear for some years.
So I think it isn't a problem if we just don't optimize those.

> >> +&& (!is_global_var (base) || decl_binds_to_current_def_p (base)))
> >> +  {
> >> +HOST_WIDE_INT bytepos = bitpos / BITS_PER_UNIT;
> >> +offset_int total_offset = bytepos;
> >> +if (offset != NULL_TREE)
> >> +  total_offset += wi::sext (wi::to_offset (offset), POINTER_SIZE);
> >> +total_offset += wi::sext (wi::to_offset (cur_offset),
> >> +  POINTER_SIZE);
> > 
> > Why are you sign extending it each time?  Can't it be sign extended after
> > all the additions?
> 
> I had problem with this:
> UBSAN_PTR (&MEM[(void *)&b + -9223372036854775807B], 9223372036854775805);
> 
> when not doing sign extension, I was said that 
> fits_shwi_p(to_offset(-9223372036854775807) + to_offset 
> (9223372036854775805)) is false.

What I meant is that you can extend only the final result, and actually not
go through a HWI at all, you compute an offset_int, extend it and pass it
to other functions.

> > On the other side, is it safe to add the (bytepos + offset) part with
> > the cur_offset part unconditionally?
> > If the former is "positive" and cur_offset is "negative", they cancel each
> > other and we IMHO shouldn't try to optimize it away.  bytepos could be some
> > very large integer, outside of bounds of the var, and cur_offset some
> > negative constant, where both var + bytepos and (var + bytepos) + cur_offset
> > overflow.  Or bytepos could be negative, outside of the bounds of the
> > variable.  I think you need to check separately that bytepos is >= 0 and
> > <= DECL_SIZE_UNIT and that bytepos + cur_offset is within that range too.
> > 
> >> +/* New total offset can fit in pointer type.  */
> >> +if (wi::fits_shwi_p (total_offset))
> > 
> > Why do you need to convert a wide_int to signed HOST_WIDE_INT and back?
> > Or do you want to check for some overflows above?
> 
> Do I understand it correctly that:
> 
>   p = b - 9223372036854775807LL; /* pointer overflow check is needed */
>   p2 = p + 9223372036854775805LL;
> 
> should not be handled because: &b + -9223372036854775807B is outsize of 
> address of b?
> Am I right?

Let's talk with smaller numbers.  If we have:
1)
UBSAN_PTR (&MEM_REF[ptr + 128], 256)
both offsets have the same "sign", so the check is redundant if there is say
UBSAN_PTR (ptr, 128+256); (or larger offset).

2)
UBSAN_PTR (&MEM_REF[ptr + 128], -64)
bitpos is positive, cur_offset "negative", here optimizing away the check if
UBSAN_PTR (ptr, 128-64); is wrong, there could be address space boundary in
between ptr+64 and ptr+128, and perhaps the only code to detect it would be
this UBSAN_PTR.  But we can optimize it away if there is
UBSAN_PTR (ptr, 128);
check dominating it, then we know there is no address space boundary in
between ptr and ptr+128, therefore none between ptr+64 and ptr+128 either.

3)
UBSAN_PTR (&MEM_REF[ptr + 128], -256)
similarly to above, but cur_offset larger than bytepos.  Here we want to
look for
UBSAN_PTR (ptr, 128-256);

4-6) invert all the signs above to get the rest of the cases.

Actually, there are other cases, e.g. when bytepos + cur_offset is larger
than sizetype max or smaller than sizetype min.  In that case, IMHO we just
shouldn't optimize.  But the check wouldn't be whether the offset_int
fits into shwi, but rather whether the resulting offset is representable
in signed sizetype (whether sign-extending it from sizetype prec is equal to
the non-extended value).

In any case, cur_offset's offset_int value needs to be sign-extended from
sizetype precision first (for any work with it, we always want to treat it
as signed in these routines), and bitpos being HOST_WIDE_INT divided by
BITS_PER_UNIT and then converted into offset_int is also naturally
sign-extended.

> > 
> >> +  {
> >> +tree toffset = build_int_cst (TREE_TYPE (cur_offset),
> >> +  total_offset.to_shwi ());
> >> +tree ptr2 = build1 (ADDR_EXPR,
> >> +build_pointer_type (TREE_TYPE (base)),
> >> +base);
> > 
> > Why do you need to create these trees here when you actually don't use them
> > (or shouldn't need)
> 
> Is it really impossible to catch something by has_dominating_ubsan_ptr_check 
> (ctx, ptr2, total_offset)?

See above, yes, it is, but if

Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-06 Thread Walter Bright



On 10/6/2017 1:34 AM, Iain Buclaw wrote:

On 6 October 2017 at 02:57, Walter Bright  wrote:



On 10/5/2017 3:59 AM, Iain Buclaw wrote:


On 3 October 2017 at 23:36, Joseph Myers  wrote:


On Tue, 3 Oct 2017, Jeff Law wrote:


/* Copyright (c) 2010-2014 by Digital Mars
   * All Rights Reserved, written by Walter Bright
   * http://www.digitalmars.com
   * Distributed under the Boost Software License, Version 1.0.
   * (See accompanying file LICENSE or copy at
http://www.boost.org/LICENSE_1_0.txt)

If the code was assigned to the FSF in 2011, then the FSF would have
ownership of the code.  And the FSF would be the only entity that could
change the license (which according to your message changed to Boost in
2014).  So something seems wrong here.



The standard FSF assignment would allow the contributor to distribute
their own code under such terms as they see fit.



Walter, would you mind clarifying details of your assignment? Was it a
standard assignment? Did you request for any amendments?



I'm good with FSF owning their copy and it being under the GPL and Digital
Mars owning our copy and it being Boost licensed.



Out of curiosity, I did have a look at some of the tops of gofrontend
sources this morning.  They are all copyright the Go Authors, and are
licensed as BSD.  So I'm not sure if having copyright FSF and
distributing under GPL is strictly required.  And from a maintenance
point of view, it would be easier to merge in upstream changes as-is
without some diff/merging tool.

Regards,
Iain.


That certainly seems like a more convenient solution.


[PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-06 Thread Shalnov, Sergey
Hi,
GCC uses full 512-bit register in case of moving SF/DF value between two 
registers.
The patch avoid 512-bit register usage if "-mprefer-avx256" option used.

2017-10-06  Sergey Shalnov  

gcc/
* config/i386/i386.md(*movsf_internal, *movdf_internal):
Avoid 512-bit AVX modes for TARGET_PREFER_AVX256.



0002-Avoid-512-bit-mode-MOV-for-prefer-avx256-option.patch
Description: 0002-Avoid-512-bit-mode-MOV-for-prefer-avx256-option.patch


Re: Allow non-wi wi

2017-10-06 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Oct 3, 2017 at 8:34 PM, Richard Sandiford
>  wrote:
>> This patch uses global rather than member operators for wide-int.h,
>> so that the first operand can be a non-wide-int type.
>
> Not sure why we had the in-class ones.  If we had some good arguments
> they'd still stand.  Do you remember?

Not really, sorry.  This might not have been discussed specifically.
It looks like Kenny and Mike's initial commit to the wide-int branch
had member operators, so it could just have been carried over by
default.  And using member operators in the initial commit might have
been influenced by double_int (which has them too), since at that time
wide_int was very much a direct replacement for double_int.

Thanks,
Richard


[PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Sudi Das


 Hi

This patch is a fix for PR 82440.
The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out on
checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
function.
Also I think James forgot to add the test cases in the original patch submitted.

Testing done : Checked for regressions on bootstrapped aarch64-none-linux-gnu.
Ok for trunk?

Thanks
Sudi

The ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-06  Sudakshina Das  

PR target/82440
* config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
(aarch64_reg_or_bic_imm): Likewise.

*** gcc/testsuite/ChangeLog ***

2017-10-06  Sudakshina Das  

* gcc.target/aarch64/bic_imm_1.c: New test.
* gcc.target/aarch64/orr_imm_1.c: Likewise.diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 887a13e..bf23b88 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -71,13 +71,15 @@
 
 (define_predicate "aarch64_reg_or_orr_imm"
(ior (match_operand 0 "register_operand")
-	(match_test "aarch64_simd_valid_immediate (op, mode, false,
-		   NULL, AARCH64_CHECK_ORR)")))
+	(and (match_code "const_vector")
+	 (match_test "aarch64_simd_valid_immediate (op, mode, false,
+		NULL, AARCH64_CHECK_ORR)"
 
 (define_predicate "aarch64_reg_or_bic_imm"
(ior (match_operand 0 "register_operand")
-	(match_test "aarch64_simd_valid_immediate (op, mode, false,
-		   NULL, AARCH64_CHECK_BIC)")))
+	(and (match_code "const_vector")
+	 (match_test "aarch64_simd_valid_immediate (op, mode, false,
+		NULL, AARCH64_CHECK_BIC)"
 
 (define_predicate "aarch64_fp_compare_operand"
   (ior (match_operand 0 "register_operand")
diff --git a/gcc/testsuite/gcc.target/aarch64/bic_imm_1.c b/gcc/testsuite/gcc.target/aarch64/bic_imm_1.c
new file mode 100644
index 000..b14f009
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bic_imm_1.c
@@ -0,0 +1,56 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2 --save-temps -ftree-vectorize" } */
+
+/* Each function uses the correspoding 'CLASS' in
+   Marco CHECK (aarch64_simd_valid_immediate).  */
+
+void
+bic_6 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0xab);
+}
+
+void
+bic_7 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0xcd00);
+}
+
+void
+bic_8 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0xef);
+}
+
+void
+bic_9 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0x1200);
+}
+
+void
+bic_10 (short *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0x34);
+}
+
+
+void
+bic_11 (short *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] &= ~(0x5600);
+}
+
+
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.4s, #171" } } */
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.4s, #205, lsl #8" } } */
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.4s, #239, lsl #16" } } */
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.4s, #18, lsl #24" } } */
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.8h, #52" } } */
+/* { dg-final { scan-assembler "bic\\tv\[0-9\]+.8h, #86, lsl #8" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/orr_imm_1.c b/gcc/testsuite/gcc.target/aarch64/orr_imm_1.c
new file mode 100644
index 000..ff6f683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/orr_imm_1.c
@@ -0,0 +1,54 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2 --save-temps -ftree-vectorize" } */
+
+/* Each function uses the correspoding 'CLASS' in
+   Marco CHECK (aarch64_simd_valid_immediate).  */
+
+void
+orr_0 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0xab;
+}
+
+void
+orr_1 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0xcd00;
+}
+
+void
+orr_2 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0x00ef;
+}
+
+void
+orr_3 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0x1200;
+}
+
+void
+orr_4 (short *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0x00340034;
+}
+
+void
+orr_5 (int *a)
+{
+  for (int i = 0; i < 1024; i++)
+a[i] |= 0x56005600;
+}
+
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.4s, #171" } } */
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.4s, #205, lsl #8" } } */
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.4s, #239, lsl #16" } } */
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.4s, #18, lsl #24" } } */
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.8h, #52" } } */
+/* { dg-final { scan-assembler "orr\\tv\[0-9\]+.8h, #86, lsl #8" } } */


Re: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 09:33:21AM +, Shalnov, Sergey wrote:
> Hi,
> GCC uses full 512-bit register in case of moving SF/DF value between two 
> registers.
> The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
> 
> 2017-10-06  Sergey Shalnov  
> 
> gcc/
>   * config/i386/i386.md(*movsf_internal, *movdf_internal):
>   Avoid 512-bit AVX modes for TARGET_PREFER_AVX256.
> 
>From b96e657153b9aea24ff002e7e156ba12b2d443b5 Mon Sep 17 00:00:00 2001
From: Sergey Shalnov 
Date: Fri, 6 Oct 2017 10:45:40 +0300
Subject: [PATCH 1/1] Avoid 512-bit mode MOV for prefer-avx256 option

---
 gcc/config/i386/i386.md | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 99497a9..a6d7cca 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3564,8 +3564,9 @@
 
   /* movaps is one byte shorter for non-AVX targets.  */
   (eq_attr "alternative" "13,17")
-(cond [(ior (match_operand 0 "ext_sse_reg_operand")
-(match_operand 1 "ext_sse_reg_operand"))
+(cond [(and (not (match_test "TARGET_PREFER_AVX256"))
+ (ior (match_operand 0 "ext_sse_reg_operand")
+  (match_operand 1 "ext_sse_reg_operand")))
  (const_string "V8DF")
(ior (not (match_test "TARGET_SSE2"))
 (match_test 
"TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))

How does that work with -mprefer-avx256 -mavx512f -mno-avx512vl?
The constraints in these alternatives use v, and [SD]Fmode are included in
VALID_AVX512F_SCALAR_MODE and thus you can get e.g. %?mm23 among the
operands.
EVEX encoded VMOVAPD or VMOVAPS is AVX512F cpuid only with 512-bit operands,
with 128-bit/256-bit it is AVX512VL + AVX512F.

So, in both of the spots you've changed in this patch, but also in the
spots you've changed earlier, you need to use
TARGET_PREFER_AVX256 && TARGET_AVX512VL rather than just
TARGET_PREFER_AVX256, because without TARGET_AVX512VL it is not a matter
of preferring it, but a must.  Unless we disable %xmm16+ registers
for TARGET_PREFER_AVX256 && !TARGET_AVX512VL code, but that would be weird:
an optimization preference would e.g. break people using %xmm16+ register
variables etc.

@@ -3739,8 +3740,9 @@
  better to maintain the whole registers in single format
  to avoid problems on using packed logical operations.  */
   (eq_attr "alternative" "6")
-(cond [(ior  (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
+(cond [(and (not (match_test "TARGET_PREFER_AVX256"))
+ (ior (match_operand 0 "ext_sse_reg_operand")
+  (match_operand 1 "ext_sse_reg_operand")))
  (const_string "V16SF")
(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
 (match_test "TARGET_SSE_SPLIT_REGS"))
-- 
1.8.3.1




Jakub


Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 09:52:35AM +, Sudi Das wrote:
> This patch is a fix for PR 82440.
> The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out on
> checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
> function.
> Also I think James forgot to add the test cases in the original patch 
> submitted.
> 
> Testing done : Checked for regressions on bootstrapped aarch64-none-linux-gnu.
> Ok for trunk?
> 
> Thanks
> Sudi
> 
> The ChangeLog entry is as follows:
> 
> *** gcc/ChangeLog ***
> 
> 2017-10-06  Sudakshina Das  
> 
>   PR target/82440
>   * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
>   (aarch64_reg_or_bic_imm): Likewise.

I'll defer the actual review to aarch64 maintainers, just want to say that
this is not a correct ChangeLog entry.  You should say what has changed, not
just that something has changed.  Something like
Only call aarch64_simd_valid_immediate on CONST_VECTORs.
or similar.

Jakub


Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Martin Liška
On 10/05/2017 07:06 PM, Martin Sebor wrote:
> On 10/04/2017 03:05 AM, Martin Liška wrote:
>> Hello.
>>
>> Following patch adds support for optimizing out unnecessary UBSAN_PTR checks.
>> It handles separately positive and negative offsets, zero offset is ignored.
>> Apart from that, we utilize get_inner_reference for local and global 
>> variables,
>> that also helps to reduce some.
>>
>> Some numbers:
>>
>> 1) postgres:
>>
>> bloaty /tmp/after2 -- /tmp/before2
>>  VM SIZE  FILE SIZE
>>  ++ GROWING    ++
>>   [ = ]   0 .debug_abbrev  +1.84Ki  +0.3%
>>
>>  -- SHRINKING  --
>>  -30.3% -3.98Mi .text  -3.98Mi -30.3%
>>   [ = ]   0 .debug_info    -3.69Mi -17.2%
>>   [ = ]   0 .debug_loc -2.02Mi -13.4%
>>  -43.1% -1.37Mi .data  -1.37Mi -43.1%
>>   [ = ]   0 .debug_ranges   -390Ki -14.3%
>>   [ = ]   0 .debug_line -295Ki -11.6%
>>   -4.0% -26.3Ki .eh_frame  -26.3Ki  -4.0%
>>   [ = ]   0 [Unmapped] -1.61Ki -62.3%
>>   [ = ]   0 .strtab    -1.15Ki  -0.4%
>>   [ = ]   0 .symtab    -1.08Ki  -0.3%
>>   -0.4%    -368 .eh_frame_hdr -368  -0.4%
>>   [ = ]   0 .debug_aranges    -256  -0.7%
>>   [DEL] -16 [None]   0  [ = ]
>>
>>  -28.0% -5.37Mi TOTAL  -11.8Mi -18.8%
> 
> This looks like an impressive improvement!  FWIW, I've been
> meaning to look into similar opportunities mentioned in bug
> 79265.

Hi. Thank you very much for feedback. If you want I can help with the PR?

> 
> Would making use of get_range_info() make sense here to detect
> and eliminate even more cases?
> 
> Just a few minor mostly stylistic suggestions.
> 
> +/* Traits class for tree triplet hash maps below.  */
> +
> +struct sanopt_tree_couple_hash : typed_noop_remove 
> +{
> ...
> +  static inline bool
> +  equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)
> 
> Member functions defined within the body of the class are implicitly
> inline so while not wrong, there is no need to declare them inline
> explicitly.

Done that in v2.

> 
> Also, since mark_deleted uses reinterpret_cast (as suggested by
> GCC coding conventions) it seems that is_deleted should do the
> same for consistency.  Alternatively, if there isn't enough
> interest/consensus to follow this guideline perhaps it should
> be removed from the GCC coding conventions.  (Very few GCC code
> seems to use reinterpret_cast.)

Likewise.

> 
> 
> +/* Return true when pointer PTR for a given OFFSET is already sanitized
> +   in a given sanitization context CTX.  */
> 
> Shouldn't the comment read CUR_OFFSET?  I ask because the function
> also declares a local variable OFFSET.

Yes, should be fixed.

> 
> +static bool
> +has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr, tree cur_offset)
> +{
> ...
> +  /* We already have recorded a UBSAN_PTR check for this pointer. Perhaps we
> + can drop this one.  But only if this check doesn't specify larger 
> offset.
> + */
> +  tree offset = gimple_call_arg (g, 1);
> 
> Martin
> 
> PS It seems to me that the test could be enabled for all targets
> where UBSan is supported by making use of SIZE_MAX to compute
> the values of the constants instead of hardwiring LP64 values.
> I noticed the test doesn't exercise member arrays.  Are those
> not handled by the patch?

I decided to use __PTRDIF__MAX__ as I need signed type. Is it ok?

Martin

> 
>> Left checks:
>> 261039
>> Optimized out:
>> 85643
>>
>> 2) tramp3d:
>>
>> bloaty after -- before
>>  VM SIZE FILE SIZE
>>  ++ GROWING   ++
>>   +167% +30 [Unmapped]    +1.01Ki   +39%
>>
>>  -- SHRINKING --
>>  -58.5% -2.52Mi .text -2.52Mi -58.5%
>>  -64.2%  -574Ki .data  -574Ki -64.2%
>>   -5.7% -4.27Ki .eh_frame -4.27Ki  -5.7%
>>   -6.4% -1.06Ki .gcc_except_table -1.06Ki  -6.4%
>>   -7.2%    -192 .bss    0  [ = ]
>>   -0.1% -32 .rodata   -32  -0.1%
>>   [ = ]   0 .strtab   -29  -0.0%
>>   -1.1% -24 .dynsym   -24  -1.1%
>>   -1.5% -24 .rela.plt -24  -1.5%
>>   [ = ]   0 .symtab   -24  -0.1%
>>   -0.6% -16 .dynstr   -16  -0.6%
>>   -1.5% -16 .plt  -16  -1.5%
>>   -1.4%  -8 .got.plt   -8  -1.4%
>>   -0.6%  -4 .hash  -4  -0.6%
>>   -1.1%  -2 .gnu.version   -2  -1.1%
>>
>>  -58.0% -3.09Mi TOTAL -3.08Mi -55.0%
>>
>> Left checks:
>> 31131
>> Optimized out:
>> 36752
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-10-04  Martin Liska  
>>
>> * sanopt.c (struct sanopt_tree_couple): New struct.
>> (struct sanopt_tree_couple_hash): Likewise.
>> (struct sanopt_ctx): Add ptr_check_ma

[PATCH v2] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Martin Liška
Hi.

Thanks for the feedback, I implemented was written in Jakub's feedback. I also 
added
comments explaining all cases that happen for what was mentioned as cases 1-6.
It would be easier to send newer version of patch to see how I resolved the 
feedback.

Patch has been tested.

Martin
>From 2185ebc117ebf15c21eedbd4e212d6b495d76aab Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 2 Oct 2017 16:14:56 +0200
Subject: [PATCH] Add sanopt support for UBSAN_PTR.

gcc/ChangeLog:

2017-10-06  Martin Liska  

	* sanopt.c (struct sanopt_tree_triplet_hash): Remove inline
	keyword for member functions.
	(struct sanopt_tree_couple): New struct.
	(struct sanopt_tree_couple_hash): New function.
	(struct sanopt_ctx): Add new hash_map.
	(has_dominating_ubsan_ptr_check): New function.
	(record_ubsan_ptr_check_stmt): Likewise.
	(maybe_optimize_ubsan_ptr_ifn): Likewise.
	(sanopt_optimize_walker): Handle IFN_UBSAN_PTR.
	(pass_sanopt::execute): Handle also SANITIZE_POINTER_OVERFLOW.

gcc/testsuite/ChangeLog:

2017-10-04  Martin Liska  

	* c-c++-common/ubsan/ptr-overflow-sanitization-1.c: New test.
---
 gcc/sanopt.c   | 268 -
 .../ubsan/ptr-overflow-sanitization-1.c|  80 ++
 2 files changed, 335 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/ptr-overflow-sanitization-1.c

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index d17c7db3321..3c15354e09a 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfghooks.h"
 #include "tree-dfa.h"
 #include "tree-ssa.h"
+#include "varasm.h"
 
 /* This is used to carry information about basic blocks.  It is
attached to the AUX field of the standard CFG block.  */
@@ -105,7 +106,7 @@ struct sanopt_tree_triplet_hash : typed_noop_remove 
   typedef sanopt_tree_triplet value_type;
   typedef sanopt_tree_triplet compare_type;
 
-  static inline hashval_t
+  static hashval_t
   hash (const sanopt_tree_triplet &ref)
   {
 inchash::hash hstate (0);
@@ -115,7 +116,7 @@ struct sanopt_tree_triplet_hash : typed_noop_remove 
 return hstate.end ();
   }
 
-  static inline bool
+  static bool
   equal (const sanopt_tree_triplet &ref1, const sanopt_tree_triplet &ref2)
   {
 return operand_equal_p (ref1.t1, ref2.t1, 0)
@@ -123,31 +124,86 @@ struct sanopt_tree_triplet_hash : typed_noop_remove 
 	   && operand_equal_p (ref1.t3, ref2.t3, 0);
   }
 
-  static inline void
+  static void
   mark_deleted (sanopt_tree_triplet &ref)
   {
 ref.t1 = reinterpret_cast (1);
   }
 
-  static inline void
+  static void
   mark_empty (sanopt_tree_triplet &ref)
   {
 ref.t1 = NULL;
   }
 
-  static inline bool
+  static bool
   is_deleted (const sanopt_tree_triplet &ref)
   {
-return ref.t1 == (void *) 1;
+return ref.t1 == reinterpret_cast (1);
   }
 
-  static inline bool
+  static bool
   is_empty (const sanopt_tree_triplet &ref)
   {
 return ref.t1 == NULL;
   }
 };
 
+/* Tree couple for ptr_check_map.  */
+struct sanopt_tree_couple
+{
+  tree ptr;
+  bool pos_p;
+};
+
+/* Traits class for tree triplet hash maps below.  */
+
+struct sanopt_tree_couple_hash : typed_noop_remove 
+{
+  typedef sanopt_tree_couple value_type;
+  typedef sanopt_tree_couple compare_type;
+
+  static hashval_t
+  hash (const sanopt_tree_couple &ref)
+  {
+inchash::hash hstate (0);
+inchash::add_expr (ref.ptr, hstate);
+hstate.add_int (ref.pos_p);
+return hstate.end ();
+  }
+
+  static bool
+  equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)
+  {
+return operand_equal_p (ref1.ptr, ref2.ptr, 0)
+	   && ref1.pos_p == ref2.pos_p;
+  }
+
+  static void
+  mark_deleted (sanopt_tree_couple &ref)
+  {
+ref.ptr = reinterpret_cast (1);
+  }
+
+  static void
+  mark_empty (sanopt_tree_couple &ref)
+  {
+ref.ptr = NULL;
+  }
+
+  static bool
+  is_deleted (const sanopt_tree_couple &ref)
+  {
+return ref.ptr == reinterpret_cast (1);
+  }
+
+  static bool
+  is_empty (const sanopt_tree_couple &ref)
+  {
+return ref.ptr == NULL;
+  }
+};
+
 /* This is used to carry various hash maps and variables used
in sanopt_optimize_walker.  */
 
@@ -166,6 +222,10 @@ struct sanopt_ctx
  that virtual table pointer.  */
   hash_map > vptr_check_map;
 
+  /* This map maps a couple (tree and boolean) to a vector of UBSAN_PTR
+ call statements that check that pointer overflow.  */
+  hash_map > ptr_check_map;
+
   /* Number of IFN_ASAN_CHECK statements.  */
   int asan_num_accesses;
 
@@ -344,6 +404,179 @@ maybe_optimize_ubsan_null_ifn (struct sanopt_ctx *ctx, gimple *stmt)
   return remove;
 }
 
+/* Return true when pointer PTR for a given CUR_OFFSET is already sanitized
+   in a given sanitization context CTX.  */
+
+static bool
+has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr,
+offset_int &cur_offset)
+{
+  bool positive = !wi::neg_p (cur_offset);
+
+  sanopt_tree_couple couple;
+  co

Re: libbacktrace patch committed: Support compressed debug sections

2017-10-06 Thread Gerald Pfeifer
On Thu, 28 Sep 2017, Ian Lance Taylor wrote:
> This patch to libbacktrace adds support for compressed debug sections.
> 2017-09-28  Ian Lance Taylor  
> 
> PR other/67165
> * elf.c (__builtin_prefetch): Define if not __GNUC__.
> (unlikely): Define.
> (SHF_UNCOMPRESSED, ELFCOMPRESS_ZLIB): Define.
> (b_elf_chdr): Define type.
> (enum debug_section): Add ZDEBUG_xxx values.

Since this change I am seeing the following in my night GCC build
and test logs on FreeBSD systems:

gmake[2]: autogen: Command not found
gmake[2]: *** [Makefile:176: check] Error 127
gmake[1]: *** [Makefile:3759: check-fixincludes] Error 2
/scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c: In function 'test_large':
/scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:384:41: warning: passing 
argument 2 of 'uncompress' from incompatible pointer type 
[-Wincompatible-pointer-types]
   r = uncompress (uncompressed_buf, &uncompressed_bufsize,
 ^
In file included from /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:43:0:
/usr/include/zlib.h:1265:21: note: expected 'uLongf * {aka long unsigned int 
*}' but argument is of type 'size_t * {aka unsigned int *}'
 ZEXTERN int ZEXPORT uncompress OF((Bytef *dest,   uLongf *destLen,
 ^~
/scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c: In function 'test_large':
/scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:384:41: warning: passing 
argument 2 of 'uncompress' from incompatible pointer type 
[-Wincompatible-pointer-types]
   r = uncompress (uncompressed_buf, &uncompressed_bufsize,
 ^
In file included from /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:43:0:
/usr/include/zlib.h:1265:21: note: expected 'uLongf * {aka long unsigned int 
*}' but argument is of type 'size_t * {aka unsigned int *}'
 ZEXTERN int ZEXPORT uncompress OF((Bytef *dest,   uLongf *destLen,
 ^~
gmake[4]: *** [Makefile:306: check-DEJAGNU] Error 1
gmake[3]: *** [Makefile:350: check-am] Error 2
gmake[2]: *** [Makefile:904: check-recursive] Error 1
gmake[1]: *** [Makefile:22343: check-target-libgomp] Error 2
Fatal error 'mutex is on list' at line 272 in file 
/usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)
Fatal error 'mutex is on list' at line 272 in file 
/usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)
gmake: *** [Makefile:2286: do-check] Error 2

Gerald


Re: [PATCH v2] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 12:21:00PM +0200, Martin Liška wrote:
> +/* Return true when pointer PTR for a given CUR_OFFSET is already sanitized
> +   in a given sanitization context CTX.  */
> +
> +static bool
> +has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr,
> + offset_int &cur_offset)
> +{
> +  bool positive = !wi::neg_p (cur_offset);
> +
> +  sanopt_tree_couple couple;

Perhaps s/positive/pos_p/ for the vars too?  And remove the empty line
in between bool pos_p = ...; and sanopt_tree_couple couple;

> +  couple.ptr = ptr;
> +  couple.pos_p = positive;
> +
> +  auto_vec &v = ctx->ptr_check_map.get_or_insert (couple);
> +  gimple *g = maybe_get_dominating_check (v);
> +  if (!g)
> +return false;
> +
> +  /* We already have recorded a UBSAN_PTR check for this pointer.  Perhaps we
> + can drop this one.  But only if this check doesn't specify larger 
> offset.
> + */
> +  tree offset = gimple_call_arg (g, 1);
> +  gcc_assert (TREE_CODE (offset) == INTEGER_CST);
> +  offset_int ooffset = wi::sext (wi::to_offset (offset), POINTER_SIZE);
> +
> +  if (positive && wi::les_p (cur_offset, ooffset))
> +return true;
> +  else if (!positive && wi::les_p (ooffset, cur_offset))
> +return true;

Maybe better as:
  if (pos_p)
{
  if (wi::les_p (cur_offset, ooffset))
return true;
}
  else if (wi::les_p (ooffset, cur_offset))
return true;
?

> +/* Record UBSAN_PTR check of given context CTX.  Register pointer PTR on
> +   a given OFFSET that it's handled by GIMPLE STMT.  */
> +
> +static void
> +record_ubsan_ptr_check_stmt (sanopt_ctx *ctx, gimple *stmt, tree ptr,
> +  const offset_int &offset)
> +{
> +  bool positive = !wi::neg_p (offset);
> +
> +  sanopt_tree_couple couple;

Similarly.  Or you could in this case just write couple.pos_p = !wi::neg_p 
(offset);

> +  couple.ptr = ptr;
> +  couple.pos_p = positive;
> +
> +  auto_vec &v = ctx->ptr_check_map.get_or_insert (couple);
> +  v.safe_push (stmt);
> +}
> +

> +  tree cur_offset = gimple_call_arg (stmt, 1);

I wonder if instead of having cur_offset mean offset_int above and
tree here you couldn't use say off for the tree and cur_offset for
the offset_int.

> +  offset_int cur_offset_int = wi::sext (wi::to_offset (cur_offset),
> + POINTER_SIZE);

I personally find it more readable to put = on a new line:
  offset_int cur_offset_int
= wi::sext (wi::to_offset (cur_offset), POINTER_SIZE);
Though, in this case with the above name changes it could even fit:
  offset_int cur_offset = wi::sext (wi::to_offset (off), POINTER_SIZE);

> +   offset_int expr_offset = bitpos / BITS_PER_UNIT;
> +   offset_int total_offset = expr_offset + cur_offset_int;

And wi::neg_p (expr_offset) can be more cheaply written as bitpos < 0
in all spots below.

What I'd add here is
  if (total_offset != wi::sext (total_offset, POINTER_SIZE))
{
  record_ubsan_ptr_check_stmt (...);
  return false;
}
though (or branch to that stuff at the end of function).

> +
> +   /* If BASE is a fixed size automatic variable or
> +  global variable defined in the current TU, we don't have
> +  to instrument anything if offset is within address
> +  of the variable.  */
> +   if ((VAR_P (base)
> +|| TREE_CODE (base) == PARM_DECL
> +|| TREE_CODE (base) == RESULT_DECL)
> +   && DECL_SIZE_UNIT (base)
> +   && TREE_CODE (DECL_SIZE_UNIT (base)) == INTEGER_CST
> +   && (!is_global_var (base) || decl_binds_to_current_def_p (base)))
> + {
> +   offset_int base_size = wi::to_offset (DECL_SIZE_UNIT (base));
> +   if (!wi::neg_p (expr_offset) && wi::les_p (total_offset,
> +  base_size))

Similar comment about line breaking, putting && on another line would
make the args to the function on one line.

> + {
> +   if (!wi::neg_p (total_offset)
> +   && wi::les_p (total_offset, base_size)
> +   && wi::fits_shwi_p (total_offset))

Why wi::fits_shwi_p?  total_offset is non-negative and smaller or equal
than base_size, so that should be enough to return true.
And if it is to make sure total_offset has not overflown, then that should
be done by the sext, otherwise it will only work for 64-bit arches.

> + return true;
> + }
> + }
> +
> +   /* Following expression: UBSAN_PTR (&MEM_REF[ptr + x], y) can be
> +  handled as follows:
> +
> +  1) sign (x) == sign (y), then check for dominating check of (x + y)
> +  2) sign (x) != sign (y), then first check if we have a dominating
> + check for ptr + x.  If so, then we have 2 situations:
> + a) sign (x) == sign (x + y), here we are done, example:
> +UBSAN_PTR (&MEM_REF[ptr + 100], -50)
> + b) check for domi

Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Sudi Das

Hi Jakub

I have modified the entries:

*** gcc/ChangeLog ***

2017-10-05  Sudakshina Das  

PR target/82440
* config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Changed to
to only call aarch64_simd_valid_immediate on CONST_VECTORs.
(aarch64_reg_or_bic_imm): Likewise.

*** gcc/testsuite/ChangeLog ***

2017-10-05  Sudakshina Das  

* gcc.target/aarch64/bic_imm_1.c: New test.
* gcc.target/aarch64/orr_imm_1.c: Likewise..


Thanks
Sudi


From: Jakub Jelinek 
Sent: Friday, October 6, 2017 11:11 AM
To: Sudi Das
Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; Richard 
Earnshaw; James Greenhalgh
Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate
    
On Fri, Oct 06, 2017 at 09:52:35AM +, Sudi Das wrote:
> This patch is a fix for PR 82440.
> The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out on
> checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
> function.
> Also I think James forgot to add the test cases in the original patch 
> submitted.
> 
> Testing done : Checked for regressions on bootstrapped aarch64-none-linux-gnu.
> Ok for trunk?
> 
> Thanks
> Sudi
> 
> The ChangeLog entry is as follows:
> 
> *** gcc/ChangeLog ***
> 
> 2017-10-06  Sudakshina Das  
> 
>    PR target/82440
>    * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
>    (aarch64_reg_or_bic_imm): Likewise.

I'll defer the actual review to aarch64 maintainers, just want to say that
this is not a correct ChangeLog entry.  You should say what has changed, not
just that something has changed.  Something like
Only call aarch64_simd_valid_immediate on CONST_VECTORs.
or similar.

    Jakub


Re: [PATCH] [graphite] translate reads and writes in a single traversal of memory ops

2017-10-06 Thread Richard Biener
On Thu, Oct 5, 2017 at 4:27 PM, Sebastian Pop  wrote:
>
>
> On Mon, Oct 2, 2017 at 4:18 AM, Richard Biener 
> wrote:
>>
>> On Mon, Oct 2, 2017 at 6:53 AM, Sebastian Pop 
>> wrote:
>> > The patch moves the code that translates reads and writes to isl
>> > representation
>> > in a same loop.  This is to avoid traversing the scop blocks and arrays
>> > with
>> > memory operations 3 times.
>>
>> LGTM.
>
>
> Richard, could you please commit this patch, as I will need to figure out
> why my
> ssh keys don't let me to commit the code.  I will probably need to update
> the key.

Done.  You probably still have a v1 key which were rejected after some point.
I would guess you'll need to contact overseers to replace your key.

Richard.

> Thanks,
> Sebastian
>


Re: [C++ PATCH] Fix comment

2017-10-06 Thread Nathan Sidwell

On 10/05/2017 05:25 PM, Jason Merrill wrote:

On Wed, Oct 4, 2017 at 11:44 AM, Nathan Sidwell  wrote:

On 10/04/2017 11:29 AM, Jason Merrill wrote:

On Wed, Oct 4, 2017 at 10:12 AM, Nathan Sidwell  wrote:



Hmm, I think the error is in the behavior, not the comment.  :)


I wouldn't disagree.


Fixing thus.


Nice.  This is a visible ABI change, so might want to add it to 8.1's 
release notes?

1) variadic fns can no longer mutate the caller's object
2) variadic fns now observe the static type of the object, not the 
dynamic type.


(and if any user's relying on that, it's their problem :)

nathan

--
Nathan Sidwell


Re: tree-complex.c: fix some_nonzerop test over reals (and a bug fix)

2017-10-06 Thread Richard Biener
On Thu, Oct 5, 2017 at 4:41 PM, Laurent Thevenoux
 wrote:
> Hello,
>
> This patch improves the some_nonzerop(tree t) function from tree-complex.c 
> file (the function is only used there).
>
> This function returns true if a tree as a parameter is not the constant 0 (or 
> +0.0 only for reals when !flag_signed_zeros ). The former result is then used 
> to determine if some simplifications are possible for complex expansions 
> (addition, multiplication, and division).
>
> Unfortunately, if the tree is a real constant, the function always return 
> true, even for +0.0 because of the explicit test on flag_signed_zeros (so if 
> your system enables signed zeros you cannot benefit from those 
> simplifications). To avoid this behavior and allow complex expansion 
> simplifications, I propose the following patch, which test for the sign of 
> the real constant 0.0 instead of checking the flag.

But

+  if (TREE_CODE (t) == REAL_CST)
+{
+  if (real_identical (&TREE_REAL_CST (t), &dconst0))
+   zerop = !real_isneg (&TREE_REAL_CST (t));
+}

shouldn't you do

   zerop = !flag_signed_zeros || !real_isneg (&TREE_REAL_CST (t));

?

Also doesn't this miscalculate A - B for {ar, 0.} - {br, 0.} given the
simplification simply
returns bi?

So I'm not convinced this handling is correct.

> This first fix reveals a bug (thanks to 
> c-c++-common/torture/complex-sign-add.c ) in the simplification section of 
> expand_complex_addition (also fixed in the patch).

Your patch looks backward and the existing code looks ok.

@@ -937,7 +940,7 @@ expand_complex_addition (gimple_stmt_iterator
*gsi, tree inner_type,

 case PAIR (VARYING, ONLY_REAL):
   rr = gimplify_build2 (gsi, code, inner_type, ar, br);
-  ri = ai;
+  ri = bi;
   break;

down we have

case PAIR (ONLY_REAL, VARYING):
  if (code == MINUS_EXPR)
goto general;
  rr = gimplify_build2 (gsi, code, inner_type, ar, br);
  ri = bi;
  break;

which for sure would need to change as well then.  But for your
changed code we know
bi is zero (but ai may not).

Richard.

> The patch has passed bootstrap and testing on x86_64-pc-linux-gnu .
>
> Best regards,
> Laurent Thévenoux


Re: [PATCH v2] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Martin Liška
On 10/06/2017 12:53 PM, Jakub Jelinek wrote:
> On Fri, Oct 06, 2017 at 12:21:00PM +0200, Martin Liška wrote:
>> +/* Return true when pointer PTR for a given CUR_OFFSET is already sanitized
>> +   in a given sanitization context CTX.  */
>> +
>> +static bool
>> +has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr,
>> +offset_int &cur_offset)
>> +{
>> +  bool positive = !wi::neg_p (cur_offset);
>> +
>> +  sanopt_tree_couple couple;
> 
> Perhaps s/positive/pos_p/ for the vars too?  And remove the empty line
> in between bool pos_p = ...; and sanopt_tree_couple couple;
> 
>> +  couple.ptr = ptr;
>> +  couple.pos_p = positive;
>> +
>> +  auto_vec &v = ctx->ptr_check_map.get_or_insert (couple);
>> +  gimple *g = maybe_get_dominating_check (v);
>> +  if (!g)
>> +return false;
>> +
>> +  /* We already have recorded a UBSAN_PTR check for this pointer.  Perhaps 
>> we
>> + can drop this one.  But only if this check doesn't specify larger 
>> offset.
>> + */
>> +  tree offset = gimple_call_arg (g, 1);
>> +  gcc_assert (TREE_CODE (offset) == INTEGER_CST);
>> +  offset_int ooffset = wi::sext (wi::to_offset (offset), POINTER_SIZE);
>> +
>> +  if (positive && wi::les_p (cur_offset, ooffset))
>> +return true;
>> +  else if (!positive && wi::les_p (ooffset, cur_offset))
>> +return true;
> 
> Maybe better as:
>   if (pos_p)
> {
>   if (wi::les_p (cur_offset, ooffset))
>   return true;
> }
>   else if (wi::les_p (ooffset, cur_offset))
> return true;
> ?
> 
>> +/* Record UBSAN_PTR check of given context CTX.  Register pointer PTR on
>> +   a given OFFSET that it's handled by GIMPLE STMT.  */
>> +
>> +static void
>> +record_ubsan_ptr_check_stmt (sanopt_ctx *ctx, gimple *stmt, tree ptr,
>> + const offset_int &offset)
>> +{
>> +  bool positive = !wi::neg_p (offset);
>> +
>> +  sanopt_tree_couple couple;
> 
> Similarly.  Or you could in this case just write couple.pos_p = !wi::neg_p 
> (offset);
> 
>> +  couple.ptr = ptr;
>> +  couple.pos_p = positive;
>> +
>> +  auto_vec &v = ctx->ptr_check_map.get_or_insert (couple);
>> +  v.safe_push (stmt);
>> +}
>> +
> 
>> +  tree cur_offset = gimple_call_arg (stmt, 1);
> 
> I wonder if instead of having cur_offset mean offset_int above and
> tree here you couldn't use say off for the tree and cur_offset for
> the offset_int.
> 
>> +  offset_int cur_offset_int = wi::sext (wi::to_offset (cur_offset),
>> +POINTER_SIZE);
> 
> I personally find it more readable to put = on a new line:
>   offset_int cur_offset_int
> = wi::sext (wi::to_offset (cur_offset), POINTER_SIZE);
> Though, in this case with the above name changes it could even fit:
>   offset_int cur_offset = wi::sext (wi::to_offset (off), POINTER_SIZE);
> 
>> +  offset_int expr_offset = bitpos / BITS_PER_UNIT;
>> +  offset_int total_offset = expr_offset + cur_offset_int;
> 
> And wi::neg_p (expr_offset) can be more cheaply written as bitpos < 0
> in all spots below.
> 
> What I'd add here is
>   if (total_offset != wi::sext (total_offset, POINTER_SIZE))
> {
>   record_ubsan_ptr_check_stmt (...);
>   return false;
> }
> though (or branch to that stuff at the end of function).
> 
>> +
>> +  /* If BASE is a fixed size automatic variable or
>> + global variable defined in the current TU, we don't have
>> + to instrument anything if offset is within address
>> + of the variable.  */
>> +  if ((VAR_P (base)
>> +   || TREE_CODE (base) == PARM_DECL
>> +   || TREE_CODE (base) == RESULT_DECL)
>> +  && DECL_SIZE_UNIT (base)
>> +  && TREE_CODE (DECL_SIZE_UNIT (base)) == INTEGER_CST
>> +  && (!is_global_var (base) || decl_binds_to_current_def_p (base)))
>> +{
>> +  offset_int base_size = wi::to_offset (DECL_SIZE_UNIT (base));
>> +  if (!wi::neg_p (expr_offset) && wi::les_p (total_offset,
>> + base_size))
> 
> Similar comment about line breaking, putting && on another line would
> make the args to the function on one line.
> 
>> +{
>> +  if (!wi::neg_p (total_offset)
>> +  && wi::les_p (total_offset, base_size)
>> +  && wi::fits_shwi_p (total_offset))
> 
> Why wi::fits_shwi_p?  total_offset is non-negative and smaller or equal
> than base_size, so that should be enough to return true.
> And if it is to make sure total_offset has not overflown, then that should
> be done by the sext, otherwise it will only work for 64-bit arches.
> 
>> +return true;
>> +}
>> +}
>> +
>> +  /* Following expression: UBSAN_PTR (&MEM_REF[ptr + x], y) can be
>> + handled as follows:
>> +
>> + 1) sign (x) == sign (y), then check for dominating check of (x + y)
>> + 2) sign (x) != sign (y), then first check if we have a dominating
>> +check fo

Re: [PATCH] PR82396 workaround for AArch64 bootstrap failure

2017-10-06 Thread Richard Biener
On Fri, Oct 6, 2017 at 1:21 AM, Wilco Dijkstra  wrote:
> r253236 broke AArch64 bootstrap.  This is a temporary workaround that
> disables qsort checking in the scheduler to enable continued development
> and testing on AArch64.  This will be removed once the autopref scheduling
> code has been fixed.
>
> AArch64 bootstrap completes, OK for commit?

Ok.

> ChangeLog:
> 2017-10-05  Wilco Dijkstra  
>
> PR rtl-optimization/82396
> * haifa-sched.c (ready_sort_real): Disable qsort checking.
>
> --
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 
> 549e8961411ecd0a04ac3b24ba78b5d53e63258a..e7014cbb8b378c998148189b4d871e93c3e81918
>  100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -3084,7 +3084,8 @@ ready_sort_real (struct ready_list *ready)
>if (n_ready_real == 2)
>  swap_sort (first, n_ready_real);
>else if (n_ready_real > 2)
> -qsort (first, n_ready_real, sizeof (rtx), rank_for_schedule);
> +/* HACK: Disable qsort checking for now (PR82396).  */
> +(qsort) (first, n_ready_real, sizeof (rtx), rank_for_schedule);
>
>if (sched_verbose >= 4)
>  {
>


Re: Allow non-wi wi

2017-10-06 Thread Richard Biener
On Fri, Oct 6, 2017 at 11:35 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Tue, Oct 3, 2017 at 8:34 PM, Richard Sandiford
>>  wrote:
>>> This patch uses global rather than member operators for wide-int.h,
>>> so that the first operand can be a non-wide-int type.
>>
>> Not sure why we had the in-class ones.  If we had some good arguments
>> they'd still stand.  Do you remember?
>
> Not really, sorry.  This might not have been discussed specifically.
> It looks like Kenny and Mike's initial commit to the wide-int branch
> had member operators, so it could just have been carried over by
> default.  And using member operators in the initial commit might have
> been influenced by double_int (which has them too), since at that time
> wide_int was very much a direct replacement for double_int.

Ah, yeah...

> Thanks,
> Richard


[PATCH][GRAPHITE] Fix PR82449

2017-10-06 Thread Richard Biener

The following fences off a few more SCEVs through scev_analyzable_p given
at the end we need those pass chrec_apply when getting a rename through
SCEV.

The SCEV in question is

  {(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2

which fails to chrec_apply in the CHREC_LEFT part because that part
is not affine (and we're usually not replacing a IV with a constant
where chrec_apply might handle one or the other case).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

This fixes three out of the remaining 8 codegen errors in SPEC CPU 2006.

Ok?

Thanks,
Richard.

2017-10-06  Richard Biener  

PR tree-optimization/82449
* sese.c (can_chrec_apply): New function.
(scev_analyzable_p): Check we can call chrec_apply on the SCEV.

* gfortran.dg/graphite/pr82449.f: New testcase.

Index: gcc/sese.c
===
--- gcc/sese.c  (revision 253477)
+++ gcc/sese.c  (working copy)
@@ -421,6 +421,27 @@ invariant_in_sese_p_rec (tree t, const s
   return true;
 }
 
+/* Check whether we can call chrec_apply on CHREC with arbitrary X and VAR.  */
+
+static bool
+can_chrec_apply (tree chrec)
+{
+  if (automatically_generated_chrec_p (chrec))
+return false;
+  switch (TREE_CODE (chrec))
+{
+case POLYNOMIAL_CHREC:
+  if (evolution_function_is_affine_p (chrec))
+   return (can_chrec_apply (CHREC_LEFT (chrec))
+   && can_chrec_apply (CHREC_RIGHT (chrec)));
+  return false;
+CASE_CONVERT:
+  return can_chrec_apply (TREE_OPERAND (chrec, 0));
+default:;
+  return tree_does_not_contain_chrecs (chrec);
+}
+}
+
 /* Return true when DEF can be analyzed in REGION by the scalar
evolution analyzer.  */
 
@@ -449,6 +470,7 @@ scev_analyzable_p (tree def, sese_l ®
|| !defined_in_sese_p (scev, region))
 && (tree_does_not_contain_chrecs (scev)
|| evolution_function_is_affine_p (scev))
+&& can_chrec_apply (scev)
 && (! loop
|| ! loop_in_sese_p (loop, region)
|| ! chrec_contains_symbols_defined_in_loop (scev, loop->num));
Index: gcc/testsuite/gfortran.dg/graphite/pr82449.f
===
--- gcc/testsuite/gfortran.dg/graphite/pr82449.f(nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/pr82449.f(working copy)
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "-O2 -floop-nest-optimize" }
+
+  SUBROUTINE JDFIDX(MKL,KGSH)
+  DIMENSION MKL(6,6)
+  NKL=0
+  400 DO 40 KG = 1,KGSH
+  DO 40 LG = 1,KG
+  NKL = NKL + 1
+   40 MKL(LG,KG) = NKL
+  END


Re: [PATCH v2] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 01:43:20PM +0200, Martin Liška wrote:
> Thanks for feedback, all resolved except this one:
> 
> ../../gcc/sanopt.c:561:3: warning: suggest braces around empty body in an 
> ‘else’ statement [-Wempty-body]
>; /* Don't record base_addr + expr_offset, it's not a guarding
>^

Ah, indeed (unlike for if (...) /* comment */; ).
So perhaps do if (!cond) /* comment */; else { ... }
instead of if (cond) { ... } else /* comment */; ?
Or put the else stuff into the comment.
I certainly can't find a single occurence of the else {} idiom
in gcc/*.[ch].
But with the multi-line {} I guess I can live with it too.

So, if it passes bootstrap/regtest, ok for trunk, with or without
the above suggested change.

Jakub


Fix profile update in switch conversion

2017-10-06 Thread Jan Hubicka
Hi,
this patch fixes missing profile updat that triggers during profiledbootstrap.

Honza

* tree-switch-conversion.c (do_jump_if_equal, emit_cmp_and_jump_insns):
Update edge counts.

Index: tree-switch-conversion.c
===
--- tree-switch-conversion.c(revision 253444)
+++ tree-switch-conversion.c(working copy)
@@ -2248,10 +2248,12 @@ do_jump_if_equal (basic_block bb, tree o
   edge false_edge = split_block (bb, cond);
   false_edge->flags = EDGE_FALSE_VALUE;
   false_edge->probability = prob.invert ();
+  false_edge->count = bb->count.apply_probability (false_edge->probability);
 
   edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE);
   fix_phi_operands_for_edge (true_edge, phi_mapping);
   true_edge->probability = prob;
+  true_edge->count = bb->count.apply_probability (true_edge->probability);
 
   return false_edge->dest;
 }
@@ -2291,10 +2293,12 @@ emit_cmp_and_jump_insns (basic_block bb,
   edge false_edge = split_block (bb, cond);
   false_edge->flags = EDGE_FALSE_VALUE;
   false_edge->probability = prob.invert ();
+  false_edge->count = bb->count.apply_probability (false_edge->probability);
 
   edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE);
   fix_phi_operands_for_edge (true_edge, phi_mapping);
   true_edge->probability = prob;
+  true_edge->count = bb->count.apply_probability (true_edge->probability);
 
   return false_edge->dest;
 }


[C PATCH] Fix -Wtautological-compare (PR c/82437; #2)

2017-10-06 Thread Jakub Jelinek
On Thu, Oct 05, 2017 at 11:05:57PM +0200, Marek Polacek wrote:
> On Thu, Oct 05, 2017 at 10:34:26PM +0200, Jakub Jelinek wrote:
> > Hi!
> > 
> > In warn_tautological_bitwise_comparison, there is even a comment
> > mentioning the fact that the types of the two constants might not be the
> > same (it is called with the original arguments before they are promoted
> > to common type for the comparison).
> > 
> > On the following testcase, one of the constants was unsigned (because
> > it has been converted to that when anded with unsigned variable), while
> > the other was signed and wi::to_widest (bitopcst) & wi::to_widest (cst)
> > wasn't sign-extended from 32-bit (because bitopcst was unsigned),
> > while wi::to_widest (cst) was (because cst was int), so we warned even
> > when we shouldn't.
> > 
> > The following patch instead uses the precision of the larger type and
> > zero extends from that (because we really don't need to care about bits
> > beyond that precision).
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Looks ok, thanks.

Actually, after committing it I've realized it was wrong.  Sorry for that.

By forcing UNSIGNED sign in wide_int::from it forces zero extension from the
precision of the tree to prec, but if there are any bits in between, we
need extension according to the sign of the constant.  As we create the
wide_int with prec, comparison of the wide_ints only looks at the low prec
bits (or, if both wide_ints are guaranteed to be sign-extended from that
the bits above will match too).
So, the right thing to do is wide_int::from (cst, prec, TYPE_SIGN (TREE_TYPE 
(cst)))
which is wi::to_wide (cst, prec).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-10-06  Jakub Jelinek  

PR c/82437
* c-warn.c (warn_tautological_bitwise_comparison): Use wi::to_wide
instead of wide_int::from.

* c-c++-common/Wtautological-compare-7.c: New test.

--- gcc/c-family/c-warn.c.jj2017-10-06 09:05:41.0 +0200
+++ gcc/c-family/c-warn.c   2017-10-06 09:28:41.597944319 +0200
@@ -362,8 +362,8 @@ warn_tautological_bitwise_comparison (lo
   int prec = MAX (TYPE_PRECISION (TREE_TYPE (cst)),
  TYPE_PRECISION (TREE_TYPE (bitopcst)));
 
-  wide_int bitopcstw = wide_int::from (bitopcst, prec, UNSIGNED);
-  wide_int cstw = wide_int::from (cst, prec, UNSIGNED);
+  wide_int bitopcstw = wi::to_wide (bitopcst, prec);
+  wide_int cstw = wi::to_wide (cst, prec);
 
   wide_int res;
   if (TREE_CODE (bitop) == BIT_AND_EXPR)
--- gcc/testsuite/c-c++-common/Wtautological-compare-7.c.jj 2017-10-06 
09:26:04.812942462 +0200
+++ gcc/testsuite/c-c++-common/Wtautological-compare-7.c2017-10-06 
09:25:57.545035088 +0200
@@ -0,0 +1,11 @@
+/* PR c/82437 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options "-Wtautological-compare" } */
+
+int
+foo (unsigned long long int x)
+{
+  if ((x | 0x19000ULL) != -1879048192) /* { dg-bogus "bitwise 
comparison always evaluates to" } */
+return 0;
+  return 1;
+}


Jakub


[PATCH] Implement smart multiple switch expansion algorithms.

2017-10-06 Thread Martin Liška
Hello.

As presented at this year's Cauldron, I rewrote current switch expansion to 
support
multiple algorithms (jump table and bit-test) that can be used just for a 
fraction
of cases. Balanced decision tree is built on top of that. I decided to do a 
bigger
refactoring and put all there 3 mentioned algorithm to its own class.

There's a bigger change in jump_table_cluster::can_be_handled where the 
constant 10 (3 respectively)
is compared to number of handled values, and not number of cases. Later one is 
wrong in my opinion.

There are some numbers for cc1plus:

$ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
 VM SIZE  FILE SIZE
 ++ GROWING++
   +19% +1.32Mi .rodata+1.32Mi   +19%
  [ = ]   0 .symtab+7.38Ki  +0.5%
  [ = ]   0 .strtab+5.18Ki  +0.2%
  [ = ]   0 .debug_info   +743  +0.0%
  +0.0%+712 .eh_frame +712  +0.0%
  +0.1%+440 .eh_frame_hdr +440  +0.1%
  [ = ]   0 .debug_aranges +80  +0.1%
  +0.0% +67 .dynstr+67  +0.0%
  [ = ]   0 .debug_str  +6  +0.0%

 -- SHRINKING  --
  -1.2%  -214Ki .text   -214Ki  -1.2%
  [ = ]   0 .debug_loc -66.7Ki  -0.1%
  [ = ]   0 .debug_line-14.0Ki  -0.2%
  [ = ]   0 .debug_ranges  -9.56Ki  -0.1%
  -6.8%  -3 [Unmapped]-375 -14.4%
  [ = ]   0 .debug_abbrev  -46  -0.0%

  +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%

So it growth and can be easily explained:

insn-attrtab.o:
 VM SIZE  FILE SIZE
 ++ GROWING++
  [ = ]   0 .rela.rodata   +2.00Mi  +215%
  +214%  +682Ki .rodata +682Ki  +214%
  [ = ]   0 .rela.debug_loc+29.3Ki   +36%
   +32% +1.91Ki .eh_frame  +1.91Ki   +32%
  [ = ]   0 .debug_loc +37  +5.6%
  [ = ]   0 .debug_str  +2  +0.0%

 -- SHRINKING  --
 -50.1% -63.3Ki .text  -63.3Ki -50.1%
  [ = ]   0 .debug_line-1.71Ki -10.4%
  [ = ]   0 .rela.debug_info  -768  -0.3%
  [ = ]   0 .rela.text-624  -0.8%
  [ = ]   0 .rela.debug_ranges-384  -2.7%
  [ = ]   0 .debug_info-87  -0.2%
  [ = ]   0 [Unmapped]  -2  -8.7%

  +137%  +621Ki TOTAL  +2.63Mi  +139%

It's caused by:
...
;; GIMPLE switch case clusters: JT(2710):-1-5090 
;; GIMPLE switch case clusters: JT(2710):-1-5090 
;; GIMPLE switch case clusters: JT(2967):-1-5090 
;; GIMPLE switch case clusters: JT(1033):-1-5017 
...

so there are many switch statements with very reasonable density.

and
insn-dfatab.o:+13% +99.4Ki TOTAL   +455Ki   +14%
insn-latencytab.o:   +19%  +136Ki TOTAL   +609Ki   +20%

There shouldn't be any fallout other from what I mentioned in previous email 
that is
a prerequisite for this patch.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

I'm still planning to come with some numbers and stats. Will do that next week.

Martin


>From db7ad52eb5d65e41d617fb3ff10ab1bc2cef8d04 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 20 Sep 2017 15:22:04 +0200
Subject: [PATCH] Implement smart multiple switch expansion algorithms.

gcc/ChangeLog:

2017-09-29  Martin Liska  

	* tree-switch-conversion.c (MAX_CASE_BIT_TESTS): Remove and move
	to header file.
	(hoist_edge_and_branch_if_true): Move to ...
	(bit_test_cluster::hoist_edge_and_branch_if_true): ... this.
	(expand_switch_using_bit_tests_p): Remove.
	(struct case_bit_test): Move to header file.
	(case_bit_test_cmp): Rename to case_bit_test::cmp.
	(emit_case_bit_tests): Move to bit_test_cluster::emit.
	(struct switch_conv_info): Transform to switch_conversion class.
	(collect_switch_conv_info): Move to ...
	(switch_conversion::collect): ... this.
	(check_range): Move to ...
	(switch_conversion::check_range): ... this.
	(check_all_empty_except_final): Likewise.
	(switch_conversion::check_all_empty_except_final): Likewise.
	(check_final_bb): Likewise.
	(switch_conversion::check_final_bb): Likewise.
	(create_temp_arrays): Likewise.
	(switch_conversion::create_temp_arrays): Likewise.
	(free_temp_arrays): Remove.
	(gather_default_values): Move to ...
	(switch_conversion::gather_default_values): ... this.
	(build_constructors): Likewise.
	(switch_conversion::build_constructors): Likewise.
	(array_value_type): Likewise.
	(switch_conversion::array_value_type): Likewise.
	(build_one_array): Likewise.
	(switch_conversion::build_one_array): Likewise.
	(build_arrays): Likewise.
	(switch_conversion::build_arrays): Likewise.
	(gen_def_assigns): Likewise.
	(switch_conversion::gen_def_assigns): Likewise.
	(prune_bbs): Likewise.
	(switch_conversion::prune_bbs): Likewise.
	(fix_phi_nodes): Likewise.
	(switch_conversion::fix_phi_nodes): Likewise.
	(gen_inbound_check): Likewise.
	(switch_conversion::gen_inbound_check): Likewise.
	(construct

[C++ Patch] PR 47791

2017-10-06 Thread Paolo Carlini

Hi,

this is a very old issue, with no impact on the functionality, which I 
already analyzed a bit a while ago: submitter noticed that 
finish_function & co could be cleaned up a bit wrt the constants passed 
in the flags parameter. When I had again a look today cleaning up the 
code appeared very straightforward:  just change the finish_function 
parameter to bool type and adjust the callers. Interestingly, as a 
consequence, cp_parser_function_definition_after_declarator can be also 
simplified, because ctor_initializer_p was only used for the 
finish_function call. Likewise the helper functions called by 
cp_parser_function_definition_after_declarator itself, which were just 
preparing ctor_initializer_p.


Tested x86_64-linux.

Thanks, Paolo.



2017-10-06  Paolo Carlini  

PR c++/47791
* decl.c (finish_function): Take a bool intead of an int; adjust.
* cp-tree.h (finish_function): Adjust declaration.
* decl2.c (generate_tls_wrapper, finish_objects,
finish_static_storage_duration_function): Adjust calls.
* lambda.c (maybe_add_lambda_conv_op, finish_lambda_function):
Likewise.
* method.c (synthesize_method): Likewise.
* optimize.c (maybe_thunk_body, maybe_clone_body): Likewise.
* pt.c (instantiate_decl): Likewise.
* parser.c (cp_parser_function_definition_after_declarator,
cp_parser_late_parsing_for_member, cp_parser_omp_declare_reduction):
Likewise.
(cp_parser_ctor_initializer_opt,
cp_parser_ctor_initializer_opt_and_function_body,
cp_parser_function_try_block,
cp_parser_function_definition_after_declarator,
cp_parser_function_transaction): Return void; adjust declarations.
Index: cp-tree.h
===
--- cp-tree.h   (revision 253483)
+++ cp-tree.h   (working copy)
@@ -6106,7 +6106,7 @@ extern bool start_function
(cp_decl_specifier_se
 extern tree begin_function_body(void);
 extern void finish_function_body   (tree);
 extern tree outer_curly_brace_block(tree);
-extern tree finish_function(int);
+extern tree finish_function(bool);
 extern tree grokmethod (cp_decl_specifier_seq *, const 
cp_declarator *, tree);
 extern void maybe_register_incomplete_var  (tree);
 extern void maybe_commonize_var(tree);
Index: decl.c
===
--- decl.c  (revision 253483)
+++ decl.c  (working copy)
@@ -7824,7 +7824,7 @@ start_cleanup_fn (void)
 static void
 end_cleanup_fn (void)
 {
-  expand_or_defer_fn (finish_function (0));
+  expand_or_defer_fn (finish_function (/*inline_p=*/false));
 
   pop_from_top_level ();
 }
@@ -15456,20 +15456,16 @@ maybe_save_function_definition (tree fun)
 
 /* Finish up a function declaration and compile that function
all the way to assembler language output.  The free the storage
-   for the function definition.
+   for the function definition. INLINE_P is TRUE if we just
+   finished processing the body of an in-class inline function
+   definition.  (This processing will have taken place after the
+   class definition is complete.)  */
 
-   FLAGS is a bitwise or of the following values:
- 2 - INCLASS_INLINE
-   We just finished processing the body of an in-class inline
-   function definition.  (This processing will have taken place
-   after the class definition is complete.)  */
-
 tree
-finish_function (int flags)
+finish_function (bool inline_p)
 {
   tree fndecl = current_function_decl;
   tree fntype, ctype = NULL_TREE;
-  int inclass_inline = (flags & 2) != 0;
 
   /* When we get some parse errors, we can end up without a
  current_function_decl, so cope.  */
@@ -15729,7 +15725,7 @@ tree
  bindings for the template parameters that we added in
  maybe_begin_member_template_processing when start_function was
  called.  */
-  if (inclass_inline)
+  if (inline_p)
 maybe_end_member_template_processing ();
 
   /* Leave the scope of the class.  */
Index: decl2.c
===
--- decl2.c (revision 253483)
+++ decl2.c (working copy)
@@ -3328,7 +3328,7 @@ generate_tls_wrapper (tree fn)
 TREE_READONLY (fn) = true;
   finish_return_stmt (convert_from_reference (var));
   finish_function_body (body);
-  expand_or_defer_fn (finish_function (0));
+  expand_or_defer_fn (finish_function (/*inline_p=*/false));
 }
 
 /* Start the process of running a particular set of global constructors
@@ -3395,7 +3395,7 @@ finish_objects (int method_type, int initp, tree b
 
   /* Finish up.  */
   finish_compound_stmt (body);
-  fn = finish_function (0);
+  fn = finish_function (/*inline_p=*/false);
 
   if (method_type == 'I')
 {
@@ -3535,7 +3535,7 @@ fin

Re: Fix profile update in switch conversion

2017-10-06 Thread Martin Liška
On 10/06/2017 02:18 PM, Jan Hubicka wrote:
> Hi,
> this patch fixes missing profile updat that triggers during profiledbootstrap.
> 
> Honza

Thanks for the fix ;) I've just send patch for more complex switch lowering.
It's maybe present also there. Will check.

Martin


Re: [PATCH][GCC][ARM] Dot Product commandline options [Patch (1/8)]

2017-10-06 Thread Tamar Christina
Hi All,

This is a respin of the patch with the feedback processed.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/
2017-10-06  Tamar Christina  

* config/arm/arm.h (TARGET_DOTPROD): New.
* config/arm/arm.c (arm_arch_dotprod): New.
(arm_option_reconfigure_globals): Add arm_arch_dotprod.
* config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
* config/arm/arm-cpus.in (cortex-a55, cortex-75): Enabled +dotprod.
(armv8.2-a, cortex-a75.cortex-a55): Likewise.
(feature dotprod, group dotprod, ALL_SIMD_INTERNAL): New.
(ALL_FPU_INTERNAL): Use ALL_SIMD_INTERNAL.
* config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
* doc/invoke.texi (armv8.2-a): Document dotprod

From: Kyrill Tkachov 
Sent: Wednesday, September 13, 2017 11:01:55 AM
To: Tamar Christina; gcc-patches@gcc.gnu.org
Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com
Subject: Re: [PATCH][GCC][ARM] Dot Product commandline options [Patch (1/8)]

Hi Tamar,

On 01/09/17 14:19, Tamar Christina wrote:
> Hi All,
>
> This patch adds support for the +dotprod extension to ARM.
> Dot Product requires Adv.SIMD to work and so enables this option
> by default when enabled.
>
> It is available from ARMv8.2-a and onwards and is enabled by
> default on Cortex-A55 and Cortex-A75.
>
> Regtested and bootstrapped on arm-none-eabi and no issues.

I'm assuming you mean arm-none-linux-gnueabihf :)

> Ok for trunk?
>
> gcc/
> 2017-09-01  Tamar Christina  
>
>   * config/arm/arm.h (TARGET_DOTPROD): New.
>   * config/arm/arm.c (arm_arch_dotprod): New.
>   (arm_option_reconfigure_globals): Add arm_arch_dotprod.
>   * config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
>   * config/arm/arm-cpus.in (cortex-a55, cortex-75): Enabled +dotprod.
>   (armv8.2-a, cortex-a75.cortex-a55): Likewise.
>   * config/arm/arm-isa.h (isa_bit_dotprod, ISA_DOTPROD): New.

arm-isa.h is now autogenerated after r251799 so you'll need to rebase on
top of that.
That being said, that patch was temporarily reverted [1] so you'll have
to apply it manually in your
tree to rebase, or wait until it is reapplied.

[1] https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00579.html

The patch looks ok to me otherwise with a documentation nit below.

>   * config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
>   * doc/invoke.texi (armv8.2-a): Document dotprod
>

--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15492,6 +15492,10 @@ The ARMv8.1 Advanced SIMD and floating-point 
instructions.
  The cryptographic instructions.  This also enables the Advanced SIMD and
  floating-point instructions.

+@item +dotprod
+Enable the Dot Product extension.  This also enables Advanced SIMD instructions
+and allows auto vectorization of dot products to the Dot Product instructions.

This should be "auto-vectorization"

Thanks,
Kyrill



diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 55472434c3a6e90c5693bbaabd3265f7d968787f..295f03bf8ee02be7c89ed2967d283be206e9f25a 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -73,6 +73,7 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   def_or_undef_macro (pfile, "__ARM_FEATURE_QRDMX", TARGET_NEON_RDMA);
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_CRC32", TARGET_CRC32);
+  def_or_undef_macro (pfile, "__ARM_FEATURE_DOTPROD", TARGET_DOTPROD);
   def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT);
 
   cpp_undef (pfile, "__ARM_FEATURE_CMSE");
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 07de4c9375ba7a0df0d8bd00385e54a4042e5264..a49f7aa83c14d7505eede81d52143f425d5a2f0c 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -156,6 +156,8 @@ define feature crypto
 # FP16 data processing (half-precision float).
 define feature fp16
 
+# Dot Product instructions extension to ARMv8.2-a.
+define feature dotprod
 
 # ISA Quirks (errata?).  Don't forget to add this to the fgroup
 # ALL_QUIRKS below.
@@ -181,12 +183,14 @@ define fgroup ALL_CRYPTO	crypto
 
 # List of all SIMD bits to strip out if SIMD is disabled.  This does
 # strip off 32 D-registers, but does not remove support for
-# double-precision FP.
-define fgroup ALL_SIMD	fp_d32 neon ALL_CRYPTO
+# double-precision FP. Make sure bits that are not an FPU bit go instructions
+# ALL_SIMD instead of ALL_SIMD_INTERNAL.
+define fgroup ALL_SIMD_INTERNAL	fp_d32 neon ALL_CRYPTO
+define fgroup ALL_SIMD	ALL_SIMD_INTERNAL dotprod
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD
+define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
 
 # Similarly, but including fp16 and other extensions that aren't part of
 # -mfpu s

Re: [PATCH][GCC][AArch64] Dot Product commandline options [Patch (4/8)]

2017-10-06 Thread Tamar Christina
Hi All,

this is the respin with the rewording as requested. Assuming still OK for trunk.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/
2017-10-06  Tamar Christina  

* config/aarch64/aarch64.h (AARCH64_FL_DOTPROD): New.
(AARCH64_ISA_DOTPROD, TARGET_DOTPROD): New.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add 
TARGET_DOTPROD.
* config/aarch64/aarch64-option-extensions.def (dotprod): New.
* config/aarch64/aarch64-cores.def (cortex-a55, cortex-a75): Enable 
TARGET_DOTPROD.
(cortex-a75.cortex-a55): Likewise.
* doc/invoke.texi (aarch64-feature-modifiers): Document dotprod.

From: James Greenhalgh 
Sent: Monday, September 4, 2017 11:47:03 AM
To: Tamar Christina
Cc: gcc-patches@gcc.gnu.org; nd; Richard Earnshaw; Marcus Shawcroft
Subject: Re: [PATCH][GCC][AArch64] Dot Product commandline options [Patch (4/8)]

On Fri, Sep 01, 2017 at 02:20:59PM +0100, Tamar Christina wrote:
> Hi All,
>
> This patch adds support for the +dotprod extension to AArch64.
> Dot Product requires Adv.SIMD to work and so enables this option
> by default when enabled.
>
> It is available from ARMv8.2-a and onwards and is enabled by
> default on Cortex-A55 and Cortex-A75.
>
> Regtested and bootstrapped on aarch64-none-elf and no issues.
>
> Ok for trunk?

Just a couple of rewordings needed, and then OK.

> gcc/
> 2017-09-01  Tamar Christina  
>
>   * config/aarch64/aarch64.h (AARCH64_FL_DOTPROD): New.
>   (AARCH64_ISA_DOTPROD, TARGET_DOTPROD): New.
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add 
> TARGET_DOTPROD.
>   * config/aarch64/aarch64-option-extensions.def (dotprod): New.
>   * config/aarch64/aarch64-cores.def (cortex-a55, cortex-a75): Enable 
> TARGET_DOTPROD.
>   (cortex-a75.cortex-a55): Likewise.
>   * doc/invoke.texi (aarch64-feature-modifiers): Document dotprod.
>
> --
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -151,7 +151,8 @@ extern unsigned aarch64_architecture_version;
>  #define AARCH64_FL_F16 (1 << 9)  /* Has ARMv8.2-A FP16 
> extensions.  */
>  /* ARMv8.3-A architecture extensions.  */
>  #define AARCH64_FL_V8_3(1 << 10)  /* Has ARMv8.3-A features.  */
> -#define AARCH64_FL_RCPC(1 << 11)  /* Has support for RCpc model. 
>  */
> +#define AARCH64_FL_RCPC   (1 << 11)  /* Has support for RCpc model.  */
> +#define AARCH64_FL_DOTPROD(1 << 12)  /* Has dot product.  */

Are these correctly formatted with the line above? "Has dot product" is not
very decsriptive.

>  /* ARMv8.3-A features.  */
>  #define TARGET_ARMV8_3   (AARCH64_ISA_V8_3)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 4cb5836a9da22681d192c3750fc8e5a50024ac10..61fbc087f4974c0eb833c2daa131a2f7269d1b84
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14271,6 +14271,9 @@ Enable FP16 extension.  This also enables 
> floating-point instructions.
>  Enable the RcPc extension.  This does not change code generation from GCC,
>  but is passed on to the assembler, enabling inline asm statements to use
>  instructions from the RcPc extension.
> +@item dotprod
> +Enable the Dot Product extension.  This also enables Advanced SIMD 
> instructions
> +and allows auto vectorization of dot products to the Dot Product 
> instructions.

I'd drop the text from "and allows" onwards, it isn't very useful for
figuring out exactly what idioms will be supported, and we don't use that
text on other extensions.

Thanks,
James

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 177e638682f9dae3476b76e48a2d96c70d5acbd1..c7d866f3b567bbb55bf2c5152c9d0729fc2eff2c 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -106,6 +106,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
 
 
   aarch64_def_or_undef (TARGET_CRC32, "__ARM_FEATURE_CRC32", pfile);
+  aarch64_def_or_undef (TARGET_DOTPROD, "__ARM_FEATURE_DOTPROD", pfile);
 
   cpp_undef (pfile, "__AARCH64_CMODEL_TINY__");
   cpp_undef (pfile, "__AARCH64_CMODEL_SMALL__");
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 10893324d3fd856ba60247fd1a48c56d0cf2fc39..16e44855872112c81db349e098f932edd52117be 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -83,8 +83,8 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR
 /* ARMv8.2-A Architecture Processors.  */
 
 /* ARM ('A') cores. */
-AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1)
-AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1)
+AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AA

Re: [PATCH][GCC][ARM] Dot Product NEON patterns [Patch (2/8)]

2017-10-06 Thread Tamar Christina
Hi All,

this is a minor respin with changes echo'd from feedback from aarch64.
I assume still OK for trunk.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/
2017-10-06  Tamar Christina  

* config/arm/arm-builtins.c (arm_unsigned_uternop_qualifiers): New.
(UTERNOP_QUALIFIERS, arm_umac_lane_qualifiers, UMAC_LANE_QUALIFIERS): 
New.
* config/arm/arm_neon_builtins.def (sdot, udot, sdot_lane, udot_lane): 
new.
* config/arm/iterators.md (DOTPROD, VSI2QI, vsi2qi): New.
(UNSPEC_DOT_S, UNSPEC_DOT_U, opsuffix): New.
* config/arm/neon.md (neon_dot): New.
(neon_dot_lane, dot_prod): New.
* config/arm/types.md (neon_dot, neon_dot_q): New.
* config/arm/unspecs.md (sup): Add UNSPEC_DOT_S, UNSPEC_DOT_U.

From: Kyrill Tkachov 
Sent: Wednesday, September 13, 2017 10:36:38 AM
To: Tamar Christina; gcc-patches@gcc.gnu.org
Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com
Subject: Re: [PATCH][GCC][ARM] Dot Product NEON patterns [Patch (2/8)]

Hi Tamar,

On 01/09/17 14:33, Tamar Christina wrote:
> Hi All,
>
> This patch adds the instructions for Dot Product to ARM along
> with the intrinsics and vectorizer pattern.
>
> Armv8.2-a dot product supports 8-bit element values both
> signed and unsigned.
>
> Dot product is available from Armv8.2-a and onwards.
>
> Regtested and bootstrapped on arm-none-eabi and no issues.
>
> Ok for trunk?

This is ok once the prerequisites are approved with one ChangeLog nit.

Kyrill

> gcc/
> 2017-09-01  Tamar Christina  
>
>   * config/arm/arm-builtins.c (arm_unsigned_uternop_qualifiers): New.
>   (UTERNOP_QUALIFIERS, arm_umac_lane_qualifiers, UMAC_LANE_QUALIFIERS): 
> New.
>   * config/arm/arm_neon_builtins.def (sdot, udot, sdot_lane, udot_lane): 
> new.
>   * config/arm/iterators.md (DOTPROD, DOT_MODE, dot_mode): New.
>   (UNSPEC_DOT_S, UNSPEC_DOT_U, opsuffix): New.
>   * config/arm/neon.md (neon_dot): New.
>   (neon_dot_lane, dot_prod): New.
>   * config/arm/types.md (neon_dot, neon_dot_q): New.
>   * config/arm/unspecs.md (UNSPEC_DOT_S, UNSPEC_DOT_U): New.
>

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 
7acbaf1bb40a4f270e75968804546508f7839e49..139e09fd929e17216ad9383505f1453a73d071fb
 100644
--- a/gcc/config/arm/iterators.md

--snip---

  ;;
  ;; Code attributes
  ;;
@@ -816,6 +822,7 @@
(UNSPEC_VSRA_S_N "s") (UNSPEC_VSRA_U_N "u")
(UNSPEC_VRSRA_S_N "s") (UNSPEC_VRSRA_U_N "u")
(UNSPEC_VCVTH_S "s") (UNSPEC_VCVTH_U "u")
+  (UNSPEC_DOT_S "s") (UNSPEC_DOT_U "u")
  ])

In your ChangeLog you list this as "New" whereas your patch just adds them to 
the "sup" int_attr.

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 569f960fd2e534df6e972245b35ae6def5bec033..6d1b20c80f9a24a8d26a10bea6f6e316886bce31 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -105,6 +105,13 @@ arm_ternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none, qualifier_none };
 #define TERNOP_QUALIFIERS (arm_ternop_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigned T).  */
+static enum arm_type_qualifiers
+arm_unsigned_uternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+  qualifier_unsigned };
+#define UTERNOP_QUALIFIERS (arm_unsigned_uternop_qualifiers)
+
 /* T (T, immediate).  */
 static enum arm_type_qualifiers
 arm_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -131,6 +138,13 @@ arm_mac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   qualifier_none, qualifier_lane_index };
 #define MAC_LANE_QUALIFIERS (arm_mac_lane_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigend T, lane index).  */
+static enum arm_type_qualifiers
+arm_umac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+  qualifier_unsigned, qualifier_lane_index };
+#define UMAC_LANE_QUALIFIERS (arm_umac_lane_qualifiers)
+
 /* T (T, T, immediate).  */
 static enum arm_type_qualifiers
 arm_ternop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index 07f0368343a0c940c1cc1848d31f28a47a587b6f..982eec810dafb5ec955273099853f8842020d104 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -331,3 +331,7 @@ VAR11 (STORE1, vst4,
 	v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf)
 VAR9 (STORE1LANE, vst4_lane,
 	v8qi, v4hi, v4hf, v2si, v2sf, v8hi, v8hf, v4si, v4sf)
+VAR2 (TERNOP, sdot, v8qi, v16qi)
+VAR2 (UTERNOP, udot, v8qi, v16qi)
+VAR2 (MAC_LANE, sdot_lane, v8qi, v16qi)
+VAR2

Re: [PATCH][GCC][Testsuite][ARM][AArch64] Enable Dot Product for generic tests for ARM and AArch64 [Patch (7/8)]

2017-10-06 Thread Tamar Christina
Hi All,

this is a respin with the changes suggested. Note that this patch is no 8/8 in 
the series.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/testsuite
2017-10-06  Tamar Christina  

* gcc.dg/vect/vect-reduc-dot-s8a.c
(dg-additional-options, dg-require-effective-target): Add +dotprod.
* gcc.dg/vect/vect-reduc-dot-u8a.c
(dg-additional-options, dg-require-effective-target): Add +dotprod.

From: Tamar Christina
Sent: Monday, September 4, 2017 12:35:39 PM
To: James Greenhalgh
Cc: gcc-patches@gcc.gnu.org; nd; Richard Earnshaw; Marcus Shawcroft
Subject: RE: [PATCH][GCC][Testsuite][ARM][AArch64] Enable Dot Product for 
generic tests for ARM and AArch64 [Patch (7/8)]

> I'm surprised that this worked!
>
> It looks like you unconditionally add the -march=armv8.2-a+dotprod options,
> which should cause you to generate instructions which will not execute on
> targets which don't support this instruction. As far as I can see, this is an
> execute test, so that should cause undefined instruction exceptions on an
> Armv8-A target at the very least.

It's not, there is no dg-do specified, which means it defaults to "compile"
This is a straight compilation tests that checks to see if the target can do the
reduction. There may be a main, but it's never executed, which is why I don't
have a hardware check against it.

The unconditional armv8.2+dotprod is for this reason. It doesn't matter what 
hardware.

>
> So, not OK in its current form.
>
> Thanks,
> James
>
> >
> > Ok for trunk?
> >
> > gcc/testsuite
> > 2017-09-01  Tamar Christina  
> >
> > * gcc.dg/vect/vect-reduc-dot-s8a.c
> > (dg-additional-options, dg-require-effective-target): Add +dotprod.
> > * gcc.dg/vect/vect-reduc-dot-u8a.c
> > (dg-additional-options, dg-require-effective-target): Add +dotprod.
> >
> > --

diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
index dc4f52019d5435edbbc811b73dee0f98ff44c1b1..acb6862f8274fb954f69bd45e8edeedcdca4cbf7 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
@@ -1,4 +1,7 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_dotprod_neon }  */
 
 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
index f3cc6c78c25305d91becd585be8949514ebc521c..c23fe5df252b94d6722708096b8aba7edd100f1a 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
@@ -1,4 +1,7 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-add-options arm_v8_2a_dotprod_neon }  */
 
 #include 
 #include "tree-vect.h"



Re: [PATCH][GCC][AArch64] Dot Product SIMD patterns [Patch (5/8)]

2017-10-06 Thread Tamar Christina
Hi All,

This is a respin with the feedback suggested.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/
2017-10-06  Tamar Christina  

* config/aarch64/aarch64-builtins.c
(aarch64_types_quadopu_lane_qualifiers): New.
(TYPES_QUADOPU_LANE): New.
* config/aarch64/aarch64-simd.md (aarch64_dot): New.
(dot_prod, aarch64_dot_lane): New.
(aarch64_dot_laneq): New.
* config/aarch64/aarch64-simd-builtins.def (sdot, udot): New.
(sdot_lane, udot_lane, sdot_laneq, udot_laneq): New.
* config/aarch64/iterators.md (sur): Add UNSPEC_SDOT, UNSPEC_UDOT.
(Vdottype, DOTPROD): New.
(sur): Add SDOT and UDOT.

From: Tamar Christina
Sent: Tuesday, September 5, 2017 7:42:40 PM
To: James Greenhalgh
Cc: gcc-patches@gcc.gnu.org; nd; Richard Earnshaw; Marcus Shawcroft
Subject: Re: [PATCH][GCC][AArch64] Dot Product SIMD patterns [Patch (5/8)]

>
> 
> From: James Greenhalgh 
> Sent: Monday, September 4, 2017 12:01 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd; Richard Earnshaw; Marcus Shawcroft
> Subject: Re: [PATCH][GCC][AArch64] Dot Product SIMD patterns [Patch (5/8)]
>
> On Fri, Sep 01, 2017 at 02:22:17PM +0100, Tamar Christina wrote:
> > Hi All,
> >
> > This patch adds the instructions for Dot Product to AArch64 along
> > with the intrinsics and vectorizer pattern.
> >
> > Armv8.2-a dot product supports 8-bit element values both
> > signed and unsigned.
> >
> > Dot product is available from Arm8.2-a and onwards.
> >
> > Regtested and bootstrapped on aarch64-none-elf and no issues.
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-09-01  Tamar Christina  
> >
> >   * config/aarch64/aarch64-builtins.c
> >   (aarch64_types_quadopu_lane_qualifiers): New.
> >   (TYPES_QUADOPU_LANE): New.
> >   * config/aarch64/aarch64-simd.md (aarch64_dot): New.
> >   (dot_prod, aarch64_dot_lane): New.
> >   (aarch64_dot_laneq): New.
> >   * config/aarch64/aarch64-simd-builtins.def (sdot, udot): New.
> >   (sdot_lane, udot_lane, sdot_laneq, udot_laneq): New.
> >   * config/aarch64/iterators.md (UNSPEC_SDOT, UNSPEC_UDOT): New.
> >   (DOT_MODE, dot_mode, Vdottype, DOTPROD): New.
> >   (sur): Add SDOT and UDOT.
> >
> > --
>
> > diff --git a/gcc/config/aarch64/aarch64-simd.md 
> > b/gcc/config/aarch64/aarch64-simd.md
> > index 
> > f3e084f8778d70c82823b92fa80ff96021ad26db..21d46c84ab317c2d62afdf8c48117886aaf483b0
> >  100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -386,6 +386,87 @@
> >  }
> >  )
> >
> > +;; These instructions map to the __builtins for the Dot Product operations.
> > +(define_insn "aarch64_dot"
> > +  [(set (match_operand:VS 0 "register_operand" "=w")
> > + (unspec:VS [(match_operand:VS 1 "register_operand" "0")
> > + (match_operand: 2 "register_operand" "w")
> > + (match_operand: 3 "register_operand" "w")]
> > + DOTPROD))]
> > +  "TARGET_DOTPROD"
> > +  "dot\\t%0., %2., %3."
> > +  [(set_attr "type" "neon_dot")]
>
> Would there be a small benefit in modelling this as:
>
>   [(set (match_operand:VS 0 "register_operand" "=w")
> (add:VS ((match_operand:VS 1 "register_operand" "0")
>  (unsepc:VS [(match_operand: 2 "register_operand" 
> "w")
> (match_operand: 3 "register_operand" "w")]
> DOTPROD)))]
>

Maybe, I can't think of anything at the moment, but it certainly won't hurt.

>
> > +)
> > +
> > +;; These expands map to the Dot Product optab the vectorizer checks for.
> > +;; The auto-vectorizer expects a dot product builtin that also does an
> > +;; accumulation into the provided register.
> > +;; Given the following pattern
> > +;;
> > +;; for (i=0; i > +;; c = a[i] * b[i];
> > +;; r += c;
> > +;; }
> > +;; return result;
> > +;;
> > +;; This can be auto-vectorized to
> > +;; r  = a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
> > +;;
> > +;; given enough iterations.  However the vectorizer can keep unrolling the 
> > loop
> > +;; r += a[4]*b[4] + a[5]*b[5] + a[6]*b[6] + a[7]*b[7];
> > +;; r += a[8]*b[8] + a[9]*b[9] + a[10]*b[10] + a[11]*b[11];
> > +;; ...
> > +;;
> > +;; and so the vectorizer provides r, in which the result has to be 
> > accumulated.
> > +(define_expand "dot_prod"
> > +  [(set (match_operand:VS 0 "register_operand")
> > + (unspec:VS [(match_operand: 1 "register_operand")
> > + (match_operand: 2 "register_operand")
> > + (match_operand:VS 3 "register_operand")]
> > + DOTPROD))]
>
> This is just an expand that always ends in a DONE, so doesn't need the
> full description here, just:
>
>   [(match_operand:VS 0 "register_operand)
>(match_operand: 1 "register_operand")
>(match_operand: 2 "register_operand")
>   

[PATCH] Add -fsanitize=pointer-{compare,subtract}.

2017-10-06 Thread Martin Liška
Hi.

Adding a missing functionality mentioned and explained here:
https://github.com/google/sanitizers/wiki/AddressSanitizerClangVsGCC-(5.0-vs-7.1)#feature-8

Currently it only works for heap allocated variables. I'm working on support 
for stack and global
variables.

The functionality is not included in -fsanitize=address by default, one needs 
to explicitly ask
for both instrumentation and enabling in run-time. It's expensive check.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-10-06  Martin Liska  

* asan.c (is_pointer_compare_opcode): New function.
(instrument_pointer_comparison): Likewise.
(asan_instrument): Handle SANITIZE_POINTER_COMPARE and
SANITIZE_POINTER_SUBTRACT.
(gate_asan): Likewise.
* doc/invoke.texi: Document the options.
* flag-types.h (enum sanitize_code): Add
SANITIZE_POINTER_COMPARE and SANITIZE_POINTER_SUBTRACT.
* gcc.c (SANITIZER_EARLY_SPEC): Handle
-fsanitize=pointer-compare and -fsanitize=pointer-subtract.
(SANITIZER_SPEC): Likewise.
(sanitize_spec_function): Likewise.
* opts.c (finish_options): Add checking for the options.
* sanitizer.def (BUILT_IN_ASAN_POINTER_COMPARE): New builtin.
(BUILT_IN_ASAN_POINTER_SUBTRACT): Likewise.
* toplev.c (process_options):  Handle SANITIZE_POINTER_COMPARE and
SANITIZE_POINTER_SUBTRACT.

gcc/testsuite/ChangeLog:

2017-10-06  Martin Liska  

* gcc.dg/asan/pointer-compare-1.c: New test.
* gcc.dg/asan/pointer-compare-2.c: New test.
* gcc.dg/asan/pointer-subtract-1.c: New test.
* gcc.dg/asan/pointer-subtract-2.c: New test.
---
 gcc/asan.c | 104 -
 gcc/doc/invoke.texi|  20 +
 gcc/flag-types.h   |   2 +
 gcc/gcc.c  |  12 ++-
 gcc/opts.c |  10 +++
 gcc/sanitizer.def  |   4 +
 gcc/testsuite/gcc.dg/asan/pointer-compare-1.c  |  31 
 gcc/testsuite/gcc.dg/asan/pointer-compare-2.c  |  19 +
 gcc/testsuite/gcc.dg/asan/pointer-subtract-1.c |  31 
 gcc/testsuite/gcc.dg/asan/pointer-subtract-2.c |  19 +
 gcc/toplev.c   |   4 +-
 11 files changed, 253 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/pointer-compare-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/pointer-compare-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/pointer-subtract-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asan/pointer-subtract-2.c


diff --git a/gcc/asan.c b/gcc/asan.c
index 2aa0a795af2..6feea017795 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2370,6 +2370,104 @@ maybe_instrument_call (gimple_stmt_iterator *iter)
   return instrumented;
 }
 
+/* Return true if a given opcode CODE is potentially a non-valid comparison
+   of pointer types.  */
+
+static bool
+is_pointer_compare_opcode (tree_code code)
+{
+  return (code == LE_EXPR || code == LT_EXPR || code == GE_EXPR
+	  || code == GT_EXPR);
+}
+
+/* Instrument potential invalid operation executed on pointer types:
+   comparison different from != and == and subtraction of pointers.  */
+
+static void
+instrument_pointer_comparison (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator i;
+
+  bool sanitize_comparison_p = sanitize_flags_p (SANITIZE_POINTER_COMPARE);
+  bool sanitize_subtraction_p = sanitize_flags_p (SANITIZE_POINTER_SUBTRACT);
+
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+	{
+	  gimple *s = gsi_stmt (i);
+
+	  tree ptr1 = NULL_TREE;
+	  tree ptr2 = NULL_TREE;
+	  enum built_in_function fn = BUILT_IN_NONE;
+
+	  if (sanitize_comparison_p)
+	{
+	  if (is_gimple_assign (s)
+		  && gimple_assign_rhs_class (s) == GIMPLE_BINARY_RHS
+		  && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (s)))
+		  && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (s)))
+		  && is_pointer_compare_opcode (gimple_assign_rhs_code (s)))
+		{
+		  ptr1 = gimple_assign_rhs1 (s);
+		  ptr2 = gimple_assign_rhs2 (s);
+		  fn = BUILT_IN_ASAN_POINTER_COMPARE;
+		}
+	  else if (gimple_code (s) == GIMPLE_COND
+		   && POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (s)))
+		   && POINTER_TYPE_P (TREE_TYPE (gimple_cond_rhs (s)))
+		   && is_pointer_compare_opcode (gimple_cond_code (s)))
+		{
+		  ptr1 = gimple_cond_lhs (s);
+		  ptr2 = gimple_cond_rhs (s);
+		  fn = BUILT_IN_ASAN_POINTER_COMPARE;
+		}
+	}
+
+	  if (sanitize_subtraction_p
+	  && is_gimple_assign (s)
+	  && gimple_assign_rhs_class (s) == GIMPLE_BINARY_RHS
+	  && gimple_assign_rhs_code (s) == MINUS_EXPR)
+	{
+	  tree rhs1 = gimple_assign_rhs1 (s);
+	  tree rhs2 = gimple_assign_rhs2 (s);
+
+	  if (TREE_CODE (rhs1) == SSA_NAME
+		  || TREE_CODE (rhs2) == SSA_NAME)
+		{
+		  gassign *def1
+		= dyn_cast(SSA_NAME_DEF

Re: [PATCH][GCC][ARM][AArch64] Testsuite framework changes and execution tests [Patch (8/8)]

2017-10-06 Thread Tamar Christina
Hi All,

this is a minor respin of the patch with the comments addressed. Note this 
patch is now 7/8 in the series.


Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/testsuite
2017-10-06  Tamar Christina  

* lib/target-supports.exp
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): New.
(check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
(add_options_for_arm_v8_2a_dotprod_neon): New.
(check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
(check_effective_target_vect_sdot_qi): New.
(check_effective_target_vect_udot_qi): New.
* gcc.target/arm/simd/vdot-exec.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
* gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.

From: Tamar Christina
Sent: Monday, September 4, 2017 2:01:40 PM
To: Christophe Lyon
Cc: gcc-patches@gcc.gnu.org; nd; James Greenhalgh; Richard Earnshaw; Marcus 
Shawcroft
Subject: RE: [PATCH][GCC][ARM][AArch64] Testsuite framework changes and 
execution tests [Patch (8/8)]

Hi Christophe,

> >
> > gcc/testsuite
> > 2017-09-01  Tamar Christina  
> >
> > * lib/target-supports.exp
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
> New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
> > (add_options_for_arm_v8_2a_dotprod_neon): New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
> > (check_effective_target_vect_sdot_qi): New.
> > (check_effective_target_vect_udot_qi): New.
> > * gcc.target/arm/simd/vdot-exec.c: New.
>
> Aren't you defining twice P() and ARR() in vdot-exec.c ?
> I'd expect a preprocessor error, did I read too quickly?
>

Yes they are defined twice but they're not redefined, all the definitions
are exactly the same so the pre-processor doesn't care. I can leave only
one if this is confusing.

>
> Thanks,
>
> Christophe
>
> > * gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
> > * gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.
> >
> > --
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 56e1b4eb103ab412b29d6dcd9b556515ebc2ac63..c25b0ba2e1a45ea0ce23955f4e87b3e4a2d7f5b0 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1684,6 +1684,17 @@ ARM target supports executing instructions from ARMv8.2 with the FP16
 extension.  Some multilibs may be incompatible with these options.
 Implies arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_scalar_hw.
 
+@item arm_v8_2a_dotprod_neon_ok
+@anchor{arm_v8_2a_dotprod_neon_ok}
+ARM target supports options to generate instructions from ARMv8.2 with
+the Dot Product extension. Some multilibs may be incompatible with these
+options.
+
+@item arm_v8_2a_dotprod_neon_hw
+ARM target supports executing instructions from ARMv8.2 with the Dot
+Product extension. Some multilibs may be incompatible with these options.
+Implies arm_v8_2a_dotprod_neon_ok.
+
 @item arm_prefer_ldrd_strd
 ARM target prefers @code{LDRD} and @code{STRD} instructions over
 @code{LDM} and @code{STM} instructions.
@@ -2290,6 +2301,11 @@ supported by the target; see the
 @ref{arm_v8_2a_fp16_neon_ok,,arm_v8_2a_fp16_neon_ok} effective target
 keyword.
 
+@item arm_v8_2a_dotprod_neon
+Add options for ARMv8.2 with Adv.SIMD Dot Product support, if this is
+supported by the target; see the
+@ref{arm_v8_2a_dotprod_neon_ok} effective target keyword.
+
 @item bind_pic_locally
 Add the target-specific flags needed to enable functions to bind
 locally when using pic/PIC passes in the testsuite.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
new file mode 100644
index ..3e7cd6c2fc22a5e5cdb355e269116636b890a9d3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
@@ -0,0 +1,81 @@
+/* { dg-skip-if "can't compile on arm." { arm*-*-* } } */
+/* { dg-do run { target { aarch64*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw } */
+
+#include 
+
+extern void abort();
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+# define ORDER(x, y) y
+#else
+# define ORDER(x, y) x - y
+#endif
+
+#define P(n1,n2) n1,n1,n1,n1,n2,n2,n2,n2
+#define ARR(nm, p, ty, ...) ty nm##_##p = { __VA_ARGS__ }
+#define TEST(t1, t2, t3, f, r1, r2, n1, n2) \
+	ARR(f, x, t1, r1);		\
+	ARR(f, y, t2, r2);		\
+	t3 f##_##r = {0};		\
+	f##_##r = f (f##_##r, f##_##x, f##_##y);  \
+	if (f##_##r[0] != n1 || f##_##r[1] != n2)   \
+	  abort ();
+
+#define TEST_LANE(t1, t2, t3, f, r1, r2, n1, n2, n3, n4) \
+	ARR(f, x, t1, r1);		\
+	ARR(f, y, t2, r2);		\
+	t3 f##_##rx = {0};		\
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, OR

Re: Let the target choose a vectorisation alignment

2017-10-06 Thread Richard Sandiford
[+arm maintainers]

Christophe Lyon  writes:
> On 18 September 2017 at 15:57, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Mon, Sep 18, 2017 at 1:58 PM, Richard Sandiford
>>>  wrote:
 The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
 there was also a hard-coded assumption that this was equal to the type
 size.  This was inconvenient for SVE for two reasons:

 - When compiling for a specific power-of-2 SVE vector length, we might
   want to align to a full vector.  However, the TYPE_ALIGN is governed
   by the ABI alignment, which is 128 bits regardless of size.

 - For vector-length-agnostic code it doesn't usually make sense to align,
   since the runtime vector length might not be a power of two.  Even for
   power of two sizes, there's no guarantee that aligning to the previous
   16 bytes will be an improveent.

 This patch therefore adds a target hook to control the preferred
 vectoriser (as opposed to ABI) alignment.

 Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
 Also tested by comparing the testsuite assembly output on at least one
 target per CPU directory.  OK to install?
>>>
>>> Did you specifically choose to pass the hook a vector type rather than
>>> a mode?
>>
>> It seemed like the safest thing to do for the default implementation,
>> e.g. in case we're vectorising "without SIMD" and thus without an
>> underlying vector mode.  I agree it probably doesn't make much
>> difference for non-default implementations.
>>
>>> I suppose in peeling for alignment the target should be able to
>>> prevent peeling by returning element alignment from the hook?
>>
>> Yeah.  This is what the SVE port does in the default vector-length
>> agnostic mode, and might also make sense in fixed-length mode.
>> Maybe it would be useful for other targets too, if unaligned accesses
>> have a negligible penalty for them.
>>
>
> Since this was committed (r253101), I've noticed a regression
> on arm-none-linux-gnueabihf:
> FAIL:gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "Vectorizing an unaligned access" 4
> FAIL:gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect
> "Vectorizing an unaligned access" 4
>
> for instance, with
> --target arm-none-linux-gnueabihf
> --with-mode arm
> --with-cpu cortex-a9
> --with-fpu neon-fp16

Thanks for letting me know.  I hadn't realised that AArch32 had an
ABI alignment of 64 bits for 128-bit vectors.

The problem is that, before the patch, we were inconsistent about
using the type alignment and type size for alignment calculations.
vect_compute_data_ref_alignment used the type alignment, so for
128-bit vectors would calculate misalignment relative to 64 bits
rather than 128 bits.  But vect_enhance_data_refs_alignment
calculated the number of peels based on the type size rather
than the type alignment, so would act as though the misalignment
calculated by vect_compute_data_ref_alignment was relative to
128 bits rather than 64 bits.

So for one loop in the test case, we pick a "vector short (8)".
vect_compute_data_ref_alignment realised that one of the original scalar
short accesses was misaligned by 6 bytes (3 elements) relative to the
ABI-defined 8-byte boundary.  However, vect_enhance_data_refs_alignment
applied that misalignment to the vector size of 16 bytes.  We would
then peel 5 iterations to achieve alignment, even though only one
iteration was needed to achieve 64-bit alignment.

(Peeling 5 iterations doesn't achieve 128-bit alignment, since for this
loop we can't compute the misalignment relative to 128 bits at compile
time.  All we were doing was peeling an extra vector iteration to
achieve 64-bit alignment.)

After the patch we base everything on the ABI's 64-bit alignment,
so peel once rather than 5 times.  A combination of these effects
mean that we end up with fewer accesses being reported as unaligned.

In that sense the patch is behaving as designed.  But the question is:
what should the preferred alignment for vectorisaton purposes be on Arm?
The options seem to be:

1) Prefer to align to the ABI alignment, as current trunk does.
2) Prefer to align to the vector size.

Both are changes from the pre-patch behaviour for this testcase,
since as I mentioned above, we cannot compute the misalignment
wrt the vector size at compile time (but before the patch we
acted as though we could).

The attached patch would do 2).  (Minus changelog since it's
just an illustration.)  Changes to the testsuite markup are
needed both ways.

Thanks,
Richard


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3e5438a..6cee000 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -273,6 +273,7 @@ static bool arm_array_mode_supported_p (machine_mode,
 static machine_mode arm_preferred_simd_mode (scalar_mode);
 static bool arm_class_likely_spilled_p (reg_class_t);
 static HOST_WIDE_INT 

Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign in vect_hw_misalign (PR 78421)

2017-10-06 Thread Tamar Christina
Hi All,

I'm looking for permission to backport this patch to the gcc-7 branch to fix 
the failing tests there as well.

Thanks,
Tamar

From: Mike Stump 
Sent: Tuesday, September 26, 2017 5:51:00 PM
To: Christophe Lyon
Cc: Tamar Christina; gcc-patches@gcc.gnu.org; nd; Ramana Radhakrishnan; Richard 
Earnshaw; ni...@redhat.com; Kyrylo Tkachov
Subject: Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign 
in vect_hw_misalign (PR 78421)

On Sep 25, 2017, at 9:58 PM, Christophe Lyon  wrote:
>
> Yes, thanks! I was missing the 'expr' part.
>
> Here is what I have committed (r253187), to avoid further noise in the 
> results.

Yup, looks good.  Thanks.


Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-06 Thread Ian Lance Taylor
On Fri, Oct 6, 2017 at 1:34 AM, Iain Buclaw  wrote:
>
> Out of curiosity, I did have a look at some of the tops of gofrontend
> sources this morning.  They are all copyright the Go Authors, and are
> licensed as BSD.  So I'm not sure if having copyright FSF and
> distributing under GPL is strictly required.  And from a maintenance
> point of view, it would be easier to merge in upstream changes as-is
> without some diff/merging tool.

The GCC steering committee accepted the gofrontend code under a
non-GPL license with the understanding that the master code would live
in a separate repository that would be mirrored into the GCC repo (the
master repository for gofrontend is currently at
https://go.googlesource.com/gofrontend/).  Personally I don't see a
problem with doing the same for the D frontend.

Ian


Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-06 Thread Andrei Alexandrescu
Thanks, Ian! -- Andrei


Re: [PATCH] [graphite] translate reads and writes in a single traversal of memory ops

2017-10-06 Thread Sebastian Pop
On Fri, Oct 6, 2017 at 6:27 AM, Richard Biener 
wrote:
>
> > Richard, could you please commit this patch, as I will need to figure out
> > why my
> > ssh keys don't let me to commit the code.  I will probably need to update
> > the key.
>
> Done.  You probably still have a v1 key which were rejected after some
> point.
> I would guess you'll need to contact overseers to replace your key.
>

Thanks!


Re: [PATCH][GRAPHITE] Fix PR82449

2017-10-06 Thread Sebastian Pop
On Fri, Oct 6, 2017 at 6:56 AM, Richard Biener  wrote:

>
> The following fences off a few more SCEVs through scev_analyzable_p given
> at the end we need those pass chrec_apply when getting a rename through
> SCEV.
>
> The SCEV in question is
>
>   {(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2
>
> which fails to chrec_apply in the CHREC_LEFT part because that part
> is not affine (and we're usually not replacing a IV with a constant
> where chrec_apply might handle one or the other case).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> This fixes three out of the remaining 8 codegen errors in SPEC CPU 2006.
>
> Ok?
>
> Thanks,
> Richard.
>
> 2017-10-06  Richard Biener  
>
> PR tree-optimization/82449
> * sese.c (can_chrec_apply): New function.
> (scev_analyzable_p): Check we can call chrec_apply on the SCEV.
>
> * gfortran.dg/graphite/pr82449.f: New testcase.
>
> Index: gcc/sese.c
> ===
> --- gcc/sese.c  (revision 253477)
> +++ gcc/sese.c  (working copy)
> @@ -421,6 +421,27 @@ invariant_in_sese_p_rec (tree t, const s
>return true;
>  }
>
> +/* Check whether we can call chrec_apply on CHREC with arbitrary X and
> VAR.  */
> +
> +static bool
> +can_chrec_apply (tree chrec)
> +{
> +  if (automatically_generated_chrec_p (chrec))
> +return false;
> +  switch (TREE_CODE (chrec))
> +{
> +case POLYNOMIAL_CHREC:
> +  if (evolution_function_is_affine_p (chrec))
> +   return (can_chrec_apply (CHREC_LEFT (chrec))
> +   && can_chrec_apply (CHREC_RIGHT (chrec)));
> +  return false;
> +CASE_CONVERT:
> +  return can_chrec_apply (TREE_OPERAND (chrec, 0));
> +default:;
> +  return tree_does_not_contain_chrecs (chrec);
> +}
> +}
> +
>  /* Return true when DEF can be analyzed in REGION by the scalar
> evolution analyzer.  */
>
> @@ -449,6 +470,7 @@ scev_analyzable_p (tree def, sese_l ®
> || !defined_in_sese_p (scev, region))
>  && (tree_does_not_contain_chrecs (scev)
> || evolution_function_is_affine_p (scev))
>

Why isn't evolution_function_is_affine_p returning false on {0, +, {1, +,
1}_1}_1?
This is quadratic.



> +&& can_chrec_apply (scev)
>  && (! loop
> || ! loop_in_sese_p (loop, region)
> || ! chrec_contains_symbols_defined_in_loop (scev, loop->num));
> Index: gcc/testsuite/gfortran.dg/graphite/pr82449.f
> ===
> --- gcc/testsuite/gfortran.dg/graphite/pr82449.f(nonexistent)
> +++ gcc/testsuite/gfortran.dg/graphite/pr82449.f(working copy)
> @@ -0,0 +1,11 @@
> +! { dg-do compile }
> +! { dg-options "-O2 -floop-nest-optimize" }
> +
> +  SUBROUTINE JDFIDX(MKL,KGSH)
> +  DIMENSION MKL(6,6)
> +  NKL=0
> +  400 DO 40 KG = 1,KGSH
> +  DO 40 LG = 1,KG
> +  NKL = NKL + 1
> +   40 MKL(LG,KG) = NKL
> +  END
>


Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Richard Earnshaw (lists)
On 06/10/17 12:01, Sudi Das wrote:
> 
> Hi Jakub
> 
> I have modified the entries:
> 
> *** gcc/ChangeLog ***
> 
> 2017-10-05  Sudakshina Das  
> 
> PR target/82440
>   * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Changed to
>   to only call aarch64_simd_valid_immediate on CONST_VECTORs.

You don't need to say 'Changed to' (or even 'Changed to to' :-).  Simply
say  'Only call ...'.

>   (aarch64_reg_or_bic_imm): Likewise.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2017-10-05  Sudakshina Das  
> 
>   * gcc.target/aarch64/bic_imm_1.c: New test.
>   * gcc.target/aarch64/orr_imm_1.c: Likewise..

too many full stops.

OK with those nits fixed.

R.

> 
> 
> Thanks
> Sudi
> 
> 
> From: Jakub Jelinek 
> Sent: Friday, October 6, 2017 11:11 AM
> To: Sudi Das
> Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; 
> Richard Earnshaw; James Greenhalgh
> Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate
> 
> On Fri, Oct 06, 2017 at 09:52:35AM +, Sudi Das wrote:
>> This patch is a fix for PR 82440.
>> The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out 
>> on
>> checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
>> function.
>> Also I think James forgot to add the test cases in the original patch 
>> submitted.
>>
>> Testing done : Checked for regressions on bootstrapped 
>> aarch64-none-linux-gnu.
>> Ok for trunk?
>>
>> Thanks
>> Sudi
>>
>> The ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2017-10-06  Sudakshina Das  
>>
>> PR target/82440
>> * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
>> (aarch64_reg_or_bic_imm): Likewise.
> 
> I'll defer the actual review to aarch64 maintainers, just want to say that
> this is not a correct ChangeLog entry.  You should say what has changed, not
> just that something has changed.  Something like
> Only call aarch64_simd_valid_immediate on CONST_VECTORs.
> or similar.
> 
> Jakub
> 
> 



Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-06 Thread Jan Hubicka
> Hi Honza,
> Thanks for the detailed suggestions, I have updated the patch accordingly.
> I have following questions on call_summary:
> 1] I added field bool is_return_callee in ipa_call_summary to track
> whether the caller possibly returns value returned by callee, which
> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
> and ipa_call_summary_t::duplicate() will already take care of handling
> late insertion/removal of cgraph nodes ? I just initialized
> is_return_callee to false in ipa_call_summary::reset and that seems to
> work. I am not sure though if I have handled it correctly. Could you
> please check that ?

I was actually thinking to introduce separate summary for ipa-pure-const pass,
but this seems fine to me too (for one bit definitly more effecient)
ipa_call_summary_t::duplicate copies all the fields, so indeed you should be
safe here.  

Also it is possible for functions to be inserted late.  Updating of call 
summaries
is currently handled by ipa_fn_summary_t::insert
> 
> 2] ipa_inline() called ipa_free_fn_summary, which made
> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
> call to ipa_free_fn_summary from ipa_inline, and moved it to
> ipa_pure_const::execute(). Is that OK ?

Seems OK to me.
> 
> Patch passes bootstrap+test and lto bootstrap+test on 
> x86_64-unknown-linux-gnu.
> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
> enabled on aarch64-linux-gnu.
> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
> the patch by building chromium or firefox.
> Would it be OK to commit if it passes above validations ?
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Honza

> 2017-10-05  Prathamesh Kulkarni  
> 
>   * cgraph.h (set_malloc_flag): Declare.
>   * cgraph.c (set_malloc_flag_1): New function.
>   (set_malloc_flag): Likewise.
>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>   false.
>   (read_ipa_call_summary): Add support for reading is_return_callee.
>   (write_ipa_call_summary): Stream is_return_callee.
>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>   ipa-prop.h, ipa-fnsummary.h.
>   (malloc_state_e): Define.
>   (malloc_state_names): Define.
>   (funct_state_d): Add field malloc_state.
>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>   (check_retval_uses): New function.
>   (malloc_candidate_p): Likewise.
>   (analyze_function): Add support for malloc attribute.
>   (pure_const_write_summary): Stream malloc_state.
>   (pure_const_read_summary): Add support for reading malloc_state.
>   (dump_malloc_lattice): New function.
>   (propagate_malloc): New function.
>   (ipa_pure_const::execute): Call propagate_malloc and
>   ipa_free_fn_summary.
>   (pass_local_pure_const::execute): Add support for malloc attribute.
>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
> 
> testsuite/
>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index 3d0cefbd46b..0aad90d59ea 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>return changed;
>  }
>  
> +/* Worker to set malloc flag.  */
New line here I guess (it is below)
> +static void
> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
> +{
> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
> +{
> +  DECL_IS_MALLOC (node->decl) = true;
> +  *changed = true;
> +}
> +
> +  ipa_ref *ref;
> +  FOR_EACH_ALIAS (node, ref)
> +{
> +  cgraph_node *alias = dyn_cast (ref->referring);
> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
> + set_malloc_flag_1 (alias, malloc_p, changed);
> +}
> +
> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> +if (e->caller->thunk.thunk_p
> + && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
> +}
> +
> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
> +
> +bool
> +cgraph_node::set_malloc_flag (bool malloc_p)
> +{
> +  bool changed = false;
> +
> +  if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE)
> +set_malloc_flag_1 (this, malloc_p, &changed);
> +  else
> +{
> +  ipa_ref *ref;
> +
> +  FOR_EACH_ALIAS (this, ref)
> + {
> +   cgraph_node *alias = dyn_cast (ref->referring);
> +   if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
> + set_malloc_flag_1 (alias, malloc_p, &changed);
> + }
> +}
> +  return changed;
> +}
> +
> diff --git a/gcc/ipa-fnsummary.h 

[C++ PATCH] use hash-table for namespace contents

2017-10-06 Thread Nathan Sidwell
The one remaining case of us pushing things into a namespace using 
not-their-name is the anonymous namespace. If we can fix that, then we 
don't need a hash-map, but can just use a hash-table, which is half the 
size.


This patch does both -- it's simpler than doing this change in two steps.

We have a new named_decl_hash trait, that hashes decls (or overloads) by 
DECL_NAME.  The changes to use that are then trivial, and indeed simpler 
than the hash-map, as we have an INSERT/NO_INSERT arg available to 
find_namespace_slot.


The changes for the anonymous namespace are to give it a NULL name and 
then set DECL_ASSEMBLER_NAME to the special name it has.  do_pushdecl 
has to be tweaked to allow pushing such decls.  We really shouldn't be 
trying to push (other) unnamed things, and I expect I'll encounter 
issues there when progressing with the modules stuff.


There are still some users of the  default_hash_traits *> specialization, but I'll be nuking them shortly and killing that 
specialization.


Applying to trunk.

nathan
--
Nathan Sidwell
2017-10-06  Nathan Sidwell  

	Use hash_table for namespace bindings
	* cp-tree.h (struct named_decl_hash): New.
	(lang_decl_ns): Change type of bindings field.
	* lex.c (maybe_add_lang_decl_raw): Adjust.
	* name-lookup.c (find_namespace_slot): Adjust.
	(do_pushdecl): Push NULL-named namespace.
	(do_push_nested_namespace): Adjust.
	(push_namespace): Push anonymous namespace as NULL name.

Index: cp-tree.h
===
--- cp-tree.h	(revision 253485)
+++ cp-tree.h	(working copy)
@@ -828,6 +828,25 @@ class lkp_iterator : public ovl_iterator
   }
 };
 
+/* hash traits for declarations.  Hashes potential overload sets via
+   DECL_NAME.  */
+
+struct named_decl_hash : ggc_remove 
+{
+  typedef tree value_type; /* A DECL or OVERLOAD  */
+  typedef tree compare_type; /* An identifier.  */
+
+  inline static hashval_t hash (const value_type decl);
+  inline static bool equal (const value_type existing, compare_type candidate);
+
+  static inline void mark_empty (value_type &p) {p = NULL_TREE;}
+  static inline bool is_empty (value_type p) {return !p;}
+
+  /* Nothing is deletable.  Everything is insertable.  */
+  static bool is_deleted (value_type) { return false; }
+  static void mark_deleted (value_type) { gcc_unreachable (); }
+};
+
 struct GTY(()) tree_template_decl {
   struct tree_decl_common common;
   tree arguments;
@@ -2548,10 +2567,10 @@ struct GTY(()) lang_decl_ns {
   vec *usings;
   vec *inlinees;
 
-  /* Map from IDENTIFIER nodes to DECLS.  It'd be nice to have this
- inline, but as the hash_map has a dtor, we can't then put this
- struct into a union (until moving to c++11).  */
-  hash_map *bindings;
+  /* Hash table of bound decls. It'd be nice to have this inline, but
+ as the hash_map has a dtor, we can't then put this struct into a
+ union (until moving to c++11).  */
+  hash_table *bindings;
 };
 
 /* DECL_LANG_SPECIFIC for parameters.  */
@@ -7370,6 +7389,20 @@ type_unknown_p (const_tree expr)
   return TREE_TYPE (expr) == unknown_type_node;
 }
 
+inline hashval_t
+named_decl_hash::hash (const value_type decl)
+{
+  tree name = OVL_NAME (decl);
+  return name ? IDENTIFIER_HASH_VALUE (name) : 0;
+}
+
+inline bool
+named_decl_hash::equal (const value_type existing, compare_type candidate)
+{
+  tree name = OVL_NAME (existing);
+  return candidate == name;
+}
+
 /* -- end of C++ */
 
 #endif /* ! GCC_CP_TREE_H */
Index: lex.c
===
--- lex.c	(revision 253485)
+++ lex.c	(working copy)
@@ -651,7 +651,7 @@ maybe_add_lang_decl_raw (tree t, bool de
 
   if (sel == lds_ns)
 /* Who'd create a namespace, only to put nothing in it?  */
-ld->u.ns.bindings = hash_map::create_ggc (499);
+ld->u.ns.bindings = hash_table::create_ggc (499);
 
   if (GATHER_STATISTICS)
 {
Index: name-lookup.c
===
--- name-lookup.c	(revision 253485)
+++ name-lookup.c	(working copy)
@@ -86,17 +86,9 @@ create_local_binding (cp_binding_level *
 static tree *
 find_namespace_slot (tree ns, tree name, bool create_p = false)
 {
-  tree *slot;
-
-  if (create_p)
-{
-  bool existed;
-  slot = &DECL_NAMESPACE_BINDINGS (ns)->get_or_insert (name, &existed);
-  if (!existed)
-	*slot = NULL_TREE;
-}
-  else
-slot = DECL_NAMESPACE_BINDINGS (ns)->get (name);
+  tree *slot = DECL_NAMESPACE_BINDINGS (ns)
+->find_slot_with_hash (name, name ? IDENTIFIER_HASH_VALUE (name) : 0,
+			   create_p ? INSERT : NO_INSERT);
   return slot;
 }
 
@@ -2950,7 +2942,10 @@ do_pushdecl (tree decl, bool is_friend)
   while (level->kind == sk_class)
 level = level->level_chain;
 
-  if (tree name = DECL_NAME (decl))
+  /* An anonymous namespace has a NULL DECL_NAME, but we still want to
+ insert it.  Other NULL-named decls, not so much.  */
+  tree name = DECL_NAME (decl);
+  if 

Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Sudi Das
Hi Richard

*** gcc/ChangeLog ***

2017-10-05  Sudakshina Das  

PR target/82440
* config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Only call
aarch64_simd_valid_immediate on CONST_VECTORs.
(aarch64_reg_or_bic_imm): Likewise.

*** gcc/testsuite/ChangeLog ***

2017-10-05  Sudakshina Das  

* gcc.target/aarch64/bic_imm_1.c: New test.
* gcc.target/aarch64/orr_imm_1.c: Likewise.


Also can someone please apply it for me. I do not have commit access.

Thanks
Sudi


From: Richard Earnshaw (lists) 
Sent: Friday, October 6, 2017 2:01 PM
To: Sudi Das; Jakub Jelinek
Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; James 
Greenhalgh
Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate
    
On 06/10/17 12:01, Sudi Das wrote:
> 
> Hi Jakub
> 
> I have modified the entries:
> 
> *** gcc/ChangeLog ***
> 
> 2017-10-05  Sudakshina Das  
> 
> PR target/82440
>    * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Changed to
>    to only call aarch64_simd_valid_immediate on CONST_VECTORs.

You don't need to say 'Changed to' (or even 'Changed to to' :-).  Simply
say  'Only call ...'.

>    (aarch64_reg_or_bic_imm): Likewise.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2017-10-05  Sudakshina Das  
> 
>    * gcc.target/aarch64/bic_imm_1.c: New test.
>    * gcc.target/aarch64/orr_imm_1.c: Likewise..

too many full stops.

OK with those nits fixed.

R.

> 
> 
> Thanks
> Sudi
> 
> 
> From: Jakub Jelinek 
> Sent: Friday, October 6, 2017 11:11 AM
> To: Sudi Das
> Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; 
> Richard Earnshaw; James Greenhalgh
> Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate
> 
> On Fri, Oct 06, 2017 at 09:52:35AM +, Sudi Das wrote:
>> This patch is a fix for PR 82440.
>> The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out 
>> on
>> checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
>> function.
>> Also I think James forgot to add the test cases in the original patch 
>> submitted.
>>
>> Testing done : Checked for regressions on bootstrapped 
>> aarch64-none-linux-gnu.
>> Ok for trunk?
>>
>> Thanks
>> Sudi
>>
>> The ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2017-10-06  Sudakshina Das  
>>
>> PR target/82440
>> * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
>> (aarch64_reg_or_bic_imm): Likewise.
> 
> I'll defer the actual review to aarch64 maintainers, just want to say that
> this is not a correct ChangeLog entry.  You should say what has changed, not
> just that something has changed.  Something like
> Only call aarch64_simd_valid_immediate on CONST_VECTORs.
> or similar.
> 
> Jakub
> 
> 



Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-06 Thread Wilco Dijkstra
Maxim Kuvyrkov wrote:

Note I've committed: https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00309.html 
which does
change qsort to (qsort) like Jakub proposed.

> I think that this is the best solution so far.  Could you add a comment like 
> the following?
> ==
> Ideally, we would call autopref_rank_data() here, but we can't since it is 
> not guaranteed
> to return transitive results fro multi_mem_insns.  We use an approximation 
> here and rely
> on lookahead_guard below to force instruction order according to 
> autopref_rank_data().
> ==

The issue is that autopref_rank_data() doesn't do anything useful. It checks 
for overlaps
which shouldn't happen much, if at all. And as discussed declaring a comparison 
as
"unordered" is simply not possible. lookahead_guard can't fix things up either, 
so there is
really no point in keeping this function. Similarly all the min/max offset 
calculations are
redundant even if we assume the offsets in a LDP/STP instruction are completely 
random.

Wilco







Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

2017-10-06 Thread Tamar Christina
Hi Sudi,

Committed as r253490 on your behalf.

Thanks,
Tamar

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Sudi Das 
Sent: Friday, October 6, 2017 2:19:46 PM
To: Richard Earnshaw; Jakub Jelinek
Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; James 
Greenhalgh
Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

Hi Richard

*** gcc/ChangeLog ***

2017-10-05  Sudakshina Das  

PR target/82440
* config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Only call
aarch64_simd_valid_immediate on CONST_VECTORs.
(aarch64_reg_or_bic_imm): Likewise.

*** gcc/testsuite/ChangeLog ***

2017-10-05  Sudakshina Das  

* gcc.target/aarch64/bic_imm_1.c: New test.
* gcc.target/aarch64/orr_imm_1.c: Likewise.


Also can someone please apply it for me. I do not have commit access.

Thanks
Sudi


From: Richard Earnshaw (lists) 
Sent: Friday, October 6, 2017 2:01 PM
To: Sudi Das; Jakub Jelinek
Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; James 
Greenhalgh
Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate

On 06/10/17 12:01, Sudi Das wrote:
>
> Hi Jakub
>
> I have modified the entries:
>
> *** gcc/ChangeLog ***
>
> 2017-10-05  Sudakshina Das  
>
> PR target/82440
>    * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Changed to
>    to only call aarch64_simd_valid_immediate on CONST_VECTORs.

You don't need to say 'Changed to' (or even 'Changed to to' :-).  Simply
say  'Only call ...'.

>    (aarch64_reg_or_bic_imm): Likewise.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2017-10-05  Sudakshina Das  
>
>    * gcc.target/aarch64/bic_imm_1.c: New test.
>    * gcc.target/aarch64/orr_imm_1.c: Likewise..

too many full stops.

OK with those nits fixed.

R.

>
>
> Thanks
> Sudi
>
>
> From: Jakub Jelinek 
> Sent: Friday, October 6, 2017 11:11 AM
> To: Sudi Das
> Cc: gcc-patches@gcc.gnu.org; nd; sell...@cavium.com; Marcus Shawcroft; 
> Richard Earnshaw; James Greenhalgh
> Subject: Re: [PATCH][AArch64] Fix ICE caused in aarch64_simd_valid_immediate
>
> On Fri, Oct 06, 2017 at 09:52:35AM +, Sudi Das wrote:
>> This patch is a fix for PR 82440.
>> The predicates aarch64_reg_or_bic_imm and aarch64_reg_or_orr_imm missed out 
>> on
>> checking for a CONST_VECTOR before calling aarch64_simd_valid_immediate 
>> function.
>> Also I think James forgot to add the test cases in the original patch 
>> submitted.
>>
>> Testing done : Checked for regressions on bootstrapped 
>> aarch64-none-linux-gnu.
>> Ok for trunk?
>>
>> Thanks
>> Sudi
>>
>> The ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2017-10-06  Sudakshina Das  
>>
>> PR target/82440
>> * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Modified.
>> (aarch64_reg_or_bic_imm): Likewise.
>
> I'll defer the actual review to aarch64 maintainers, just want to say that
> this is not a correct ChangeLog entry.  You should say what has changed, not
> just that something has changed.  Something like
> Only call aarch64_simd_valid_immediate on CONST_VECTORs.
> or similar.
>
> Jakub
>
>



Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Bernd Edlinger
On 10/05/17 18:16, Martin Sebor wrote:
> In my (very quick) tests the warning appears to trigger on all
> strictly incompatible conversions, even if they are otherwise
> benign, such as:
> 
>    int f (const int*);
>    typedef int F (int*);
> 
>    F* pf1 = f;    // -Wincompatible-pointer-types
>    F* pf2 = (F*)f;    // -Wcast-function-type
> 
> Likewise by:
> 
>    int f (signed char);
>    typedef int F (unsigned char);
> 
>    F* pf = (F*)f;    // -Wcast-function-type
> 
> I'd expect these conversions to be useful and would view warning
> for them with -Wextra as excessive.  In fact, I'm not sure I see
> the benefit of warning on these casts under any circumstances.
> 

Well, while the first example should be safe,
the second one is probably not safe:

Because the signed and unsigned char are promoted to int,
by the ABI but one is in the range -128..127 while the
other is in the range 0..255, right?


Bernd.


Re: [PATCH][GRAPHITE] Fix PR82449

2017-10-06 Thread Richard Biener
On Fri, 6 Oct 2017, Sebastian Pop wrote:

> On Fri, Oct 6, 2017 at 6:56 AM, Richard Biener  wrote:
> 
> >
> > The following fences off a few more SCEVs through scev_analyzable_p given
> > at the end we need those pass chrec_apply when getting a rename through
> > SCEV.
> >
> > The SCEV in question is
> >
> >   {(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2
> >
> > which fails to chrec_apply in the CHREC_LEFT part because that part
> > is not affine (and we're usually not replacing a IV with a constant
> > where chrec_apply might handle one or the other case).
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > This fixes three out of the remaining 8 codegen errors in SPEC CPU 2006.
> >
> > Ok?
> >
> > Thanks,
> > Richard.
> >
> > 2017-10-06  Richard Biener  
> >
> > PR tree-optimization/82449
> > * sese.c (can_chrec_apply): New function.
> > (scev_analyzable_p): Check we can call chrec_apply on the SCEV.
> >
> > * gfortran.dg/graphite/pr82449.f: New testcase.
> >
> > Index: gcc/sese.c
> > ===
> > --- gcc/sese.c  (revision 253477)
> > +++ gcc/sese.c  (working copy)
> > @@ -421,6 +421,27 @@ invariant_in_sese_p_rec (tree t, const s
> >return true;
> >  }
> >
> > +/* Check whether we can call chrec_apply on CHREC with arbitrary X and
> > VAR.  */
> > +
> > +static bool
> > +can_chrec_apply (tree chrec)
> > +{
> > +  if (automatically_generated_chrec_p (chrec))
> > +return false;
> > +  switch (TREE_CODE (chrec))
> > +{
> > +case POLYNOMIAL_CHREC:
> > +  if (evolution_function_is_affine_p (chrec))
> > +   return (can_chrec_apply (CHREC_LEFT (chrec))
> > +   && can_chrec_apply (CHREC_RIGHT (chrec)));
> > +  return false;
> > +CASE_CONVERT:
> > +  return can_chrec_apply (TREE_OPERAND (chrec, 0));
> > +default:;
> > +  return tree_does_not_contain_chrecs (chrec);
> > +}
> > +}
> > +
> >  /* Return true when DEF can be analyzed in REGION by the scalar
> > evolution analyzer.  */
> >
> > @@ -449,6 +470,7 @@ scev_analyzable_p (tree def, sese_l ®
> > || !defined_in_sese_p (scev, region))
> >  && (tree_does_not_contain_chrecs (scev)
> > || evolution_function_is_affine_p (scev))
> >
> 
> Why isn't evolution_function_is_affine_p returning false on {0, +, {1, +,
> 1}_1}_1?
> This is quadratic.

It returns false on that but the CHREC we ask it on is

{(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2

only the initial value is "quadratic".

Richard.


Re: [PATCH] Add -fsanitize=pointer-{compare,subtract}.

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 02:46:05PM +0200, Martin Liška wrote:
> +   if (sanitize_comparison_p)
> + {
> +   if (is_gimple_assign (s)
> +   && gimple_assign_rhs_class (s) == GIMPLE_BINARY_RHS
> +   && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (s)))
> +   && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (s)))
> +   && is_pointer_compare_opcode (gimple_assign_rhs_code (s)))

Isn't it better to test is_pointer_compare_opcode right after
is_gimple_assign and leave the gimple_assign_rhs_class test out?

> + {
> +   ptr1 = gimple_assign_rhs1 (s);
> +   ptr2 = gimple_assign_rhs2 (s);
> +   fn = BUILT_IN_ASAN_POINTER_COMPARE;
> + }
> +   else if (gimple_code (s) == GIMPLE_COND
> +&& POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (s)))
> +&& POINTER_TYPE_P (TREE_TYPE (gimple_cond_rhs (s)))
> +&& is_pointer_compare_opcode (gimple_cond_code (s)))
> + {
> +   ptr1 = gimple_cond_lhs (s);
> +   ptr2 = gimple_cond_rhs (s);
> +   fn = BUILT_IN_ASAN_POINTER_COMPARE;
> + }

You don't handle the case when there is a COND_EXPR with pointer comparison
in the condition.

> + }
> +
> +   if (sanitize_subtraction_p
> +   && is_gimple_assign (s)
> +   && gimple_assign_rhs_class (s) == GIMPLE_BINARY_RHS

The above isn't really needed.

> +   && gimple_assign_rhs_code (s) == MINUS_EXPR)
> + {
> +   tree rhs1 = gimple_assign_rhs1 (s);
> +   tree rhs2 = gimple_assign_rhs2 (s);
> +
> +   if (TREE_CODE (rhs1) == SSA_NAME
> +   || TREE_CODE (rhs2) == SSA_NAME)
> + {
> +   gassign *def1
> + = dyn_cast(SSA_NAME_DEF_STMT (rhs1));
> +   gassign *def2
> + = dyn_cast(SSA_NAME_DEF_STMT (rhs2));
> +
> +   if (def1 && def2
> +   && gimple_assign_rhs_class (def1) == GIMPLE_UNARY_RHS
> +   && gimple_assign_rhs_class (def2) == GIMPLE_UNARY_RHS)
> + {
> +   if (POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (def1)))
> +   && POINTER_TYPE_P
> +   (TREE_TYPE (gimple_assign_rhs1 (def2

Better add temporaries for rhs1/2 of def1, then you don't have issues with
too long lines.

> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -10935,6 +10935,26 @@ Enable AddressSanitizer for Linux kernel.
>  See @uref{https://github.com/google/kasan/wiki} for more details.
>  The option cannot be combined with @option{-fcheck-pointer-bounds}.
>  
> +@item -fsanitize=pointer-compare
> +@opindex fsanitize=pointer-compare
> +Instrument comparison operation (<, <=, >, >=, -) with pointer operands.

Poiinter-compare doesn't instrument -, so ", -" should be left out.

> +The option cannot be combined with @option{-fsanitize=thread}
> +and/or @option{-fcheck-pointer-bounds}.
> +Note: By default the check is disabled at run time.  To enable it,
> +add @code{detect_invalid_pointer_pairs=1} to the environment variable
> +@env{ASAN_OPTIONS}.  The checking currently works only for pointers allocated
> +on heap.
> +
> +@item -fsanitize=subtract

-fsanitize=pointer-subtract

> +@opindex fsanitize=pointer-compare

s/compare/subtract/

> +Instrument subtraction with pointer operands.
> +The option cannot be combined with @option{-fsanitize=thread}
> +and/or @option{-fcheck-pointer-bounds}.
> +Note: By default the check is disabled at run time.  To enable it,
> +add @code{detect_invalid_pointer_pairs=1} to the environment variable
> +@env{ASAN_OPTIONS}.  The checking currently works only for pointers allocated
> +on heap.
> +
>  @item -fsanitize=thread
>  @opindex fsanitize=thread
>  Enable ThreadSanitizer, a fast data race detector.

Conceptually, these two instrumentations rely on address sanitization,
not really sure if we should supporting them for kernel sanitization (but I
bet it is just going to be too costly for kernel).
So, we also need to make sure at least parts of SANITIZE_ADDRESS is enabled
when these options are on.
That can be done by erroring out if -fsanitize=pointer-compare is requested
without -fsanitize=address, or by implicitly enabling -fsanitize=address for
these, or by adding yet another SANITIZE_* bit which would cover
sanitization of memory accesses for asan, that bit would be set by
-fsanitize={address,kernel-address} in addition to the current 2 bits, but
pointer-{subtract,compare} would set its own bit and SANITIZE_ADDRESS and
SANITIZE_USER_ADDRESS only.  Without the new bit we'd emit red zones,
function prologue/epilogue asan changes, registraction of global variables,
but not actual instrumentation of memory accesses (and probably not
instrumentation of C++ ctor ordering).

> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -971,7 +971,9 @@ prop

Re: libbacktrace patch committed: Support compressed debug sections

2017-10-06 Thread Ian Lance Taylor
On Fri, Oct 6, 2017 at 3:22 AM, Gerald Pfeifer  wrote:
> On Thu, 28 Sep 2017, Ian Lance Taylor wrote:
>> This patch to libbacktrace adds support for compressed debug sections.
>> 2017-09-28  Ian Lance Taylor  
>>
>> PR other/67165
>> * elf.c (__builtin_prefetch): Define if not __GNUC__.
>> (unlikely): Define.
>> (SHF_UNCOMPRESSED, ELFCOMPRESS_ZLIB): Define.
>> (b_elf_chdr): Define type.
>> (enum debug_section): Add ZDEBUG_xxx values.
>
> Since this change I am seeing the following in my night GCC build
> and test logs on FreeBSD systems:
>
> gmake[2]: autogen: Command not found
> gmake[2]: *** [Makefile:176: check] Error 127
> gmake[1]: *** [Makefile:3759: check-fixincludes] Error 2
> /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c: In function 'test_large':
> /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:384:41: warning: passing 
> argument 2 of 'uncompress' from incompatible pointer type 
> [-Wincompatible-pointer-types]
>r = uncompress (uncompressed_buf, &uncompressed_bufsize,
>  ^
> In file included from /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:43:0:
> /usr/include/zlib.h:1265:21: note: expected 'uLongf * {aka long unsigned int 
> *}' but argument is of type 'size_t * {aka unsigned int *}'
>  ZEXTERN int ZEXPORT uncompress OF((Bytef *dest,   uLongf *destLen,
>  ^~
> /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c: In function 'test_large':
> /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:384:41: warning: passing 
> argument 2 of 'uncompress' from incompatible pointer type 
> [-Wincompatible-pointer-types]
>r = uncompress (uncompressed_buf, &uncompressed_bufsize,
>  ^
> In file included from /scratch/tmp/gerald/GCC-HEAD/libbacktrace/ztest.c:43:0:
> /usr/include/zlib.h:1265:21: note: expected 'uLongf * {aka long unsigned int 
> *}' but argument is of type 'size_t * {aka unsigned int *}'
>  ZEXTERN int ZEXPORT uncompress OF((Bytef *dest,   uLongf *destLen,
>  ^~
> gmake[4]: *** [Makefile:306: check-DEJAGNU] Error 1
> gmake[3]: *** [Makefile:350: check-am] Error 2
> gmake[2]: *** [Makefile:904: check-recursive] Error 1
> gmake[1]: *** [Makefile:22343: check-target-libgomp] Error 2
> Fatal error 'mutex is on list' at line 272 in file 
> /usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)
> Fatal error 'mutex is on list' at line 272 in file 
> /usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)
> gmake: *** [Makefile:2286: do-check] Error 2

Thanks for the report.  I committed this patch, which I hope will fix
the problem.

Ian

2017-10-06  Ian Lance Taylor  

* ztest.c (test_large): Pass unsigned long *, not size_t *, to
zlib uncompress function.
Index: ztest.c
===
--- ztest.c (revision 253490)
+++ ztest.c (working copy)
@@ -369,6 +369,8 @@ test_large (struct backtrace_state *stat
 
   for (i = 0; i < trials; ++i)
 {
+  unsigned long uncompress_sizearg;
+
   cid = ZLIB_CLOCK_GETTIME_ARG;
   if (clock_gettime (cid, &ts1) < 0)
{
@@ -406,7 +408,8 @@ test_large (struct backtrace_state *stat
  return;
}
 
-  r = uncompress (uncompressed_buf, &uncompressed_bufsize,
+  uncompress_sizearg = uncompressed_bufsize;
+  r = uncompress (uncompressed_buf, &uncompress_sizearg,
  compressed_buf + 12, compressed_bufsize - 12);
 
   if (clock_gettime (cid, &ts2) < 0)


Re: [PATCH] Implement smart multiple switch expansion algorithms.

2017-10-06 Thread Wilco Dijkstra
Martin Liska wrote:

> There are some numbers for cc1plus:
>
> $ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
> VM SIZE  FILE SIZE
>   +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%

> insn-attrtab.o:
> VM SIZE  FILE SIZE
>   +214%  +682Ki .rodata +682Ki  +214%
>  -50.1% -63.3Ki .text  -63.3Ki -50.1%

So is that a 3.8% codesize increase or decrease? If an increase,
I can't see how replacing 63KB of instructions with 682KB of data
is a good tradeoff... There should be an accurate calculation
of the density, taking the switch table width into account (really small
tables can use 1-byte offsets, large tables are typically forced to
use 4-byte offsets). This may need new target callbacks - I changed
PARAM_CASE_VALUES_THRESHOLD on AArch64 to get smaller
code and better performance since the current density calculations
are hardcoded and quite wrong for big tables...

Also what is the codesize difference on SPEC2006/2017? I don't see
any mention of performance impact either...

Wilco

Re: [C++ Patch] PR 47791

2017-10-06 Thread Nathan Sidwell

On 10/06/2017 08:27 AM, Paolo Carlini wrote:

Hi,

this is a very old issue, with no impact on the functionality, which I 
already analyzed a bit a while ago: submitter noticed that 
finish_function & co could be cleaned up a bit wrt the constants passed 
in the flags parameter. When I had again a look today cleaning up the 
code appeared very straightforward:  just change the finish_function 
parameter to bool type and adjust the callers. Interestingly, as a 
consequence, cp_parser_function_definition_after_declarator can be also 
simplified, because ctor_initializer_p was only used for the 
finish_function call. Likewise the helper functions called by 
cp_parser_function_definition_after_declarator itself, which were just 
preparing ctor_initializer_p.



ok, thanks


--
Nathan Sidwell


Patch ping

2017-10-06 Thread Jakub Jelinek
Hi!

I'd like to ping a couple of patches:

http://gcc.gnu.org/ml/gcc-patches/2017-09/msg01237.html 
 
  C++2a P0704R1 - fixing const-qualified pointers to members
   

http://gcc.gnu.org/ml/gcc-patches/2017-09/msg01939.html
  PR libstdc++/81706 - omp declare simd attribute vs. builtins
  C++ part of the patch

http://gcc.gnu.org/ml/gcc-patches/2017-09/msg02006.html
  PR c++/82299 - invalid -Wuseless-cast on direct enum init

http://gcc.gnu.org/ml/gcc-patches/2017-09/msg01540.html
  libgcc __mulvDI3 fix - missed detection of overflow for
  0x * 0x
  __builtin_mul_overflow{,_p} fix for the same

Thanks

Jakub


[C++ PATCH] hash-table for extern-c fns.

2017-10-06 Thread Nathan Sidwell
This patch converts the extern "C" function map to use a hash table, in 
the same way as I've just changed namespace bindings.


There's a small wart, in that the  c_linkage_bindings user (in c-common) 
expects either a single decl or a TREE_LIST.  I.e. not an OVERLOAD.  But 
the hasher expects either a DECL or an OVERLOAD.  Rather than extend the 
hasher (and marginally pessimize it for this special case), I modify the 
extern-c table handling to insert an initial OVERLOAD node, when there 
are multiple functions.  That can CHAIN directly to a TREE_LIST.  Sure, 
we have a wasted OVERLOAD node in this case, but it's going to be rare 
-- programs don't usually declare extern "C" functions of the same name 
in different namespaces.


Changing the c-common function is harder, as OVERLOAD is a C++ FE local 
node.


Applying to trunk.

nathan
--
Nathan Sidwell
2017-10-06  Nathan Sidwell  

	Use hash_table for extern "C" names
	* name-lookup.c (extern_c_fns): Use hash_table.
	(check_extern_c_conflict): Adjust.
	(c_linkage_bindings): Adjust.

Index: name-lookup.c
===
--- name-lookup.c	(revision 253489)
+++ name-lookup.c	(working copy)
@@ -2511,9 +2511,9 @@ update_binding (cp_binding_level *level,
   return decl;
 }
 
-/* Map of identifiers to extern C functions (or LISTS thereof).  */
+/* Table of identifiers to extern C functions (or LISTS thereof).  */
 
-static GTY(()) hash_map *extern_c_fns;
+static GTY(()) hash_table *extern_c_fns;
 
 /* DECL has C linkage. If we have an existing instance, make sure it
has the same exception specification [7.5, 7.6].  If there's no
@@ -2527,17 +2527,15 @@ check_extern_c_conflict (tree decl)
 return;
 
   if (!extern_c_fns)
-extern_c_fns = hash_map::create_ggc (127);
+extern_c_fns = hash_table::create_ggc (127);
 
-  bool existed;
-  tree *slot = &extern_c_fns->get_or_insert (DECL_NAME (decl), &existed);
-  if (!existed)
-*slot = decl;
-  else
+  tree *slot = extern_c_fns
+->find_slot_with_hash (DECL_NAME (decl),
+			   IDENTIFIER_HASH_VALUE (DECL_NAME (decl)), INSERT);
+  if (tree old = *slot)
 {
-  tree old = *slot;
-  if (TREE_CODE (old) == TREE_LIST)
-	old = TREE_VALUE (old);
+  if (TREE_CODE (old) == OVERLOAD)
+	old = OVL_FUNCTION (old);
 
   int mismatch = 0;
   if (DECL_CONTEXT (old) == DECL_CONTEXT (decl))
@@ -2563,9 +2561,24 @@ check_extern_c_conflict (tree decl)
 		 "due to different exception specifications");
 	}
   else
-	/* Chain it on for c_linkage_binding's use.  */
-	*slot = tree_cons (NULL_TREE, decl, *slot);
+	{
+	  if (old == *slot)
+	/* The hash table expects OVERLOADS, so construct one with
+	   OLD as both the function and the chain.  This allocate
+	   an excess OVERLOAD node, but it's rare to have multiple
+	   extern "C" decls of the same name.  And we save
+	   complicating the hash table logic (which is used
+	   elsewhere).  */
+	*slot = ovl_make (old, old);
+
+	  slot = &OVL_CHAIN (*slot);
+
+	  /* Chain it on for c_linkage_binding's use.  */
+	  *slot = tree_cons (NULL_TREE, decl, *slot);
+	}
 }
+  else
+*slot = decl;
 }
 
 /* Returns a list of C-linkage decls with the name NAME.  Used in
@@ -2575,8 +2588,15 @@ tree
 c_linkage_bindings (tree name)
 {
   if (extern_c_fns)
-if (tree *slot = extern_c_fns->get (name))
-  return *slot;
+if (tree *slot = extern_c_fns
+	->find_slot_with_hash (name, IDENTIFIER_HASH_VALUE (name), NO_INSERT))
+  {
+	tree result = *slot;
+	if (TREE_CODE (result) == OVERLOAD)
+	  result = OVL_CHAIN (result);
+	return result;
+  }
+
   return NULL_TREE;
 }
 


RE: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-06 Thread Shalnov, Sergey
Jakub,
I completely agree with you. I fixed the patch.
Currently, TARGET_PREFER256 will work on architectures with 512VL. It will not 
work otherwise.

I will try to find better solution for this. I think I need to look into 
register classes to configure
available registers for 512F and 512VL in case of TARGET_PREFER_AVX256.

I would propose to merge this patch as temporal solution.

Sergey


-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
Behalf Of Jakub Jelinek
Sent: Friday, October 6, 2017 11:58 AM
To: Shalnov, Sergey 
Cc: 'gcc-patches@gcc.gnu.org' ; 'ubiz...@gmail.com' 
; 'kirill.yuk...@gmail.com' ; 
Senkevich, Andrew ; Ivchenko, Alexander 
; Peryt, Sebastian 
Subject: Re: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in 
Intel AVX512 configuration

On Fri, Oct 06, 2017 at 09:33:21AM +, Shalnov, Sergey wrote:
> Hi,
> GCC uses full 512-bit register in case of moving SF/DF value between two 
> registers.
> The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
> 
> 2017-10-06  Sergey Shalnov  
> 
> gcc/
>   * config/i386/i386.md(*movsf_internal, *movdf_internal):
>   Avoid 512-bit AVX modes for TARGET_PREFER_AVX256.
> 
From b96e657153b9aea24ff002e7e156ba12b2d443b5 Mon Sep 17 00:00:00 2001
From: Sergey Shalnov 
Date: Fri, 6 Oct 2017 10:45:40 +0300
Subject: [PATCH 1/1] Avoid 512-bit mode MOV for prefer-avx256 option

---
 gcc/config/i386/i386.md | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
99497a9..a6d7cca 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3564,8 +3564,9 @@
 
   /* movaps is one byte shorter for non-AVX targets.  */
   (eq_attr "alternative" "13,17")
-(cond [(ior (match_operand 0 "ext_sse_reg_operand")
-(match_operand 1 "ext_sse_reg_operand"))
+(cond [(and (not (match_test "TARGET_PREFER_AVX256"))
+ (ior (match_operand 0 "ext_sse_reg_operand")
+  (match_operand 1 "ext_sse_reg_operand")))
  (const_string "V8DF")
(ior (not (match_test "TARGET_SSE2"))
 (match_test 
"TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))

How does that work with -mprefer-avx256 -mavx512f -mno-avx512vl?
The constraints in these alternatives use v, and [SD]Fmode are included in 
VALID_AVX512F_SCALAR_MODE and thus you can get e.g. %?mm23 among the operands.
EVEX encoded VMOVAPD or VMOVAPS is AVX512F cpuid only with 512-bit operands, 
with 128-bit/256-bit it is AVX512VL + AVX512F.

So, in both of the spots you've changed in this patch, but also in the spots 
you've changed earlier, you need to use
TARGET_PREFER_AVX256 && TARGET_AVX512VL rather than just TARGET_PREFER_AVX256, 
because without TARGET_AVX512VL it is not a matter of preferring it, but a 
must.  Unless we disable %xmm16+ registers for TARGET_PREFER_AVX256 && 
!TARGET_AVX512VL code, but that would be weird:
an optimization preference would e.g. break people using %xmm16+ register 
variables etc.

@@ -3739,8 +3740,9 @@
  better to maintain the whole registers in single format
  to avoid problems on using packed logical operations.  */
   (eq_attr "alternative" "6")
-(cond [(ior  (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
+(cond [(and (not (match_test "TARGET_PREFER_AVX256"))
+ (ior (match_operand 0 "ext_sse_reg_operand")
+  (match_operand 1 "ext_sse_reg_operand")))
  (const_string "V16SF")
(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
 (match_test "TARGET_SSE_SPLIT_REGS"))
--
1.8.3.1




Jakub


0002-Avoid-512-bit-mode-MOV-for-prefer-avx256-option.patch
Description: 0002-Avoid-512-bit-mode-MOV-for-prefer-avx256-option.patch


Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign in vect_hw_misalign (PR 78421)

2017-10-06 Thread Mike Stump
On Oct 6, 2017, at 5:48 AM, Tamar Christina  wrote:
> 
> I'm looking for permission to backport this patch to the gcc-7 branch to fix 
> the failing tests there as well.

Ok.



Re: [PATCH][GRAPHITE] Fix PR82449

2017-10-06 Thread Sebastian Pop
On Fri, Oct 6, 2017 at 8:33 AM, Richard Biener  wrote:

> On Fri, 6 Oct 2017, Sebastian Pop wrote:
>
> > On Fri, Oct 6, 2017 at 6:56 AM, Richard Biener 
> wrote:
> >
> > >
> > > The following fences off a few more SCEVs through scev_analyzable_p
> given
> > > at the end we need those pass chrec_apply when getting a rename through
> > > SCEV.
> > >
> > > The SCEV in question is
> > >
> > >   {(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2
> > >
> > > which fails to chrec_apply in the CHREC_LEFT part because that part
> > > is not affine (and we're usually not replacing a IV with a constant
> > > where chrec_apply might handle one or the other case).
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > >
> > > This fixes three out of the remaining 8 codegen errors in SPEC CPU
> 2006.
> > >
> > > Ok?
> > >
> > > Thanks,
> > > Richard.
> > >
> > > 2017-10-06  Richard Biener  
> > >
> > > PR tree-optimization/82449
> > > * sese.c (can_chrec_apply): New function.
> > > (scev_analyzable_p): Check we can call chrec_apply on the SCEV.
> > >
> > > * gfortran.dg/graphite/pr82449.f: New testcase.
>
> >
> > > Index: gcc/sese.c
> > > ===
> > > --- gcc/sese.c  (revision 253477)
> > > +++ gcc/sese.c  (working copy)
> > > @@ -421,6 +421,27 @@ invariant_in_sese_p_rec (tree t, const s
> > >return true;
> > >  }
> > >
> > > +/* Check whether we can call chrec_apply on CHREC with arbitrary X and
> > > VAR.  */
>
> > +
> > > +static bool
> > > +can_chrec_apply (tree chrec)
>

Could we use scev_is_linear_expression ?
It seems like can_chrec_apply has the same semantics.


> > > +{
> > > +  if (automatically_generated_chrec_p (chrec))
> > > +return false;
> > > +  switch (TREE_CODE (chrec))
> > > +{
> > > +case POLYNOMIAL_CHREC:
> > > +  if (evolution_function_is_affine_p (chrec))
> > > +   return (can_chrec_apply (CHREC_LEFT (chrec))
> > > +   && can_chrec_apply (CHREC_RIGHT (chrec)));
> > > +  return false;
> > > +CASE_CONVERT:
> > > +  return can_chrec_apply (TREE_OPERAND (chrec, 0));
> > > +default:;
> > > +  return tree_does_not_contain_chrecs (chrec);
> > > +}
> > > +}
> > > +
> > >  /* Return true when DEF can be analyzed in REGION by the scalar
> > > evolution analyzer.  */
> > >
> > > @@ -449,6 +470,7 @@ scev_analyzable_p (tree def, sese_l ®
> > > || !defined_in_sese_p (scev, region))
> > >  && (tree_does_not_contain_chrecs (scev)
> > > || evolution_function_is_affine_p (scev))
> > >
> >
> > Why isn't evolution_function_is_affine_p returning false on {0, +, {1, +,
> > 1}_1}_1?
> > This is quadratic.
>
> It returns false on that but the CHREC we ask it on is
>
> {(integer(kind=4)) {0, +, {1, +, 1}_1}_1, + 1}_2
>
> only the initial value is "quadratic".
>

Right.
If I understand correctly, the scop is the body of loop_1,
and we do not need to represent the quadratic evolution
of the initial value.


Re: Allow non-wi wi

2017-10-06 Thread Mike Stump

> On Oct 6, 2017, at 2:35 AM, Richard Sandiford  
> wrote:
> 
> Richard Biener  writes:
>> On Tue, Oct 3, 2017 at 8:34 PM, Richard Sandiford
>>  wrote:
>>> This patch uses global rather than member operators for wide-int.h,
>>> so that the first operand can be a non-wide-int type.
>> 
>> Not sure why we had the in-class ones.  If we had some good arguments
>> they'd still stand.  Do you remember?
> 
> Not really, sorry.

No real good reason.  Copying double_int's style is most of the reason.  Just 
wanted to support the api clients used, and they, at the time, didn't require 
non-member versions.  If they had, we'd have done it outside.



Re: [PATCH, rs6000] Process deferred rescans between mini-passes

2017-10-06 Thread Segher Boessenkool
Hi Bill,

On Wed, Oct 04, 2017 at 04:44:59PM -0500, Bill Schmidt wrote:
> The Power8 swap optimization pass contains a mini-pass to replace certain
> patterns prior to swap optimization proper.  In order for this not to
> distort the dataflow information for swap optimization, we should process
> all the deferred rescans between the two passes.  Currently that is not
> done.  This patch fixes that.
> 
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is
> this ok for trunk?

This is fine, thanks! (also for backports if you want those).


Segher


Re: Patch ping

2017-10-06 Thread Nathan Sidwell

On 10/06/2017 10:12 AM, Jakub Jelinek wrote:

Hi!

I'd like to ping a couple of patches:

http://gcc.gnu.org/ml/gcc-patches/2017-09/msg01237.html
   C++2a P0704R1 - fixing const-qualified pointers to members


ok, thanks


--
Nathan Sidwell


Re: Patch ping

2017-10-06 Thread Nathan Sidwell

On 10/06/2017 10:12 AM, Jakub Jelinek wrote:

Hi!

I'd like to ping a couple of patches:




http://gcc.gnu.org/ml/gcc-patches/2017-09/msg02006.html
   PR c++/82299 - invalid -Wuseless-cast on direct enum init


Agreed, ok, thanks.

nathan

--
Nathan Sidwell


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Martin Sebor

On 10/06/2017 07:25 AM, Bernd Edlinger wrote:

On 10/05/17 18:16, Martin Sebor wrote:

In my (very quick) tests the warning appears to trigger on all
strictly incompatible conversions, even if they are otherwise
benign, such as:

   int f (const int*);
   typedef int F (int*);

   F* pf1 = f;// -Wincompatible-pointer-types
   F* pf2 = (F*)f;// -Wcast-function-type

Likewise by:

   int f (signed char);
   typedef int F (unsigned char);

   F* pf = (F*)f;// -Wcast-function-type

I'd expect these conversions to be useful and would view warning
for them with -Wextra as excessive.  In fact, I'm not sure I see
the benefit of warning on these casts under any circumstances.



Well, while the first example should be safe,
the second one is probably not safe:

Because the signed and unsigned char are promoted to int,
by the ABI but one is in the range -128..127 while the
other is in the range 0..255, right?


Right.  The cast is always safe but whether or not a call to such
a function through the incompatible pointer is also safe depends
on the definition of the function (and on the caller).  If the
function uses just the low bits of the argument then it's most
likely fine for any argument.  If the caller only calls it with
values in the 7-bit range (e.g., the ASCII subset) then it's also
fine.  Otherwise there's the potential for the problem you pointed
out.  (Similarly, if in the first example I gave the cast added
constness to the argument rather than removing it and the function
modified the pointed-to object calling it through the incompatible
pointer on a constant object would also be unsafe.)

Another class of cases to consider are casts between functions
taking pointers to different but related structs.  Code like this
could be written to mimic C++ calling a base class function on
a derived object.

  struct Base { ... };
  struct Derived { Base b; ... };

  typedef void F (Derived*);

  void foo (Base*);

  F* pf = (F*)foo;

Martin


[PR c++/82424] Dont convert dependent types

2017-10-06 Thread Nathan Sidwell
This fixes crash with -Wshadow=compatible-local, where we ended up 
trying to convert to or from a dependent type.  Rather than skip that 
out right, I see if the types are the same (dependent or not) before 
also trying convert in the non-dependent case.  I suppose I could try 
matching unqualified variants, but that's probably a slippery slope.


Applying to trunk.

nathan

--
Nathan Sidwell
2017-10-06  Nathan Sidwell  

	cp/
	PR c++/82424
	* name-lookup.c (check_local_shadow): Don't try and convert
	dependent types.

	testsuite/
	PR c++/82424
	* g++.dg/warn/pr82424.C: New.

Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 253493)
+++ cp/name-lookup.c	(working copy)
@@ -2728,7 +2728,11 @@ check_local_shadow (tree decl)
   else if (warn_shadow_local)
 	warning_code = OPT_Wshadow_local;
   else if (warn_shadow_compatible_local
-	   && can_convert (TREE_TYPE (old), TREE_TYPE (decl), tf_none))
+	   && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl))
+		   || (!dependent_type_p (TREE_TYPE (decl))
+		   && !dependent_type_p (TREE_TYPE (old))
+		   && can_convert (TREE_TYPE (old), TREE_TYPE (decl),
+   tf_none
 	warning_code = OPT_Wshadow_compatible_local;
   else
 	return;
Index: testsuite/g++.dg/warn/pr82424.C
===
--- testsuite/g++.dg/warn/pr82424.C	(revision 0)
+++ testsuite/g++.dg/warn/pr82424.C	(working copy)
@@ -0,0 +1,20 @@
+// { dg-additional-options "-Wshadow=compatible-local" }
+
+// pr c++/82424 we were trying to convert between dependent types.
+template  class a
+{
+  struct b;
+  template  void c ();
+};
+template 
+template 
+void
+a::c ()
+{
+  typedef typename T::b b; // Don't go looking inside the typename
+  T thing;
+  {
+T thing; // { dg-warning "shadows a previous local" }
+  }
+}
+


Re: [PATCH] libstdc++: istreambuf_iterator proxy (was: keep attached streambuf)

2017-10-06 Thread François Dumont

On 03/10/2017 22:39, Petr Ovtchenkov wrote:

On Thu, 28 Sep 2017 13:38:06 +0100
Jonathan Wakely  wrote:


On 28/09/17 15:06 +0300, Petr Ovtchenkov wrote:

On Thu, 28 Sep 2017 11:34:25 +0100
Jonathan Wakely  wrote:

+  VERIFY(i == std::istreambuf_iterator());
+
+  VERIFY(memcmp(b, r, 36) == 0);
+
+  s << q;
+  VERIFY(!s.fail());
+
+  copy_n(i, 36, r);

This is undefined behaviour. The end-of-stream iterator value cannot
be dereferenced.

Within this test istreambuf_iterator in eof state never dereferenced.

That is quite implementation dependent.

The libc++ and VC++ implementations fail this test, because once an
istreambuf_iterator has been detected to reach end-of-stream it
doesn't get "reset" by changes to the streambuf.

If we will keep even "unspecified" behaviour same, then bug fix/drawback
removing become extremely hard: it should be identified as drawback
in all libs almost simultaneously.


The libc++ implementation crashes, because operator== on an
end-of-stream iterator sets its streambuf* to null, and any further
increment or dereference will segfault.

So this is testing something that other implementations don't support,
and isn't justifiable from the standard.

I will use N4687 as reference.

27.2.3 par.2 Table 95:

++r

Requires: r is dereferenceable. Postconditions: r is dereferenceable or r is
past-the-end; any copies of the previous value of r are no longer required
either to be dereferenceable or to be in the domain of ==.

(void)r++ equivalent to (void)++r

*r++

{ T tmp = *r;
++r;
return tmp; }

[BTW, you see that r++ without dereference has no sense, and even more,

   copies of the previous
   value of r are no longer
   required either to be
   dereferenceable or to be in
   the domain of ==.

>From this follow, that postfix increment operator shouldn't return
istreambuf_iterator.
]


The test itself simulate "stop and go" istream usage.
stringstream is convenient for behaviuor illustration, but in "real life"
I can assume socket or tty on this place.

At the very minimum we should have a comment in the test explaining
how it relies on non-standard, non-portable behaviour.

But I'd prefer to avoid introducing more divergence from other
implementations.

Standard itself say nothting about "stop and go" scenario.
At least I don't see any words pro or contra.
But current implementation block usage of istreambuf_iterator
with underlying streams like socket or tty, so istreambuf_iterator become
almost useless structure for practice.
Why not creating a new istreambuf_iterator each time you need to check 
that streambuf is not in eof anymore ?



We have three issues with istreambuf_iterator:
   - debug-dependent behaviour

Fixed.

   - EOL of istreambuf_iterator when it reach EOF (but this not mean
 EOL of associated streambuf)

Controversial.

   - postfix increment operator return istreambuf_iterator, but here
 expected restricted type, that accept only dereference, if it possible.

I agree that we need to fix this last point too.

Consider this code:

  std::istringstream inf("abc");
  std::istreambuf_iterator j(inf), eof;
  std::istreambuf_iterator i = j++;

  assert( *i == 'a' );

At this point it looks like i is pointing to 'a' but then when you do:

std::string str(i, eof);

you have:
assert( str == "ac" );

We jump other the 'b'.

We could improve the situation by adding a debug assertion that _M_c is 
eof when pre-increment is being used or by changing semantic of 
pre-increment to only call sbumpc if _M_c is eof. But then we would need 
to consider _M_c in find overload and in some other places in the lib I 
think.


Rather than going through this complicated path I agree with Petr that 
we need to simply implement the solution advised by the Standard with 
the nested proxy type.


This is what I have done in the attached patch in a naive way. Do we 
need to have abi compatibility here ? If so I'll rework it.


This patch will make libstdc++ pass the llvm test. I even duplicate it 
on our side with a small refinement to check for the return value of the 
proxy::operator*().


François

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h b/libstdc++-v3/include/bits/streambuf_iterator.h
index 64b8cfd..a556fce 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,12 +95,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
   mutable streambuf_type*	_M_sbuf;
-  int_type			_M_c;
 
 public:
   ///  Construct end of input stream iterator.
   _GLIBCXX_CONSTEXPR istreambuf_iterator() _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(0), _M_c(traits_type::eof()) { }
+  : _M_sbuf() { }
 
 #if __cplusplus >= 201103L
   istreambuf_iterator(const istreambuf_iterator&) noexcept = default;
@@ -110,11 +109,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   ///  Construct start of input stream iterator.
   istreambuf_i

Re: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing vec align tests.

2017-10-06 Thread Christophe Lyon
On 6 October 2017 at 09:45, Tamar Christina  wrote:
>
>
>> -Original Message-
>> From: Rainer Orth [mailto:r...@cebitec.uni-bielefeld.de]
>> Sent: 05 October 2017 20:16
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd; James Greenhalgh; Richard Earnshaw;
>> Marcus Shawcroft
>> Subject: Re: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing
>> vec align tests.
>>
>> Hi Tamar,
>>
>> > Previously I had corrected the vect_hw_misalign check which prompted
>> > these three test to start failing because the condition needs to be
>> > inverted in the testcases.
>> >
>> > Regtested on aarch64-none-elf, arm-none-linux-gnueabihf and x86_64-pc-
>> linux-gnu.
>> >
>> > Ok for trunk?
>> >
>> > Thanks,
>> > Tamar.
>> >
>> > gcc/testsuite/
>> > 2017-10-02  Tamar Christina  
>> >
>> > * gcc.dg/vect/vect-align-1.c: Fix vect_hw_misalign condition.
>> > * gcc.dg/vect/vect-align-2.c: Likewise.
>> > * gcc.dg/vect/vect-multitypes-1.c: Likewise.
>>
>> unfortunately, your patch caused gcc.dg/vect/vect-multitypes-1.c to FAIL on
>> sparc-sun-solaris2.11 (32 and 64-bit):
>>
>> FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  
>> scan-tree-dump-
>> times vect "Vectorizing an unaligned access" 4
>> FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect
>> "Vectorizing an unaligned access" 4
>
> Thanks! I'll take a look.
>

If that's easier for you, I've noticed the same thing on
armeb-none-linux-gnueabihf
--with-mode arm
--with-cpu cortex-a9
--with-fpu neon-fp16

Christophe


> Tamar
>
>>
>> It had XFAILed before.
>>
>>   Rainer
>>
>> --
>> -
>> Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Add selftest for vec::reverse

2017-10-06 Thread David Malcolm
Martin: I noticed that your switch expansion patch added a
  vec::reverse ()
method.  Here's a proposed selftest for it, mostly to verify
that it handles even vs odd lengths (which it does).

Only lightly tested; hope this is useful.
Dave

gcc/ChangeLog:
* vec.c (selftest::test_reverse): New function.
(selftest::vec_c_tests): Call it.
---
 gcc/vec.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/gcc/vec.c b/gcc/vec.c
index d612703..5d70973 100644
--- a/gcc/vec.c
+++ b/gcc/vec.c
@@ -359,6 +359,43 @@ test_qsort ()
   ASSERT_EQ (10, v.length ());
 }
 
+/* Verify that vec::reverse works correctly.  */
+
+static void
+test_reverse ()
+{
+  /* Reversing an empty vec ought to be a no-op.  */
+  {
+auto_vec  v;
+ASSERT_EQ (0, v.length ());
+v.reverse ();
+ASSERT_EQ (0, v.length ());
+  }
+
+  /* Verify reversing a vec with even length.  */
+  {
+auto_vec  v;
+safe_push_range (v, 0, 4);
+v.reverse ();
+ASSERT_EQ (3, v[0]);
+ASSERT_EQ (2, v[1]);
+ASSERT_EQ (1, v[2]);
+ASSERT_EQ (0, v[3]);
+ASSERT_EQ (4, v.length ());
+  }
+
+  /* Verify reversing a vec with odd length.  */
+  {
+auto_vec  v;
+safe_push_range (v, 0, 3);
+v.reverse ();
+ASSERT_EQ (2, v[0]);
+ASSERT_EQ (1, v[1]);
+ASSERT_EQ (0, v[2]);
+ASSERT_EQ (3, v.length ());
+  }
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -374,6 +411,7 @@ vec_c_tests ()
   test_unordered_remove ();
   test_block_remove ();
   test_qsort ();
+  test_reverse ();
 }
 
 } // namespace selftest
-- 
1.8.5.3



Re: [PATCH][GCC][ARM] Dot Product commandline options [Patch (1/8)]

2017-10-06 Thread Richard Earnshaw (lists)
On 06/10/17 13:44, Tamar Christina wrote:
> Hi All,
> 
> This is a respin of the patch with the feedback processed.
> 
> Regtested on arm-none-eabi, armeb-none-eabi,
> aarch64-none-elf and aarch64_be-none-elf with no issues found.
> 
> Ok for trunk?
> 
> gcc/
> 2017-10-06  Tamar Christina  
> 
> * config/arm/arm.h (TARGET_DOTPROD): New.
> * config/arm/arm.c (arm_arch_dotprod): New.
> (arm_option_reconfigure_globals): Add arm_arch_dotprod.
> * config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
> * config/arm/arm-cpus.in (cortex-a55, cortex-75): Enabled +dotprod.
> (armv8.2-a, cortex-a75.cortex-a55): Likewise.
> (feature dotprod, group dotprod, ALL_SIMD_INTERNAL): New.
> (ALL_FPU_INTERNAL): Use ALL_SIMD_INTERNAL.
> * config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
> * doc/invoke.texi (armv8.2-a): Document dotprod
> 
> From: Kyrill Tkachov 
> Sent: Wednesday, September 13, 2017 11:01:55 AM
> To: Tamar Christina; gcc-patches@gcc.gnu.org
> Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com
> Subject: Re: [PATCH][GCC][ARM] Dot Product commandline options [Patch (1/8)]
> 
> Hi Tamar,
> 
> On 01/09/17 14:19, Tamar Christina wrote:
>> Hi All,
>>
>> This patch adds support for the +dotprod extension to ARM.
>> Dot Product requires Adv.SIMD to work and so enables this option
>> by default when enabled.
>>
>> It is available from ARMv8.2-a and onwards and is enabled by
>> default on Cortex-A55 and Cortex-A75.
>>
>> Regtested and bootstrapped on arm-none-eabi and no issues.
> 
> I'm assuming you mean arm-none-linux-gnueabihf :)
> 
>> Ok for trunk?
>>
>> gcc/
>> 2017-09-01  Tamar Christina  
>>
>>   * config/arm/arm.h (TARGET_DOTPROD): New.
>>   * config/arm/arm.c (arm_arch_dotprod): New.
>>   (arm_option_reconfigure_globals): Add arm_arch_dotprod.
>>   * config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
>>   * config/arm/arm-cpus.in (cortex-a55, cortex-75): Enabled +dotprod.
>>   (armv8.2-a, cortex-a75.cortex-a55): Likewise.
>>   * config/arm/arm-isa.h (isa_bit_dotprod, ISA_DOTPROD): New.
> 
> arm-isa.h is now autogenerated after r251799 so you'll need to rebase on
> top of that.
> That being said, that patch was temporarily reverted [1] so you'll have
> to apply it manually in your
> tree to rebase, or wait until it is reapplied.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00579.html
> 
> The patch looks ok to me otherwise with a documentation nit below.
> 
>>   * config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
>>   * doc/invoke.texi (armv8.2-a): Document dotprod
>>
> 
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -15492,6 +15492,10 @@ The ARMv8.1 Advanced SIMD and floating-point 
> instructions.
>   The cryptographic instructions.  This also enables the Advanced SIMD and
>   floating-point instructions.
> 
> +@item +dotprod
> +Enable the Dot Product extension.  This also enables Advanced SIMD 
> instructions
> +and allows auto vectorization of dot products to the Dot Product 
> instructions.
> 
> This should be "auto-vectorization"
> 
> Thanks,
> Kyrill
> 
> 
> 


Hmm, can you arrange to add patches as text/plain attachments, so that
when I reply the patch is included for comments, please?

+# double-precision FP. Make sure bits that are not an FPU bit go
instructions
+# ALL_SIMD instead of ALL_SIMD_INTERNAL.

Two spaces after full stop.  The new sentence doesn't make sense.
Instead, I think you should probably put the following:

"ALL_FPU lists all the feature bits associated with the floating-point
unit; these will all be removed if the floating-point unit is disabled
(eg -mfloat-abi=soft).  ALL_FPU_INTERNAL must ONLY contain features that
form part of a named -mfpu option; it is used to map the capabilities
back to a named FPU for the benefit of the assembler.  ALL_SIMD_INTERNAL
and ALL_SIMD are similarly defined to help with the construction of
ALL_FPU and ALL_FPU_INTERNAL; they describe the SIMD extensions that are
either part of a named FPU or optional extensions respectively."

You might need to rejig the other sentence there as well to make it more
consistent.

@@ -239,6 +243,7 @@ define fgroup FP_D32FP_DBL fp_d32
 define fgroup FP_ARMv8 FPv5 FP_D32
 define fgroup NEON FP_D32 neon
 define fgroup CRYPTO   NEON crypto
+define fgroup DOTPROD   NEON dotprod

lines above have a hard tab between the group name and the features it
contains.  Your entry has spaces.  Please fix for consistency.

@@ -1473,9 +1479,10 @@ begin cpu cortex-a55
  cname cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
+ option dotprod add FP_ARMv8 DOTPROD

We don't have an option entry for +fp16 (all Cortex-a55 cores implement
it), so we should treat dotprod similarly here.  Crypto is a specia

Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Martin Sebor

On 10/06/2017 04:18 AM, Martin Liška wrote:

On 10/05/2017 07:06 PM, Martin Sebor wrote:

On 10/04/2017 03:05 AM, Martin Liška wrote:

Hello.

Following patch adds support for optimizing out unnecessary UBSAN_PTR checks.
It handles separately positive and negative offsets, zero offset is ignored.
Apart from that, we utilize get_inner_reference for local and global variables,
that also helps to reduce some.

Some numbers:

1) postgres:

bloaty /tmp/after2 -- /tmp/before2
 VM SIZE  FILE SIZE
 ++ GROWING++
  [ = ]   0 .debug_abbrev  +1.84Ki  +0.3%

 -- SHRINKING  --
 -30.3% -3.98Mi .text  -3.98Mi -30.3%
  [ = ]   0 .debug_info-3.69Mi -17.2%
  [ = ]   0 .debug_loc -2.02Mi -13.4%
 -43.1% -1.37Mi .data  -1.37Mi -43.1%
  [ = ]   0 .debug_ranges   -390Ki -14.3%
  [ = ]   0 .debug_line -295Ki -11.6%
  -4.0% -26.3Ki .eh_frame  -26.3Ki  -4.0%
  [ = ]   0 [Unmapped] -1.61Ki -62.3%
  [ = ]   0 .strtab-1.15Ki  -0.4%
  [ = ]   0 .symtab-1.08Ki  -0.3%
  -0.4%-368 .eh_frame_hdr -368  -0.4%
  [ = ]   0 .debug_aranges-256  -0.7%
  [DEL] -16 [None]   0  [ = ]

 -28.0% -5.37Mi TOTAL  -11.8Mi -18.8%


This looks like an impressive improvement!  FWIW, I've been
meaning to look into similar opportunities mentioned in bug
79265.


Hi. Thank you very much for feedback. If you want I can help with the PR?



Would making use of get_range_info() make sense here to detect
and eliminate even more cases?

Just a few minor mostly stylistic suggestions.

+/* Traits class for tree triplet hash maps below.  */
+
+struct sanopt_tree_couple_hash : typed_noop_remove 
+{
...
+  static inline bool
+  equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)

Member functions defined within the body of the class are implicitly
inline so while not wrong, there is no need to declare them inline
explicitly.


Done that in v2.



Also, since mark_deleted uses reinterpret_cast (as suggested by
GCC coding conventions) it seems that is_deleted should do the
same for consistency.  Alternatively, if there isn't enough
interest/consensus to follow this guideline perhaps it should
be removed from the GCC coding conventions.  (Very few GCC code
seems to use reinterpret_cast.)


Likewise.




+/* Return true when pointer PTR for a given OFFSET is already sanitized
+   in a given sanitization context CTX.  */

Shouldn't the comment read CUR_OFFSET?  I ask because the function
also declares a local variable OFFSET.


Yes, should be fixed.



+static bool
+has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr, tree cur_offset)
+{
...
+  /* We already have recorded a UBSAN_PTR check for this pointer. Perhaps we
+ can drop this one.  But only if this check doesn't specify larger offset.
+ */
+  tree offset = gimple_call_arg (g, 1);

Martin

PS It seems to me that the test could be enabled for all targets
where UBSan is supported by making use of SIZE_MAX to compute
the values of the constants instead of hardwiring LP64 values.
I noticed the test doesn't exercise member arrays.  Are those
not handled by the patch?


I decided to use __PTRDIF__MAX__ as I need signed type. Is it ok?


I think that should work too.  ptrdiff_t is usually the same size
as size_t.  The two exceptions I could find are VMS and the M32C
target where size_t is always 32-bits wide but ptrdiff_t can be
64-bits.  I don't know what that might mean for the sanitizer (or
if it's even supported there).

Martin



Martin




Left checks:
261039
Optimized out:
85643

2) tramp3d:

bloaty after -- before
 VM SIZE FILE SIZE
 ++ GROWING   ++
  +167% +30 [Unmapped]+1.01Ki   +39%

 -- SHRINKING --
 -58.5% -2.52Mi .text -2.52Mi -58.5%
 -64.2%  -574Ki .data  -574Ki -64.2%
  -5.7% -4.27Ki .eh_frame -4.27Ki  -5.7%
  -6.4% -1.06Ki .gcc_except_table -1.06Ki  -6.4%
  -7.2%-192 .bss0  [ = ]
  -0.1% -32 .rodata   -32  -0.1%
  [ = ]   0 .strtab   -29  -0.0%
  -1.1% -24 .dynsym   -24  -1.1%
  -1.5% -24 .rela.plt -24  -1.5%
  [ = ]   0 .symtab   -24  -0.1%
  -0.6% -16 .dynstr   -16  -0.6%
  -1.5% -16 .plt  -16  -1.5%
  -1.4%  -8 .got.plt   -8  -1.4%
  -0.6%  -4 .hash  -4  -0.6%
  -1.1%  -2 .gnu.version   -2  -1.1%

 -58.0% -3.09Mi TOTAL -3.08Mi -55.0%

Left checks:
31131
Optimized out:
36752

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-10-04  Martin Liska  

* sanopt.c (struct sanopt_tree_couple): New struct.
(struct sanopt_tree_couple_hash): Likewise.
(struct sano

Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Jakub Jelinek
On Fri, Oct 06, 2017 at 10:30:11AM -0600, Martin Sebor wrote:
> I think that should work too.  ptrdiff_t is usually the same size
> as size_t.  The two exceptions I could find are VMS and the M32C
> target where size_t is always 32-bits wide but ptrdiff_t can be
> 64-bits.  I don't know what that might mean for the sanitizer (or
> if it's even supported there).

-fsanitize=pointer-overflow does nothing
if (TYPE_PRECISION (sizetype) != POINTER_SIZE)
But the runtime library is supported only on
  x86_64-*-linux* | i?86-*-linux*)
  powerpc*-*-linux*)
  sparc*-*-linux*)
  s390*-*-linux*)
  arm*-*-linux*)
  aarch64*-*-linux*)
  x86_64-*-darwin[1]* | i?86-*-darwin[1]*)
anyway, and on all these targets the sizes are the same and so
is ptrdiff_t.

Jakub


Re: [PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions

2017-10-06 Thread Segher Boessenkool
Hi Mike,

On Thu, Oct 05, 2017 at 06:14:14PM -0400, Michael Meissner wrote:
> This patch adds support for most of the PowerPC ISA 3.0 Atomic Memory 
> Operation
> instructions, listed in section 4.5 of the manual.  Currently these functions
> are done via extended asm.  At some point in the future, I will probably move
> the inner part of the patch to use new built-in functions to replace the
> extended asm.

Just some asm comments for now, have to think about the rest a bit more.

> +#define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC)   
> \
> +static __inline__ TYPE   
> \
> +NAME (TYPE *_PTR, TYPE _VALUE)   
> \
> +{\
> +  unsigned __int128 _TMP;\
> +  TYPE _RET; \
> +  __asm__ volatile ("mr %L1,%3\n"\
> + "\t" OPCODE " %1,%4,%5\t\t# %0\n"   \
> + "\tmr %2,%1\n"  \
> + : "+Q" (_PTR[0]), "=&r" (_TMP), "=r" (_RET) \
> + : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));  \
> +  return _RET;   
> \
> +}

You don't need arg 4: "%P0" is the same thing.

Do you really need the mr insns?  Can't you express that in the
arguments?  Perhaps using a union of __int128 with something that
is two long ints?

> +#define _AMO_ST_SIMPLE(NAME, TYPE, OPCODE, FC)   
> \
> +static __inline__ void   
> \
> +NAME (TYPE *_PTR, TYPE _VALUE)   
> \
> +{\
> +  __asm__ volatile (OPCODE " %1,%2,%3\t\t# %0"   
> \
> + : "+Q" (_PTR[0])\
> + : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));  \
> +  return;\
> +}

Same for "b" (arg 2) here.


Segher


Re: [PATCH][GCC] Simplification of 1U << (31 - x)

2017-10-06 Thread Sudi Das

Hi Jakub

As per the discussions, I have a created a bug report for the possible 
regression this may cause.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82454

Sudi


From: Wilco Dijkstra
Sent: Tuesday, September 26, 2017 2:20 PM
To: Sudi Das; Jakub Jelinek
Cc: Richard Biener; GCC Patches; nd; Richard Earnshaw; James Greenhalgh
Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x)
    
Jakub Jelinek wrote:

> Well, we don't want to regress performance wise on one of the most important
> primary targets.  I don't care that much if the RTL/backend work is done
> together with the patch, or as a follow-up during stage1/3, but it should be
> done, the testcases I've posted can be used as a basis of a P1 runtime
> performance regression.

It should be sufficient to file a bug about inefficient 64-bit constant 
expansions on
x64. I didn't see a significant difference in my benchmarking of it on x64, so 
I'd say
it's only a performance regression if large benchmarks regress measurably (quite
unlikely).

Wilco

Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-06 Thread Martin Sebor

This looks like an impressive improvement!  FWIW, I've been
meaning to look into similar opportunities mentioned in bug
79265.


Hi. Thank you very much for feedback. If you want I can help with the PR?


I belatedly realized I didn't answer one of your questions (or
got an answer to one of mine :) so let me correct my oversight.

If you have the cycles to also tackle bug 79265 that would be
great. (I don't think of the bugs I file as my projects so if
one strikes you as interesting please feel free to grab it.)





Would making use of get_range_info() make sense here to detect
and eliminate even more cases?


I am curious what you think of this idea.

Martin


Re: [PATCH] Implement smart multiple switch expansion algorithms.

2017-10-06 Thread Mikhail Maltsev
Hi.

I have watched Martin's talk on switch lowering improvements (
https://slideslive.com/38902416/switch-lowering-improvements), the last
slide has a question about benchmarks that can be used for tuning the
switch statement optimizer. Martin mentioned one common use case - bytecode
interpreters (such as perlbench from spec CPU 2006 and 2017). But there is
a caveat with modern bytecode interpreters, such as CPython: they use
computed gotos instead of switch statements and also implement the
"Threaded code" technique to improve utilization of the CPU's branch
predictor. (see this comment for detailed explanation:
https://github.com/python/cpython/blob/master/Python/ceval.c#L585)

Another common use case involving hot switch statements are various lexers
and parsers (either hand-coded or generated by tools such as ragel and
re2c). For example, a well-known web server Nginx uses several huge
hand-coded switch statements to parse HTTP requests (
http://lxr.nginx.org/source/src/http/ngx_http_parse.c).

I found an isolated benchmark for this parser: https://natsys-lab.blogspot.
ru/2014/11/the-fast-finite-state-machine-for-http.html (code:
https://github.com/natsys/blog/tree/master/http_benchmark). I hope this can
be helpful for performance analysis.


On Fri, Oct 6, 2017 at 4:46 PM, Wilco Dijkstra 
wrote:

> Martin Liska wrote:
>
> > There are some numbers for cc1plus:
> >
> > $ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
> > VM SIZE  FILE SIZE
> >   +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%
>
> > insn-attrtab.o:
> > VM SIZE  FILE SIZE
> >   +214%  +682Ki .rodata +682Ki  +214%
> >  -50.1% -63.3Ki .text  -63.3Ki -50.1%
>
> So is that a 3.8% codesize increase or decrease? If an increase,
> I can't see how replacing 63KB of instructions with 682KB of data
> is a good tradeoff... There should be an accurate calculation
> of the density, taking the switch table width into account (really small
> tables can use 1-byte offsets, large tables are typically forced to
> use 4-byte offsets). This may need new target callbacks - I changed
> PARAM_CASE_VALUES_THRESHOLD on AArch64 to get smaller
> code and better performance since the current density calculations
> are hardcoded and quite wrong for big tables...
>
> Also what is the codesize difference on SPEC2006/2017? I don't see
> any mention of performance impact either...
>
> Wilco




-- 
Regards,
   Mikhail Maltsev


Re: [PATCH] Implement smart multiple switch expansion algorithms.

2017-10-06 Thread David Malcolm
On Fri, 2017-10-06 at 14:25 +0200, Martin Liška wrote:
> Hello.
> 
> As presented at this year's Cauldron, I rewrote current switch
> expansion to support
> multiple algorithms (jump table and bit-test) that can be used just
> for a fraction
> of cases. Balanced decision tree is built on top of that. I decided
> to do a bigger
> refactoring and put all there 3 mentioned algorithm to its own class.
> 
> There's a bigger change in jump_table_cluster::can_be_handled where
> the constant 10 (3 respectively)
> is compared to number of handled values, and not number of cases.
> Later one is wrong in my opinion.
> 
> There are some numbers for cc1plus:
> 
> $ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
>  VM SIZE  FILE SIZE
>  ++ GROWING++
>+19% +1.32Mi .rodata+1.32Mi   +19%
>   [ = ]   0 .symtab+7.38Ki  +0.5%
>   [ = ]   0 .strtab+5.18Ki  +0.2%
>   [ = ]   0 .debug_info   +743  +0.0%
>   +0.0%+712 .eh_frame +712  +0.0%
>   +0.1%+440 .eh_frame_hdr +440  +0.1%
>   [ = ]   0 .debug_aranges +80  +0.1%
>   +0.0% +67 .dynstr+67  +0.0%
>   [ = ]   0 .debug_str  +6  +0.0%
> 
>  -- SHRINKING  --
>   -1.2%  -214Ki .text   -214Ki  -1.2%
>   [ = ]   0 .debug_loc -66.7Ki  -0.1%
>   [ = ]   0 .debug_line-14.0Ki  -0.2%
>   [ = ]   0 .debug_ranges  -9.56Ki  -0.1%
>   -6.8%  -3 [Unmapped]-375 -14.4%
>   [ = ]   0 .debug_abbrev  -46  -0.0%
> 
>   +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%
> 
> So it growth and can be easily explained:
> 
> insn-attrtab.o:
>  VM SIZE  FILE SIZE
>  ++ GROWING++
>   [ = ]   0 .rela.rodata   +2.00Mi  +215%
>   +214%  +682Ki .rodata +682Ki  +214%
>   [ = ]   0 .rela.debug_loc+29.3Ki   +36%
>+32% +1.91Ki .eh_frame  +1.91Ki   +32%
>   [ = ]   0 .debug_loc +37  +5.6%
>   [ = ]   0 .debug_str  +2  +0.0%
> 
>  -- SHRINKING  --
>  -50.1% -63.3Ki .text  -63.3Ki -50.1%
>   [ = ]   0 .debug_line-1.71Ki -10.4%
>   [ = ]   0 .rela.debug_info  -768  -0.3%
>   [ = ]   0 .rela.text-624  -0.8%
>   [ = ]   0 .rela.debug_ranges-384  -2.7%
>   [ = ]   0 .debug_info-87  -0.2%
>   [ = ]   0 [Unmapped]  -2  -8.7%
> 
>   +137%  +621Ki TOTAL  +2.63Mi  +139%
> 
> It's caused by:
> ...
> ;; GIMPLE switch case clusters: JT(2710):-1-5090 
> ;; GIMPLE switch case clusters: JT(2710):-1-5090 
> ;; GIMPLE switch case clusters: JT(2967):-1-5090 
> ;; GIMPLE switch case clusters: JT(1033):-1-5017 
> ...
> 
> so there are many switch statements with very reasonable density.
> 
> and
> insn-dfatab.o:+13% +99.4Ki TOTAL   +455Ki   +14%
> insn-latencytab.o:   +19%  +136Ki TOTAL   +609Ki   +20%
> 
> There shouldn't be any fallout other from what I mentioned in
> previous email that is
> a prerequisite for this patch.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression
> tests.
> 
> I'm still planning to come with some numbers and stats. Will do that
> next week.
> 
> Martin


> gcc/ChangeLog:
> 
> 2017-09-29  Martin Liska  
> 
>   * tree-switch-conversion.c (MAX_CASE_BIT_TESTS): Remove and move
>   to header file.
>   (hoist_edge_and_branch_if_true): Move to ...
>   (bit_test_cluster::hoist_edge_and_branch_if_true): ... this.

I'm not a reviewer, but there's a lot of "churn" in the patch due to
things moving around; there seems to be a mixture of things simply
moving around, functions becoming methods of classes, and some
things changing.

Would it be helpful to split the patch into two: a patch that moves
the functions to where they'll be needed, and then a patch to do the
"actual" (or functional) changes?

[...snip...]

> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
> new file mode 100644
> index 000..45ae11f408d
> --- /dev/null
> +++ b/gcc/tree-switch-conversion.h
> @@ -0,0 +1,806 @@
> +/* Tree switch conversion for GNU compiler.
> +   Copyright (C) 2017 Free Software Foundation, Inc.

Should this be "gimple switch conversion" and
gimple-ssa-switch-conversion.h?
(but presumably this is to mirror tree-switch-conversion.c, and
I don't want to advocate for unnecessary churn)

> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public 

Re: [PATCH] libstdc++: istreambuf_iterator proxy (was: keep attached streambuf)

2017-10-06 Thread Petr Ovtchenkov
On Fri, 6 Oct 2017 18:01:36 +0200
François Dumont  wrote:

> ...
> >>> The test itself simulate "stop and go" istream usage.
> >>> stringstream is convenient for behaviuor illustration, but in "real life"
> >>> I can assume socket or tty on this place.
> >> At the very minimum we should have a comment in the test explaining
> >> how it relies on non-standard, non-portable behaviour.
> >>
> >> But I'd prefer to avoid introducing more divergence from other
> >> implementations.
> > Standard itself say nothting about "stop and go" scenario.
> > At least I don't see any words pro or contra.
> > But current implementation block usage of istreambuf_iterator
> > with underlying streams like socket or tty, so istreambuf_iterator become
> > almost useless structure for practice.
> Why not creating a new istreambuf_iterator each time you need to check 
> that streambuf is not in eof anymore ?

It's strange question. Because EOL (end-of-life) for istreambuf_iterator
object after EOF of associated streambuf 

  - not directly follow from standard, just follow from (IMO wrong)
implementations; and this is a balk for useful usage of istreambuf_iterator;
  - violate expectations from point of view of C++ objects life cycle;
  - require destruction and construction of istreambuf_iterator
object after check for equality with istreambuf_iterator()
(strange trick).

> 
> > We have three issues with istreambuf_iterator:
> >- debug-dependent behaviour
> Fixed.

+   __glibcxx_requires_cond(_M_sbuf,
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
vs

+   __glibcxx_requires_cond(_M_sbuf && !_S_is_eof(_M_sbuf->sgetc()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));

and

+   __glibcxx_requires_cond(_M_sbuf
+   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
vs

+   __glibcxx_requires_cond(_M_sbuf && !_S_is_eof(_M_sbuf->sgetc()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));

I'm insist on the first variant. It free from code that may lead to different
behaviour between debug/non-debug (like _M_sbuf->sgetc()).



> >- EOL of istreambuf_iterator when it reach EOF (but this not mean
> >  EOL of associated streambuf)
> Controversial.
> >- postfix increment operator return istreambuf_iterator, but here
> >  expected restricted type, that accept only dereference, if it possible.
> I agree that we need to fix this last point too.
> 
> Consider this code:
> 
>    std::istringstream inf("abc");
>    std::istreambuf_iterator j(inf), eof;
>    std::istreambuf_iterator i = j++;
> 
>    assert( *i == 'a' );
> 
> At this point it looks like i is pointing to 'a' but then when you do:
> 
> std::string str(i, eof);
> 
> you have:
> assert( str == "ac" );

No. I mean that in last (my) suggestion ([PATCH])

   std::istreambuf_iterator i = j++;

is impossible ("impossible" is in aggree with standard).
So test test01() in 2.cc isn't correct.

> 
> We jump other the 'b'.
> 
> We could improve the situation by adding a debug assertion that _M_c is 
> eof when pre-increment is being used or by changing semantic of 
> pre-increment to only call sbumpc if _M_c is eof. But then we would need 
> to consider _M_c in find overload and in some other places in the lib I 
> think.
> 
> Rather than going through this complicated path I agree with Petr that 
> we need to simply implement the solution advised by the Standard with 
> the nested proxy type.
> 
> This is what I have done in the attached patch in a naive way. Do we 
> need to have abi compatibility here ? If so I'll rework it.
> 
> This patch will make libstdc++ pass the llvm test. I even duplicate it 
> on our side with a small refinement to check for the return value of the 
> proxy::operator*().
> 
> François
> 

--

  - ptr


Re: [PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions

2017-10-06 Thread Michael Meissner
On Fri, Oct 06, 2017 at 11:45:10AM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Oct 05, 2017 at 06:14:14PM -0400, Michael Meissner wrote:
> > This patch adds support for most of the PowerPC ISA 3.0 Atomic Memory 
> > Operation
> > instructions, listed in section 4.5 of the manual.  Currently these 
> > functions
> > are done via extended asm.  At some point in the future, I will probably 
> > move
> > the inner part of the patch to use new built-in functions to replace the
> > extended asm.
> 
> Just some asm comments for now, have to think about the rest a bit more.
> 
> > +#define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC) 
> > \
> > +static __inline__ TYPE 
> > \
> > +NAME (TYPE *_PTR, TYPE _VALUE) 
> > \
> > +{  \
> > +  unsigned __int128 _TMP;  \
> > +  TYPE _RET;   
> > \
> > +  __asm__ volatile ("mr %L1,%3\n"  \
> > +   "\t" OPCODE " %1,%4,%5\t\t# %0\n"   \
> > +   "\tmr %2,%1\n"  \
> > +   : "+Q" (_PTR[0]), "=&r" (_TMP), "=r" (_RET) \
> > +   : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));  \
> > +  return _RET; 
> > \
> > +}
> 
> You don't need arg 4: "%P0" is the same thing.

I didn't think there was a print_opcode that did what I wanted, I'll modify it.

> Do you really need the mr insns?  Can't you express that in the
> arguments?  Perhaps using a union of __int128 with something that
> is two long ints?

My first attempt resulted in the compiler doing move directs to form the
__int128 in the vector unit and then move directs back.  So, I figured having
two mr's was a simple price to pay.

I do have thoughts to replace the two with a built-in (but keep amo.h and the
names), and we can probably eliminate some of the moves.

> > +#define _AMO_ST_SIMPLE(NAME, TYPE, OPCODE, FC) 
> > \
> > +static __inline__ void 
> > \
> > +NAME (TYPE *_PTR, TYPE _VALUE) 
> > \
> > +{  \
> > +  __asm__ volatile (OPCODE " %1,%2,%3\t\t# %0" 
> > \
> > +   : "+Q" (_PTR[0])\
> > +   : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));  \
> > +  return;  \
> > +}
> 
> Same for "b" (arg 2) here.
> 
> 
> Segher
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[patch, wwwdocs, committed]

2017-10-06 Thread Thomas Koenig

Hi,

I just committed the change below.  Gerald's bot had no
complaints, so I guess this must be OK :-)


75a76,78
> The main version of libfortran has been changed to 5.
>   
>   
87a91,95
>   
> When an actual argument contains too few errors for a dummy argument,
> an error is now issued. The -std=legacy option can be
> used to still compile such code.
>   


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Bernd Edlinger
On 10/06/17 17:43, Martin Sebor wrote:
> On 10/06/2017 07:25 AM, Bernd Edlinger wrote:
>> On 10/05/17 18:16, Martin Sebor wrote:
>>> In my (very quick) tests the warning appears to trigger on all
>>> strictly incompatible conversions, even if they are otherwise
>>> benign, such as:
>>>
>>>    int f (const int*);
>>>    typedef int F (int*);
>>>
>>>    F* pf1 = f;    // -Wincompatible-pointer-types
>>>    F* pf2 = (F*)f;    // -Wcast-function-type
>>>
>>> Likewise by:
>>>
>>>    int f (signed char);
>>>    typedef int F (unsigned char);
>>>
>>>    F* pf = (F*)f;    // -Wcast-function-type
>>>
>>> I'd expect these conversions to be useful and would view warning
>>> for them with -Wextra as excessive.  In fact, I'm not sure I see
>>> the benefit of warning on these casts under any circumstances.
>>>
>>
>> Well, while the first example should be safe,
>> the second one is probably not safe:
>>
>> Because the signed and unsigned char are promoted to int,
>> by the ABI but one is in the range -128..127 while the
>> other is in the range 0..255, right?
> 
> Right.  The cast is always safe but whether or not a call to such
> a function through the incompatible pointer is also safe depends
> on the definition of the function (and on the caller).  If the
> function uses just the low bits of the argument then it's most
> likely fine for any argument.  If the caller only calls it with
> values in the 7-bit range (e.g., the ASCII subset) then it's also
> fine.  Otherwise there's the potential for the problem you pointed
> out.  (Similarly, if in the first example I gave the cast added
> constness to the argument rather than removing it and the function
> modified the pointed-to object calling it through the incompatible
> pointer on a constant object would also be unsafe.)
> 
> Another class of cases to consider are casts between functions
> taking pointers to different but related structs.  Code like this
> could be written to mimic C++ calling a base class function on
> a derived object.
> 
>    struct Base { ... };
>    struct Derived { Base b; ... };
> 
>    typedef void F (Derived*);
> 
>    void foo (Base*);
> 
>    F* pf = (F*)foo;
> 

Hmm, yes.

I start to believe, that this warning should treat all pointers
as equivalent, but everything else need to be of the same type.
That would at least cover the majority of all "safe" use cases.

And I need a way to by-pass the warning with a generic function
pointer type.  uintptr_t is not the right choice, as you pointed
out already.

But I also see problems with "int (*) ()" as a escape mechanism
because this declaration creates a warning in C with
-Wstrict-prototypes, and in C++ this syntax means "int (*) (void)"
while the C++ type "int (*) (...)" is rejected by the C front end.

I start to believe that the type "void (*) (void)" is better suited for
this purpose, and it is already used in many programs as a type-less
wildcard function type.  I see examples in libgo and libffi at least.



Bernd.


Re: [PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions

2017-10-06 Thread Segher Boessenkool
Hi!

On Fri, Oct 06, 2017 at 02:03:43PM -0400, Michael Meissner wrote:
> > Do you really need the mr insns?  Can't you express that in the
> > arguments?  Perhaps using a union of __int128 with something that
> > is two long ints?
> 
> My first attempt resulted in the compiler doing move directs to form the
> __int128 in the vector unit and then move directs back.  So, I figured having
> two mr's was a simple price to pay.
> 
> I do have thoughts to replace the two with a built-in (but keep amo.h and the
> names), and we can probably eliminate some of the moves.

It's so ugly, even if it doesn't cost much ;-)

But don't worry about it, certainly not if the plan is to expand it
as a builtin later.


Segher


Re: [PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions

2017-10-06 Thread Michael Meissner
On Fri, Oct 06, 2017 at 01:25:33PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Oct 06, 2017 at 02:03:43PM -0400, Michael Meissner wrote:
> > > Do you really need the mr insns?  Can't you express that in the
> > > arguments?  Perhaps using a union of __int128 with something that
> > > is two long ints?
> > 
> > My first attempt resulted in the compiler doing move directs to form the
> > __int128 in the vector unit and then move directs back.  So, I figured 
> > having
> > two mr's was a simple price to pay.
> > 
> > I do have thoughts to replace the two with a built-in (but keep amo.h and 
> > the
> > names), and we can probably eliminate some of the moves.
> 
> It's so ugly, even if it doesn't cost much ;-)
> 
> But don't worry about it, certainly not if the plan is to expand it
> as a builtin later.

Here is the revised amo.h.  I tested the two test files amo1.c and amo2.c, and
they both compile.  It is interesting, use %P0 results in fewer addi's than the
older one (there were redunant addi's in passing the address).  Can I check it
in?

[gcc]
2017-10-06  Michael Meissner  

* config/rs6000/amo.h: New include file to provide ISA 3.0 atomic
memory operation instruction support.
* config.gcc (powerpc*-*-*): Include amo.h as an extra header.
(rs6000-ibm-aix[789]*): Likewise.
* doc/extend.texi (PowerPC Atomic Memory Operation Functions):
Document new functions.

[gcc/testsuite]
2017-10-06  Michael Meissner  

* gcc.target/powerpc/amo1.c: New test.
* gcc.target/powerpc/amo2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/amo.h
===
--- gcc/config/rs6000/amo.h (revision 0)
+++ gcc/config/rs6000/amo.h (revision 0)
@@ -0,0 +1,152 @@
+ /* Power ISA 3.0 atomic memory operation include file.
+Copyright (C) 2017 Free Software Foundation, Inc.
+Contributed by Michael Meissner .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published
+by the Free Software Foundation; either version 3, or (at your
+option) any later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+ #ifndef _AMO_H
+ #define _AMO_H
+
+ #if !defined(_ARCH_PWR9) || !defined(_ARCH_PPC64)
+ #error "The atomic memory operations require Power 64-bit ISA 3.0"
+
+ #else
+ #include 
+
+ /* Enumeration of the LWAT/LDAT sub-opcodes.  */
+ enum _AMO_LD {
+   _AMO_LD_ADD = 0x00, /* Fetch and Add.  */
+   _AMO_LD_XOR = 0x01, /* Fetch and Xor.  */
+   _AMO_LD_IOR = 0x02, /* Fetch and Ior.  */
+   _AMO_LD_AND = 0x03, /* Fetch and And.  */
+   _AMO_LD_UMAX= 0x04, /* Fetch and Unsigned Maximum.  
*/
+   _AMO_LD_SMAX= 0x05, /* Fetch and Signed Maximum.  */
+   _AMO_LD_UMIN= 0x06, /* Fetch and Unsigned Minimum.  
*/
+   _AMO_LD_SMIN= 0x07, /* Fetch and Signed Minimum.  */
+   _AMO_LD_SWAP= 0x08, /* Swap.  */
+   _AMO_LD_CS_NE   = 0x10, /* Compare and Swap Not Equal.  
*/
+   _AMO_LD_INC_BOUNDED = 0x18, /* Fetch and Increment Bounded.  */
+   _AMO_LD_INC_EQUAL   = 0x19, /* Fetch and Increment Equal.  */
+   _AMO_LD_DEC_BOUNDED = 0x1A  /* Fetch and Decrement Bounded.  */
+ };
+
+ /* Implementation of the simple LWAT/LDAT operations that take one register 
and
+modify one word or double-word of memory and return the value that was
+previously in the memory location.
+
+The LWAT/LDAT opcode requires the address to be a single register, and that
+points to a suitably aligned memory location.  Asm volatile is used to
+prevent the optimizer from moving the operation.  */
+
+ #define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC)
\
+ static __inline__ TYPE
\
+ NAME (TYPE *_PTR, TYPE _VALUE)
\
+ {  

Re: tree-complex.c: fix some_nonzerop test over reals (and a bug fix)

2017-10-06 Thread Laurent Thevenoux
Hi Richard, thanks for your quick reply.

- Mail original -
> De: "Richard Biener" 
> À: "Laurent Thevenoux" 
> Cc: "GCC Patches" 
> Envoyé: Vendredi 6 Octobre 2017 13:42:57
> Objet: Re: tree-complex.c: fix some_nonzerop test over reals (and a bug fix)
> 
> On Thu, Oct 5, 2017 at 4:41 PM, Laurent Thevenoux
>  wrote:
> > Hello,
> >
> > This patch improves the some_nonzerop(tree t) function from tree-complex.c
> > file (the function is only used there).
> >
> > This function returns true if a tree as a parameter is not the constant 0
> > (or +0.0 only for reals when !flag_signed_zeros ). The former result is
> > then used to determine if some simplifications are possible for complex
> > expansions (addition, multiplication, and division).
> >
> > Unfortunately, if the tree is a real constant, the function always return
> > true, even for +0.0 because of the explicit test on flag_signed_zeros (so
> > if your system enables signed zeros you cannot benefit from those
> > simplifications). To avoid this behavior and allow complex expansion
> > simplifications, I propose the following patch, which test for the sign of
> > the real constant 0.0 instead of checking the flag.
> 
> But
> 
> +  if (TREE_CODE (t) == REAL_CST)
> +{
> +  if (real_identical (&TREE_REAL_CST (t), &dconst0))
> +   zerop = !real_isneg (&TREE_REAL_CST (t));
> +}
> 
> shouldn't you do
> 
>zerop = !flag_signed_zeros || !real_isneg (&TREE_REAL_CST (t));
> 
> ?

I’m not so sure. If I understand your proposition (tables below gives values of 
zerop following the values of t and flag_signed_zeros):

   |   zerop
 t | !flag_signed_zeros is false | !flag_signed_zeros is true 
-
+n | true*   | true* 
-n | false   | true* 
 0 | true| true 
-0 | false   | unreachable 

But here, results with a * don't return the good value. The existing code is 
also wrong, it always returns false if flag_signed_zeros is true, else the code 
is correct:

   |   zerop
 t | !flag_signed_zeros is false | !flag_signed_zeros is true 
-
+n | false   | false 
-n | false   | false 
 0 | true| false*
-0 | false   | unreachable 

With the code I propose, we obtain the right results: 

 t | zerop 
--
+n | false 
-n | false 
 0 | true 
-0 | false

Do I really miss something (I'm sorry if I'm wrong)?

> 
> Also doesn't this miscalculate A - B for {ar, 0.} - {br, 0.} given the
> simplification simply
> returns bi?

Here we are in case PAIR (ONLY_REAL, ONLY_REAL), which gives ri = 0. (ai) even 
with signed zeros. So everything is OK.

> 
> So I'm not convinced this handling is correct.
> 
> > This first fix reveals a bug (thanks to
> > c-c++-common/torture/complex-sign-add.c ) in the simplification section of
> > expand_complex_addition (also fixed in the patch).
> 
> Your patch looks backward and the existing code looks ok.
> 
> @@ -937,7 +940,7 @@ expand_complex_addition (gimple_stmt_iterator
> *gsi, tree inner_type,
> 
>  case PAIR (VARYING, ONLY_REAL):
>rr = gimplify_build2 (gsi, code, inner_type, ar, br);
> -  ri = ai;
> +  ri = bi;
>break;

With the existing code we don’t perform the simplification step at all since it 
always returns (VARYING, VARYING) when flag_signed_zeros is true. 

For instance, with {ar, -0.} + {br, 0.}, if simplification really occurs, its 
in case PAIR (VARIYNG, ONLY_REAL) (because we consider -0. as a non_zero), and 
it returns -0. if we return ai (instead of -0. + 0. -> 0.). But I understand 
now that returning bi in this case is meaningless since {br, bi} is a ONLY_REAL 
complex. 

Nevertheless, for instance, {ar, -0.} + {br, 0.} -> {ar + br, -0.} is still 
buggy.

A solution could be:

case PAIR (VARYING, ONLY_REAL):
  rr = gimplify_build2 (gsi, code, inner_type, ar, br);
+ if (TREE_CODE (ai) == REAL_CST
+  && code = PLUS_EXPR
+  && real_identical (&TREE_REAL_CST (ai), &dconst0)
+  && real_isneg (&TREE_REAL_CST (ai)))
+   ri = bi;
+ else 
ri = ai;
  break;

Laurent 

> 
> down we have
> 
> case PAIR (ONLY_REAL, VARYING):
>   if (code == MINUS_EXPR)
> goto general;
>   rr = gimplify_build2 (gsi, code, inner_type, ar, br);
>   ri = bi;
>   break;
> 
> which for sure would need to change as well then.  But for your
> changed code we know
> bi is zero (but ai may not).
> 
> Richard.
> 
> > The patch has passed bootstrap and testing on x86_64-pc-linux-gnu .
> >
> > Best regards,
> > Laurent Thévenoux
>


[PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-10-06 Thread Qing Zhao
Thanks a lot for Wilco’s help on this bug. 

Yes, Aarch64 does NOT do anything wrong.  

The implementation of __builtin_update_setjmp_buf is not correct. It takes a 
pointer
as an operand and treats the Mode of the pointer as Pmode, which is not correct.
a conversion from ptr_mode to Pmode is needed for this pointer.

bootstrapped on aarch64-unknown-linux-gnu.
testing on aarch64-unknown-linux-gnu is running.

the new patch is:

=
gcc/ChangeLog

* builtins.c (expand_builtin_update_setjmp_buf): Add a
converstion to Pmode from the buf_addr.

gcc/testsuite/ChangeLog

PR middle-end/80295
* gcc.target/aarch64/pr80295.c: New test.

---
 gcc/builtins.c | 1 +
 gcc/testsuite/gcc.target/aarch64/pr80295.c | 8 
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr80295.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c8a5ea6..01fb08b 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1199,6 +1199,7 @@ void
 expand_builtin_update_setjmp_buf (rtx buf_addr)
 {
   machine_mode sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL);
+  buf_addr = convert_memory_address (Pmode, buf_addr);
   rtx stack_save
 = gen_rtx_MEM (sa_mode,
   memory_address
diff --git a/gcc/testsuite/gcc.target/aarch64/pr80295.c 
b/gcc/testsuite/gcc.target/aarch64/pr80295.c
new file mode 100644
index 000..b3866d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr80295.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=ilp32" } */
+
+void f (void *b) 
+{ 
+  __builtin_update_setjmp_buf (b); 
+}
+
-- 
1.9.1


> 
>> On Oct 5, 2017, at 11:50 AM, Richard Earnshaw (lists) 
>>  wrote:
>> 
>> On 25/09/17 17:35, Qing Zhao wrote:
>>> 
>>> --- a/gcc/config/aarch64/aarch64.h
>>> +++ b/gcc/config/aarch64/aarch64.h
>>> @@ -782,7 +782,7 @@ typedef struct
>>> /* Specify the machine mode that the hardware addresses have.
>>>   After generation of rtl, the compiler makes no further distinction
>>>   between pointers and any other objects of this machine mode.  */
>>> -#define Pmode  DImode
>>> +#define Pmode  (TARGET_ILP32 ? SImode : DImode)
>> 
>> This is wrong.  AArch64 has both ptr_mode and Pmode.  Pmode must always
>> be DImode as (internally) all addresses must be 64-bit.  ptr_mode
>> reflects the ABI choice of 32/64-bit language level addresses.  The
>> register SP must always be a 64-bit value, even when all the top 32 bits
>> are zero.
> 



[PATCH] Fortran -- Handle BOZ in accordance with F2008/2015.

2017-10-06 Thread Steve Kargl
All,

I have spent the last few days trying to reconcile the various Fortran
standards' requirements for handling BOZ.  The short story is that J3
over the last 27 years has made incompatible changes to the interpretation
of a BOZ (under some circumstances).  The interpretations in F2008 and
F2015 now treat a boz-literal-constant as a sequence of bits.  Unfortunately,
due to quirks in how BOZ are currently implemented in gfortran and a boat
load of documented and undocumented extensions, bringing gfortran into 
agreement with F2008/F2015 led to a rewrite of BOZ handling.  In the 
rewrite I have made no attempt to use the -std= option to try to maintain
the incompatibilities between standards.

On x86_64-*-freebsd, the attached patch gives

tail gcc/testsuite/gfortran/gfortran.sum

=== gfortran Summary ===

# of expected passes45888
# of unexpected failures4
# of unexpected successes   6
# of expected failures  97
# of unsupported tests  79
/home/sgk/gcc/obj/gcc/gfortran  version 8.0.0 20170927 (experimental) (GCC)

The 4 unexpected failures are not related to this patch.

2017-10-06  Steven G. Kargl  

PR fortran/45513
PR fortran/54072
PR fortran/81509
* array.c (resolve_array_list): Handle an array descriptor
with BOZ elements.
* check.c (numeric_check): Error for BOZ when numeric type is expected.
(numeric_or_boz_check): New function.  Check for numeric or BOZ arg.
(int_or_boz_check): New function.  Check for INTEGER or BOZ arg.
(gfc_check_bge_bgt_ble_blt): Use int_or_boz_check.
(gfc_check_cmplx): Use numeric_or_boz_check.
(gfc_check_complex): Permit BOZ arguments.
(gfc_check_dcmplx): Use numeric_or_boz_check.
(gfc_check_dble): Allow BOZ argument.
(boz_args_check): New function.  Disallow two BOZ arguments.
(gfc_boz2int): New function.  In-place conversion of BOZ to INTEGER.
(gfc_check_dshift): Use int_or_boz_check, boz_args_check, gfc_boz2int.
(gfc_check_iand): Renamed to ...
(gfc_check_iand_ieor_ior): ... this.  Use int_or_boz_check,
boz_args_check, and gfc_boz2int.  Convert gfc_notify_std to gfc_error.
(gfc_check_ieor, gfc_check_ior): Removed function.
(gfc_check_int): Use numeric_or_boz_check 
(gfc_check_merge_bits): Use int_or_boz_check, boz_args_check,
gfc_boz2int
(gfc_check_real): Allow BOZ.  Use numeric_or_boz_check.
(gfc_check_and): Allow BOZ. Use boz_args_check and gfc_boz2int
* data.c (gfc_assign_data_value): Implement F2008/F2015 sematics for
for BOZ in data statement.
* expr.c (gfc_get_expr): Set boz component of gfc_expr to NULL.
(gfc_copy_expr, free_expr0): Cope new boz component.
(gfc_check_assign): Replace old is_boz checks with BT_BOZ checks.
Use gfc_boz2int.
* gfortran.h (gfc_expr): Remove is_boz component.  Add boz
component. Add prototyp gfc_boz2int.
* intrinsic.c (add_functions): Use gfc_check_iand_ieor_ior
in place of gfc_check_iand, gfc_check_ieor, and gfc_check_ior.
* intrinsic.h: Add prototype for gfc_check_iand_ieor_ior.
Remove prototypes for gfc_check_iand, gfc_check_ieor, gfc_check_ior.
* intrinsic.texi: Document (some) changes.
* iresolve.c(gfc_resolve_iand,gfc_resolve_ieor,gfc_resolve_ior): Mark
j with ATTRIBUTE_UNUSED. Make IAND, IEOR, IOR
conform to F2008/2015.
* libgfortran.h: Add new basic type BT_BOZ.
* primary.c (match_boz_constant): Remove old handling of BOZ.
Cache BOZ string in gfc_expr's boz component.
* resolve.c (resolve_operator): Allow BOZ in binary 
numeric and rational operators.  Use gfc_boz2int or gfc_convert_boz
as needed.
(resolve_allocate_expr): Split declaration and initialization.
(resolve_ordinary_assign): Replace is_boz checks with BT_BOZ checks.
* simplify.c (convert_boz): Replace BT_INTEGER with BT_BOZ
(simplify_cmplx): Rearrange allow simplication of individual args.
Convert BOZ as needed.
(gfc_simplify_complex): Account of args with BT_BOZ.
(gfc_simplify_float): Replace is_boz check with BT_BOZ check.
(simplify_intconv): In-place conversin of boz to INTEGER.
* target-memory.c (boz2int): New function.  Conversion of 
boz to INTEGER with widest decimal range.
(gfc_convert_boz): Use it.  Remove clearly is_boz.

2017-10-06  Steven G. Kargl  

PR fortran/45513
PR fortran/54072
PR fortran/81509
* gfortran.dg/achar_5.f90: Remove BOZ arg tests.
* gfortran.dg/boz_4.f90: Delete test as it no longer applies.
* gfortran.dg/graphite/id-26.f03:  Fix test.
* gfortran.dg/pr81509_1.f90: New test.
* gfortran.dg/pr81509_2.f90: New test.
* gfortran.dg/unf_io_convert_2.f90: Fix test.
I

Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Martin Sebor

On 10/06/2017 12:06 PM, Bernd Edlinger wrote:

On 10/06/17 17:43, Martin Sebor wrote:

On 10/06/2017 07:25 AM, Bernd Edlinger wrote:

On 10/05/17 18:16, Martin Sebor wrote:

In my (very quick) tests the warning appears to trigger on all
strictly incompatible conversions, even if they are otherwise
benign, such as:

   int f (const int*);
   typedef int F (int*);

   F* pf1 = f;// -Wincompatible-pointer-types
   F* pf2 = (F*)f;// -Wcast-function-type

Likewise by:

   int f (signed char);
   typedef int F (unsigned char);

   F* pf = (F*)f;// -Wcast-function-type

I'd expect these conversions to be useful and would view warning
for them with -Wextra as excessive.  In fact, I'm not sure I see
the benefit of warning on these casts under any circumstances.



Well, while the first example should be safe,
the second one is probably not safe:

Because the signed and unsigned char are promoted to int,
by the ABI but one is in the range -128..127 while the
other is in the range 0..255, right?


Right.  The cast is always safe but whether or not a call to such
a function through the incompatible pointer is also safe depends
on the definition of the function (and on the caller).  If the
function uses just the low bits of the argument then it's most
likely fine for any argument.  If the caller only calls it with
values in the 7-bit range (e.g., the ASCII subset) then it's also
fine.  Otherwise there's the potential for the problem you pointed
out.  (Similarly, if in the first example I gave the cast added
constness to the argument rather than removing it and the function
modified the pointed-to object calling it through the incompatible
pointer on a constant object would also be unsafe.)

Another class of cases to consider are casts between functions
taking pointers to different but related structs.  Code like this
could be written to mimic C++ calling a base class function on
a derived object.

   struct Base { ... };
   struct Derived { Base b; ... };

   typedef void F (Derived*);

   void foo (Base*);

   F* pf = (F*)foo;



Hmm, yes.

I start to believe, that this warning should treat all pointers
as equivalent, but everything else need to be of the same type.
That would at least cover the majority of all "safe" use cases.


Perhaps basing the warning on some form of structural equivalence
between function arguments might be useful.  For instance, in
ILP32, casting between 'int foo (int)' and 'long foo (long)' and
calling the function is probably generally considered safe (even
though it's undefined by the language) and works as people expect
so avoiding the warning there would help keep the false positive
rate down. (Something like this might also work for the kernel
alias macros.)

Similarly, casting between a function that returns a scalar smaller
than int and int and then calling it is probably safe (or maybe
even long, depending on the ABI).

Casting a function returning a value to one returning void and
calling it through the result should always be safe.

I would also suggest to consider disregarding qualifiers as those
are often among the reasons for intentional casts (e.g., when
mixing a legacy API and a more modern const-correct one).

Casts are also not uncommon between variadic and ordinary function
types so some heuristic might be appropriate there.



And I need a way to by-pass the warning with a generic function
pointer type.  uintptr_t is not the right choice, as you pointed
out already.

But I also see problems with "int (*) ()" as a escape mechanism
because this declaration creates a warning in C with
-Wstrict-prototypes, and in C++ this syntax means "int (*) (void)"
while the C++ type "int (*) (...)" is rejected by the C front end.


I wouldn't consider it a problem if the suppression mechanism were
different between languages.  Most such casts are going to be in
source files (as opposed to C headers) so it should be fine to use
each language's unique form of function without a prototype.

Martin


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Jeff Law
On 10/05/2017 03:47 PM, Joseph Myers wrote:
> On Thu, 5 Oct 2017, Bernd Edlinger wrote:
> 
>> Maybe it would be good to not warn in type-casts, when they can be
>> assumed to be safe, for instance
>> void* <-> any pointer (parameter or result),
>> uintptr_t <-> any int, any pointer (parameter or result),
>> void (*) (void) and void (*) (...) <-> any function pointer.
> 
> Well, void * and uintptr_t aren't necessarily interchangable at the ABI 
> level.  At least, the m68k ABI returns integers in %d0 and pointers in 
> %a0; I don't know if any other ABIs have that peculiarity.
> 
The mn103 (live) and mn102 (dead) probably do.  But my memory is getting
fuzzy on those.

jeff


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Jeff Law
On 10/06/2017 09:43 AM, Martin Sebor wrote:
> On 10/06/2017 07:25 AM, Bernd Edlinger wrote:
>> On 10/05/17 18:16, Martin Sebor wrote:
>>> In my (very quick) tests the warning appears to trigger on all
>>> strictly incompatible conversions, even if they are otherwise
>>> benign, such as:
>>>
>>>    int f (const int*);
>>>    typedef int F (int*);
>>>
>>>    F* pf1 = f;    // -Wincompatible-pointer-types
>>>    F* pf2 = (F*)f;    // -Wcast-function-type
>>>
>>> Likewise by:
>>>
>>>    int f (signed char);
>>>    typedef int F (unsigned char);
>>>
>>>    F* pf = (F*)f;    // -Wcast-function-type
>>>
>>> I'd expect these conversions to be useful and would view warning
>>> for them with -Wextra as excessive.  In fact, I'm not sure I see
>>> the benefit of warning on these casts under any circumstances.
>>>
>>
>> Well, while the first example should be safe,
>> the second one is probably not safe:
>>
>> Because the signed and unsigned char are promoted to int,
>> by the ABI but one is in the range -128..127 while the
>> other is in the range 0..255, right?
> 
> Right.  The cast is always safe but whether or not a call to such
> a function through the incompatible pointer is also safe depends
> on the definition of the function (and on the caller).  If the
> function uses just the low bits of the argument then it's most
> likely fine for any argument.  If the caller only calls it with
> values in the 7-bit range (e.g., the ASCII subset) then it's also
> fine.  Otherwise there's the potential for the problem you pointed
> out.  (Similarly, if in the first example I gave the cast added
> constness to the argument rather than removing it and the function
> modified the pointed-to object calling it through the incompatible
> pointer on a constant object would also be unsafe.)
> 
> Another class of cases to consider are casts between functions
> taking pointers to different but related structs.  Code like this
> could be written to mimic C++ calling a base class function on
> a derived object.
> 
>   struct Base { ... };
>   struct Derived { Base b; ... };
> 
>   typedef void F (Derived*);
> 
>   void foo (Base*);
> 
>   F* pf = (F*)foo;
Yea.  And one might even find such code in BFD.  It certainly mimicks
C++ base and derived classes using C, so it has significant potential to
have this kind of code.
jeff


[PATCH][aarch64] Put vector fnma instruction into canonical form for better code generation.

2017-10-06 Thread Steve Ellcey
This patch is a follow up to a discussion at:

https://gcc.gnu.org/ml/gcc/2017-06/msg00126.html

For some reason the simd version of fnma in aarch64-simd.md
is not in the canonical form of having the neg operator on 
the first operand and instead has it on the second.  This 
results in sub-optimal code generation (an extra dup instruction).

I have moved the 'neg', rebuilt GCC and retested with this patch
There were no regressions.  OK to checkin?


2017-10-06  Steve Ellcey  

* config/aarch64/aarch64-simd.md (fnma4): Move neg operator
to canonical location.


diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-sim
d.md
index 12da8be..d9ced50 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1777,9 +1777,8 @@
 (define_insn "fnma4"
   [(set (match_operand:VHSDF 0 "register_operand" "=w")
    (fma:VHSDF
-     (match_operand:VHSDF 1 "register_operand" "w")
-  (neg:VHSDF
-   (match_operand:VHSDF 2 "register_operand" "w"))
+     (neg:VHSDF (match_operand:VHSDF 1 "register_operand" "w"))
+     (match_operand:VHSDF 2 "register_operand" "w")
      (match_operand:VHSDF 3 "register_operand" "0")))]
   "TARGET_SIMD"
   "fmls\\t%0., %1., %2."



2017-10-06  Steve Ellcey  

* gcc.target/aarch64/fmls.c: New test.


diff --git a/gcc/testsuite/gcc.target/aarch64/fmls.c b/gcc/testsuite/gcc.target/
aarch64/fmls.c
index e69de29..1ea0e6a 100644
--- a/gcc/testsuite/gcc.target/aarch64/fmls.c
+++ b/gcc/testsuite/gcc.target/aarch64/fmls.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+#define vector __attribute__((vector_size(16)))
+vector double a = {1.0,1.0};
+vector double b = {2.0,2.0};
+double x = 3.0;
+
+
+void __attribute__ ((noinline))
+vf (double x, vector double *v1, vector double *v2, vector double *result)
+{
+  vector double s = v1[0];
+  vector double t = -v2[0];
+  vector double m = {x,x};
+  vector double r = t * m + s;
+  result[0] = r;
+}
+/* { dg-final { scan-assembler-not "dup" } } */



Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-06 Thread Bernd Edlinger
On 10/06/17 22:50, Jeff Law wrote:
> On 10/06/2017 09:43 AM, Martin Sebor wrote:
>> On 10/06/2017 07:25 AM, Bernd Edlinger wrote:
>>> On 10/05/17 18:16, Martin Sebor wrote:
 In my (very quick) tests the warning appears to trigger on all
 strictly incompatible conversions, even if they are otherwise
 benign, such as:

     int f (const int*);
     typedef int F (int*);

     F* pf1 = f;    // -Wincompatible-pointer-types
     F* pf2 = (F*)f;    // -Wcast-function-type

 Likewise by:

     int f (signed char);
     typedef int F (unsigned char);

     F* pf = (F*)f;    // -Wcast-function-type

 I'd expect these conversions to be useful and would view warning
 for them with -Wextra as excessive.  In fact, I'm not sure I see
 the benefit of warning on these casts under any circumstances.

>>>
>>> Well, while the first example should be safe,
>>> the second one is probably not safe:
>>>
>>> Because the signed and unsigned char are promoted to int,
>>> by the ABI but one is in the range -128..127 while the
>>> other is in the range 0..255, right?
>>
>> Right.  The cast is always safe but whether or not a call to such
>> a function through the incompatible pointer is also safe depends
>> on the definition of the function (and on the caller).  If the
>> function uses just the low bits of the argument then it's most
>> likely fine for any argument.  If the caller only calls it with
>> values in the 7-bit range (e.g., the ASCII subset) then it's also
>> fine.  Otherwise there's the potential for the problem you pointed
>> out.  (Similarly, if in the first example I gave the cast added
>> constness to the argument rather than removing it and the function
>> modified the pointed-to object calling it through the incompatible
>> pointer on a constant object would also be unsafe.)
>>
>> Another class of cases to consider are casts between functions
>> taking pointers to different but related structs.  Code like this
>> could be written to mimic C++ calling a base class function on
>> a derived object.
>>
>>    struct Base { ... };
>>    struct Derived { Base b; ... };
>>
>>    typedef void F (Derived*);
>>
>>    void foo (Base*);
>>
>>    F* pf = (F*)foo;
> Yea.  And one might even find such code in BFD.  It certainly mimicks
> C++ base and derived classes using C, so it has significant potential to
> have this kind of code.
> jeff
> 

Yes, absolutely.

This use case makes up 99% of all places where I saw the warning until
now.  I will try to ignore all pointer types and see how that works out
on some code bases I have access to.

When that works as expected we should be able to see what heuristics
need to be added next.

FYI I have attached what I am currently bootstrapping.


Bernd.
gcc:
2017-10-06  Bernd Edlinger  

* doc/invoke.texi: Document -Wcast-function-type.
* recog.h (stored_funcptr): Change signature.
* tree-dump.c (dump_node): Avoid warning.
* typed-splay-tree.h (typed_splay_tree): Avoid warning.

libcpp:
2017-10-06  Bernd Edlinger  

* internal.h (maybe_print_line): Change signature.

c-family:
2017-10-06  Bernd Edlinger  

* c.opt (Wcast-function-type): New warning option.
* c-lex.c (get_fileinfo): Avoid warning.
* c-ppoutput.c (scan_translation_unit_directives_only): Remove cast.

c:
2017-10-06  Bernd Edlinger  

* c-typeck.c (c_safe_function_type_cast_p): New helper function.
(build_c_cast): Implement -Wcast_function_type.

cp:
2017-10-06  Bernd Edlinger  

* decl2.c (start_static_storage_duration_function): Aboid warning.
* typeck.c (+cxx_safe_function_type_cast_p): New helper function.
(build_reinterpret_cast_1): Implement -Wcast_function_type.

testsuite:
2017-10-06  Bernd Edlinger  

* c-c++-common/Wcast-function-type.c: New test.
Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 253493)
+++ gcc/c/c-typeck.c	(working copy)
@@ -5474,6 +5474,36 @@ handle_warn_cast_qual (location_t loc, tree type,
   while (TREE_CODE (in_type) == POINTER_TYPE);
 }
 
+/* Check if a type cast between two function types can be considered safe.  */
+
+static bool
+c_safe_function_type_cast_p (tree t1, tree t2)
+{
+  if (TREE_TYPE (t1) == void_type_node &&
+  TYPE_ARG_TYPES (t1) == void_list_node)
+return true;
+
+  if (TREE_TYPE (t2) == void_type_node &&
+  TYPE_ARG_TYPES (t2) == void_list_node)
+return true;
+
+  if (!comptypes (TREE_TYPE (t1), TREE_TYPE (t2)))
+return false;
+
+  for (t1 = TYPE_ARG_TYPES (t1), t2 = TYPE_ARG_TYPES (t2);
+   t1 && t2;
+   t1 = TREE_CHAIN (t1), t2 = TREE_CHAIN (t2))
+{
+  if (TREE_CODE (TREE_VALUE (t1)) == POINTER_TYPE
+	  && TREE_CODE (TREE_VALUE (t2)) == POINTER_TYPE)
+	continue;
+  if (!comptypes (TREE_VALUE (t1), TREE_VALUE (t2)))
+	return false;
+}
+
+  return true;
+

Re: [PATCH PR79868 ][aarch64] Fix error calls in aarch64 code so they can be translated (version 2)

2017-10-06 Thread Steve Ellcey
Ping.

Steve Ellcey
sell...@cavium.com

On Mon, 2017-09-25 at 16:25 -0700, Steve Ellcey wrote:
> This is a new version of my patch to fix PR target/79868, where some
> error messages are impossible to translate correctly due to how the
> strings are dynamically constructed.  It also includes some format
> changes in the error messags to make the messages more consistent with
> each other and with other GCC errors.  This was worked out with help
> from Martin Sebor.  I also had to fix some tests to match the new error
> string formats.
> 
> Tested on Aarch64 with no regressions, OK to checkin?
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 2017-09-25  Steve Ellcey  
> 
>   PR target/79868
>   * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
>   Change argument type on aarch64_process_target_attr call.
>   * config/aarch64/aarch64-protos.h
> (aarch64_process_target_attr):
>   Change argument type.
>   * config/aarch64/aarch64.c (aarch64_attribute_info): Change
>   field type.
>   (aarch64_handle_attr_arch): Change argument type, use boolean
>   argument to use different strings in error calls.
>   (aarch64_handle_attr_cpu): Ditto.
>   (aarch64_handle_attr_tune): Ditto.
>   (aarch64_handle_attr_isa_flags): Ditto.
>   (aarch64_process_one_target_attr): Ditto.
>   (aarch64_process_target_attr): Ditto.
>   (aarch64_option_valid_attribute_p): Change argument type on
>   aarch64_process_target_attr call.
> 
> 
> 2017-09-25  Steve Ellcey  
> 
>   PR target/79868
>   * gcc.target/aarch64/spellcheck_1.c: Update dg-error string to
> match
>   new format.
>   * gcc.target/aarch64/spellcheck_2.c: Ditto.
>   * gcc.target/aarch64/spellcheck_3.c: Ditto.
>   * gcc.target/aarch64/target_attr_11.c: Ditto.
>   * gcc.target/aarch64/target_attr_12.c: Ditto.
>   * gcc.target/aarch64/target_attr_17.c: Ditto.


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-06 Thread Prathamesh Kulkarni
On 6 October 2017 at 06:04, Jan Hubicka  wrote:
>> Hi Honza,
>> Thanks for the detailed suggestions, I have updated the patch accordingly.
>> I have following questions on call_summary:
>> 1] I added field bool is_return_callee in ipa_call_summary to track
>> whether the caller possibly returns value returned by callee, which
>> gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
>> and ipa_call_summary_t::duplicate() will already take care of handling
>> late insertion/removal of cgraph nodes ? I just initialized
>> is_return_callee to false in ipa_call_summary::reset and that seems to
>> work. I am not sure though if I have handled it correctly. Could you
>> please check that ?
>
> I was actually thinking to introduce separate summary for ipa-pure-const pass,
> but this seems fine to me too (for one bit definitly more effecient)
> ipa_call_summary_t::duplicate copies all the fields, so indeed you should be
> safe here.
>
> Also it is possible for functions to be inserted late.  Updating of call 
> summaries
> is currently handled by ipa_fn_summary_t::insert
>>
>> 2] ipa_inline() called ipa_free_fn_summary, which made
>> ipa_call_summaries unavailable during ipa-pure-const pass. I removed
>> call to ipa_free_fn_summary from ipa_inline, and moved it to
>> ipa_pure_const::execute(). Is that OK ?
>
> Seems OK to me.
>>
>> Patch passes bootstrap+test and lto bootstrap+test on 
>> x86_64-unknown-linux-gnu.
>> Verfiied SPEC2k6 compiles and runs without miscompares with LTO
>> enabled on aarch64-linux-gnu.
>> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
>> the patch by building chromium or firefox.
>> Would it be OK to commit if it passes above validations ?
>>
>> Thanks,
>> Prathamesh
>> >
>> > Thanks,
>> > Honza
>
>> 2017-10-05  Prathamesh Kulkarni  
>>
>>   * cgraph.h (set_malloc_flag): Declare.
>>   * cgraph.c (set_malloc_flag_1): New function.
>>   (set_malloc_flag): Likewise.
>>   * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
>>   * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
>>   false.
>>   (read_ipa_call_summary): Add support for reading is_return_callee.
>>   (write_ipa_call_summary): Stream is_return_callee.
>>   * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
>>   * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
>>   ipa-prop.h, ipa-fnsummary.h.
>>   (malloc_state_e): Define.
>>   (malloc_state_names): Define.
>>   (funct_state_d): Add field malloc_state.
>>   (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
>>   (check_retval_uses): New function.
>>   (malloc_candidate_p): Likewise.
>>   (analyze_function): Add support for malloc attribute.
>>   (pure_const_write_summary): Stream malloc_state.
>>   (pure_const_read_summary): Add support for reading malloc_state.
>>   (dump_malloc_lattice): New function.
>>   (propagate_malloc): New function.
>>   (ipa_pure_const::execute): Call propagate_malloc and
>>   ipa_free_fn_summary.
>>   (pass_local_pure_const::execute): Add support for malloc attribute.
>>   * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
>>
>> testsuite/
>>   * gcc.dg/ipa/propmalloc-1.c: New test-case.
>>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
>>
>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>> index 3d0cefbd46b..0aad90d59ea 100644
>> --- a/gcc/cgraph.c
>> +++ b/gcc/cgraph.c
>> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
>>return changed;
>>  }
>>
>> +/* Worker to set malloc flag.  */
> New line here I guess (it is below)
>> +static void
>> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
>> +{
>> +  if (malloc_p && !DECL_IS_MALLOC (node->decl))
>> +{
>> +  DECL_IS_MALLOC (node->decl) = true;
>> +  *changed = true;
>> +}
>> +
>> +  ipa_ref *ref;
>> +  FOR_EACH_ALIAS (node, ref)
>> +{
>> +  cgraph_node *alias = dyn_cast (ref->referring);
>> +  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
>> + set_malloc_flag_1 (alias, malloc_p, changed);
>> +}
>> +
>> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
>> +if (e->caller->thunk.thunk_p
>> + && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
>> +  set_malloc_flag_1 (e->caller, malloc_p, changed);
>> +}
>> +
>> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
>> +
>> +bool
>> +cgraph_node::set_malloc_flag (bool malloc_p)
>> +{
>> +  bool changed = false;
>> +
>> +  if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE)
>> +set_malloc_flag_1 (this, malloc_p, &changed);
>> +  else
>> +{
>> +  ipa_ref *ref;
>> +
>> +  FOR_EACH_ALIAS (this, ref)
>> + {
>> +   cgraph_node *alias = dyn_cast (ref->referring);
>> +   if (!malloc_p || alias->get_availabi

Re: [patch, fortran] Error for non-contiguous pointers

2017-10-06 Thread Jerry DeLisle
On 10/03/2017 11:35 AM, Thomas Koenig wrote:
> Hello world,
> 
> I have re-thought and simplified the patch for PR49232.
> This now uses gfc_is_simply_contiguous, in the
> non-strict version.  I have also opted for an error
> because, well, the use cases rejected by this are really
> errors, and will very likely lead to wrong code in
> user applications.
> 
> Regression-tested. OK for trunk?

OK

Thanks for patch

> 
> Thomas
> 
> 2017-10-03  Thomas Koenig  
> 
>     PR fortran/49232
>     * expr.c (gfc_check_pointer_assign): Error
>     for non-contiguous rhs.
> 
> 2017-10-03  Thomas Koenig  
> 
>     PR fortran/49232
>     * gfortran.dg/contiguous_4.f90: New test.
> 



Re: [patch, fortran] Set implicit ASYNCHRONOUS attribute

2017-10-06 Thread Jerry DeLisle
On 10/04/2017 02:41 PM, Thomas Koenig wrote:
> Hello world,
> 
> the attached patch sets the implicit ASYNCHRONPUS according to F2008,
> 9.6.2.5:
> 
> # If a variable is used in an asynchronous data transfer statement as
> # • an item in an input/output list,
> # • a group object in a namelist, or
> # • a SIZE= specifier
> # the base object of the data-ref is implicitly given the ASYNCHRONOUS 
> attribute
> in the scoping unit of the
> # data transfer statement. This attribute may be confirmed by explicit 
> declaration.
> 
> At the moment, the only thing this does is show up on the fortran tree
> dump. This will hopefully change once asynchronous I/O is implemented.
> And if you are wondering why I put setting the global variable into
> check_io_constraints: It is because the parsing of the YES/NO
> is done there, and I didn't want to duplicate the code.
> 
> No test case because we don't (yet) have tests for -fdump-fortran-original.
> 
> Regression-tested.  OK for trunk?

OK and thanks for the patch.

Jerry


  1   2   >